Back in June, aparently, the FTC said that a do-not-email list (like
the do-not-call list) would not work, and would generate more spam
because spammers would use it as a source of new email addresses.
Though it's a bit late now, I have to wonder about the latter
point. Why not simply map each address into its MD5 checksum
before storing it?
So foo@example.com would become "a0b6e8fd2367f5999b6b4e7e1ce9e2d2"
which is useless for sending email. However, spammers could use any of many available tools
to check for "hits" on their email lists, so it's still perfectly
usable for filtering out email addresses. Of course it would also
tell spammers that they have a 'real' email address on their list, but
only if they already had it -- so I don't think that would be giving
them much information at all.
I still think the list would be useless because spammers would simply
ignore it. But it wouldn't generate new spam, and it would drive
up the cost of spamming by making the threat of legal action a bit more
possible.
Wednesday, December 15, 2004
Tuesday, December 14, 2004
The Noosphere Just Got Closer
Of course it'll take several years, but Google's just announced project to digitize major university library collections means that the print-only "dark matter" of the noosphere
is about to be mapped out and made available to anyone with an Internet
connection. Well, at least the parts that have passed into the
public domain; the rest will be indexed.
I'm clearly a geek -- my toes are tingling.
I'm clearly a geek -- my toes are tingling.
Monday, December 13, 2004
The "5th Estate"
Interesting quote, from my point of view, in this article:
Jonathan Miller, Head of AOL in the US, testifies to the popularity of Citizen's Media. He says that 60 - 70 per cent of the time people spend on AOL is devoted to ‘audience generated content'.
(Though he's talking mostly about things like message boards and chat rooms, of course, rather than blogs.)
Jonathan Miller, Head of AOL in the US, testifies to the popularity of Citizen's Media. He says that 60 - 70 per cent of the time people spend on AOL is devoted to ‘audience generated content'.
(Though he's talking mostly about things like message boards and chat rooms, of course, rather than blogs.)
Monday, December 6, 2004
Welcome MSN Spaces!
A surprise to welcome me back from sabbatical: Microsoft released the beta of MSN Spaces
(congratulations guys!). I've been playing with it a bit over the
past few days; there's some very cool stuff there, especially the
integrations between Microsoft applications.
(I've seen a few comments about the instability of the Spaces service; come on folks, it's a beta. And they're turning around bug fixes in 48 hours while keeping up with what has got to be a ton of traffic.)
(I've seen a few comments about the instability of the Spaces service; come on folks, it's a beta. And they're turning around bug fixes in 48 hours while keeping up with what has got to be a ton of traffic.)
Wednesday, November 24, 2004
The Atom Publishing Protocol Summarized
The slides
from Joe Gregorio's XML 2004 talk about the Atom Publishing Protocol
are online. It's an excellent summary, and makes a good case for
the document literal and addressable web resource approaches. The
publishing protocol is where Atom really starts to get exciting.
Tuesday, November 23, 2004
Software Patents Considered Harmful
This post by Paul Vick is, I think, a very honest and representative take on software patents -- and in particular the over-the-top IsNot patent -- from the point of view of an innovator. I find myself agreeing with him wholeheartedly:
Microsoft has been as much a victim of this as anyone else, and yet we're right there in there with everyone else, playing the game. It's become a Mexican standoff, and there's no good way out at the moment short of a broad consensus to end the game at the legislative level.
And we all know how Mexican standoffs typically end. Paul, my name is on a couple of patents which I'm not proud of either. But in the current environment, there really isn't a choice: We're all locked in to locally 'least bad' courses, which together work to guarantee the continuation of the downward spiral (and in the long run, make all companies worse off -- other than Nathan Myhrvold's, of course.)
Microsoft has been as much a victim of this as anyone else, and yet we're right there in there with everyone else, playing the game. It's become a Mexican standoff, and there's no good way out at the moment short of a broad consensus to end the game at the legislative level.
And we all know how Mexican standoffs typically end. Paul, my name is on a couple of patents which I'm not proud of either. But in the current environment, there really isn't a choice: We're all locked in to locally 'least bad' courses, which together work to guarantee the continuation of the downward spiral (and in the long run, make all companies worse off -- other than Nathan Myhrvold's, of course.)
Monday, November 22, 2004
Web Services and KISS
Adam Bosworth argues for the 'worse is better' philosophy of web services eloquently in his ISCOC talk and blog entry.
I have a lot of sympathy for this point of view. I'm also
skeptical about the benefits of the WS-* paradigm. They seem to
me to be well designed to sell development tools and enterprise
consulting services.
Sunday, November 14, 2004
Why Aggregation Matters
Sometimes, I feel like I'm banging my head against a wall trying to describe just why feed syndication and aggregation is important. In an earlier post,
I tried to expand the universe of discourse by throwing out as many
possible uses as I could dream up. Joshua Porter has written a
really good article about why aggregation is a big deal, even just
considering its impact on web site design: Home Alone? How Content Aggregators Change Navigation and Control of Content.
Monday, November 1, 2004
Prediction is Difficult, Especially the Future
My
second hat at AOL is development manager for the AOL Polls system.
This means I've had the pleasure of watching the conventions and
debates in real time while sitting on conference calls watching the
performance of our instant polling systems. Which had some potential
issues, but which, after a lot of work, seem to be just fine now.
Anyway: The interesting thing about the instant polling during the
debates was how different the results were from the conventional
instant phone polls. For example, after the final debate the AOL
Instapoll respondents gave the debate win to Kerry by something like
60% to 40%. The ABC news poll was more like 50%/50%. Frankly, I don't
believe any of these polls. However, I'll throw this thought out: The
online insta polls are taken by a self selected group of people who are
interested in the election and care about making their opinions known.
Hmmm... much like the polls being conducted tomorrow.
I'll go out on a limb and make a prediction based on the various poll results and on a lot of guesswork: Kerry will win the popular vote by a significant margin. And, he'll win at least half of the "battleground" states by a margin larger than the last polls show. But, I make no predictions about what hijinks might ensue in the Electoral College.
Update 11/11: Well, maybe not...
I'll go out on a limb and make a prediction based on the various poll results and on a lot of guesswork: Kerry will win the popular vote by a significant margin. And, he'll win at least half of the "battleground" states by a margin larger than the last polls show. But, I make no predictions about what hijinks might ensue in the Electoral College.
Update 11/11: Well, maybe not...
Monday, October 18, 2004
Random Note: DNA's Dark Matter
Scientific American's The Hidden Genetic Program of Complex Organisms
grabbed my attention last week. This could be the biological
equivalent of the discovery of dark matter. Basically, the 'junk'
or intron DNA that forms a majority of our genome may not be junk at
all, but rather control code that regulates the expression of other
genes.
The programming analogy would be, I think, that the protein-coding parts of the genome would be the firmware or opcodes while the control DNA is the source code that controls when and how the opcodes are executed. Aside from the sheer coolness of understanding how life actually works, there's a huge potential here for doing useful genetic manipulation. It's got to be easier to tweak control code than to try to edit firmware... (Free link on same subject: The Unseen Genome.)
The programming analogy would be, I think, that the protein-coding parts of the genome would be the firmware or opcodes while the control DNA is the source code that controls when and how the opcodes are executed. Aside from the sheer coolness of understanding how life actually works, there's a huge potential here for doing useful genetic manipulation. It's got to be easier to tweak control code than to try to edit firmware... (Free link on same subject: The Unseen Genome.)
Monday, October 11, 2004
Things in Need of a Feed
Syndicated feeds are much bigger than blogs and news stories; they're a
platform. A bunch of use cases, several of which actually exist in some form, others just things I'd like to see:
- Blog entries for blogs I'm interested in
- Feed of all comments on entries I've authored
- News stories matching a custom filter I've set up
- Traffic conditions on my customary route(s)
- Fedex shipping feed giving status and history for all of my packages
- Customer support feed giving status and history for all my issues (any company)
- Product safety/recall information for everything I buy
- Amazon feed of new books matching my preferences
- All new material by a specific author (on any blog or online source)
- Feed of new feeds, of various types:
- Just my friends
- Authored by people whose blogs I already subscribe to
- Filtered on personal profile/interests
- House for sale listings
- Newly discovered prime numbers (okay, a niche audience)
- Airport flight status alerts
- Movies in my Netflix queue and recommendations
- Audio / video content pushed onto my iPod (Podcasting)
- Auction information
- Multiplayer game results feed
- New government publications feed
- New computer virus alerts feed (with metadata giving virus signatures)
- Book queue
Tuesday, October 5, 2004
Niche Markets
Niche markets are where it's at: Chris Anderson's The Long Tail
is exactly right. The Internet not only eliminates the overhead of
physical space but also, more importantly, reduces the overhead of
finding what you want to near-zero. When your computer tracks your
preferences and auto-discovers new content that you actually want, it enables new markets that couldn't otherwise exist.
Update 10/11: Joi Ito's take.
Update 10/11: Joi Ito's take.
Sunday, August 1, 2004
Network Protocols and Vectorization
Doing things in parallel is one of the older performance tricks. Vector SIMD machines -- like the Cray supercomputers -- attack problems that benefit from doing the same thing to lots
of different pieces of data simultaneously. It's just a performance
trick, but it drove the design and even the physical shape of those
machines because the problems they're trying to tackle -- airflow
simulation, weather prediction, nuclear explosion simulation, etc. --
are both important and difficult to scale up. (More recently, we're
seeing massively parallel machines built out of individual commodity
PCs; conceptually the same, but limited mostly by network
latency/bandwidth.)
So what does this have to do with network protocols? Just as the problems of doing things like a matrix-vector multiply very, very fast drove the designs of supercomputers, the problems of moving data from one place to another very quickly, on demand drive the designs of today's network services. The designs of network APIs (whether REST, SOAP, XML-RPC, or whatever) need to take these demands into account.
In particular, transferring lots of small pieces of data in serial fashion over a network can be a big problem. Lots of protocols that are perfectly fine when run locally or over a LAN fail miserably when expected to deal with 100-200ms latencies on a WAN or the Internet. HTTP does a decent job of balancing out performance/latency issues for retrieving human readable pages -- a page comes down as a medium-sized chunk of data, followed by, if necessary, associated resources such as scripts, style sheets, and binary images, which can all be retrieved in parallel/behind the scenes. Note, that this is achieved only through lots of work on the client side and deep knowledge of the interactions between HTML, HTTP, and the final UI. The tradeoff is complexity of protocol and implementation.
How does this apply to network protocols in general? One idea is to carefully scrutinize protocol requests that transfer a single small piece of data. Often a single small piece of data isn't very useful on its own. Are there common use cases where a system will do this in a loop, perhaps serially, to get enough data to process or present to a user? If so, perhaps it would be a good idea to think of "vectorizing" that part of the protocol. Instead of returning a single piece of data, for example, return a variable-length collection of those pieces of data. The semantics of the request may change only slightly -- from "I return an X" to "I return a set of X". Ideally, the length should be dynamic and the client should be able to ask for "no more than N" on each request.
For example, imagine a protocol that requires a client to first retrieve a set of handles (say, mailboxes for a user) then query each one in turn to get some data (say, the number of unread messages). If this is something that happens often -- for example, automatically every two minutes -- there are going to be a lot of packets hitting servers. If multiple mailboxes are on one server, it would be fairly trivial to vectorize the second call and effectively combine the two queries into one -- call it "get mailbox state(s)". This would let a client retrieve the state for all mailboxes on a given server, with better latency and far less bandwidth than the first option. Of course there's no free lunch; if a client is dealing with multiple servers, it now has to group the mailboxes for each server for purposes of retrieving state. But conceptually, it's not too huge of a leap.
There are other trade-offs. If the "extra" data is large -- like a binary image -- it might well be better to download it separately, perhaps in parallel with other things. If it's cacheable, but the main data isn't, it may again be better to separate it out so you can take advantage of things like HTTP caching.
To summarize, one might want to vectorize part of a network protocol if:
So what does this have to do with network protocols? Just as the problems of doing things like a matrix-vector multiply very, very fast drove the designs of supercomputers, the problems of moving data from one place to another very quickly, on demand drive the designs of today's network services. The designs of network APIs (whether REST, SOAP, XML-RPC, or whatever) need to take these demands into account.
In particular, transferring lots of small pieces of data in serial fashion over a network can be a big problem. Lots of protocols that are perfectly fine when run locally or over a LAN fail miserably when expected to deal with 100-200ms latencies on a WAN or the Internet. HTTP does a decent job of balancing out performance/latency issues for retrieving human readable pages -- a page comes down as a medium-sized chunk of data, followed by, if necessary, associated resources such as scripts, style sheets, and binary images, which can all be retrieved in parallel/behind the scenes. Note, that this is achieved only through lots of work on the client side and deep knowledge of the interactions between HTML, HTTP, and the final UI. The tradeoff is complexity of protocol and implementation.
How does this apply to network protocols in general? One idea is to carefully scrutinize protocol requests that transfer a single small piece of data. Often a single small piece of data isn't very useful on its own. Are there common use cases where a system will do this in a loop, perhaps serially, to get enough data to process or present to a user? If so, perhaps it would be a good idea to think of "vectorizing" that part of the protocol. Instead of returning a single piece of data, for example, return a variable-length collection of those pieces of data. The semantics of the request may change only slightly -- from "I return an X" to "I return a set of X". Ideally, the length should be dynamic and the client should be able to ask for "no more than N" on each request.
For example, imagine a protocol that requires a client to first retrieve a set of handles (say, mailboxes for a user) then query each one in turn to get some data (say, the number of unread messages). If this is something that happens often -- for example, automatically every two minutes -- there are going to be a lot of packets hitting servers. If multiple mailboxes are on one server, it would be fairly trivial to vectorize the second call and effectively combine the two queries into one -- call it "get mailbox state(s)". This would let a client retrieve the state for all mailboxes on a given server, with better latency and far less bandwidth than the first option. Of course there's no free lunch; if a client is dealing with multiple servers, it now has to group the mailboxes for each server for purposes of retrieving state. But conceptually, it's not too huge of a leap.
There are other trade-offs. If the "extra" data is large -- like a binary image -- it might well be better to download it separately, perhaps in parallel with other things. If it's cacheable, but the main data isn't, it may again be better to separate it out so you can take advantage of things like HTTP caching.
To summarize, one might want to vectorize part of a network protocol if:
- Performance is important, and network latency is high and/or variable;
- The data to be vectorized are always or often needed together in common use cases;
- It doesn't over-complexify the protocol;
- There's no other way to achieve similar performance in other ways (parallel requests, caching, etc.)
Sunday, July 4, 2004
Office Space
How important is the physical workspace to knowledge workers generally,
and software developers specifically? Everybody agrees it's
important. Talk to ten people, though, and you'll get nine different
opinions about what aspects are important and how much
they impact effectiveness. But there are some classic studies that
shed some light on the subject; looking around recently, they haven't
been refuted. At the same time, a lot of people in the software
industry don't seem to have heard of them.
Take the amount and kind of workspace provided to each knowledge worker. You can quantify this (number of square feet, open/cubicle/office options). What effects should you expect from, say, changing the number of square feet per person from 80 to 64? What would this do to your current project's effort and schedule?
There's no plug-in formula for this, but based on the available data, I'd guesstimate that the effort would expand by up to 30%. Why?
"Programmer Performance and the Effects of the Workplace" describes the Coding War Games, a competition in which hundreds of developers from dozens of companies compete on identical projects. (Also described in Peopleware: Productive Projects and Teams.) The data is from the 1980's, but hasn't been replicated since as far as I can tell. The developers were ranked according to how quickly they completed the projects, into top 25%, middle 50%, and bottom 25%. The competition work was done in their normal office environments.
It itself, this doesn't give us an answer for the question we started out with (changing from 80 square feet to 64 square feet per person, and bumping up the people density commensurately). As a first approximation, let's assume a linear relationship between dedicated area per person and productivity ratios. 64 is just over halfway between 46 and 78, so it seems reasonable to use half of the 2.6 factor, or 1.3, as a guesstimate. So using this number, a project that was going to take two weeks in the old environment would take 1.3 times as long, or around two and a half weeks, in the new environment. (In the long term, of course.)
To put this into perspective, it appears that increasing an organization's CMM level by one generally results in an 11% increase in productivity, and that the ratio of effort between worst and best real-world processes appears to be no more than 1.43.
You can't follow the numbers blindly here. This probably depends a lot on the kind of work you actually do, and I can think of dozens of caveats. My gut feeling is that the penalty is likely to be more like 10% than 30%, assuming you're really holding everything else (noise, interruptions, etc.) as constant as possible. I suspect that the organizations which are squeezing people into ice cube sized cubicles are likely to be destroying productivity in other ways as well. But, these numbers do provide some guidance as to what to expect in terms of costs and consequences of changing the workplace environment.
Links and references:
Take the amount and kind of workspace provided to each knowledge worker. You can quantify this (number of square feet, open/cubicle/office options). What effects should you expect from, say, changing the number of square feet per person from 80 to 64? What would this do to your current project's effort and schedule?
There's no plug-in formula for this, but based on the available data, I'd guesstimate that the effort would expand by up to 30%. Why?
"Programmer Performance and the Effects of the Workplace" describes the Coding War Games, a competition in which hundreds of developers from dozens of companies compete on identical projects. (Also described in Peopleware: Productive Projects and Teams.) The data is from the 1980's, but hasn't been replicated since as far as I can tell. The developers were ranked according to how quickly they completed the projects, into top 25%, middle 50%, and bottom 25%. The competition work was done in their normal office environments.
- The top 25% had an average of 78 square feet of dedicated office space.
- The bottom 25% had an average of 46 square feet of dedicated office space.
- The top 25% finished 2.6 times faster, on average, than the bottom 25%, with a lower defect rate.
- They ruled out the idea that top performers tended to be rewarded with larger offices.
It itself, this doesn't give us an answer for the question we started out with (changing from 80 square feet to 64 square feet per person, and bumping up the people density commensurately). As a first approximation, let's assume a linear relationship between dedicated area per person and productivity ratios. 64 is just over halfway between 46 and 78, so it seems reasonable to use half of the 2.6 factor, or 1.3, as a guesstimate. So using this number, a project that was going to take two weeks in the old environment would take 1.3 times as long, or around two and a half weeks, in the new environment. (In the long term, of course.)
To put this into perspective, it appears that increasing an organization's CMM level by one generally results in an 11% increase in productivity, and that the ratio of effort between worst and best real-world processes appears to be no more than 1.43.
You can't follow the numbers blindly here. This probably depends a lot on the kind of work you actually do, and I can think of dozens of caveats. My gut feeling is that the penalty is likely to be more like 10% than 30%, assuming you're really holding everything else (noise, interruptions, etc.) as constant as possible. I suspect that the organizations which are squeezing people into ice cube sized cubicles are likely to be destroying productivity in other ways as well. But, these numbers do provide some guidance as to what to expect in terms of costs and consequences of changing the workplace environment.
Links and references:
- In How office space affects programming productivity
(IEEE Computer Vol. 28 No. 1; Jan 1995, pp. 7676) Capers Jones gives a
guideline of at least 80 square feet of space per person, with full
walls and doors, for optimal productivity.
- The most well-documented planning exercise for knowledge worker facilities is IBM's Santa Teresa facility; a discussion is here.
- Steve McConnell gives a good overview of this and other issues in Quantifying Soft Factors (IEEE Software Vol. 17 No. 6: Nov/Dec 2000, pp. 9-11).
- T. DeMarco and T. Lister , "Programmer Performance and the Effects of the Workplace", Proc. 8th Int'l Conf. Software Eng., ACM Press, New York,1985,, pp. 268-272.
- A great anecdote: Joel Spolsky, Bionic Office. He's betting a lot of money that it's effective to equip his company with spacious, private offices.
Thursday, July 1, 2004
Community, social networks, and technology at Supernova 2004
Some afterthoughts
from the Supernova conference, specifically about social networks and
community. Though it's difficult to separate the different topics.
A quick meta-note here: Supernova is itself a social network of people and ideas, specifically about technology -- more akin to a scientific conference than an industry conference. And, it's making a lot of use of various social tools: http://www.socialtext.net/supernova/, http://supernova.typepad.com/moblog/.
Decentralized Work (Thomas Malone) sounds good, but I think there are powerful entrenched stakeholders that can work against or reverse this trend (just because it would be good doesn't mean it will happen). I'm taking a look at The Future of Work right now; one first inchoate thought is how some of the same themes are treated differently in The Innovator's Solution.
The Network is People - a panel with Chrisopher Allen, Esther Dyson, Ray Ozzie, and Mena Trott. Interesting/new thoughts:
A quick meta-note here: Supernova is itself a social network of people and ideas, specifically about technology -- more akin to a scientific conference than an industry conference. And, it's making a lot of use of various social tools: http://www.socialtext.net/supernova/, http://supernova.typepad.com/moblog/.
Decentralized Work (Thomas Malone) sounds good, but I think there are powerful entrenched stakeholders that can work against or reverse this trend (just because it would be good doesn't mean it will happen). I'm taking a look at The Future of Work right now; one first inchoate thought is how some of the same themes are treated differently in The Innovator's Solution.
The Network is People - a panel with Chrisopher Allen, Esther Dyson, Ray Ozzie, and Mena Trott. Interesting/new thoughts:
- Chris Allen on spreadsheets: They are a social tool for convincing people with numbers and scenarios, just like presentation software is for convincing people with words and images. So if you consider a spreadsheet social software, well, what isn't social software?
- "43% of time is spent on grooming in large monkey troupes." (But wait, what species of monkeys are we talking about here? Where are our footnotes?) So,
the implication is that the amount of overhead involved in maintaining
true social ties in large groups is probably very high. Tools that
would actually help with this (as opposed to just growing the size of
your 'network' to ridiculous proportions) would be a true killer app.
- Size
of network is not necessarily a good metric, just one that's easy to
measure. Some people really only want a small group.
- Kevin stated that # of subscribers to a given feed follows a power law almost exactly, all the way down to 1. So even having a handful of readers is an accomplishment. One might also note that this means the vast majority of subscriptions are in this 'micropublishing' area.
- New syndication possibilities mentioned: Traffic cameras for your favorite/current route.
- The Web is like a vast library; syndicated feeds are about what's happening now (stasis vs. change). What does this mean?
- The one interesting thing to come out of the how-to-get-paid-for-this discussion: What if you could subscribe to a feed of advertising that you want to see? How much more would advertisers pay for this? (Reminds me of a discussion I heard recently about radio stations going back to actually playing more music and less talk/commercials: They actually get paid more per commercial-minute because advertisers realize their ad won't be buried in a sea of crap that nobody is listening to.)
Friday, June 25, 2004
Supernova 2004 midterm update
I'm at the Supernova 2004 conference
at the moment. I'm scribbling notes as I go, and plan to go back
and cohere the highlights into a post-conference writeup. First
impressions: Lots of smart and articulate people here, both on
the panels and in the 'audience'. I wish there were more time for
audience participation, though there is plenty of time for informal
interactions between and after sessions. The more panel-like sessions are better than the formal presentations.
The Syndication Nation panel had some good points, but it ratholed a bit on standard issues and would have benefited from a longer term/wider vision. How to pay for content is important, but it's a well trodden area. We could just give it a code name, like a chess opening, and save a lot of discussion time...
I am interested in the Autonomic Computing discussion and related topics, if for no other reason than we really need to be able to focus smart people on something other than how to handle and recover from system issues. It's addressing the technical complexity problem.
Next problem: The legal complexity problem (IP vs. IP: Intellectual Property Meets the Internet Protocol) - I think this problem is far harder because it's political. There's no good solution in sight for how to deal with the disruptions technology are causing business models and the structure of IP law.
And, on a minor note, I learned the correct pronunciation of Esther Dyson's first name.
The Syndication Nation panel had some good points, but it ratholed a bit on standard issues and would have benefited from a longer term/wider vision. How to pay for content is important, but it's a well trodden area. We could just give it a code name, like a chess opening, and save a lot of discussion time...
I am interested in the Autonomic Computing discussion and related topics, if for no other reason than we really need to be able to focus smart people on something other than how to handle and recover from system issues. It's addressing the technical complexity problem.
Next problem: The legal complexity problem (IP vs. IP: Intellectual Property Meets the Internet Protocol) - I think this problem is far harder because it's political. There's no good solution in sight for how to deal with the disruptions technology are causing business models and the structure of IP law.
And, on a minor note, I learned the correct pronunciation of Esther Dyson's first name.
Sunday, June 20, 2004
Atom Proposal: Simple resource posting
On the Atom front, I've just added a proposal to the Wiki: PaceSimpleResourcePosting. The abstract is:
This proposal extends the AtomAPI to allow for a new creation URI, ResourcePostURI, to be used for simple, efficient uploading of resources referenced by a separate Atom entry. It also extends the Atom format to allow a "src" attribute of the content element to point to an external URI as an alternative to providing the content inline.
This proposal is an alternative to PaceObjectModule, PaceDontSyndicate, and PaceResource. It is almost a subset of and is compatible with PaceNonEntryResources, but differs in that it presents a very focused approach to the specific problem of efficiently uploading the parts of a compound document to form a new Atom entry. This proposal does not conflict with WebDAV but does not require that a server support WeDAV.
This proposal extends the AtomAPI to allow for a new creation URI, ResourcePostURI, to be used for simple, efficient uploading of resources referenced by a separate Atom entry. It also extends the Atom format to allow a "src" attribute of the content element to point to an external URI as an alternative to providing the content inline.
This proposal is an alternative to PaceObjectModule, PaceDontSyndicate, and PaceResource. It is almost a subset of and is compatible with PaceNonEntryResources, but differs in that it presents a very focused approach to the specific problem of efficiently uploading the parts of a compound document to form a new Atom entry. This proposal does not conflict with WebDAV but does not require that a server support WeDAV.
Saturday, June 5, 2004
Atom: Cat picture use case
To motivate discussion about some of the basic needs for the Atom API, I've documented a use case that I want Atom to support: Posting a Cat Picture.
This use case is primarily about simple compound text/picture entries,
which I think are going to be very common. It's complicated
enough to be interesting but it's still a basic usage.
The basic idea here is that we really want compound documents that contain both text and pictures without users needing to worry about the grungy details; that (X)HTML already offers a way to organize the top level part of this document; and that Atom should at least provide a way to create such entries in a simple way.
The basic idea here is that we really want compound documents that contain both text and pictures without users needing to worry about the grungy details; that (X)HTML already offers a way to organize the top level part of this document; and that Atom should at least provide a way to create such entries in a simple way.
Friday, June 4, 2004
Who am I?
Technorati Profile
I'm currently a tech lead/manager at Google, working on Blogger engineering.
I'm formerly a system architect and technical manager for web based products at AOL. I last managed development for Journals and Favorites Plus. I've helped launch Public & Private Groups, Polls, and Journals for AOL.
History:
Around 1991, before the whole Web thing, I began my career at a startup which intended to compete with Intuit's Quicken software on the then-new Windows 3.0 platform. This was great experience, especially in terms of what not to do[*]. In 1993 I took a semi-break from the software industry to go to graduate school at UC Santa Cruz. About this time Usenet, ftp, and email started to be augmented by the Web. I was primarily interested in machine learning, software engineering, and user interfaces rather than hypertext, though, so I ended up writing a thesis on the use of UI usability analysis in software engineering.
Subsequently, I worked for a startup that essentially attempted to do Flash before the Web really took hold, along with a few other things. We had plugins for Netscape and IE in '97. I played a variety of roles -- API designer, technical documentation manager, information designer, project manager, and development manager. In '98 the company was acquired by CA and I moved shortly thereafter to the combination of AtWeb/Netscape/AOL. (While I was talking to a startup called AtWeb, they were acquired by Netscape and Netscape was in turn acquired by AOL -- an employment trifecta.)
At AtWeb I transitioned to HTML UIs and web servers, working on web and email listserver management software before joining the AOL Community development group. I worked as a principal software engineer and then engineering manager. I've managed the engineering team for the AOL Journals product from its inception in 2003 until the present time; I've also managed the Groups@AOL, Polls, Rostering, and IM Bots projects.
What else have I been doing? I've followed and promoted the C++ standardization process and contributed a tiny amount to the Boost library effort. On a side note, I've taught courses inobject oriented programming, C++, Java, and template metaprogramming for UCSC Extension, and published two articles in the C++ Users Journal.
I'm interested in software engineering, process and agile methods, Web standards, language standards, generic programming, information architectures, user interface design, machine learning, evolution, and disruptive innovation,
I'm currently a tech lead/manager at Google, working on Blogger engineering.
I'm formerly a system architect and technical manager for web based products at AOL. I last managed development for Journals and Favorites Plus. I've helped launch Public & Private Groups, Polls, and Journals for AOL.
History:
Around 1991, before the whole Web thing, I began my career at a startup which intended to compete with Intuit's Quicken software on the then-new Windows 3.0 platform. This was great experience, especially in terms of what not to do[*]. In 1993 I took a semi-break from the software industry to go to graduate school at UC Santa Cruz. About this time Usenet, ftp, and email started to be augmented by the Web. I was primarily interested in machine learning, software engineering, and user interfaces rather than hypertext, though, so I ended up writing a thesis on the use of UI usability analysis in software engineering.
Subsequently, I worked for a startup that essentially attempted to do Flash before the Web really took hold, along with a few other things. We had plugins for Netscape and IE in '97. I played a variety of roles -- API designer, technical documentation manager, information designer, project manager, and development manager. In '98 the company was acquired by CA and I moved shortly thereafter to the combination of AtWeb/Netscape/AOL. (While I was talking to a startup called AtWeb, they were acquired by Netscape and Netscape was in turn acquired by AOL -- an employment trifecta.)
At AtWeb I transitioned to HTML UIs and web servers, working on web and email listserver management software before joining the AOL Community development group. I worked as a principal software engineer and then engineering manager. I've managed the engineering team for the AOL Journals product from its inception in 2003 until the present time; I've also managed the Groups@AOL, Polls, Rostering, and IM Bots projects.
What else have I been doing? I've followed and promoted the C++ standardization process and contributed a tiny amount to the Boost library effort. On a side note, I've taught courses inobject oriented programming, C++, Java, and template metaprogramming for UCSC Extension, and published two articles in the C++ Users Journal.
I'm interested in software engineering, process and agile methods, Web standards, language standards, generic programming, information architectures, user interface design, machine learning, evolution, and disruptive innovation,
First Post
The immediate purpose of this blog is to publish thoughts about web technologies, particularly Atom.
Of course that suffers from the recursive blogging-about-blogging
syndrome, so I'll probably expand it to talk about software in general.
What does the name stand for? Mostly, it stands for "something not currently indexed by Google". Hopefully in a little while it will be the only thing you get when you type "Abstractioneer" into Google. Actually it's a contraction of the "Abstract Engineering" which is a meme I'm hoping to propagate. More on that later.
What does the name stand for? Mostly, it stands for "something not currently indexed by Google". Hopefully in a little while it will be the only thing you get when you type "Abstractioneer" into Google. Actually it's a contraction of the "Abstract Engineering" which is a meme I'm hoping to propagate. More on that later.
Subscribe to:
Posts (Atom)