Wednesday, April 26, 2006

Atom feed updates: Pagination

One of the hidden changes from last week's release is the support for Atom pagination.  This will potentially let tools browse of the entries in a blog, copy them, archive them, search them, etc.  Technically, this means we're supporting the link@rel="first", "last", "next", and "previous" relations.  Get the current feed and follow the "previous" links until you run out of data; then you've got all of the entries in a blog, in standard Atom format.  And, we're valid according to http://feedvalidator.org.  Let me know if you see any problems.

Friday, April 14, 2006

Tags: Web Bumper Stickers

Our new entry tagging secret beta stealth feature might be a little difficult to see since it doesn't work on IE yet, though Joe did a great job with screen shots.  (Joe: It's not much of a stealth feature if you tell everybody about it, is it?)

Tags are just labels that you can apply to your entries; since they're public, they're kind of like electronic bumper stickers.  If you use Firefox or Mozilla, you can play with them on beta.journals.aol.com/<your screen name>.  Otherwise, well, here's a little animation:

Picture from Hometown

...and you can see the results below.  I have no idea what "stealth" is going to link to, since right now it just does a general web-wide tag search.  I think that's kind of fun, actually, but your mileage may vary.  We're looking at various ideas, including having the links go to a blog-specific search page (but perhaps with links off to the general web search to see what other people have chosen the same bumper stickers).  Also, we'll leverage the results to provide better categorization tools for your entries and blogs.  It's pretty wide open at the moment.  So if you have opinions, let us know.

Whoops. Seriously.

This week, AOL accidentally started spam blocking email with "dearaol.com" URLs embedded in the text.  And then we fixed itNever ascribe to malice that which is adequately explained by... um... never mind.  I know of no conspiracy and the people running our spam filters are good folks, it's just sometimes the software they wrangle gets a little obstreperous.

(techdirt)

Tuesday, April 11, 2006

Code, and other laws... (part 2)

In part 1 I talked about the ideal world where feeds were all clearly licensed. So now I'll turn to the real world, and I'll be very US-centric because this article is quite long enough as it is. You might want to skip to the happy fun summary at the bottom.

Millions of feeds aren't explicitly licensed.  Some can't be because their generators don't allow for it.  For others, the owner doesn't know or care about licensing.  For unlicensed feeds, it's not reasonable to make the default assumption "nothing more than fair use" because there are millions of feeds out there whose owners want their content syndicated as-is (headline feeds with links back to content, for example).  On the other hand, if you assume anything more than fair use, you also need to be prepared handle exceptions.  So how to do both of these in a way that minimizes overhead and lets aggregation happen without lawyers while respecting copyright?

My take is that a reasonable default assumption is to assume the Creative Commons Attribution license only if the feed owner hasn't specified otherwise. 
This means that by default, we'd assume that copying of feed content is allowed as long as attribution is given through an appropriate hyperlink.  Then, provide easy ways to let feed owners specify a different license whenever they explicitly declare one. 

If a feed owner is happy with the default, they need to do nothing.  My sense is that this covers 98% of unlicensed feeds.  For the remainder, a feed owner could go to individual aggregators and tell them explicitly what license they prefer.  They can always choose a completely restrictive license that allows only fair use for the general public.  Or, they can choose a noncommercial license.  My take is that something equivalent to the current Creative Commons license chooser is sufficient.

Of course, what we'd all really prefer is for feed owners to put the licenses in their feeds directly.  That way, our AOL proxies and caches would simply pass the information along to clients, which would make appropriate decisions about what to do based on the particular license.  If we're dealing with a small number of well understood licenses, this is the easy part.

How should the feed licenses work?  There's a pretty good page with reasonable recommendations at Creative Commons on the subject.  James Snell's Feed License Link Relation works well for Atom and is pretty flexible:
<link rel="license" href="http://creativecommons.org/licenses/by/2.5/"/>
The Creative Commons RSS Module works for RSS 2.0:
<creativeCommons:license>http://www.creativecommons.org/licenses/by-nc/2.5</creativeCommons:license>. 
Both of these work with CC and other licenses and have been deployed in real implementations  There's an RDF version for RSS 1.0 as well (cc:license).

Finally there's the RSS 2.0 <copyright> element, which is just plain text.  But, given that some tools might allow people to put text in this field but not embed the other types of licenses, I think it's reasonable to look for a known license URL in the copyright text as well:
<copyright>The contents of this feed are licensed to the public under http://creativecommons.org/licenses/by-nc-sa/1.0/</copyright>
If a processor can't find any of the above licenses, I'm proposing that AOL feed consumers fall back to a license based on an explicit list that AOL maintains by feed owner request.  This would be part of our feed infrastructure.  I see this working two ways.  First, we would add metadata to feeds which are requested via our feed proxies.  For Atom and RSS 2.0, the two output formats we support, this would be a namespaced extension, aol:declared-license:
<aol:declared-license>
      <link rel="license" href="http://creativecommons.org/licenses/by/2.5/"/>
</aol:declared-license>
It would contain a Feed License Link Relation indicating which license the owner specified to AOL.  It could potentially contain multiple license links.  It could contain other namespaced elements in the future as well, but feed consumers can ignore ones they don't understand.

A client might also want to inquire about a feed's declared license without retrieving it.  For this, we could provide a simple REST API:
GET http://example.aol.com/declared-license/example.org/feed/atom.xml
which returns a simple XML document:
<?xml version="1.0" encoding="utf-8" ?>
<declared-license xmlns="http://example.aol.com/2006/aolfeeds">
    <link rel="license" href="http://creativecommons.org/licenses/by/2.5/"/>
</declared-license>
Note that non-AOL clients could potentially make use of this; you'd just have to believe that AOL is maintaining a good declared license list (the licenses themselves are the ones the feed owners want to provide to the general public, not to AOL specifically).  We could even potentially share these lists between feed aggregators.  An embedded (original) license would always override any declared license; this would let feed owners easily start embedding their own licenses in the future.  (Should we eliminate any declared license as soon as the source feed starts licensing itself?  I think so, but our legal team would need to weigh in on that.)

Finally, we'd advertise a variety of ways for feed owners to contact us and declare their licenses.  There does need to be some sort of validation step to ensure they really own the feed.  As part of the hopefully painless process we'd ask them to pick from one of the existing Creative Commons licenses.  If these aren't sufficient we can add other licenses but it's easier all around if people can agree on a small set.

How about a real world example?  Brian Alvey of Weblogs Inc. recently announced support for excerpt feeds, for example Engadget full vs. Engadget headlines.  The full Engadget feed has the copyright statement:
<copyright>Copyright 2006 Weblogs, Inc. The contents of this feed are available for non-commercial use only.</copyright>
Translating into license-speak, we'd get an Attribution-NonCommercial-NoDerivs license for the full feed, meaning no commercial exploitation, links back are required, and editing of the material is not allowed beyond fair use:
<creativeCommons:license>http://creativecommons.org/licenses/by-nc-nd/2.5/</creativeCommons:license>
The excerpt Engadget feed has the copyright statement:
<copyright>Copyright 2006 Blogsmith, LLC. The contents of this headlines and excerpts feed are available for limited commercial distribution. You may repost this feed to your site provided you link back to the original story, do not edit the material, and do not remove this copyright notice.</copyright>
Translating into license-speak, we'd get Attribution-NoDerivs for the excerpt feed, meaning that commercial use is OK but links back are required and the material may not be edited:
<creativeCommons:license>http://creativecommons.org/licenses/by-nd/2.5/</creativeCommons:license>
(I'm assuming here that the restriction on editing applies to the individual articles, not the feed document as a whole, since feed documents are not intended to be kept intact in any case.  This minor ambiguity goes away with Atom's Feed License Link Relation.)

So far, so good.  Having multiple versions does raise the question of how automated processors are supposed to find these feeds.  I think that's going to have to be a followup post.

That's about it.
In summary:
None of this is black or white.  I should also mention that I'm completely conflicted here, in that my company both syndicates and aggregates content and I'm directly involved on both sides.  I'm coming at this from the viewpoint of someone trying to provide online feed aggregation services where the end users subscribe to the feeds; they're not being selected or screened by editors.  In other situations other rules about default licences might be better.  Explicit licences are definitely best to avoid problems down the road.  Here are some other links I've stumbled across:  A basic practical primer on copyright and RSS. One re-aggregator's viewpoint (Palfrey).  Producer's viewpoints: Shelley Powers, Om Malik (here and here) .  Some legal discussion (with Wendy Seltzer, previously of the EFF, weighing in). (Feedburner already does CC licensing following the methodsoutlined above, except that they're using the creativeCommons namespace extension for Atom as well as RSS 2.0; consumers should look for either one in Atom feeds.)

Tags: , Creative Commons, RSS, Atom, syndication

Monday, April 10, 2006

Buddy Updates for Blog Entries

Greg of aiminfo blogs about IM Triton release 1.2.37.2 :

"Buddy Updates allow you to view changes or additions your buddies make to their away messages, message boards and profiles.  You will see a new icon next to the buddy in the buddy list when an update has happened: "

You can grab the latest AIM Triton here.  What Greg doesn't mention is that this also works for blog entries made through Journals.  So if you use the latest AIM client, you'll be notified about your buddies' latest blog posts.  If you try it out, please let me (or Susan or Joe or John) know what you think.  This only works for public blogs, the ones that you can find through AOL or Google search in any case, but it does give you an up-to-the-minute picture of what's going on with your buddies.

Oh, and we have an update for Journals going out tomorrow morning.  After it's complete, one nonobvious change is that you'll be able to see the list of Journals someone publishes by going to their screen name on Journals (for example, http://beta.journals.aol.com/panzerjohn/ will give you a list of mostly test blogs).  The page lists public blogs, plus any private blogs that you're a reader of.  (Others are invisible.)  Also, the page has a nifty search box where you can type in screen names to try to find their Journals if they have any.  Again, let us know what you think.  It's sort of a hidden feature right now in that you have to know to type in the right URL.  So feedback is welcomed!

Monday, April 3, 2006

Danah Boyd at AOL Mountain View

Danah Boyd just wrapped up a great talk about online social spaces here at AOL Mountain View (the podcast is up already).  She delivered information via firehose. Some random notes...

There were several reasons why Friendster faded, and some lessons.
  • Conflict between the user community and the space creators (they wanted a dating site, the users wanted to do a lot of other things).  Lesson: Listen to the community; be flexible; adjust the business plan when needed.
  • Servers buckled under load when it got too popular.  Lesson: The technology has to work or people will lose patience and go to the competition.
  • When Friendster started to try to go mainstream beyond the early adopter clusters, new users couldn't find any friends on the site so it wasn't useful to them.  Lesson:  Network effects work in reverse too.  Start with small clusters and grow organically.
MySpace did a big thing right: When people started 'hacking' HTML in their own spaces, the creators let it happen, then made it easier by adding the features that people actually wanted to use, like sound files (for indie bands) and videos from YouTube. This means the community is designing the service as much as the creators are; not just putting content, but guiding much of the design direction, along with highly visible and passionate designers engaged with the community.  Danah calls this Embedded Community Design. 

Best quote: "[Teens are] immune to bouncy visual overload." They've been immunized to this by mass media. (What does this mean for advertising as a business model?)

Last week, teens used MySpace to organize mass school walkouts to protest HR 4437. That's impressive regardless of your political views.

Tags: , ,