RSS and Atom

Table of Contents

Publish And Subscribe With RSS And Atom Evolution Of RSS Standards RSS 2.0 Feed Elements Atom 1.0 Feed Elements The MetaWeblog API The Blogger API The Atom Protocol AtomPub in the Wild

Section 1

Publish And Subscribe With RSS And Atom

By Dave Johnson

The explosive growth of RSS and Atom feeds on the internet make it easier than ever before for your software to publish, edit, monitor and extract data from the web. That's why feeds are the core of a host of new RESTful web services from simple blog publishing protocols to Google's expansive GData and OpenSocial APIs. You'll find this reference card useful whether you are creating and serving, or subscribing to and parsing feeds. It lists the XML elements in the most widely used feed formats and it illustrates the relationship between multiple variants of RSS and Atom. We list and explain the methods in the XML-RPC based Blogger and MetaWeblog API. And, we provide a guide to the new RESTbased Atom web publishing protocol.

RSS and Atom make it easy to read and write the web. Applications can use the Atom Publishing Protocol (RFC-5023) and the MetaWeblog API to publish any type of content to blog, wiki and CMS servers. And servers can make any type of content available to media players, feed reader and other applications via RSS and Atom formats.

Section 2

Evolution Of RSS Standards

Depending on whom you ask, RSS stands for RDF Site Summary, Rich Site Summary, Really Simple Syndication or just RSS. The diagram below shows the evolution of RSS and Atom feed formats. There are two variants of RSS: Dave Winer's simple fork and the RSS-DEV group's RDF fork. Atom (RFC-4287) is the new standard feed format

You can find the specifications for all of these formats online at the following locations:

RSS 0.94

No longer available online

RSS 1.0

http://web.resource.org/rss/1.0/spec

RSS 2.0

No longer available online

RSS 2.0.1

http://blogs.law.harvard.edu/tech/rss

Atom 1.0

http://www.atomenabled.org/developers/syndication/atom-format-spec.php

Section 3

RSS 2.0 Feed Elements

The root element is <rss>, it contains one <channel> element, which in turn contains <item> elements. Dates are represented in RFC-822 format. The RSS 2.0 diagram below is broken into two parts; first we show the feed level metadata under - item element children are omitted.

Feed Diagram Key

The feed diagrams use the following notations to indicate required elements, cardinality, containment and XML attributes.

Feed Diagram

You can extend RSS by adding your own extension elements, i.e. new XML elements, as long as they are placed in their own XML namespace.

Feed Diagram

The second part of the RSS 2.0 diagram shows the item element and its children. Item content is carried in the <description> element and is represented as escaped HTML.

RSS 2.0 Feed Examples Example of an RSS 2.0 feed with one item and a podcast.

​x
​
<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0">
<channel>
<title>Example Blog</title>
<link>http://example.com/blog</link>
<item>
<title>Here is an item with a podcast</title>
<description>
My first <b>podcast<b>.
</description>
<pubDate>Wed, 20 Apr 2005 13:30:45 EDT</pubDate>
<link>http://example.com/blog/20050420?id=132</
link>
<enclosure url="http://example.com/casts/file1.mpg"
type="audio/mpeg3" length="13456170"/>
</item>
</channel>
</rss>
​

RSS 2.0 feeds in the wild often use extension elements instead of the standard elements of RSS. For example, this feed uses the Dublin Core <dc:date> and <dc:creator> instead of the standard <pubDate> and <author> elements.

​
<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
<title>Example Blog</title>
<link>http://example.com/blog</link>
<item>
<title>Here is an item that uses Dublin Core</title>
<description>Just another <b>blog entry<b>.
</description>
<link>http://example.com/blog/20050420?id=133</link>
<dc:date>2005-04-20T17:41:04+5:00</dc:date>
<dc:creator>Dave Johnson</dc:creator>
</item>
</rss>
​
​

Use Feed Autodiscovery to Advertise Your Feeds If your web application or site provides feeds, advertise those feeds by listing each with an HTML element in the HTML of your web page. For each feed, you can specify a content-type, title and an href-as shown here:

​
<link rel="alternate"
type="application/rss+xml" title="My RSS feed"
href="http://localhost/feed.rss" />
​

Section 4

Atom 1.0 Feed Elements

The root element of an Atom feed is the <feed> element, which contains metadata and a collection of <entry> elements.

ID must be a valid URN
There must be a self-link containing the URI location of the feed
Dates in Atom are represented in W3C DateTime format
Text constructs (indicated with <<text>> in the diagram) may contain a type attribute with a value of text for plain text, html for escaped HTML, or xhtml for XHTML. If not present, content is assumed to be text.
An author must be specified at the <feed> level or in each <entry>

Feed Diagram Key

The feed diagrams use the following notations to indicate required elements, cardinality, containment and XML attributes.

Feed Diagram

Here are the elements at the Atom <entry> level. Note that the <content> element has a type attribute like that in text construct , but it can also be set to any MIME content-type, thus allowing an Atom entry to carry any type of data.

Atom 1.0 Example

Example Atom 1.0 feed with XHTML content.

​
<?xml version='1.0' encoding='UTF-8'?>
<feed xmlns='http://purl.org/atom/ns#' xml:lang='en-us'>
<title>Oh no, Mr. Bill</title>
<link href='http://nbc.com/sluggo/' />
<link rel='self' href='http://nbc.com/sluggo/index.atom' />
<updated>2005-04-06T20:25:05-08:00</updated>
<author><name>Mr. Bill</name></author>
<entry>
<title>A post about stuff</title>
<link href='http://nbc.com/sluggo/20050420?id=321' />
<id>http://nbc.com/sluggo/20050420?id=321</id>
<updated>2005-04-06T20:25:05-08:00</updated>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<!-- xhtml content -->
</div>
</content>
</entry>
</feed>
​

Validate your Feeds with feedvalidator.org

If your web application or site provides feeds, validate those feeds by using the free feed validation service at feedvalidator.org. If you're serving private or behind-the-firewall feeds, then you can download the (Python) source code for the Feed Validator and run it on your own machine.

Parsing Feeds with Java and Rome

Here's a simple example that shows how to parse and print a feed using Java and the ROME feed parser library. The example uses the SyndFeeedInput class, which parses any format of RSS (0.9X, 1.0, 2.0) or Atom (0.3 or 1.0) to a SyndFeed object containing a collection of SyndEntry objects.

SyndFeedInput input = new SyndFeedInput(); SyndFeed feed = input.build( new InputStreamReader(inputStream)); Iterator entries = feed.getEntries().iterator(); while (entries.hasNext()) { SyndEntryventry.next(); System.out.println("Title: " + entry.getTitle()); System.out.println("Link: " + entry.getLink()); System.out.println("Date: " + entry.getPublishedDate()); System.out.println("Desc: " + entry.getDescription()); System.out.println("\n"); }

ROME is covered in detail in RSS and Atom in Action, Chapter 7. For more information on ROME, visit the project's web site at http://rome.dev.java.net.

Parsing Feeds with C# & Windows RSS

Here's an example that shows how to parse and print a feed using C# and the Windows RSS Platform's Feeds API.

​
IFeedsManager fm = new FeedsManagerClass();
IFeed feed = null;
if (!fm.IsSubscribed(url)) {
IFeedFolder rootFolder =
(IfeedFolder)fm.RootFolder;
feed = (IFeed)rootFolder.CreateFeed(url, url);
} else {
feed = (IFeed)fm.GetFeedByUrl(url);
}
feed.Download();
foreach (IFeedItem item in (IFeedsEnum)feed.Items) {
Console.Out.WriteLine("item.Title: " + item.Title);
Console.Out.WriteLine("item.pubDate:" + item.PubDate);
Console.Out.WriteLine("item.Desc: " + item.Description);
}
​

Section 5

The MetaWeblog API

The MetaWeblog API was created by Dave Winer; it extends the Blogger API by adding six new methods to allow posting and editing blog entries with better metadata than the Blogger API and to allow uploading of media files (image, video, etc.).

List of Methods

Method Name	Parameters and Descriptions
metaWeblog. newPost	string blogid, string username, string password, struct post, boolean publish Creates a new post in the blog specified by blogid using the data from the post structure. The names in the post structure correspond to the names of the XML elements in an RSS <item>. Returns the string ID of the newly created post.
metaWeblog. editPost	string postid, string username, string password, struct post, boolean publish Updates the post specified by postid using data from the post structure.
metaWeblog. getPost	string postid, string username, string password Returns the post specified by postid as a post structure.
metaWeblog. getRecentPosts	string blogid, string username, string password, int numPosts Returns the most recent blog post as an array of post structures. Maximum number of posts to return is numPosts.
metaWeblog. newMediaObject	string blogid, string username, string password, struct object Uploads an image, video, or audio file to the blog specified by blogid. The file is specified by the object structure with fields name, type, and bits. The bits field is the file data encoded as Base64 data. Returns a string, which is the URL of the uploaded file.
MetaWeblog. getCategories	string blogid, string username, string password Returns the categories available in the blog specified by blogid as a structure of structures, each structure representing a category and having members description, htmlUrl, and rssUrl.

Section 6

The Blogger API

The Blogger API was created in 2001 for Blogger.com and it's being replaced by the Atom protocol, but it is still an important API because it is the foundation of the widely used MetaWeblog API.

List of Methods

blogger.newPost	string appkey, string blogid, string username, string password, string content, boolean publish Create a new blog post in the blog specified by blogid and content specified by content. Some servers interpret publish=true to mean publish publicly and publish=false to mean save as a private draft. Others interpret it to mean simply publish immediately. Returns a string, which is the postid of the new post.
blogger.editPost	string appkey, string postid, string username, string password, string content, boolean publish Update the blog post specified by postid.with new content.
blogger.deletePost	string appkey, string postid, string username, string password, boolean publish Delete the blog post specified by blogid and optionally republish the blog.
blogger. getRecentPosts	string appkey, string blogid, string username, string password, int numPosts Get the most recent blog posts as an array of structures, each having members dateCreated, userid, postid, and content. Maximum number of posts to return is numPosts.
blogger.getUsersBlogs	string appkey, string username, string password Get the specified user's blogs as an array of structures, each having members url, blogid, and blogName.
blogger.getUserInfo	string appkey, string username, string password Get the specified user's information as a structure with members nickname, userid, url, email, lastname, firstname.
blogger.getTemplate	string appkey, string blogid, string username, string password, string type Get blog's template of the specified type.
blogger.setTemplate	string appkey, string blogid, string username, string password, string template, string type Change the blog's template of the specified type. The format of blog templates varies depending on the blog server.

Making a Post With MetaWeblog API

Here is an example that uses Apache XML-RPC to post a blog entry with a title and description to a blog with a blogid, username and password.The blog server has an endpointURL.

​
import java.util.*;
import java.io.*;
import org.apache.xmlrpc.XmlRpcClient;
Hashtable post = new Hashtable();
post.put("dateCreated", new Date());
if (title != null) post.put("title", title);
post.put("description", description);
Vector params = new Vector();
params.addElement(blogid);
params.addElement(username);
params.addElement(password);
params.addElement(post);
params.addElement(Boolean.TRUE);
XmlRpcClient xmlrpc = new XmlRpcClient(endpointURL);
String newEntryId =
(String)xmlrpc.execute("metaWeblog.newPost", params);
​

Section 7

The Atom Protocol

Atom protocol (RFC-5023) is a REST-based protocol for creating, retrieving, updating and deleting collections of objects on a server. Objects are represented as Atom entries and collections as Atom feeds.

Service Document

To find out what workspaces and collections are available on an Atom server, send an authenticated HTTP GET request to the server's end-point URI.

You'll get back an Atom service document like the one below, which includes one workspace that contains two collections: one of entries and one of images. Note that each collection has a collection URI.

​
<?xml version="1.0" encoding='utf-8'?>
<service xmlns="http://www.w3.org/2007/app">
<workspace title="My blog" >
<collection title="Entries"
href="http://example.org/reilly/main" >
<accept>entry</accept>
</collection>
<collection title="Pictures"
href="http://example.org/reilly/pic" >
<accept>image/*</accept>
</collection>
</workspace>
</service>
​

Listing Collections

To retrieve the contents of a collection, send an authenticated HTTP GET request to the collection's URI.

The server will respond by sending back an Atom feed containing the first portion of the collection and a next URI, which you can use to retrieve the next portion of the collection.

​
<feed xmlns="http://www.w3.org/2005/Atom">
<link rel="next"
href="http://example.org/entries/60" />
<link rel="previous"
href="http://example.org/entries/20" />
...
<entry> ... </entry>
<entry> ... </entry>
<entry> ... </entry>
<entry> ... </entry>
...
</feed>
​

Creating an Entry

To create a new entry within a collection, you simply post the XML for the entry to the collection's URI. For example, here's an example entry suitable for posting to an Atom server.

​
<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns="http://www.w3.org/2005/Atom">
<id></id>
<title>Atom test post title</title>
<content>Atom test post content</content>
<updated>2006-05-16T00:00:00Z</updated>
</entry>
​

The server will respond by creating an entry based on what you posted. It will fill in some blanks, such as the ID, and will return the Atom entry as it appears on the server. It will add in an edit URI, as shown below in bold, which you can use to retrieve, update or delete the entry.

​
<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns="http://www.w3.org/2005/Atom">
<id>http://example.com/blog/entry/2223</id>
<link rel="alternate" type="text/html"
href="http://example.com/blog/entry/2223" />
<link rel="edit" type="text/html"
href="http://example.com/app/blog/entry/2223" />
<title>Atom test post title</title>
<content>Atom test post content</content>
<updated>2006-05-16T00:00:00Z</updated>
</entry>
​

Updating an Entry

To update an entry, you first send an authenticated HTTP GET request to the entry's edit URI to get the latest copy of the entry. You then edit the entry and use an authenticated HTTP PUT to the edit URI to update it on the server.

Section 8

AtomPub in the Wild

Atom protocol has been spreading like wildfire since the specification was finalized in October 2007. Google and Microsoft have adopted it as the basis for many of their web services interfaces, for example:

Google Data (GData)-more than a dozen APIs that use Atom protocol plus extensions: http://code.google.com/apis/gdata

OpenSocial-access and manage Social Graph data via Atom protocol: http://code.google.com/apis/opensocial

Microsoft Windows Live-manage photos and store gadget data with Atom protocol: http://dev.live.com

Microsoft ADO.Net Data Services-use Atom protocol to access relational databases: http://astoria.mslivelabs.com