Podcasting with IPWorks

Tutorial - The Next Generation of RSS: Podcasting With IPWorks

Requirements: IPWorks .Net Edition
Podcast Demo: Download

Content Syndication

There's a relatively new form of information consumption spreading across the globe of which, if you're reading this article, you may be already aware. It goes by many names, and there is some argument over what it should "officially" be called. I'm speaking of RSS. I mean Atom. I mean Feeds...newsfeeds. People use all of these terms when speaking about the syndication of online content, which was made popular by blogging (web logging). RSS and ATOM are xml formats used for syndication. An application called an aggregator is one that brings many of these syndicated feeds together in a way that makes it possible to read information from many different sources in one place. This would allow you to read what interests you from CNN, Netflix, the local news, knowledgeable people in your field, and any other feeds you are interested in reading - but much more quickly and efficiently than if you had to individually visit these websites with a browser.

Podcasting

Podcasting applies RSS technology to binary media distribution. What is Podcasting? It is the process of automatically feeding media (usually audio, but really it could be anything: video, images, etc) to a device. It gets its name from casting these media feeds to iPods. "Broadcasting" to an iPod - Podcasting.

A Podcasting application monitors media feeds (RSS 2.0) for content, automatically downloads that content, and then makes that content available for copying to some mobile device (cell phone, iPod, Pocket PC). If this were audio media, it would also be nice to provide the ability to write the media to a CD for later listening in a car, for example (for those without a mobile handheld device).

In this article I'll walk through the basics of writing a simple Podcasting application in C#. I'll use the IPWorks .NET Edition, specifically the RSS component included in it to monitor a media feed and auto-download enclosures. I will cover the basics; that is, I will demonstrate downloading the feeds, parsing the feeds, finding new items, and downloading enclosures in those new items. If you'd like to see the full project, you can download it here.

The application can be expanded to take specific action depending on the enclosure type. For example:

  • Audio/Video enclosure types can be automatically copied to a mobile device for later listening/viewing, automatically copied to a CD or DVD, or automatically copied to a playlist folder that can be synched with an auto playlist in a media player.
  • Graphic enclosure types can be automatically be copied to a graphic library folder for later viewing.
  • Bit Torrent enclosure types (links) could be downloaded automatically using Bit Torrent (a P2P application designed to "share" network bandwidth). Some Podcast producers are using Bit Torrent as a transport format in order to cut the bandwidth burden of serving large audio/video files from their web servers to many consumers.

If I were going to build a simple news aggregator, such as in the demo included in the installation of IPWorks called "RSS", I would wait for the user to click on a feed, and then download and display the feed items to the user and allow them to select which ones they want to read. When the user clicks on a news item the details of the item would be displayed.

A Podcasting application behaves differently. Instead of requiring the user to click a feed and then downloading the text items and displaying them - the Podcasting application will automatically check each feed periodically. If there are new items in the feed that contain binary enclosures, the Podcasting application will automatically download these behind the scenes, not bothering the user with this decision. Whenever the user is ready, new items will be available, and optionally automatically copied onto a mobile device of some sort.

We'll be porting this C# application over to Cocoa for Mac OS X users, and posting that code in this article when finished. On the Mac you can have iTunes automatically handle writing the files to a connected iPod. I'll also be porting the code to a .Net CF application. These ports will be very easy because of the fact that IPWorks exists for so many platforms (including the Unix and .Net CF Editions), with the same API.

Downloading Feeds

A news aggregator and a Podcasting application both need to allow the user to "subscribe" to different feeds (URLs). The URL of the feed will be to an XML file, for which the format will depend on the feed. All Podcast feeds are in RSS 2.0 format (which allows enclosure elements). Non-Podcast news feeds can be in any of the "standard" formats, like RDF, RSS 1.0, RSS 2.0, or ATOM. These files are parsed by the news aggregating or Podcasting application. If you are new to Podcasting, here are some interesting podcast feeds:

IT Conversations http://www.itconversations.com/rss/recentWithEnclosures.php
Channel 9 http://channel9.msdn.com/rss.aspx
The Slashdot Review http://slashdotreview.com/wp-rss2.php
The Scripting News http://www.scripting.com/rss.xml
Daily Source Code http://radio.weblogs.com/0001014/categories/dailySourceCode/rss.xml

I'll download these feeds with the RSS component included in IPWorks The RSS component will automatically parse the raw xml data into the individual items contained in the feed.

Parsing Feeds

The RSS component will parse the feeds for you - so for this step all that is required is to call the GetFeed() method of the RSS component. Before doing so, I'll configure the RSS component to use an IfModifiedSince date in the HTTP request header. This is very important, and tells the web server to only send the feed XML data if it has been modified since the specified date. This helps to preserve server bandwidth, which is valuable since many news aggregators and Podcast applications are configured to check for new items on a regular basis.

rss1.Config("IfModifiedSince=" + lastreaddate);
rss1.GetFeed(feedurl);

For each feed the original lastreaddate (the first time you retrieve the feed) is empty string. Every time the feed gets downloaded, the last-modified date gets sent in the response by the server - and saved by the application for use next time that particular feed is downloaded.

After the component gets the feed XML - it will automatically begin parsing it. The result will be a set of arrays of information about each item in the feed. For example, the titles of the items will be in the ItemTitle[] array. The descriptions of the items will be in the ItemDescription[] array. Other non-required (from the standpoint of RSS 2.0 at least) elements in the items will be accessible through the GetProperty method. This method is provided so that changes in the RSS specification will not result in a broken application. You can get any element you wish using the GetProperty method.

Podcast items can include an enclosure element that points to any binary data being enclosed. This element always includes at least three attributes: url (a URL to the data), length (the size of the data), and type (the MIME type of the data, ie "audio/mpeg"). You can use the GetProperty method to get the any element or any attributes by using the syntax:

item/element@attribute
	

Where "@attribute" is optional (without it you'll get the value of the element, with it you'll get the value of the attribute). For example, in order to get the url attribute of an enclosure element from the 5th item:

string url = rss1.GetProperty("item[5]/enclosure@url");

Now I can loop through each item in the feed and check for enclosures.

for (int j=1; j<=rss1.ItemCount; j++)
{
  string type = rss1.GetProperty("item[" + j.ToString() + "]/enclosure@type");
  string url = rss1.GetProperty("item[" + j.ToString() + "]/enclosure@url");
  string bytes = rss1.GetProperty("item[" + j.ToString() + "]/enclosure@length");
  
  //if there is an enclosure url, remember it
  if ((!url.Equals("")) && (!enclosures.Contains(url))) enclosures.Add(url);
  
  //and if the url is new (never been downloaded before), download it:
  if (!url.Equals("") && isNew(feedname, url))
    {
	totalbytes = Convert.ToDecimal(bytes);
	DownloadEnclosure(i, url);
    }
}

Finding New Items

Note that in the code above the decision to download a particular enclosure is made based on whether or not there is an enclosure URL and whether or not that URL is new. "isNew" is a Boolean function to check a "history" of downloads. If the URL has been downloaded already, there is no need to download it again. You can maintain your history any way you wish, but I'll quickly explain the way that I've dealt with it.

I've used a set of global ArrayLists to keep track of several pieces of management data in my application:

private ArrayList history = new ArrayList();
private ArrayList enclosures = new ArrayList();
private ArrayList donotclean = new ArrayList();
private ArrayList subscriptions = new ArrayList();
private ArrayList bookmarks = new ArrayList();

The history ArrayList is synched with a data file at run time. This contains a list of all of the enclosure URLs that have been downloaded in the past.

The history data file is kept "clean" by removing items that are no longer even available in a particular feed. This keeps the history file from growing indefinitely and keeps it at a stable size. I do this by keeping a list of all enclosure urls found in each feed. All of these URLs get saved in the enclosures ArrayList. When I save the history file, I compare the array list of all current enclosures (the enclosures ArrayList) with the ArrayList of the saved history. Anything that is in the history list that is no longer available in the feeds can be deleted from the history.

The donotclean ArrayList is used to list feeds whose items should not be deleted from the history. This is necessary in order to support the IfModifiedSince HTTP request header, which is very important. If the feed has not changed since the IfModifiedSince date, no items will be returned (and so they will not be in the enclosures ArrayList), but we do not want to clean them from the history ArrayList.

The subscriptions ArrayList is used to keep track of all of the feeds the application is currently subscribed to. The subscriptions ArrayList also keeps track of the lastmodified date for each feed. This is also synched with a data file at runtime.

My application uses the Windows Media Player ActiveX object to allow listening to or viewing of downloaded items. The bookmarks ArrayList is used to keep track of any bookmarks that have been saved in the application. I've implemented bookmarks as a way to pickup listening to a particular feed item (audio or video) at the place where you left off at some previous time. The bookmarks are also synched with a data file.

The history, donotclean, and enclosures ArrayList form the basis of my structure for determining if an item is new or not. I can just check to see if the enclosure url is contained in the history ArrayList. If it is not, I know it is new. Note that this does make the assumption that all enclosure items have a unique url. The RSS specification does not require a unique identifier for each item, nor a permanent url (permaLink). The isNew function does this history checking for me:

private bool isNew(string feed, string url)
{
	//if its already there, its not new, move on...
	if (history.Contains(feed + "|" + url)) return false;			
		//otherwise, its NOT here, so add it
	history.Add(feed + "|" + url);
	return true;
}

If the item is new, the isNew function goes ahead and add its to the history ArrayList. And when isNew returns with true the DownloadEnclosures function will be called to download the new item.

Downloading Enclosures

The DownloadEnclosure method simply downloads the enclosure file. This is done by setting the LocalFile property of the RSS component to the name of the local file at which the url data should be saved in, and calling the GetURL method. I also use the Transfer event, which fires periodically during the transfer process, to implement a progress bar for the download.

private void DownloadEnclosure(string feedname, string url)
{
	//the fileizeURL function converts a URL to a suitable filename
	filename = fileizeURL(url);
	
	//All downloaded items are saved in a subdirectory of a main download path.
	//The subdirectory name is the name of the feed:
	System.IO.Directory.CreateDirectory(txtPath.Text + feedname);
	
	rss1.Config("Localfile=" + txtPath.Text + feedname + "\\" + filename);
	
	rss1.GetURL(url);
}

Although Podcasting is new at the time of this writing, its use is growing rapidly. Available Podcasts to subscribe to are multiplying every day. There are tools available to encourage this growth - aggregators, Podcast clients (like the one here), Podcast directories, etc. As new and improved software is introduced, the technology will continue to grow and more consumers will take advantage of it.

Using the powerful IPWorks toolkit, you can develop all kinds of innovative and productive applications with relative ease. I hope this sample will prove useful, and encourage your own contribution. Please feel free to contact me with any comments, ideas, or improvements that you would like to contribute. Again, you can download this Podcasting sample application here.


We appreciate your feedback.  If you have any questions, comments, or suggestions about this article please contact our support team at kb@nsoftware.com.