Threading NNTP News Articles

In this article I will walk through how to create a news group browser in ASP.Net. To do so, I'll use the NNTP control that is included in IPWorks Specifically, I'll use the .Net Edition of IPWorks since I will be creating an ASP.Net application. The API of the NNTP control is uniform across all editions of IPWorks, so don't be afraid to read on if you are developing in some other language like Delphi, C++, or Visual Basic. To download the full code of the sample ASP.Net web application, click here.

Threading

Newsgroup articles are stored on the news server in order of arrival, not in order of message thread or subject. When displaying these articles for reading, I definitely do not want to display them in the order of arrival, but instead in a way that makes more sense to everyone: in a threaded layout. A threaded layout means that replies will fall under their parent message in that nice threaded tree view that everyone is used to. To do this, I'll download a large number of message headers and sort them into the correct order. This can be done by using the XMLDocument Object in VS.Net to store a tree of related messageid's.

Connect to NNTP Server and Download Message Headers

To connect to the server, I just set the NewsServer property of the NNTP component and call the Connect method.

	Nntp1.NewsServer = "msnews.microsoft.com"
	Nntp1.Connect()

Now that I'm connected, the next step is to download the message details of however many messages I want to thread. For this demo, I'll just download the newest 100 articles in the group. To do this I'll use the GroupOverview method of the NNTP component, which downloads the headers (subject, from, references, messageid, etc) of a range of articles specified in the OverviewRange property. After specifying the CurrentGroup property, the FirstArticle and LastArticle properties will have a value reflecting the numbers of the first and last articles in the group, respectively. I can use this to specify the most recent 100 articles in the OverviewRange property.

Private Sub GetThreads(ByVal group As String)
    Nntp1.NewsServer = "msnews.microsoft.com"
    Nntp1.CurrentGroup = "microsoft.public.dotnet.languages.vb"
    Nntp1.Connect()
    
    Dim start as long = Nntp1.LastArticle - 100
    Dim end as long = Nntp1.LastArticle
    Nntp1.OverviewRange = start.ToString() + "-" + end.ToString()
    Nntp1.GroupOverview()
    
    Nntp1.Disconnect()
End Sub	

Creating The Tree

As a result of calling the GroupOverview method, the GroupOverview event will fire for each message in the OverviewRange. The GroupOverview method is a synchronous one so it will not return until all of the GroupOverview events have fired. The event provides me with the following details in the form of event parameters:

  • ArticleNumber - contains the number of the article within the group.
  • Subject - contains the subject of the article.
  • From - contains the email address of the article author.
  • ArticleDate - contains the date the article was posted.
  • MessageId - contains the unique message id for the article.
  • References - contains the message ids for the articles this article refers to (separated by spaces).
  • ArticleSize - contains the size of the article in bytes.
  • ArticleLines - contains the number of lines in the article.
  • OtherHeaders - contains any other article headers the news server provides for the article.

In order to create a threaded layout, the most important parts are the messageid and references parameters. Every NNTP article has its own unique message ID contained in the "Message-ID" header. Just so that you know what these look like, here are five example message IDs:

<e8ZlPf8rCHA.1132@TK2MSFTNGP12>
<vARP9.2130$L61.370841@news1.west.cox.net>
<OKG$Dr8rCHA.2484@TK2MSFTNGP10>
<KV_P9.4450$9N5.440007@newsread2.prod.itd.earthlink.net>
<3E1086F3.2050805@.com>

Every NNTP article which is a reply also has a list of references contained in the "References" header. If a new message is posted (the beginning of a thread), there is no references header. Each time a reply is formed, its references header will contain all the references of the article it is in reply to (if there are any) plus the message-ID of the article it is in reply to, space separated. For example:

Tom B creates a new thread about smurfs.

From: Tom A.
Subject: This is a new thread about smurfs
Message-ID: <abc123@tomsnetwork.com>

I like Smurfs

If Sally B replies to Tom A, Sally's References header will include the references of Tom's message (none) and the Message-ID of Tom's message.

	From: Sally B.
	Subject: Re[1]: This is a new thread about smurfs
	Message-ID: <abc456@sallysnetwork.com>
	References: <abc123@tomsnetwork.com>

	>I like Smurfs

	Me too!

If John C then replies to Sally B's message, his references header will include the references of Sally B's message, plus the Message-ID of Sally's message.

		From: John C.
		Subject: Re[2]: This is a new thread about smurfs
		Message-ID: <abc789@johnsnetwork.com>
		References: <abc123@tomsnetwork.com>, <abc456@sallysnetwork.com>
		
		>>I like Smurfs

		>Me too!
		
		I do not like Smurfs.

If Mike D. also replies to Tom A's message:

	
	From: Mike D.
	Subject: Re[1]: This is a new thread about smurfs
	Message-ID: <def123@mikesdomain.com>
	References: <abc123@tomsnetwork.com>
		
	>I like Smurfs

	I'm not a big fan.				

Notice that the References form a chain that can be followed to determine the branches of replies. It is these references along with the message-id's that I'll use to construct a tree that I can use to then easily traverse the message threads. Each time the GroupOverview event fires with these pieces of information, I'll insert that message id into the tree, for later retrieval.

If the References parameter is empty, I know that I have a new message which is NOT a reply. In this case I'll simply append a new node to the root of the tree so that the tree now could be described by the XML below. For details on how to add new nodes to the XMLDocument object, check out the source code of this application, or the MSDN documentation of the XMLDocument object.

<newsgroup name="microsoft.public.dotnet.languages.vb">
	<message msgid=abc123@mynetwork.com></message>
</newsgroup>

If the References parameter is not empty, I know that I have a new message which IS a reply. Now I scan the tree to find the message thread to which this new message belongs. To do this, look for only the last message-id in the references list, since that will be the message of which this new message is a direct reply. Traverse down the tree and search for that message-id. If it is found, append a new node to the matched node, so the tree would now look like:

<newsgroup name="microsoft.public.dotnet.languages.vb">
	<message msgid=abc123@mynetwork.com>
		<message msgid=abc456@othernetwork.com/>
	</message>
</newsgroup>

If I traverse the tree and do not find a match, then its safe to assume that the new message is in reply to an old message (at least older than the newest 100 articles that I'm looking at). So I'll just start it as a new node at the root level:

<newsgroup name="microsoft.public.dotnet.languages.vb">
	<message msgid=abc123@mynetwork.com></message>
	<message msgid=slkjdsldkjsd></message>
</newsgroup>

After all the GroupOverview events fire, I'll have a large tree with all of the articles in it, indexed so that replies are child nodes of the replied-to message. The most difficult code of the project is done. To see the complete code, please download the sample project here.

Private Sub Nntp1_OnGroupOverview(ByVal sender As System.Object, ByVal e As
    nsoftware.IPWorks.NntpGroupOverviewEventArgs) Handles Nntp1.OnGroupOverview
    If e.References = "" Then
        'this message has no references, it is not a reply, start a new thread:
        AddNode(e.ArticleDate, e.Subject, e.From, e.MessageId, Msgs.ChildNodes(0))
    Else
        'this message refers to an earlier message
        Dim ref As String = GetLastReferenceID(e.References)
        Dim place As System.Xml.XmlNode
    place = Msgs.SelectSingleNode("/newsgroup/message[@msgid = '" + ref + "']")
        If (Not place Is Nothing) Then
            'add node to existing tree
            AddNode(e.ArticleDate, e.Subject, e.From, e.MessageId, place)
        Else
            'or create new thread b/c reference is too old
     AddNode(e.ArticleDate, e.Subject, e.From, e.MessageId, Msgs.ChildNodes(0))
        End If
    End If
End Sub

Displaying the Resulting Tree - XSL Transform

After the GroupOverview method returns, I have a populated XMLDocument object. This can be traversed programmatically and displayed in easily customizable forms using an xsl transformation.

I'll use the "XML" WebControl object to do this transformation. This object is used to display XML data in a webforms application. It has a DocumentSource and a TransformSource property, to which you assign xml data and xsl data respectively. I'll provide the DocumentSource property with the OuterXML property of the XMLDocument tree, and set the TransformSource property to an XSL file. This will allow me to customize the display of the data without having to do so programmatically.

    Private Sub DisplayThreads()
        Xml1.TransformSource = "xmlnewsthread.xsl"
        Xml1.DocumentContent = Msgs.OuterXml
    End Sub

My xmlnewsthread.xsl sheet displays the message in a tree structure. Each child node is displayed with a left margin (in pixels) of 10 times its level in the tree. The subject and sender of each article is displayed. The XSL looks like so:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">


<xsl:template match="message">
<xsl:apply-templates select="message"/>
</xsl:template>

<xsl:template match="message">
  <span>
  <xsl:attribute name="style">margin-left:<xsl:value-of 
  	select="count(ancestor::*)"/>0px;</xsl:attribute> 
  	
  <a>
  	<xsl:attribute name="href">read.aspx?ID=<xsl:value-of select="@msgid"/>
  	&group=<xsl:value-of select="/newsgroup/@name"></xsl:value-of>
  	</xsl:attribute> <xsl:value-of select="@subject"/>
  </a>
  
  </span>
  
  by <span><xsl:attribute name="style">color:#008080;</xsl:attribute>
  	<xsl:value-of select="@from"/></span>
  	<br/>
  
  <xsl:apply-templates select="message"/>
</xsl:template>

</xsl:stylesheet>

Reading Articles

The XSL displays each message as a link to read.aspx with the messageID as a querystring variable called ID, and the news group as a querystring variable called group, like so:

http://myserver/xmlnewsthread/read.aspx?ID=1234&group=server.group

read.aspx is the webform used to display the content of a particular article. In the Page Load of this page, the querystring variables are retrieved, and the article with the specified messageid fetched from the specified group. The FetchArticle method is used to fetch an entire article (body and headers) from the news server. Before calling the fetchArticle method, I must set the CurrentGroup to fetch from, and the CurrentArticle (a messageid) to fetch.

Private Sub Page_Load(ByVal sender As System.Object, ByVal e As System.EventArgs)
	                      Handles MyBase.Load
    'fetch and display a particular message by message id
    Nntp1.NewsServer = "msnews.microsoft.com"
    Nntp1.CurrentGroup = Request("group")
    Nntp1.CurrentArticle = "<" + Request("ID") + ">"

    Nntp1.FetchArticle()

    'populate the labels and textbox on the form with the contents of the article
    lblFrom.Text = from
    lblSubject.Text = subject
    lblDate.Text = msgdate
    txtarticle.text = Nntp1.ArticleText
End Sub

The from, subject, and msgdate variables are globals that are set in the Header event of the NNTP component, which fires during the FetchArticle method execution.

Private Sub Nntp1_OnHeader(ByVal sender As System.Object, 
	ByVal e As nsoftware.IPWorks.NntpHeaderEventArgs) Handles Nntp1.OnHeader
    Select Case (e.Field)
    Case ("Subject") : subject = e.Value
    Case ("Date") : msgdate = e.Value
    Case ("From") : from = e.Value
    End Select
End Sub
This application can easily be built upon to allow the user to compose replies (including html, embedded images, and file attachments) and post them to the news server. The application also could be modified to connect with SSL-secured NNTP servers, using the NNTPS component in IPWorks.

We appreciate your feedback.  If you have any questions, comments, or suggestions about this article please contact our support team at kb@nsoftware.com.