I Hate RSS

I am in the midst of writing a newsreader client that must display content from both RSS and Atom feeds. I am using ROME to parse the feeds, which I then display in my GUI. ROME handles a vast majority of parsing nastiness, I just have to interpret and display whatever ROME gives back to me. Rome provides the least-common-denominator of every possible feed, which leaves a lot of room for ambiguity.

Today I discovered that for some RSS feeds, ROME tells me the content type is “text/plain”. When I dutifully attempt rendering as text/plain, I find the feeds in fact contain HTML markup like hyperlinks. When I test these exact same feeds in tools like Google Reader, I see that they treat them as HTML instead of plain text.

Other RSS feeds report their content type (again, via ROME) as text/html.

I do not know a good way to handle this. I guess I’ll just blindly assume that every RSS feed contains HTML, regardless of what the content type says.

This sucks. If you ever have the opportunity to PRODUCE your own feed, PLEASE consider using Atom. And if your feed contains HTML tags, why on earth would your feed tell everyone it is text/plain?


One Response to “I Hate RSS”

cooper Says:

As a Rome developer, thanks!

The problem here is really one of RSS — there is no differentiation between entity-escaped HTML and plain text in any node, nor a certified XHTML content quality.

It does totally blow. Quite frankly, if you are using Rome Fetcher, we are already second guessing update times vs Last-Modified vs Etags vs TTL and UTF-8 vs Win-1251 encoding and a THOUSAND other things that people screw up all the time, even with Atom it is hard. Add to that (in my area) Apple and Yahoo not honoring capitalization or trailing “/”s in namespaces, and the number of workarounds in Rome grows to a truly horrendous level.

Rome does a best-fit guess well. If you need some additional code, I will gladly share some of my wrap arounds, but frankly, there are still just some shortcomings in the specs, and a MILLION bad feeds to try and deal with.

Leave a Reply