gofeed
gofeed copied to clipboard
Allow parsing of atom tags in RSS
As needed per #151, I added parsing of atom tags in RSS.
This adds a mechanism to add a parser for certain extension namespaces. This is now only implemented for calling Atom from RSS. This avoids re-handling Atom in RSS by calling the Atom parser directly.
The implementation tries to be very agnostic: the RSS parser knows nothing about Atom. I played with the idea of removing the abstraction and couple them tighter, but the result was not that different -- so I settled with the nicer agnostic one.
The translation phase, of course, is not agnostic anymore, because it has to know what to put where.
Remarks:
- This implementation currently only supports calling the external parser on the item level. Adding parsing on channel level is trivial, the translation phase then is more involved, because currently everything expects an
*atom.Entry. - The initialisation of the inner atom parser might not be completely correct (see
atom/parser.go#ParseAsExtension), because I do not know the idea behind base and the url stack. - Due to the agnostic ways of the approach, the JSON decoder does not precisely know what to do with the extensions. Thus, for the tests, the extensions of the
actualobject take a JSON-detour (i.e., en- then decoded) to match theexpected. - The translation of atom-tags is currently only added for
authorandupdated. If more are needed, they need to be added.
Resolves #151
Hi @mmcdole, would you mind giving a feedback, whether this has any chance of getting included and/or what is needed to do so?
(I also noted, that the coverage bot avoids this PR...)
@Necoro thanks for the submission.
Can you provide a bit more context around the problem for me, just so I grok it. Do you often find atom tags within RSS feeds?
@mmcdole I use your library for the parsing part in an RSS reader. A user found the StackOverflow RSS feed to have that problem and opened an issue with me. I'm using a patched version of gofeed since my fix and cannot say how many other RSS feeds are out there with that strange structure (as they just work ;)).
Coverage increased (+0.5%) to 75.256% when pulling 5a9d204f8125960177546c9f4f83461b24c79e86 on Necoro:fix_151 into 5d81d4b54b22ccc4b0f038a0130d342bc6b88312 on mmcdole:master.