known icon indicating copy to clipboard operation
known copied to clipboard

Provide RSS feed history metadata

Open jameysharp opened this issue 5 years ago • 0 comments

Is your feature request related to a problem? Please describe. RSS feed consumers who want to read further back than the most recent items_per_page can't currently do so, unless maybe with Known-specific hacks.

Describe the solution you'd like I'd like Known to implement RFC5005, "Feed Paging and Archiving".

When the total number of entries is at most items_per_page, so the whole history fits within one feed document, you can implement section 2 just by adding <fh:complete xmlns:fh="http://purl.org/syndication/history/1.0"/> in the <channel> section. (Or declare the namespace on the <rss> tag along with all the others, if you prefer.)

Paginated archives are trickier, because section 4 requires that archive documents should not change once they've been published. That's a useful requirement for two reasons:

  • Stability allows consumers to aggressively cache the archives: the only page they need to revalidate is the main feed document.
  • Stability also allows consumers to aggressively discard cache contents: they can keep minimal metadata such as entry title, publication date, and GUID, and if someone wants to see the full contents of an old entry, the feed reader can go refetch the archive page with the expectation that the contents will still be available.

Describe alternatives you've considered Since section 4 can be tricky to implement, you could instead implement section 3. If a feed has older entries available, you'd add an <atom:link rel="prev"> tag to it, passing the offset query parameter as the current offset plus items_per_page. If I'm reading the source code correctly, I think you might be able to implement that without touching anything outside of templates/rss/shell.tpl.php.

The downside is that this style prevents any meaningful caching. A consumer has to revalidate all pages of the feed history to detect any changes. Further, if they want to look at an old entry they've evicted from cache, they have to walk the whole chain of atom:links from the beginning until they find the entry they're looking for.

So section 3 is really only useful for consumer software where HTTP requests for navigating the paginated archives are tied directly to infinite-scroll-style user interactions. Granted, that may be good enough for the things Known is used for.

Additional context I have a very rough prototype of an RFC5005 feed reader at http://reader.minilop.net/, although I've only implemented sections 2 and 4, not section 3.

I can make some suggestions on how to meet section 4's stability requirements, if you're interested, and if you can answer some questions that I couldn't figure out from browsing the source code. Specifically, does Known already have code for any of these things, and if not, how much of a pain would each be to implement:

  • reliably update a post's last-modified timestamp when that post is edited in any way
  • keep a timestamped "tombstone" when a post is deleted
  • paginate on month/week/whatever boundaries instead of a fixed number of entries
  • store a change counter on each post which is set higher than any existing counter when that post or any earlier post changes
  • produce a hash of the contents of a given page of posts, and of the hashes of all older pages

Combinations of various subsets of the above are sufficient, they aren't all necessary.

jameysharp avatar Jun 26 '19 20:06 jameysharp