tufte-jekyll icon indicating copy to clipboard operation
tufte-jekyll copied to clipboard

Margin content in feed.xml

Open ekstroem opened this issue 8 years ago • 26 comments

Howdy.

When feed.xml is created the content is provided by

   <description>{{ post.content | xml_escape }}</description>

However, the side margin information is included in the post.content which - for footnotes and references - can make the text rather unreadable. Is it possible to only get the content corresponding to the main column (i.e,. the content you can see for the smallest screen)?

ekstroem avatar Mar 06 '16 23:03 ekstroem

Hmm, I think that makes sense. I'm happy to write a pull request if you and @clayh53 will review it.

ghost avatar Mar 07 '16 07:03 ghost

I'd be happy to test that out

ekstroem avatar Mar 07 '16 07:03 ekstroem

@ekstroem please test. You can cherry-pick this commit into your repo to test the code: https://github.com/clayh53/tufte-jekyll/commit/81ce4e356e42715faf7d737923349cc8d9e256f4 Thanks!

ghost avatar Mar 07 '16 09:03 ghost

Testing it out right now but I'm having some trouble with nokogiri. Installation worked fine (local) but running jekyll build throws.

`require': cannot load such file -- nokogiri (LoadError)

Trying to figure out why my installation isn't working.

ekstroem avatar Mar 07 '16 11:03 ekstroem

Please list the following:

  • OS and OS version
  • Ruby version
  • Jekyll version

ghost avatar Mar 07 '16 12:03 ghost

  • Mac OSX 10.9.5
  • ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin13]
  • jekyll 2.5.3

ekstroem avatar Mar 07 '16 12:03 ekstroem

Hmm, what command did you use to install the gem locally? It may be a path issue with your gems.

ghost avatar Mar 07 '16 12:03 ghost

I ran

gem install nokogiri 

which gave no errors and when I check gem list --local then I get

...
mustache (0.99.8)
netrc (0.10.3)
nokogiri (1.6.7.2, 1.6.6.2)
pango (2.2.5)
parslet (1.5.0)
...

ekstroem avatar Mar 07 '16 12:03 ekstroem

I don't think you should be able to run gem install without sudo on Mac. Sorry, I'm not sure what next to suggest.

ghost avatar Mar 07 '16 12:03 ghost

Maybe does this help? http://stackoverflow.com/questions/19643153/error-to-install-nokogiri-on-osx-10-9-maverick

ghost avatar Mar 07 '16 12:03 ghost

Possibly. I did try installation with sudo gem install nokogiri also but that resulted in the same error. However, I don't get installation errors. Only when running jekyll. I'll dig further.

ekstroem avatar Mar 07 '16 12:03 ekstroem

Alright. Got nokogiri up and running by updating ruby and reinstalling the jekyll gem.

Here's a snippet from feed.xml where there is/was a sidemargin note.

frem til december 2015&lt;label for=&quot;sn-dst-fødselsmåned&quot; class=&quot;margin-toggle sidenote-number&quot;&gt;&lt;/label&gt;. Figuren herunder 

I think it looks great. Thanks a bunch. I'm not sure if it's worth keeping the whole <label ...> .... since that won't work in an RSS reader anyway but it doesn't do any harm I suppose

ekstroem avatar Mar 07 '16 18:03 ekstroem

So just to verify, the output you shared is being generated by my changes? Or was that from before you tested my changes.

ghost avatar Mar 08 '16 01:03 ghost

Generated by your changes. Could be merged directly (IMO)

ekstroem avatar Mar 08 '16 01:03 ekstroem

Hmm, that's not how it should look. I want to completely remove the HTML elements in question. I'll have a look at this.

ghost avatar Mar 08 '16 01:03 ghost

The sidenote reference ends up being in < label> ... < /label> so it disappears altogether from the actual viewable text.

ekstroem avatar Mar 08 '16 01:03 ekstroem

Sure but there's still that mess of &lt;label for=&quot;sn-dst-fødselsmåned&quot; class=&quot;margin-toggle sidenote-number&quot;&gt;&lt;/label&gt; in the feed which shouldn't be in there.

ghost avatar Mar 08 '16 01:03 ghost

Ah, I forgot a CSS selector, I need to search and remove labels that have margin-toggle sidenote-number

ghost avatar Mar 08 '16 02:03 ghost

@ekstroem Ok, I just updated my branch and all elements of margin and side-note content should be fully removed from feed.xml - please test again when you can. thanks!

ghost avatar Mar 08 '16 02:03 ghost

Just ran the updated version of strip_margin_content.rb and when looking at the same position in feed.xml as before it now reads

frem til december 2015. Figuren herunder viser 

All the unnecessary html fluff is gone. Bravo! I really appreciate you spending time on writing the ruby code for this.

ekstroem avatar Mar 08 '16 08:03 ekstroem

Hurrah! Glad to do it. Thanks for testing it.

ghost avatar Mar 08 '16 08:03 ghost

Btw, I modified an existing filter for this, so credit goes to @sumdog and this gist https://gist.github.com/sumdog/99bf642024cc30f281bc

ghost avatar Mar 08 '16 08:03 ghost

Thinking further about this: removing marginnote and sidenote content completely isn't the best way to handle this. It would be better to convert all marginnote and sidenote content into footnotes for feed.xml. This way if someone is reading the site via a feed reader, they could still see the content at the bottom of each article.

Using nokogiri, you can replace a node in a document, as well as move nodes around. So to convert to footnotes perhaps something like this:

  1. For each instance of <span class='marginnote'></span> and <span class='sidenote'></span> replace that node with a super-script number and footnote id, perhaps <span class='footnote' id='footnote-1'>1</span> or something and style this as super-script.
  2. Move just the content of each node into a new footnote node located at the bottom of the page. Link this content back to the footnote id.
  3. Cleanup the page by deleting all existing instances of marginnote and sidenote nodes. This commit does exactly that: https://github.com/clayh53/tufte-jekyll/commit/8f13e2a3a6a6f182677da019ed9bc8e2bdb8ae65

This leaves you with a page of footnote numberings in the text content, and associated footnote pairings at the end of the content. Write this as a Jekyll filter, and put it right after post.content in feed.xml to process each post's content before rendering it in the feed.

This isn't something I have time for. Maybe someone else can use these notes to write the necessary plugin.

Is it possible to only get the content corresponding to the main column (i.e,. the content you can see for the smallest screen)?

In reply to OP: If you just want to strip out all marginnotes and sidenotes from feed.xml, cherry-pick this commit: https://github.com/clayh53/tufte-jekyll/commit/8f13e2a3a6a6f182677da019ed9bc8e2bdb8ae65

ghost avatar Mar 10 '16 14:03 ghost

I'd be fine for me to close this since I got the tweak I wished for, but I'm not sure if @clayh53 / @xHN35RQ have figured out what the best way to proceed would be.

ekstroem avatar Apr 04 '16 19:04 ekstroem

Another thing I want to investigate is whether we can tweak the jekyll rendering engine to do this for us when it creates the feed's xml file.

clayh53 avatar Apr 04 '16 23:04 clayh53

Right now the feed xml contains the generated post content, so it already contains any margin or side notes. You'd need to write a custom content renderer plugin to generate a feed specific post content where the margin/side notes are turned into footnotes. Essentially, for each margin/side note instead of including directly into the post, transform it into a footnote and include the note content in a new footnote section at the bottom of the post. Again, this seems like an awful lot of work just for the feed content.

Recently I found this plugin which uses gsub to filter out content to make it appropriate for inclusion in RSS feed. If you want to just filter note content entirely, using this approach is probably better than my solution which uses nokogiri, since gsub is part of ruby so you don't need an extra gem requirement:

https://github.com/mpc-hc/mpc-hc.org/blob/master/source/_plugins/fix_rss.rb

ghost avatar Apr 08 '16 04:04 ghost