NetNewsWire icon indicating copy to clipboard operation
NetNewsWire copied to clipboard

Feed not found if no `<head>` start tag is present

Open j9t opened this issue 4 years ago • 23 comments

NetNewsWire doesn’t appear to find feeds if there’s no <head> start tag:

  • Failure: https://hell.meiert.org/temp/netnewswire/failure.html
  • Success: https://hell.meiert.org/temp/netnewswire/success.html

The failure case is valid HTML, however, and therefore it would be useful if NetNewsWire would not rely on <head> to locate feeds.

(Thanks @danburzo for alerting me about this.)

j9t avatar Sep 15 '21 21:09 j9t

Does this occur regularly? (Just gathering info; I personally think it's worth fixing.)

Wevah avatar Oct 09 '21 03:10 Wevah

We're bailing early on the failure page because we explicitly check for a <body> in isProbablyHTML. Even without that, I don't think we search past <head>, so it looks like there are at least two places to fix.

Wevah avatar Oct 09 '21 04:10 Wevah

Does this occur regularly? (Just gathering info; I personally think it's worth fixing.)

Should affect less than .5% of pages, however it’s not just valid, but also an optimization option as in tooling like html-minifier. I’d suspect it become more common (though I’m also a bit biased).

j9t avatar Oct 09 '21 11:10 j9t

I think the isProbablyHTML part could be robustified a bit by checking for <!DOCTYPE html (case-insensitively) in the first x bytes (allowing for, e.g., a possible XML preamble). As for scanning past <head>, maybe that could be done but bail after a certain number of characters and/or if we've already found appropriate <link>s in the <head>? It's certinly an edge case, but it's possible that someone would put a <link> in at the end of a ginormous file…

Wevah avatar Oct 09 '21 17:10 Wevah

Feeling free to nudge this issue—coming from an angle of this not working with valid HTML, is it something that could be fixed?

j9t avatar Nov 01 '22 13:11 j9t

I've run into this several times over the past week. Most recently with https://blog.littlepolygon.com. This has happened to me every time I've gone to add a feed, so there might be new (valid) html patterns that are spreading.

ottumm avatar Feb 03 '23 00:02 ottumm

Seems like that one has a <head> tag, too. 🤔

Wevah avatar Feb 28 '23 21:02 Wevah

I'm not sure if the cause is related, but I came to report that I keep getting 'feed not found' errors when I tried to add a feed from https://blueskiesdaily.com . I also tried and got the same error with https://blueskiesdaily.com/feed.xml , which is the direct path to the feed.

pketh avatar Mar 31 '23 18:03 pketh

https://blueskiesdaily.com/feed.xml redirects to http://blueskiesdaily.com/feed.xml — which redirects to https://blueskiesdaily.com/feed.xml — it’s a redirect loop. Server error.

brentsimmons avatar Mar 31 '23 18:03 brentsimmons

It's highly likely I'm misinterpreting something, but when I hit https://blueskiesdaily.com/feed.xml in firefox, this .rss (xml) file is downloaded by the browser (I zipped it up for github). I don't get a 500 response or any other error from the browser.

tIkuiIBk.rss.zip

pketh avatar Mar 31 '23 18:03 pketh

I misdiagnosed earlier — the actual feed URL is https://blueskiesdaily.com/feed/

@pketh Is there any chance you’re trying to add this to a Feedly account in NetNewsWire? We’ve seen a Feedly bug where that doesn’t always work. If not, what kind of account are you using?

brentsimmons avatar Mar 31 '23 19:03 brentsimmons

yes I'm indeed using Feedly. That said, I'm actually looking to migrate away from it – probably to the icloud sync. So I'm happy to do that sooner rather than later to solve my specific issue :)

pketh avatar Mar 31 '23 20:03 pketh

You can migrate a little at a time. You can add iCloud (or whatever account type) and keep using Feedly for as long as you want to.

brentsimmons avatar Mar 31 '23 20:03 brentsimmons

can confirm, https://blueskiesdaily.com/ works as expected with icloud sync. For other people that might have similar issues in the future, it'd be awesome to have the extra detail in the error message of whether the error was because either:

  1. the feed address couldn't be found in the page html,
  2. whether something went wrong with the feed service/feedly, or
  3. whether it was an error with netnewswire

From the current error message, I had wrongly assumed the problem was a problem on the NNW end (3)

pketh avatar Mar 31 '23 20:03 pketh

I'm not sure if the cause is related,

It doesn’t seem so, given that this issue is specifically about the <head> start tag not being present, an issue which itself still seems to persist.

@brentsimmons, is there any chance to get this fixed? It‘s valid HTML, it’s a legitimate code optimization step, and it seems feed detection could work without looking for this tag.

j9t avatar Mar 31 '23 21:03 j9t

I'm not sure if this is the same issue, but I've also run into this with YouTube user page URLs (e.g. https://www.youtube.com/@unpeeled_). NNW gives an error when trying to add them (using Feedly as the back end), while adding the same URL in Feedly directly works as expected.

ottumm avatar Mar 31 '23 23:03 ottumm

@ottumm Works if adding to NetNewsWire directly instead of using a Feedly account in NetNewsWire.

Our error reporting could make it clear that it’s a Feedly issue. We should also report the bug to Feedly.

brentsimmons avatar Mar 31 '23 23:03 brentsimmons

@brentsimmons ok...do you recommend migrating off of Feedly specifically at this time, for continued use of NetNewsWire? I seem to be getting errors on most feeds I try to add, which is obviously pretty annoying.

ottumm avatar Mar 31 '23 23:03 ottumm

@ottumm I think your options are 1) when you want to add feeds, add them via the Feedly web app in your browser, or 2) when you want to add feeds, add them to an iCloud (or other kind of) account in NetNewsWire.

I don’t recommend migrating all at once. If you do want to migrate, start with just putting new feeds in a new account.

brentsimmons avatar Mar 31 '23 23:03 brentsimmons

Ok, that makes sense. I'd prefer to keep all my feeds in one place for portability, but I can work around this for now. If there's a Feedly issue to follow for this that'd be wonderful. Thanks!

ottumm avatar Mar 31 '23 23:03 ottumm

for what it's worth, I was able to export my feedly opml with 630 feeds (not sure if that's big or not), remove the feedly account from nnw, and import them into a nnw icloud account. It beachballed during the import for 20seconds or so on my M1 mac, but everything worked great after that.

(before this issue, I'd been thinking about leaving feedly anyways because I never use the website directly, and because some of their marketing lately has been sus: https://c.im/@[email protected]/110113209095150456)

pketh avatar Apr 01 '23 00:04 pketh

Withdrawing issue report.

j9t avatar Apr 01 '23 07:04 j9t

Reopening because this needs additional verification.

brentsimmons avatar Aug 05 '24 04:08 brentsimmons