Nick Sweeting
Nick Sweeting
Ok RSS parser is fixed. There's now a dedicated RSS parser for the Shaarli export format. Give it a try with the latest version of master.
Very strange. You can see in the output it now says `Adding 11 new links to index from /data/sources/demo.shaarli.org-1549427254.txt (Plain Text format)`, notice the `Plain Text Format` at the end,...
Ah sorry I forgot to push it to master! It was just on my local branch. Try pulling master and uncommenting that line now.
At a conference right now and have a busy week ahead, so apologies if I don't get around to fixing this for a bit.
A redacted copy of your `/data/sources/demo.shaarli.org-1549685314.txt` would be helpful, thx.
@mawmawmawm I think I fixed it (in eff0100), pull the latest master and give it a shot. Comment if it's still broken and I'll reopen the issue.
I just ran the latest master on this sample Shaarli export you provided: https://github.com/pirate/ArchiveBox/issues/135#issuecomment-460898443 and it worked as expected (imported 4 links and parsed as Shaarli RSS format). If the...
Sorry for the delay, just fixed this @jeanregisser in 58c9b47. Pull the latest master and give it a try. Comment back here if it doesn't work and I'll reopen the...
w3.org and purl.org are expected in full-text parsing mode (which it's falling back to due to a bug) because they are linked to in the RSS even though the links...
try increasing the download timeout in case it's slow: `archivebox config --set TIMEOUT=180`.