parser
parser copied to clipboard
Parse pre-fetched HTML with command-line tool
Hi,
With the command line tool, is it possible to parse custom or pre-fetched HTML by passing an HTML string to the parse function?
I want to do something like the following, but using the command line tool provided:
Mercury.parse(url, {
html:
'<html><body><article><h1>Thunder (mascot)</h1><p>Thunder is the stage name for the horse who is the official live animal mascot for the Denver Broncos</p></article></body></html>',
}).then(result => console.log(result));
I tried the following but it doesn't seem to be supported:
./mercury-parser "http://example.com" --html='<html><body><article><h1>Thunder (mascot)</h1><p>Thunder is the stage name for the horse who is the official live animal mascot for the Denver Broncos</p></article></body></html>'
Any idea?
write a custom wrapper, but it seems that there's some encoding problems when passing pre-fetched data into Mercury.parse
ref:
https://github.com/ttimasdf/ArchiveBox/commit/78477dc387908677fb65c7d0f1b09edd1063d970#commitcomment-42621227