parser icon indicating copy to clipboard operation
parser copied to clipboard

Parse pre-fetched HTML with command-line tool

Open acontia opened this issue 4 years ago • 1 comments

Hi,

With the command line tool, is it possible to parse custom or pre-fetched HTML by passing an HTML string to the parse function?

I want to do something like the following, but using the command line tool provided:

Mercury.parse(url, {
  html:
    '<html><body><article><h1>Thunder (mascot)</h1><p>Thunder is the stage name for the horse who is the official live animal mascot for the Denver Broncos</p></article></body></html>',
}).then(result => console.log(result));

I tried the following but it doesn't seem to be supported:

./mercury-parser "http://example.com" --html='<html><body><article><h1>Thunder (mascot)</h1><p>Thunder is the stage name for the horse who is the official live animal mascot for the Denver Broncos</p></article></body></html>'

Any idea?

acontia avatar May 24 '20 14:05 acontia

write a custom wrapper, but it seems that there's some encoding problems when passing pre-fetched data into Mercury.parse ref: https://github.com/ttimasdf/ArchiveBox/commit/78477dc387908677fb65c7d0f1b09edd1063d970#commitcomment-42621227

ttimasdf avatar Sep 23 '20 10:09 ttimasdf