web-auto-extractor icon indicating copy to clipboard operation
web-auto-extractor copied to clipboard

Error in jsonld parse

Open floflock opened this issue 7 years ago • 12 comments
trafficstars

Hey there,

how can I prevent this error thrown by jsonld-parser.js? The input html code has no jsonld, this why w-a-e is throwing the error.

Maybe, w-a-e should try if json-ld is present, because the error message is wrong at the moment if now json-ld is in the html - right?

Error in jsonld parse - SyntaxError: Unexpected end of JSON input                                                                 
POST /detailpage 200 4729.179 ms - 678
POST /detailpage 200 5661.727 ms - 2388
Error in jsonld parse - SyntaxError: Unexpected end of JSON input
POST /detailpage 200 4616.781 ms - 664
Error in jsonld parse - SyntaxError: Unexpected end of JSON input                                                                 POST /detailpage 200 4659.136 ms - 1339  

floflock avatar Sep 14 '18 15:09 floflock

@floflock Can you include a code snippet to reproduce this issue?

paambaati avatar Sep 14 '18 15:09 paambaati

Sure, here is the code:

const wae = require('web-auto-extractor').default

const html = '<html></html>' // from a request of my development server

wae().parse(html)

Maybe you try this page: https://www.docmorris.de/melissaphosphorus-compmischung/01632860

See data here: https://search.google.com/structured-data/testing-tool/u/0/#url=https%3A%2F%2Fwww.docmorris.de%2Fmelissaphosphorus-compmischung%2F01632860

floflock avatar Sep 14 '18 15:09 floflock

@floflock This should be easy to handle by wrapping the .parse() call with a try/catch.

paambaati avatar Sep 14 '18 15:09 paambaati

ok, that's clear for me, but I thought it was conceptional wrong to throw syntax error, if there is no json-ld present. :)

floflock avatar Sep 14 '18 15:09 floflock

Throwing is better, so you know your input isn't in the expected format.

paambaati avatar Sep 14 '18 15:09 paambaati

Try/catch isn't possible, because it is a `console.log()``

try {
      return wae().parse(html)
} catch (e) {
      //
}

floflock avatar Sep 14 '18 15:09 floflock

@floflock Okay so the error isn't thrown, it is just a log statement; it should be ignorable.

paambaati avatar Sep 14 '18 16:09 paambaati

FWIW I'm working on this now and expected it to throw. I do get the log message, but that's not as explicit as throwing and handling a specific case.

Alternative designs:

  • explicitly declare this operation throws -- that should feel normal to JS developers since anybody who has ever used JSON.parse has had to consider that case (whether they knew it or not)
  • populate an errors property on the instance I can inspect manually

TheDahv avatar Dec 17 '18 20:12 TheDahv

Hi @paambaati, I trust that you're all quite busy so I don't mean to come off with any undue urgency. I'm curious to know what you would advise regarding https://github.com/indix/web-auto-extractor/pull/25

Is Indix open to contributions on this project? Given your estimation on time available to review submissions, do you recommend I work off a fork for a while?

TheDahv avatar Apr 11 '19 15:04 TheDahv

@paambaati would you be so kind to give short feedback on that topic? As far as I can see, @paambaati wants to add additional features. I will prepare the package for exnext-ready syntax (Class, etc.)...

floflock avatar May 04 '19 06:05 floflock

@TheDahv @floflock First off, apologies for the silence; we've been busy for the past few months. To answer your questions, yes, please go ahead with your fork. I'll carve out some time to review it.

paambaati avatar May 04 '19 06:05 paambaati

@paambaati thanks for your feedback. I use your package in a certain project. I've made a fork and will try to improve some things.

floflock avatar May 06 '19 08:05 floflock