crawling-framework icon indicating copy to clipboard operation
crawling-framework copied to clipboard

Log actual LD+JSON on parsing error

Open dremeika opened this issue 7 years ago • 0 comments

Error should also log erroneous JSON so that we could learn how to pre-process it to avoid such errors

WARN  l.t.c.p.u.JsonLdParser - Failed to parse ld+json
com.fasterxml.jackson.core.JsonParseException: Document contains more content after json-ld element - (possible mismatched {}?)
 at [Source: java.io.StringReader@72eeb417; line: 31, column: 10]
        at com.github.jsonldjava.utils.JsonUtils.fromJsonParser(JsonUtils.java:167) ~[crawler-standalone.jar:?]
        at com.github.jsonldjava.utils.JsonUtils.fromReader(JsonUtils.java:122) ~[crawler-standalone.jar:?]
        at com.github.jsonldjava.utils.JsonUtils.fromString(JsonUtils.java:190) ~[crawler-standalone.jar:?]
        at lt.tokenmill.crawling.parser.utils.JsonLdParser.parse(JsonLdParser.java:37) [crawler-standalone.jar:?]
        at lt.tokenmill.crawling.parser.ArticleExtractor.extractArticleWithDetails(ArticleExtractor.java:35) [crawler-standalone.jar:?]
        at lt.tokenmill.crawling.parser.ArticleExtractor.extractArticle(ArticleExtractor.java:22) [crawler-standalone.jar:?]

dremeika avatar Sep 22 '17 09:09 dremeika