boilerpipe
boilerpipe copied to clipboard
Extraction Issue
Hello @kohlschuetter ,
First off, I have to say, Boilerpipe is AMAZING! Thank you for your work on this.
In a few cases, I am having a bit of an extraction issue. With the github code, there are some articles where the extraction is starting late. For example, on https://en.wikipedia.org/wiki/New_York_City the output starts at "Further information: Police surveillance in New York City and Crime in New York City". However, when I check that same article on https://boilerpipe-web.appspot.com/, the web API is always getting the full text. I've been banging my head against the wall trying to figure out what I was doing wrong, and just figured I should message the inventor. The only two things I could think of are: 1) I am totally missing something or 2) the web api might slightly different version. Do you what might be going on here?
Hope you are having a great weekend!
Best, Kevin