boilerpipe icon indicating copy to clipboard operation
boilerpipe copied to clipboard

Extraction Issue

Open vinylrichie opened this issue 4 years ago • 1 comments

Hello @kohlschuetter ,

First off, I have to say, Boilerpipe is AMAZING! Thank you for your work on this.

In a few cases, I am having a bit of an extraction issue. With the github code, there are some articles where the extraction is starting late. For example, on https://en.wikipedia.org/wiki/New_York_City the output starts at "Further information: Police surveillance in New York City and Crime in New York City". However, when I check that same article on https://boilerpipe-web.appspot.com/, the web API is always getting the full text. I've been banging my head against the wall trying to figure out what I was doing wrong, and just figured I should message the inventor. The only two things I could think of are: 1) I am totally missing something or 2) the web api might slightly different version. Do you what might be going on here?

Hope you are having a great weekend!

Best, Kevin

vinylrichie avatar Jan 31 '21 01:01 vinylrichie