boilerpipe
boilerpipe copied to clipboard
its not working for a news site
What steps will reproduce the problem?
1.String content = CommonExtractors.DEFAULT_EXTRACTOR.getText(new
URL("http://www.nytimes.com/2014/06/06/business/gm-ignition-switch-internal-reca
ll-investigation-report.html?hp"));
2.System.out.println(content);
3.It prints nothing
When I run with the above URL, its not extracting anything. I have tried with
all the extractor but the result is blank.
I have tried on http://boilerpipe-web.appspot.com/ and there its working fine.
Please advice.
Original issue reported on code.google.com by [email protected]
on 6 Jun 2014 at 9:25
Attachments:
If possible, can you provide me the online demo libraries? It seems that online
version is more robust that downloaded library.
Original comment by [email protected]
on 6 Jun 2014 at 10:16