node-boilerpipe
node-boilerpipe copied to clipboard
Error on fulltext extraction
I'm running my script that use boilerpipe and after some time I'm getting this error:
Boilerpipe error: Error: Error running instance method
java.lang.NullPointerException
at de.l3s.boilerpipe.filters.heuristics.SimpleBlockFusionProcessor.proce
ss(SimpleBlockFusionProcessor.java:45)
at de.l3s.boilerpipe.extractors.DefaultExtractor.process(DefaultExtracto
r.java:46)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
I'm using this code:
getText = (html, callback)->
unless html
callback(false)
else
boilerpipe = new Boilerpipe(
html: html
, (err) ->
util.log 'Boilerpipe error: ' + err if err
)
boilerpipe.getText((err, text)->
if text?
callback text
else
callback(false)
)
My function "getText()" running in the loop. Could you please suggest something?
Also would be great to have some method that will set HTML/URL and in callback return text. In this case I won't need to create objects in the loop.
I am runnign into the same problem. Any fixes yet?
I guess this is still an open issue?
I got this error, finally figured out 'html' was an object. Ensure html is a string before passing it in.