bejean

Results 29 issues of bejean

With FF 6.x, It looks like the handleResults callback function is not properly called after $j.post("options-general.php?page=solr-for-wordpress/solr-for-wordpress.php", {method: "load", type: $type}, handleResults, "json"); So only the 100 first posts are sent...

What about provide optional extraction directives ? In a majority of cases the extraction algorithm woks great. But for some web sites it can fail to extract relevant content. For...

A great feature could be to detect the published date of the web page. This information is often located somewhere at the top or the bottom of the main text.

When trying to test finding links action with tools_test_scripts.sh, I never get to see any output of found links. Even when hardcoding them. Neither does it show exceptions (after intentionally...

bug

When i import a source, the source will not be crawled. I exported a running source an match it with a new imported source and they look different. And the...

bug

As free IP geolocalisation WS are often unavailable or deprecated, allows easy custom class implementation. http://www.geoiptool.com/ don't provide informations as xml anymore

enhancement

Add a max pages number option. Should this be the maximum number of pages fetched on the server or the max number of pages sent to the pipeline ? This...

enhancement

Create a fast recrawl option. This option could allow to recrawl a web site often an quickly by crawling only at a maximum depth of 1 or 2 levels for...

enhancement

https://groups.google.com/forum/#!topic/crawl-anywhere/tdkJNIjuB5E

Task

see https://groups.google.com/forum/#!topic/crawl-anywhere/3WPCZuwtZCc

enhancement