crawl-anywhere
crawl-anywhere copied to clipboard
add DjVu support
Can you please add DjVu indexing support? There is a tool like pdftotext available for djvu files: http://djvu.sourceforge.net/doc/man/djvutxt.html I like crawl anywhere, because it is super fast. Sadly I'm not able to add djvu support by myself as I do not understand Java.
Thank you for the "I like crawl anywhere, because it is super fast" :) Can you provide some urls with djvutxt content ?
Not easy to find good files for testing. The best is to provide me some djvu files which produce text with djvutxt utility.
Thanks for your fast response. Here is a djvu file with a hidden text layer that can be extracted: http://www.djvuzone.org/support/results.djvu I can look for better examples, if this is not a good file for testing.
I please, I need some good files for tests.
Did you get my E-Mail with additional links to DjVu files?
yes, thank you