crawl-anywhere icon indicating copy to clipboard operation
crawl-anywhere copied to clipboard

add DjVu support

Open ghost opened this issue 11 years ago • 6 comments

Can you please add DjVu indexing support? There is a tool like pdftotext available for djvu files: http://djvu.sourceforge.net/doc/man/djvutxt.html I like crawl anywhere, because it is super fast. Sadly I'm not able to add djvu support by myself as I do not understand Java.

ghost avatar May 17 '13 22:05 ghost

Thank you for the "I like crawl anywhere, because it is super fast" :) Can you provide some urls with djvutxt content ?

bejean avatar May 18 '13 08:05 bejean

Not easy to find good files for testing. The best is to provide me some djvu files which produce text with djvutxt utility.

bejean avatar May 18 '13 17:05 bejean

Thanks for your fast response. Here is a djvu file with a hidden text layer that can be extracted: http://www.djvuzone.org/support/results.djvu I can look for better examples, if this is not a good file for testing.

ghost avatar May 19 '13 20:05 ghost

I please, I need some good files for tests.

bejean avatar May 19 '13 21:05 bejean

Did you get my E-Mail with additional links to DjVu files?

ghost avatar May 30 '13 10:05 ghost

yes, thank you

bejean avatar May 30 '13 11:05 bejean