behemoth icon indicating copy to clipboard operation
behemoth copied to clipboard

Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.

Results 13 behemoth issues
Sort by recently updated
recently updated
newest added

It would be really helpful to use Apache Commons CLI for command line processing and then to try to standardize the names of input/output arguments, etc.

The UIMA and GATE annotation and type filters are configured using strings; by default if nothing is specified by the user no annotations are produced in the output. Instead it...

// Exception in thread "main" java.io.IOException: can't find class: com.digitalpebble.behemoth.tika.TextArrayWritable because com.digitalpebble.behemoth.tika.TextArrayWritable at org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:204)