Chris Mattmann

Results 7 repositories owned by Chris Mattmann

tika-python

1.4k
Stars
234
Forks
Watchers

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

MLwithTensorFlow2ed

135
Stars
68
Forks
Watchers

Code for Machine Learning with TensorFlow: 2nd Edition Published by Manning Publications

imagecat

94
Stars
40
Forks
Watchers

ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extrac...

tika-similarity

103
Stars
59
Forks
Watchers

Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.

etllib

16
Stars
36
Forks
Watchers

This is the ETL lib package. It provides an API to munge and prepare JSON, TSV and other data using Apache Tika and JSON parsing/loading for ETL via Apache OODT (or other libs) into Apache Solr.

lucene-geo-gazetteer

36
Stars
21
Forks
Watchers

Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.

nutch-python

35
Stars
19
Forks
Watchers

Nutch-Python is a Python binding to the Apache Nutch™ REST services allowing Nutch to be called natively in the Python community. — Edit