Clément Doumouro

Results 13 comments of Clément Doumouro

Sent a mail to the Tika mailing list to get an update on the issue's status (see [email protected] inbox)

Update on June 10th: Tim/Tilman said they would update the PDFParserConfig with parameters to allow to keep soft line breaks. Waiting for the feature to be implemented

Wait for Spacy NER to be implemented to allow for faster prototyping / easier text processing: https://github.com/ICIJ/datashare/issues/1452