Friedrich Lindenberg

Results 156 issues of Friedrich Lindenberg

You have to give Microsoft credit for its consistency: instead of storing E-Mail messages in Outlook as RFC822 plain text, they came up with their own super funky file format...

At the moment, the `ingestors` will call on an HTTP service provided by `convert-document` (in this repo) to convert documents in various types (Word, Powerpoint, etc.) to PDF files, which...

question

What's broken? * We're seeing incorrect text extraction out of some documents, especially those containing Arabic text. * Text from images isn't being extracted into the right location in the...

It seems like we fail to parse files which are created in Excel with write-protection, even though they are readable without a password in the app. There has to be...

file-types

This is specifically with regards to `csvsql`, where loading a CSV file with `Some manually entered - header (TM)` will give you a data structure that is really hard to...

feature
framework

Final changes to #415 , fixes #412

Hey all. I've just pushed dataset 1.6.0, which pins the sqlalchemy dependency for this library to >= 1.3.2, < 2.0.0. This is meant as a hotfix to prevent people from...

Columns: * EntityID * Featured Properties * Sources * Temporal Extent * Ignore? Address Keep non-matchable fields as an array? Persons

enhancement

We have a few data sources where we use the `Sanction` schema to describe non-sanction adverse information, such as a procurement debarment, a criminal record or a regulatory penalty. We...

schema

We want to be able to produce a data file that just contains the changed entities day to day. The file also needs to give information regarding entities that have...

pipeline