Ivan Begtin
Ivan Begtin
- [x] Added category GS1 - [x] Included GTIN and GTIN of contained items
[sherlock_datatypes.xlsx](https://github.com/apicrafter/metacrafter-registry/files/8998573/sherlock_datatypes.xlsx) Sherlock data types
Hi! Not yet, for now only Metacrafter registry converted to Datahub business glossary https://github.com/apicrafter/metacrafter-registry/blob/main/data/datahub/metacrafter.yml I had plans to add Datahub ingestion, just not yet sure about the best way how...
Current state: - Added basic support of XML files - Added XML files to README.md Next steps: collect examples and write tests
Added automatic detection of XML tags
Replaced XML reader with pyiterable, should support huge XML files right now
[Presidio](https://github.com/microsoft/presidio) looks like possible NER engine. The ways to implement: - support analysis of list of fields - support analysis of any string fields with length greater than `max_len` parameter....
It's caused by column names with non utf-8 encoding, it's possible with some SQLite databases. Possible solution here https://stackoverflow.com/questions/22751363/sqlite3-operationalerror-could-not-decode-to-utf-8-column it could resolve errors by it will improve data types detection...
@cipher387 Hi! Is it after installation using "pip install metawarc" or it's latest code from github?