Ivan Begtin

Results 195 issues of Ivan Begtin

Consider adding qddate https://github.com/ivbeg/qddate patterns as different data types

improve semantic types

Egeria project include glossary area https://egeria-project.org/types/3/ similar to semantic types. Need more research and analysis key differences and Egeria approach.

research

Dataprep include support of cleaning of dozens of country identifiers https://docs.dataprep.ai/user_guide/clean/introduction.html * Australian Business Numbers * Australian Tax Numbers * Belgian VAT Numbers and e.t.c. These identifiers and basic pattern...

improve semantic types

This PR was automatically created by Snyk using the credentials of a real user.Snyk has created this PR to fix one or more vulnerable packages in the `pip` dependencies of...

Add support for the following NoSQL databases and search engines: MongoDB, ArangoDB, Milvus, ArcadeDB, ElasticSearch, OpenSearch, MeiliSearch, Apache Cassandra, StarGate (MongoDB-like API over NoSQL databases) The current state of database...

enhancement

Error `Object of type bytes is not JSON serializable` caused by table fields with bytes type. Better detection of types needed and serialization of bytes type in JSON report. Error...

bug

Error processing SQLite database with non-unicode names for fields. Example [000012_world.zip](https://github.com/apicrafter/metacrafter/files/9306204/000012_world.zip) `Traceback (most recent call last): File "C:\Users\ibegt\AppData\Roaming\Python\Python310\site-packages\sqlalchemy\engine\result.py", line 1284, in fetchall l = self.process_rows(self._fetchall_impl()) File "C:\Users\ibegt\AppData\Roaming\Python\Python310\site-packages\sqlalchemy\engine\result.py", line 1230, in...

bug

Right now JSON file of the metadata scanning report is not structured well enough. Improvements should include: - [ ] Add Cerberus schema (more info https://docs.python-cerberus.org) - [ ] Add...

enhancement

Error processing several SQLite files `(sqlite3.OperationalError) no such tokenizer: PSITokenizer` Example file [001607_psi.zip](https://github.com/apicrafter/metacrafter/files/9313446/001607_psi.zip)

bug

Right now report include only: field name, data type, tags, semantic type id and registry URL. Sometimes additional information required and it's collected during matching process. Consider to add to...

enhancement