Ivan Begtin

Results 195 issues of Ivan Begtin

When I process multiple files it often require command-line tools to create CSV/JSON pipeline. Right now hachoir-metadata tool can't be used to create CSV or JSON. It will be very...

Some crawlers could create multiple WARC files, it's importand if we had to upload WARC files to storages with limitation on single file size. I have a lot of archives...

I've created wacz file from warc.gz with latest py-warcz package 0.4.5 Original file https://cdn1.ruarxive.org/public/webcollect2022/ngo2022/cafrussia.ru/cafrussia.ru.warc.gz (179MB) Produced WACZ file https://cdn1.ruarxive.org/public/webcollect2022/ngo2022/cafrussia.ru/cafrussia.ru.wacz (179MB) I open wacz file with Reply Web.Page and I don't...

Please add support of JSON lines files https://jsonlines.org/ There are a lot of such files published and used. Sometimes they are huge and hard to convert to JSON

python

Right now the only options to get data from splitgraph are CSV via "sgr csv export" or SQL via "sgr dump". Very often data include JSON records that more comfortably...

Add integration of Awesome Opendata Rus with DataCatalogs.ru Airtable database. Right now it's separated entities without sync. We need a script linked with GitHub action to extract list of data...

enhancement

Hi! I am working on ETL and data platform with NoSQL as primary data formats. Most data is JSON lines, BSON and Parquet converted from XML, JSON and extracted from...

Hi! The last publication about Github Explorer was in December 2020, and I am just curious, Is the dataset updated since that time? I do regular research about government open-source...

I can't find how business glossary or semantic data types defined in specification. Maybe just missed it or it's not defined yet? About business glossary is a good blog post...

Hi! I see Open Metadata standard already https://docs.open-metadata.org/metadata-standard/schemas/overview and Egeria Open Metadata types https://egeria-project.org/types/ How is Open Data Discovery Specification linked and compatible with them?