csvapi icon indicating copy to clipboard operation
csvapi copied to clipboard

Kafka integration

Open geoffreyaldebert opened this issue 2 years ago • 1 comments

  • [x] Kafka Integration (only consumer)
  • [x] Read message from udata-analysis-service
  • [x] Parse file (could be from minio instead of downloading again resource)
  • [x] Add csv-detective type detection to help agate to store resource into sqlite
  • [x] Add pandas profiling analysis (minimal) and generation of json report
  • [x] Store new infos into sqlite in new tables :
    • general_infos : basic info on resource
    • column_infos : basic info on each column of resource
    • categorical_infos : categorical values for each columns (limit to 10)
    • top_infos : top values for each columns (limit to 10)
    • numeric_infos : basic info on each numeric column of resource (mean, std, min, max)
    • numeric_plot_infos : repartition of values of numeric column in a plot
  • [x] Update API to list those new info if we have them

geoffreyaldebert avatar Jun 10 '22 19:06 geoffreyaldebert

This branch is now published on pypi https://app.circleci.com/pipelines/github/etalab/csvapi/91/workflows/09dba6e2-b91f-4cf2-af03-71a9daee9bbb/jobs/605

⚠️ remove this publication when merged on master

abulte avatar Aug 26 '22 09:08 abulte