EDA_miner icon indicating copy to clipboard operation
EDA_miner copied to clipboard

Improve upload and general data handling (e.g. via pandas)

Open KMouratidis opened this issue 5 years ago • 0 comments

  1. Correctly read various formats:
  • [x] csv
  • [x] xls / xlsx
  • [x] json
  • [x] feather
  • [ ] html
  1. Convert columns to appropriate data-types (possibly using heuristics). Focus on imporved handling of:
  • [x] Dates, timestamps, etc
  • [x] Categorical (Ordinal & Nominal)
  • [ ] Missing values (interface with some sort of preprocessing pipeline? algorithm dependent?)
  • [ ] Interface for custom objects (?)
  1. Handling of files, storage, etc
  • [x] Review use of Redis, discuss alternatives or improved usage, etc
  • [x] Multiple file uploads
  • [ ] Large file uploads (e.g. Resumable upload)

KMouratidis avatar Apr 03 '19 23:04 KMouratidis