EDA_miner
EDA_miner copied to clipboard
Improve upload and general data handling (e.g. via pandas)
- Correctly read various formats:
- [x] csv
- [x] xls / xlsx
- [x] json
- [x] feather
- [ ] html
- Convert columns to appropriate data-types (possibly using heuristics). Focus on imporved handling of:
- [x] Dates, timestamps, etc
- [x] Categorical (Ordinal & Nominal)
- [ ] Missing values (interface with some sort of preprocessing pipeline? algorithm dependent?)
- [ ] Interface for custom objects (?)
- Handling of files, storage, etc
- [x] Review use of Redis, discuss alternatives or improved usage, etc
- [x] Multiple file uploads
- [ ] Large file uploads (e.g. Resumable upload)