heidi
heidi copied to clipboard
heidi : tidy data in Haskell
the latest generic-trie [1] is a more sustainable bet, so let's import it and get rid of its vendored source 1) https://hackage.haskell.org/package/generic-trie-0.3.2/changelog
I added the 'Day' type from the time library, which represents a Date. This is a first attempt and if this is OK, I can add more types from the...
Add a few useful date/time types from `time` (https://hackage.haskell.org/package/time) , e.g. - [ ] POSIXTime - [ ] Date etc. A checklist for where to add things : - [...
In-memory typed data -> dataframe i.e. _after_ parsing validation etc.
e.g. as in https://elbersb.com/public/posts/tidylog100/ ``` filtered filter: removed 21 rows (66%), 11 rows remaining joined left_join: added 9 columns (temp, dewp, humid, wind_dir, wind_speed, …) #> > rows only in...
e.g. `ascii` : ``` +-------------+-----------------+ | Person | House | +-------+-----+-------+---------+ | Name | Age | Color | Price | +-------+-----+-------+---------+ | David | 63 | Green | $170000 |...
* A Header is uniquely determined by the type of the input data * once data are encoded in a Frame, we compute a Header from the type (with `header`)...
Frames are not supposed to be constructed directly by the user (they should only be 'encode'd from data). Currently we say 'hdr = mempty' in a few places for convenience...
- [ ] mutate() adds new variables that are functions of existing variables - [x] select() picks variables based on their names. - see the `text`, `int` etc. lenses -...
It seems essential to me that the library be able to join using multiple columns as the join key. I don't know if the underlying Trie makes that simpler. The...