analyze: add usage example(s)
Possibly a binary in the app/ folder with an end-to-end workflow. Then we can split back anything good that comes out of this into the main library
One possible use case (from https://www.reddit.com/r/haskell/comments/a50xpr/datahaskell_solve_this_small_problem_to_fill_some/ )
The problem
Averaged across persons, excluding legal fees, how much money had each person spent by time 6?
item , price
----------
computer , 1000
car , 5000
legal fees (1 hour) , 400
date , person , item-bought , units-bought
------------------------------------
7 , bob , car , 1
5 , alice , car , 1
4 , bob , legal fees (1 hour) , 20
3 , alice , computer , 2
1 , bob , computer , 1
It would be extra cool if you provided both an in-memory and a streaming solution.
Principles|operations it illustrates
Predicate-based indexing|filtering. Merging (called "joining" in SQL). Within- and across-group operations. Sorting. Accumulation (what Data.List calls "scanning"). Projection (both the "last row" and the "mean" operations). Statistics (the "mean" operation).
Solution and proposed algorithm (it's possible you don't want to read this)
The answer is $4000. That's because by time 6, Bob had bought 1 computer ($1000) and 20 hours of legal work (excluded), while Alice had bought a car ($5000) and two computers ($2000). In total they had spent $8000, so the across-persons average is $4000.
One way to compute that would be to:
- Delete any purchase of legal fees.
- Merge price and purchase data.
- Compute a new column, "money-spent" = units-bought price.
- Group by person.
- Within each group: Sort by date in increasing order.
- Compute a new column, "accumulated-spending" = running total of money spent.
- Keep the last row with a date no greater than 6; drop all others.
- Across groups, compute the mean of accumulated spending.
Started addressing this with some generic conversion machinery in #34
Currently writing an example, will commit soon
writtern code! don't know how to pull request however
@UnkDevE you open a PR starting from the page with your fork, then clicking "Compare" to see your changes in context :
https://github.com/DataHaskell/dh-core/compare/master...UnkDevE:master
then you can press "Create pull request"
Thanks! made pull request.
@UnkDevE I was too quick in merging your previous PR; a number of things still needed to be fixed. For the future, could you add your tests to the main test group, so that Travis runs them together and we see if anything is broken? Thanks!
no problem! will get started on that tomorrow