project-m36
project-m36 copied to clipboard
DataFrame for orderBy, limit, and offset
Thanks for @agentm 's opinion, in order to explore more practical usage from relational algebra, we may need
a new type and processing engine for converting Relations to DataFrames akin to pandas or R DataFrames which do have an ordering.
Converting to a DataFrame would emphasize the finality of that processing to the user- further relation algebra operators cannot be applied, though the DataFrame could potentially be converted back to a Relation. Normally, this conversion step would be final step in a data retrieval pipeline.
Some reference: ACCESSING POSTGRES IN A DATAFRAME IN HASKELL
Yea, it would be nice to be able to use a DataFrame-related engine from an existing Haskell project, but, from what I have seen, they are typically typed at compile time. We will need to support arbitrarily-generated types at runtime in order to support conversion to-and-from relations.
I considered exporting to SQLite but there is no way to represent relation-valued attributes or ADTs, so the data type impedance mismatch would be quite painful.
Oh, I didn't think too far. I just try to find some code that make sure I have something to reference when I think about code.
I just want to have a simple syntax like beam's to use projectm36 and relational algebra in haskell.
And yes, relation-valued attributes and ADTs feels too valuable to lose. It would be interesting to use them both in haskell and relational database.
I'm happy to discuss any future features. From my perspective, I think a good first step would be to implement a Relation -> DataFrame engine. That could be implemented independently of a compile-time type safe interface.
Thanks! I have implemented a simple feature now.
TutorialD (master/main): :importexample date
TutorialD (master/main): :showexpr s
┌──────────┬────────┬───────────┬───────────────┐
│city::Text│s#::Text│sname::Text│status::Integer│
├──────────┼────────┼───────────┼───────────────┤
│"Paris" │"S2" │"Jones" │10 │
│"Athens" │"S5" │"Adams" │30 │
│"Paris" │"S3" │"Blake" │30 │
│"London" │"S1" │"Smith" │20 │
│"London" │"S4" │"Clark" │20 │
└──────────┴────────┴───────────┴───────────────┘
TutorialD (master/main): :showsorteddataframe s s#
┌──────────┬────────┬───────────┬───────────────┐
│city::Text│s#::Text│sname::Text│status::Integer│
├──────────┼────────┼───────────┼───────────────┤
│"London" │"S1" │"Smith" │20 │
│"Paris" │"S2" │"Jones" │10 │
│"Paris" │"S3" │"Blake" │30 │
│"London" │"S4" │"Clark" │20 │
│"Athens" │"S5" │"Adams" │30 │
└──────────┴────────┴───────────┴───────────────┘
TutorialD (master/main): :showsorteddataframe s city
┌──────────┬────────┬───────────┬───────────────┐
│city::Text│s#::Text│sname::Text│status::Integer│
├──────────┼────────┼───────────┼───────────────┤
│"Athens" │"S5" │"Adams" │30 │
│"London" │"S1" │"Smith" │20 │
│"London" │"S4" │"Clark" │20 │
│"Paris" │"S2" │"Jones" │10 │
│"Paris" │"S3" │"Blake" │30 │
└──────────┴────────┴───────────┴───────────────┘
It can only has ascending order now. And I leave RelationAtom and CustomizedAtom's Ord instance undefined because I don't see the meaning of it.
I am happy to discuss any thing, too.
Let me make an pull request for this first.