SQL interface to DataFrame?
Has anyone thought about taking SQL as query interface for data frames?
For example, doing something like this:
julia> df = DataFrame(x = rand(1:3, 10), y = rand(10))
10×2 DataFrame
Row │ x y
│ Int64 Float64
─────┼──────────────────
1 │ 2 0.589515
2 │ 1 0.72848
3 │ 2 0.344321
4 │ 1 0.900515
5 │ 2 0.667395
6 │ 2 0.0749464
7 │ 2 0.916652
8 │ 1 0.460028
9 │ 3 0.284116
10 │ 3 0.662666
julia> sql("select x, count(*) from df group by 1", (:df => df))
This could be doable and I was thinking about it some time ago. Someone would need to write parser of SQL. The point is that I think most people nowadays started using DuckDB interface for writing SQL queries in Julia (but I might be wrong here).
SQLdf.jl exists - I haven't tried it yet.
A parser that converted df.jl to sql would be amazing, and would lead me to rewrite TidierDB. I've toyed with writing a tables.jl to sql interface, but there's some nuance to sql that has made it tricky.
I think as far as writing sql to run on df.jl, I wonder what the benefit of that vs registering the table in duckdb and using their existing engine?