ibis
ibis copied to clipboard
feat: Collect MSSQL object into memory as an ibis object
Is your feature request related to a problem?
From my research and going through the documentation, if I want to bring my ibis expr into memory, I need to execute() it. But I want to continue to use ibis expression after collecting. I wish there was a simple way to collect the ibis object into memory and keep it as an ibis object without having to execute (which makes it pandas) and then convert it back to an ibis object via memtable.
If there is already a straight way to do this, I apologize for missing it and would appreciate an example on how to do this. Currently I made a function that will execute and convert it to an ibis object.
What is the motivation behind your request?
I am either missing something or there are multiple steps I need to take to accomplish this task. I have not dug into if its extra compute to go from MSSQL -> pandas -> ibis (which I think is duckdb?).
Describe the solution you'd like
This would be ideal:
con = con_ibis(env)
# Selecting columns
my_table = (
con.table("my_table", database="dbo")
.select(
_.my_id,
_.some_column
)
)
my_table is server side at the moment, for my use case/project I want to bring it into memory, but retain the ibis properties so I can keep using the ibis API
my_table_in_memory = my_table.collect()
What version of ibis are you running?
9.5.0
What backend(s) are you using, if any?
mssql
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
@KoderKow I think what you are looking for is .cache(), you can find an example in the docs https://ibis-project.org/reference/expression-tables.html#ibis.expr.types.relations.Table.cache
Would this do what you are looking for?
Interesting I have not tried that. When I cache an expression, where is that saved? I see it makes SELECT * FROM [tempdb].[dbo].[##ibis_cached_d6rcpzztrfhcbaqpqijetubzdu], is that a temporary table in the database? I noticed once the connection closes the cache no longer works.
Is there a way to turn my con object that's mssql into an ibis polars by any chance? Currently I am collecting the data into memory from my ibis mssql object then creating an ibis polar object from that. I couldn't see if there is a way to update it with out the 2-3 extra steps.
Hey @KoderKow -- when you cache an expression, it gets saved to a temporary table on the backend-engine, and those don't survive past the lifetime of the session (if you want an intermediate expression to persist, you can pass the expr to create_table).
Can you explain a little bit more about what you are trying to accomplish here? I think we're missing some context and it would help us to give you better advice.
@KoderKow Depending on what you want to do, here are your options:
cache: persist in the database for the lifetime of the connection
Use this if you want to keep everything in the database. The cached table will be dropped on connection close.
to_polars: compute in the backend and return a polars object
Use this if you want to bring the results to the client.
To continue using the Ibis API, you'd call ibis.memtable(con.to_polars(some_expr)).