duckdb-r icon indicating copy to clipboard operation
duckdb-r copied to clipboard

Expose the internal default connection and `sql` function?

Open eitsupi opened this issue 1 year ago • 4 comments

The sql function used internally would be useful to perform processing via DuckDB. (e.g., reading Parquet files).

https://github.com/duckdb/duckdb-r/blob/d243b5301bdca5e642a4579ced6fd7f7b9ded04e/R/sql.R#L1-L15

Would you consider exporting this with a name like duckdb_query?

eitsupi avatar Oct 15 '23 14:10 eitsupi

This function and the associated default_connection() is not really used even internally. We're free to design it to our desires, but I'd like to understand the motivation and use case. Most of that can be achieved today with DBI too?

krlmlr avatar Dec 02 '23 10:12 krlmlr

Thank you for your reply. I think it would be great if I could eventually do something like the following in Python.

https://duckdb.org/docs/archive/0.9.2/guides/python/import_pandas

import duckdb
import pandas

# Create a Pandas dataframe
my_df = pandas.DataFrame.from_dict({'a': [42]})

# create the table "my_table" from the DataFrame "my_df"
# Note: duckdb.sql connects to the default in-memory database connection
duckdb.sql("CREATE TABLE my_table AS SELECT * FROM my_df")

# insert into the table "my_table" from the DataFrame "my_df"
duckdb.sql("INSERT INTO my_table SELECT * FROM my_df")

Here, the connection specification and the registration of the pandas.DataFrame to the DB are done automatically. The R client currently does not automatically register data frames (duckdb/duckdb#6771), but I think it would be useful at the moment to be able to omit specifying a connection like with the sql function.

eitsupi avatar Dec 03 '23 00:12 eitsupi

Can we do a draft/pilot in a separate package, on top of DBI, before committing here?

krlmlr avatar Mar 21 '24 06:03 krlmlr

I don't have time to work on this right away, but I'm assuming the same thing below that you can use with polarssql.

https://rpolars.github.io/r-polarssql/reference/polarssql_query.html

polarssql_register(mtcars = mtcars)

query <- "SELECT * FROM mtcars LIMIT 5"

# Returns a polars LazyFrame
polarssql_query(query)

# Clean up
polarssql_unregister("mtcars")

eitsupi avatar Mar 24 '24 03:03 eitsupi

I noticed this duckdb:::sql() internal function used in today's duckdb blog post which also made me wonder if it might be worth exporting the function?

stephhazlitt avatar Oct 10 '24 03:10 stephhazlitt