MonetDBLite-R icon indicating copy to clipboard operation
MonetDBLite-R copied to clipboard

Insert vector from R into existing table column

Open gravesee opened this issue 7 years ago • 3 comments

I would like to use monetDB lite as an alternative to SAS for my medium data problems. To that end I need to add new columns consisting of transformations of existing columns done in R. I have searched the internet and documentation for a solution to this and have not found an answer.

I would like to do something like this:

con <- dbConnect( MonetDBLite::MonetDBLite() , ":memory:" )
dbWriteTable(con, "mtcars", mtcars)

dbSendQuery(con, "ALTER TABLE mtcars ADD COLUMN mpg_per_cyl double")
dbListFields(con, "mtcars")

To update the new column I have tried using databinding, but it seems this only works for one value:

## ?? how to populate new column from R?
insert <- dbSendQuery(con, 'PREPARE INSERT INTO mtcars (mpg_per_cyl) VALUES (?)')

> dbBind(insert, list(mpg_per_cyl=mtcars$mpg/mtcars$cyl))
Error in vapply(params, function(x) { : values must be length 1,
 but FUN(X[[1]]) result is length 32
In addition: Warning message:
In if (is.na(x)) "NULL" else if (is.numeric(x) || is.logical(x)) { :
  the condition has length > 1 and only the first element will be used

The documentation for dbSendUpdate has the only reference to placeholders in the MonetDBLite package I could find. I am a bit new to working with DBs so forgive me if I've overlooked something obvious. Does this capability already exist?

gravesee avatar Feb 18 '18 16:02 gravesee

After adding the column using ALTER TABLE you can of course do something like dbExecute(con, "UPDATE mtcars SET mpg_per_cyl=mpg/cyl"). But if I understand you correctly that value is computed by R? So what you would like to do is to extend your table with an R-computed column?

hannes avatar Feb 18 '18 19:02 hannes

Yes, precisely! I would like to do some computations in R and then either add or update a column using the R vector. I've seen examples that do this by using paste to effectively update row by row, but the column-store nature of MonetDB seems better suited to this model. I hope there is an idiomatic way to accomplish this. This is a fairly common use case for my group where our data is larger than memory but not larger than HDD.

gravesee avatar Feb 18 '18 20:02 gravesee

The stand-alone MonetDB version can already do this with R UDFs (see https://www.monetdb.org/content/embedded-r-monetdb). Does this help?

hannes avatar Feb 19 '18 09:02 hannes