sergeant
sergeant copied to clipboard
Implement write table methods
It would be nice if the seargeant
package could create new tables as I can do with other database drivers. Something like this:
library(sergeant)
library(dplyr)
conn = dbConnect(Drill())
# Load original data
original_data = tbl(conn, "`dfs.downloads`.`original_data `")
# Do some operation
edited_data = original_data %>%
filter(name = "John")
# Write the result to a table without loading it into R
dbWriteTable(conn, table = "`dfs.downloads`.`edited_data`", values = edited_data )
Agreed, but even the ODBC interface has issues with this. i.e. just try doing:
library(DBI)
library(odbc)
library(tidyverse)
DBI::dbConnect(
odbc::odbc(),
driver = "/Library/mapr/drill/lib/libdrillodbc_sbu.dylib",
ConnectionType = "Zookeeper",
AuthenticationType = "No Authentication",
ZKCLusterID = "drillbits1",
ZkQuorum = "localhost:2181",
AdvancedProperties = "CastAnyToVarchar=true;HandshakeTimeout=30;QueryTimeout=180;TimestampTZDisplayTimezone=utc;
ExcludedSchemas=sys,INFORMATION_SCHEMA;NumberOfPrefetchBuffers=5;"
) -> drill_con
dbCreateTable(drill_con, "dfs.tmp.mtcars", mtcars)
One reason for the ODBC interface error is the syntax Drill needs.
One of the only ways I initially went down a path on was to save off a data frame as JSON then use CREATE TABLE x AS SELECT cols FROM thejsonfile
but that wld assume Drill is running on the local host. Trying to generalize that (e.g. having an option to automatically scp the JSON to a remote Drill server, having an option to point the CTAS
call to a specific path assuming it was, say an NFS share, etc) would require doing quite a bit of condition handling and hasn't made it back to the priority list.
I also went down a path of using CTAS
and then adding VALUES()
but the REST interface rly does not like it when you do that with a ton of columns and values.
I'll keep the issue open tho in the event I get some time or others find it and have other ideas on how to make it work.