tidyquery
tidyquery copied to clipboard
Support disk.frame objects
See https://github.com/xiaodaigh/disk.frame/issues/196
Blocked by https://github.com/xiaodaigh/disk.frame/issues/197 (using a Mac dev environment)
xiaodaigh/disk.frame#197 is resolved. Now blocked by xiaodaigh/disk.frame#217
Also blocked by https://github.com/xiaodaigh/disk.frame/issues/250
Hey, all blockers are resolved and it is working! But with some bugs. See
library(disk.frame)
setup_disk.frame()
airports.df = as.disk.frame(airports)
# this works
airports.df %>%
query("SELECT name as name1, lat as lat1, lon as lon1 ORDER BY lat DESC") %>%
collect
but this doesn't
airports.df %>%
query("SELECT name, lat, lon as lon1 ORDER BY lat DESC LIMIT 5") %>%
collect
complaining about
Error: The SELECT list includes two or more long expressions with no aliases assigned to them. You must assign aliases to these expressions
In addition: There were 17 warnings (use warnings() to see them)
and the warnings()
Warning messages:
1: In readChar(rc, nchars) : truncating string with embedded nuls
2: In readChar(rc, nchars) : truncating string with embedded nuls
3: In readChar(rc, nchars) : truncating string with embedded nuls
4: In readChar(rc, nchars) : truncating string with embedded nuls
5: In readChar(rc, nchars) : truncating string with embedded nuls
6: In readChar(rc, nchars) : truncating string with embedded nuls
7: In readChar(rc, nchars) : truncating string with embedded nuls
8: In readChar(rc, nchars) : truncating string with embedded nuls
9: In readChar(rc, nchars) : truncating string with embedded nuls
10: In readChar(rc, nchars) : truncating string with embedded nuls
11: In readChar(rc, nchars) : truncating string with embedded nuls
12: In readChar(rc, nchars) : truncating string with embedded nuls
13: In readChar(rc, nchars) : truncating string with embedded nuls
14: In readChar(rc, nchars) : truncating string with embedded nuls
15: In readChar(rc, 1L, useBytes = TRUE) : truncating string with embedded nuls
16: In readChar(rc, 1L, useBytes = TRUE) : truncating string with embedded nuls
17: In readChar(rc, 1L, useBytes = TRUE) : truncating string with embedded nuls
18: In arrange.disk.frame(., ...) :
`arrange.disk.frame` is now deprecated. Please use `chunk_arrange` instead. This is in preparation for a more powerful `arrange` that sorts the whole disk.frame
Thanks @xiaodaigh—I'll take a look at this soon
@xiaodaigh this error is happening because colnames()
is returning NULL
on a disk.frame
object. Should I be using names(collect(get_chunk(df, 1)))
to get the column names, as you suggest at https://diskframe.com/reference/colnames.html?
I see. the design disk.frame
is a little odd at this stage. So names(get_chunk(df, 1))
should suffice. But it's kinda weird to make you run this disk.frame specific code. Let me fix the disk.frame
colnames.
See https://github.com/xiaodaigh/disk.frame/issues/299
Another approach, which I think might be better is to set query
as a S3 method so this would work
query <- function(data, ...) {
UseMethod("query")
}
query.data.frame <- function(data, sql) {
query_(data, sql, TRUE)
}
then on {disk.frame}
side, I can do something like this
query.disk.frame = create_chunk_mapper(tidyquery::query)
airports.df %>%
query("SELECT name, lat, lon as lon1") %>%
collect
to test, this should definitely work
airports.df %>%
query.disk.frame("SELECT name, lat, lon as lon1") %>%
collect
This already on a branch on {disk.frame}
's side.
Closing because {disk.frame} has been soft-deprecated.