disk.frame icon indicating copy to clipboard operation
disk.frame copied to clipboard

Invalid first argument with data.table syntax

Open matthewgson opened this issue 4 years ago • 8 comments

It seems there's a conflict on the recent version of disk.frame. When i argument is not provided within [], it generates error.

> iris.df = as.disk.frame(iris)
> iris.df[,.N]
Error in exists(v, envir = env, inherits = FALSE) : 
  invalid first argument
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16
> packageVersion('disk.frame')
[1] ‘0.3.7’
> packageVersion('data.table')
[1] ‘1.13.7’

matthewgson avatar Dec 30 '20 16:12 matthewgson

I'm not experiencing this issue with my windows machine, however.

matthewgson avatar Dec 30 '20 20:12 matthewgson

Seems to be a bug, for now data.table syntax is not well supported. Try iris.df[TRUE, .N]

xiaodaigh avatar Dec 31 '20 07:12 xiaodaigh

@matthewgson I see there's a new version of disk.frame on CRAN. Do you still receive an error?

If you still get an error, does loading the package and running this help? It's a slightly different approach and passes the current tests involving [.disk.frame]

`[.disk.frame` <- function(df, ..., future.globals = NULL, keep = NULL, rbind = TRUE, use.names = TRUE, fill = FALSE, idcol = NULL) {

    res = future.apply::future_lapply(df[[1L]], function(chunk_file) {
        fst::read_fst(chunk_file, as.data.table = TRUE, columns = keep)[...]
    }, future.globals = c("keep", future.globals), future.packages = c("data.table","disk.frame"),
    future.seed=TRUE
    )
    
    if(rbind)
        if (all(vapply(res, is.data.frame, TRUE))) 
            rbindlist(res, use.names = use.names, fill = fill, idcol = idcol)
        else
            unlist(res)
    else 
        res
}

ColeMiller1 avatar Feb 25 '21 02:02 ColeMiller1

@ColeMiller1 Thanks for suggestion, I'll try and reply soon.

matthewgson avatar Feb 25 '21 02:02 matthewgson

`[.disk.frame` <- function(df, ..., future.globals = NULL, keep = NULL, rbind = TRUE, use.names = TRUE, fill = FALSE, idcol = NULL) {

    res = future.apply::future_lapply(df[[1L]], function(chunk_file) {
        fst::read_fst(chunk_file, as.data.table = TRUE, columns = keep)[...]
    }, future.globals = c("keep", future.globals), future.packages = c("data.table","disk.frame"),
    future.seed=TRUE
    )
    
    if(rbind)
        if (all(vapply(res, is.data.frame, TRUE))) 
            rbindlist(res, use.names = use.names, fill = fill, idcol = idcol)
        else
            unlist(res)
    else 
        res
}

Should I review this and consider using it in the package? I am reworking the whole NSE system which is currently quite messy.

xiaodaigh avatar Feb 25 '21 11:02 xiaodaigh

No need to review now. What I am working on is already a bit different.

ColeMiller1 avatar Feb 26 '21 23:02 ColeMiller1

@ColeMiller1 I checked, and it works fine now. Version update 0.3.6 -> 0.4.0. @xiaodaigh should this be closed?

matthewgson avatar Mar 05 '21 10:03 matthewgson

I would need further testing to close it. I am working to increase test coverage.

xiaodaigh avatar Mar 16 '21 22:03 xiaodaigh