anor
anor copied to clipboard
known data.table fread limitation
In fread function, the skip parameter can't to input > 2500000000. If the database file > 2500000000 lines, you need to split the raw database file.
For example:
/usr/bin/split -l 2499999999 hg19_eigen.txt hg19_eigen.txt_split
# if you have been write 2499999999 in sqlite file, you can start from "ab"
for( i in c("aa", "ab", "ac", "ad")) {
system(sprintf("mv hg19_eigen.txt_split%s hg19_eigen.txt", i))
new.colnames <- c("#Chr", "Start", "End", "Ref", "Alt", "Eigen")
annovarR::sqlite.auto.build('eigen', database.dir = './', append = TRUE, new.colnames = new.colnames)
system(sprintf("mv hg19_eigen.txt hg19_eigen.txt_split%s", i))
}