filehash
filehash copied to clipboard
Unable to dbInsert a .txt file
Dear Roger,
I am trying to dbInsert
a large .txt file as a data frame using the read_fwf
function from the readr
package. The file comes from OECD's PISA 2012 and its size is 1.1GB. It contains the responses to the student questionnaire. I work on a laptop with 4GB of RAM under Arch Linux (64-bit) and have about 250GB of free space on the hard drive. The size of the swap partition is 2GB. Here is the code that I use:
setwd("/media/work")
dbCreate("tmpDB")
DB <- dbInit("tmpDB")
dbInsert(DB, "x", data.frame(read_fwf(
file = "/media/PISA_2012/INT_STU12_DEC03.txt",
fwf_positions(start = ranges.start, end = ranges.end,
col_names = var.names), progress = FALSE)))
ranges.start
, ranges.end
and var.names
are taken from the .sps file provided with the .txt data file.
The tmpDB
file is created, the DB
is initialized in the R environment. The dbInsert
runs without any error or warning messages, but after being done the file size of the tmpDB
still remains 0B, the dbList(DB)
returns character(0)
and the key x
does not seem to exist.
I tried with smaller files from the same or previous cycles and with those of about 500MB it works. I also tried taking just 200 lines from the file I have troubles with and it works too. I thought this might be due to the limitation of my /tmp
folder which is the system's temporary folder and is limited to 1.8GB. Then I installed the unixtools
package and used the following to change R's temporary folder and check if it is changed:
> set.tempdir("/media/temp")
> tempdir()
[1] "/media/temp"
> tempfile()
[1] "/media/temp/file8fc7d43a8d6"
I run the dbInsert
code above again. However, the result is the same - tmpDB
is still 0B, the x
key does not exist.
What would be the reason for this behavior?
Regards