sqlite-lines
sqlite-lines copied to clipboard
The output seems slow to me. Am I doing something wrong?
Hi,
I'm using lines0.so v0.1.1.
If I run this command on this input csv file
sqlite3 tmp.sqlite "select count(*) from lines_read('input.csv');"
I have the output in 0.229 seconds.
If I apply the same to your example ndjson file
sqlite3 tmp.sqlite "select count(*) from lines_read('calendar.ndjson');"
I have the output in 51.493 seconds. Probably I'm doing something wrong.
Using wc <calendar.ndjson wc -l I have the output in 1.053 seconds.
Thank you
Hey @aborruso , that does seem odd - can you run the following script as a file called debug.sql?
.bail on
.timer on
.load ./lines0
select lines_debug();
select count(*) from lines_read('calendar.ndjson');
And then run it with sqlite3 :memory: '.read debug.sql'?
$ sqlite3 :memory: '.read debug.sql'
Version: v0.1.1
Date: 2022-06-22T14:35:59Z+0000
Source: 37c8d2dde4c97395b6af837ec7bd6f7af639e79f
Run Time: real 0.000 user 0.000000 sys 0.000138
321981
Run Time: real 0.126 user 0.066360 sys 0.059196
It runs in ~100ms for me on a digital ocean droplet with 8gb of RAM.
Thank you @asg017
I have used your script, but I have removed .load ./lines0 because I have .load /home/aborruso/library/lines0.so in my ~/.sqliterc file.
This is the result
-- Loading resources from /home/aborruso/.sqliterc
Version: v0.1.1
Date: 2022-06-22T14:35:59Z+0000
Source: 37c8d2dde4c97395b6af837ec7bd6f7af639e79f
Run Time: real 0.000 user 0.000000 sys 0.000172
321981
Run Time: real 52.250 user 0.617902 sys 6.183273
I use the lines0 I have in lines0-linux-amd64.zip
Woah, that's pretty wild to see... Could you try compiling the library yourself and see if that changes anything? It should be straightforward to do, no external deps
git clone [email protected]:asg017/sqlite-lines.git
cd sqlite-lines
mkdir -p dist/
make loadable
sqlite3 :memory: '.timer on' '.load dist/lines0' 'select count(*) from lines_read("calendar.ndjson");'
I'm guessing the pre-compiled library might be way slower for some reason, and possibly compiling yourself may fix it? Weird how it only seems to get tripped up on calendar.json, too
Could you try compiling the library yourself and see if that changes anything? It should be straightforward to do, no external deps
It's the same :(
