sqlite-lines icon indicating copy to clipboard operation
sqlite-lines copied to clipboard

The output seems slow to me. Am I doing something wrong?

Open aborruso opened this issue 3 years ago • 5 comments
trafficstars

Hi, I'm using lines0.so v0.1.1.

If I run this command on this input csv file

sqlite3  tmp.sqlite "select count(*) from lines_read('input.csv');"

I have the output in 0.229 seconds.

If I apply the same to your example ndjson file

sqlite3  tmp.sqlite "select count(*) from lines_read('calendar.ndjson');"

I have the output in 51.493 seconds. Probably I'm doing something wrong. Using wc <calendar.ndjson wc -l I have the output in 1.053 seconds.

Thank you

aborruso avatar Jul 31 '22 09:07 aborruso

Hey @aborruso , that does seem odd - can you run the following script as a file called debug.sql?

.bail on
.timer on

.load ./lines0

select lines_debug();

select count(*) from lines_read('calendar.ndjson');

And then run it with sqlite3 :memory: '.read debug.sql'?

$ sqlite3 :memory: '.read debug.sql'
Version: v0.1.1
Date: 2022-06-22T14:35:59Z+0000
Source: 37c8d2dde4c97395b6af837ec7bd6f7af639e79f
Run Time: real 0.000 user 0.000000 sys 0.000138
321981
Run Time: real 0.126 user 0.066360 sys 0.059196

It runs in ~100ms for me on a digital ocean droplet with 8gb of RAM.

asg017 avatar Jul 31 '22 16:07 asg017

Thank you @asg017 I have used your script, but I have removed .load ./lines0 because I have .load /home/aborruso/library/lines0.so in my ~/.sqliterc file.

This is the result

-- Loading resources from /home/aborruso/.sqliterc
Version: v0.1.1
Date: 2022-06-22T14:35:59Z+0000
Source: 37c8d2dde4c97395b6af837ec7bd6f7af639e79f
Run Time: real 0.000 user 0.000000 sys 0.000172
321981
Run Time: real 52.250 user 0.617902 sys 6.183273

aborruso avatar Jul 31 '22 18:07 aborruso

I use the lines0 I have in lines0-linux-amd64.zip

aborruso avatar Jul 31 '22 21:07 aborruso

Woah, that's pretty wild to see... Could you try compiling the library yourself and see if that changes anything? It should be straightforward to do, no external deps

git clone [email protected]:asg017/sqlite-lines.git
cd sqlite-lines
mkdir -p dist/
make loadable
sqlite3 :memory: '.timer on' '.load dist/lines0' 'select count(*) from lines_read("calendar.ndjson");'

I'm guessing the pre-compiled library might be way slower for some reason, and possibly compiling yourself may fix it? Weird how it only seems to get tripped up on calendar.json, too

asg017 avatar Jul 31 '22 21:07 asg017

Could you try compiling the library yourself and see if that changes anything? It should be straightforward to do, no external deps

It's the same :(

image

aborruso avatar Aug 01 '22 05:08 aborruso