build-your-own-sqlite icon indicating copy to clipboard operation
build-your-own-sqlite copied to clipboard

Index is not required to pass SQLite #nz8 Retrieve data using an index

Open andy1li opened this issue 8 months ago • 4 comments

https://forum.codecrafters.io/t/nz8-test-pass-without-an-index/7811

For the last stage I have not yet implemented the index lookup. Something seems amiss because my code is taking ~7s yet it passes the final stage test…

[tester::#NZ8] Running tests for Stage #NZ8 (Retrieve data using an index) [tester::#NZ8] $ ./your_program.sh test.db "SELECT id, name FROM companies WHERE country = 'montserrat'" [your_program] 4472846|university of science, arts & technology [your_program] 5316703|the abella group llc [your_program] 288999|government of montserrat [your_program] INFO SQLCommand - Query time=7319ms [tester::#NZ8] $ ./your_program.sh test.db "SELECT id, name FROM companies WHERE country = 'north korea'" [your_program] 986681|isn network company limited [your_program] 1573653|initial innovation limited [your_program] 2828420|beacon point ltd [your_program] 3485462|pyongyang university of science & technology (pust) [your_program] 3969653|plastoform industries ltd [your_program] 4271599|korea national insurance corporation [your_program] INFO SQLCommand - Query time=7060ms [tester::#NZ8] Test passed.

andy1li avatar Mar 30 '25 16:03 andy1li

Thanks @HowieTKL for highlighting the issue!

andy1li avatar Mar 30 '25 16:03 andy1li

We picked the time value & database size here based on then performance we were observing at the time, looks like this might've changed w/ the recent upgrades to our test infrastructure.

The real solution here is to somehow restrict or meter the amount of filesystem IO a program makes. We could either do this using cgroups or by creating a Fuse filesystem. We should be able to predict a reasonable upper-bound to how much data needs to be read for a query to be served via an index.

rohitpaulk avatar Apr 01 '25 02:04 rohitpaulk

There's also a huge 10x difference depending on use of Java's memory-mapped ByteBuffer (~3s on my local, ~7s on test machine) vs RandomAccessFile (~30s on my local, timeout on test machine).

Perhaps constraining available RAM for this challenge to ensure we cannot fit the 1GB file into memory will prevent such brute force solutions.

HowieTKL avatar Apr 01 '25 03:04 HowieTKL

@HowieTKL ah, good observation - constraining RAM will actually be easier for us to implement, will keep this in mind when we get to this!

rohitpaulk avatar Apr 03 '25 20:04 rohitpaulk