datasette.io icon indicating copy to clipboard operation
datasette.io copied to clipboard

Tutorial or how-to on importing large CSVs

Open simonw opened this issue 2 years ago • 4 comments

This came up on Discord: https://discord.com/channels/823971286308356157/823971286941302908/1017006129831739432

If you have a large (e.g. 9.5GB) CSV file it's not obvious how best to import it.

csvs-to-sqlite tries to load the whole thing into RAM, which isn't ideal. sqlite-utils insert can stream it, which is better, but it's still quite slow. The best option is actually to create the table manually and then use sqlite3 .import to import the CSV, as described here: https://til.simonwillison.net/sqlite/import-csv - but it's not exactly obvious!

Documentation could help here.

This is actually more of a "how-to" in the https://diataxis.fr/ framework as opposed to a tutorial.

simonw avatar Sep 07 '22 20:09 simonw