node-sqlite3 icon indicating copy to clipboard operation
node-sqlite3 copied to clipboard

RFC: Adding an import method for fast import of CSV

Open antonycourtney opened this issue 7 years ago • 5 comments

I'm the developer of Tad, a fast, free CSV file viewer and pivot-table based UI for exploring and analyzing tabular data, implemented in JavaScript. Tad works by immediately importing target CSV files into sqlite, and using sqlite (via node-sqlite3 of course) for all subsequent analytical operations.

In order to achieve high performance for large CSV files, I extracted the import code implemented in C in the sqlite3 shell, and wrapped it as a library routine in my fork of node-sqlite3. The result is an easy way to import CSV files into sqlite from JavaScript, with great performance.

Usage:

var db = new sqlite3.Database(':memory:');
...
var options = { columnIds: ['colA', 'colB', 'colC'], delimiter: ',' };
db.import(testPath,'staging', options, function (err, res) {
  ...
});

I'm happy with the functionality and implementation, but it's a bit unfortunate that I'm maintaining my own fork of node-sqlite3. I'd love to either upstream this code or move it to a separate library.

Since this may be generally useful to a large number of node-sqlite3 users, would you consider accepting this as a PR?

antonycourtney avatar Nov 26 '18 01:11 antonycourtney

Hi, Thanks for the pull request, I think it is an interesting feature. However, this would require tests to be included before it can go through. Would you be kind enough add tests for this functionality? Please remove the lock files also.

kewde avatar Nov 26 '18 12:11 kewde

Hi @kewde - Thanks for the reply (and sorry for delay in mind). I'd be happy to add tests and remove the lock files. Will do.

antonycourtney avatar Dec 07 '18 00:12 antonycourtney

Hi @kewde - I added a handful of tests and removed the lock files from this PR. Unfortunately I'm having no end of difficult getting this to pass CI. On Linux with NODE_VERSION="9" and earlier it seems to be crashing when running node test/support/createdb.js which is extremely perplexing since that shouldn't be executing any of my code. On the Mac builds, the build seems to fail with for node versions <= 6 because my new code uses std::regex.

Any ideas or suggestions on how I might resolve this? At this point I'm fairly tempted to just move this into its own package (that perhaps only works with more recent versions of node) or go back to maintaining a private fork...

Thanks in advance for any help!

antonycourtney avatar Dec 13 '18 02:12 antonycourtney

The createdb. js file is a pretest. It runs before the tests and set up the environment. It's the first time the newly compiled sqlite3 lib is loaded in JS, it segfaulted and aborted.

kewde avatar Dec 14 '18 02:12 kewde

Still seeing core dumps..

terminate called after throwing an instance of 'std::regex_error'
  what():  regex_error
Aborted (core dumped)

kewde avatar Jan 28 '19 13:01 kewde