duckdb-wasm
duckdb-wasm copied to clipboard
Inserting 0 row JSON or Arrow data causes fatal error
What happens?
If I insert an empty JSON dataset or an empty Arrow dataset, DuckDB fails with an internal error and a fatal error when attempting to use the database the next time. This means that even if I catch the error, I have to restart the database. It would be better if this failed without causing a fatal error. A simple workaround is to implement an if statement in my client code prior to inserting the data, but wanted to document this issue.
To Reproduce
import * as duckdb from 'js/node_modules/@duckdb/duckdb-wasm';
import * as arrow from 'js/node_modules/apache-arrow';
const MANUAL_BUNDLES = {
mvp: {
mainModule: '/js/node_modules/@duckdb/duckdb-wasm/dist/duckdb-mvp.wasm',
mainWorker: '/js/node_modules/@duckdb/duckdb-wasm/dist/duckdb-browser-mvp.worker.js',
},
eh: {
mainModule: '/js/node_modules/@duckdb/duckdb-wasm/dist/duckdb-eh.wasm',
mainWorker: '/js/node_modules/@duckdb/duckdb-wasm/dist/duckdb-browser-eh.worker.js',
},
};
// Select a bundle based on browser checks
const bundle = await duckdb.selectBundle(MANUAL_BUNDLES);
// Instantiate the asynchronus version of DuckDB-wasm
const worker = new Worker(bundle.mainWorker);
let logger = new duckdb.ConsoleLogger();
let db = new duckdb.AsyncDuckDB(logger, worker);
await db.instantiate(bundle.mainModule, bundle.pthreadWorker);
//Bigints are not well handled in JavaScript, so we want them to come back as doubles
let dbOpen = await db.open({
query: {
castBigIntToDouble: true
}
});
let conn = await db.connect();
let dataArray = [];
let tableName = 'test';
var arrowTable = arrow.tableFromJSON([{}]);
/* The next line fails with this error:
Uncaught (in promise) Error: INTERNAL Error: Failed to bind "arrow_scan": Table function must return at least one column
at Go.insertArrowFromIPCStream (duckdb-browser-eh.worker.js:11:19758)
at lc.onMessage (duckdb-browser-eh.worker.js:10:58189)
at Zu.globalThis.onmessage (duckdb-browser-eh.worker.js:24:10927)
*/
await this.conn.insertArrowTable(arrowTable, {name: tableName, schema: 'main', create: true});
//(We never get here since we have already failed, but this is another failure example)
await this.db.registerFileText('rows.json',JSON.stringify(dataArray));
/* The next line fails with this error:
Uncaught (in promise) Error: INTERNAL Error: Failed to bind "arrow_scan": Table function must return at least one column
at Go.insertJSONFromPath (duckdb-browser-eh.worker.js:11:20446)
at lc.onMessage (duckdb-browser-eh.worker.js:10:58423)
at Zu.globalThis.onmessage (duckdb-browser-eh.worker.js:24:10927)
*/
await this.conn.insertJSONFromPath('rows.json',{name: tableName})
let result = await conn.query(`select * from test`);
console.log('DuckDB Test Result:',result.toArray());
Browser/Environment:
Chrome Version 112.0.5615.138 (Official Build) (64-bit)
Device:
Windows Laptop
DuckDB-Wasm Version:
1.25.0
DuckDB-Wasm Deployment:
Hosted in house
Full Name:
Alex Monahan
Affiliation:
Intel and DuckDB Labs
Is this related to https://github.com/duckdb/duckdb/issues/7200?
Is this related to duckdb/duckdb#7200?
My hunch is that it is not since that error relates to 0 row output, and this is 0 row input. Thanks though!
Hi! Thanks @Alex-Monahan for the report.
I think the solution here will be to rely more on the underlying DuckDB capabilities, reducing the surface for incompatibilities like in this case.
For example the end point would be having insertJSONFromPath
be a wrapper around JSON's extension read_json_auto
, but that's for the near future.
Thanks again for raising this.