sql.js
sql.js copied to clipboard
Add support for creating SQLite aggregate functions
Currently sql.js
supports creating custom SQL functions via the exposed create_function
function. This allows you to use a locally-defined function from within SQLite queries, such as SELECT custom_func(col) FROM ...
.
SQLite also supports user-defined aggregate functions. This is supported via the same native C SQLite function that create_function
uses, namely sqlite3_create_function
; however, it requires the definition of two additional function pointers to be passed in to the function which are called at certain points during the aggregate calculation process.
This is all document on the SQLite page for sqlite3_create_function
:
The sixth, seventh and eighth parameters, xFunc, xStep and xFinal, are pointers to C-language functions that implement the SQL function or aggregate. A scalar SQL function requires an implementation of the xFunc callback only; NULL pointers must be passed as the xStep and xFinal parameters. An aggregate SQL function requires an implementation of xStep and xFinal and NULL pointer must be passed for xFunc. To delete an existing SQL function or aggregate, pass NULL pointers for all three function callbacks.
Currently we are only defining the first of these (xFunc
) for scalar function support. xStep
and xFinal
would also have to be defined. When a custom aggregate function is used in SQLite, xFunc
is first called then xStep
is called for each value being aggregated, and finally xFinal
is called to finalize the results and return a result.
The definitions for xFunc
and xStep
are identical, while xFinal
does not take a parameter besides the SQLite3 context (since it is a finalizer):
void (*xFunc)(sqlite3_context*,int,sqlite3_value**),
void (*xStep)(sqlite3_context*,int,sqlite3_value**),
void (*xFinal)(sqlite3_context*)
The create_function
method in sql.js
can only support custom scalar functions, a different definition would be needed to support creating aggregate functions.
In Python, scalar function support is implemented in the same way as sql.js
with a single create_function
method. Aggregate function support was implemented via a create_aggregate
function that takes a class with methods step
and finalize
which are called accordingly; for sql.js
I would propose a method defined along the lines of:
create_aggregate(name, func, step, final)
Where func
, step
, and final
are function callbacks. It should be straightforward to understand and use for anyone that is remotely familiar with SQLite custom functions in C or any other language.
Thanks for considering!
This is a very interesting feature request ! I currently don’t have the time to work on this, but I would be happy to review a pull request for this.
I recommend reading the code for create_function
https://github.com/kripken/sql.js/blob/master/coffee/api.coffee#L439-L483
Implementing this shouldn't be hard
Hey @lovasoa I'm trying to implement this, and I was a bit confused by what the addFunction
does here. It doesn't seem to be defined anywhere in the repo? Thanks!
@lovasoa I know you're a busy person but I figured I'd put this here anyway. I saw you recently removed CoffeeScript from the repo. Thank you for that! CoffeeScript was a pain to work with =(
If you change create_function
to,
function setFunctionResult (cx, result) {
switch (typeof result) {
case "boolean":
sqlite3_result_int(cx, result ? 1 : 0);
break;
case "number":
sqlite3_result_double(cx, result);
break;
case "string":
sqlite3_result_text(cx, result, -1, -1);
break;
case "object":
if (result === null) {
sqlite3_result_null(cx);
} else if (result.length != null) {
var blobptr = allocate(result, "i8", ALLOC_NORMAL);
sqlite3_result_blob(cx, blobptr, result.length, -1);
_free(blobptr);
} else {
sqlite3_result_error(cx, (
"Wrong API use : tried to return a value "
+ "of an unknown type (" + result + ")."
), -1);
}
break;
default:
sqlite3_result_null(cx);
}
}
function parseFunctionArguments (argc, argv) {
function extract_blob(ptr) {
var size = sqlite3_value_bytes(ptr);
var blob_ptr = sqlite3_value_blob(ptr);
var blob_arg = new Uint8Array(size);
for (var j = 0; j < size; j += 1) blob_arg[j] = HEAP8[blob_ptr + j];
return blob_arg;
}
var args = [];
for (var i = 0; i < argc; i += 1) {
var value_ptr = getValue(argv + (4 * i), "i32");
var value_type = sqlite3_value_type(value_ptr);
var arg;
if (value_type === SQLITE_INTEGER || value_type === SQLITE_FLOAT) {
arg = sqlite3_value_double(value_ptr);
} else if (value_type === SQLITE_TEXT) {
arg = sqlite3_value_text(value_ptr);
} else if (value_type === SQLITE_BLOB) {
arg = extract_blob(value_ptr);
} else arg = null;
args.push(arg);
}
return args;
}
function invokeWithFunctionArguments (cx, func, args) {
try {
return func.apply(null, args);
} catch (error) {
if (error instanceof Error) {
sqlite3_result_error(cx, customErrorMessage, -1);
} else {
sqlite3_result_error(cx, (typeof error == "string") ? error : String(error), -1);
return undefined;
}
}
}
/** Register a custom function with SQLite
@example Register a simple function
db.create_function("addOne", function (x) {return x+1;})
db.exec("SELECT addOne(1)") // = 2
@param {string} name the name of the function as referenced in SQL statements.
@param {function} func the actual function to be executed.
@return {Database} The database object. Useful for method chaining
*/
Database.prototype["create_function"] = function create_function(name, func, options) {
var func_ptr;
function wrapped_func(cx, argc, argv) {
var args = parseFunctionArguments(argc, argv);
var result = invokeWithFunctionArguments(cx, func, args);
setFunctionResult(cx, result);
}
if (Object.prototype.hasOwnProperty.call(this.functions, name)) {
removeFunction(this.functions[name]);
delete this.functions[name];
}
// The signature of the wrapped function is :
// void wrapped(sqlite3_context *db, int argc, sqlite3_value **argv)
func_ptr = addFunction(wrapped_func, "viii");
this.functions[name] = func_ptr;
this.handleError(sqlite3_create_function_v2(
this.db,
name,
//nArg
//If this parameter is -1,
//then the SQL function or aggregate may take any number of arguments between
//0 and the limit set by sqlite3_limit(SQLITE_LIMIT_FUNCTION_ARG)
(
options.isVarArg === true ?
-1 :
func.length
),
SQLITE_UTF8,
0,
func_ptr,
0,
0,
0
));
return this;
};
You get a few benefits,
- varArg functions are now supported
- Parsing function arguments is abstracted away
- Invoking functions is abstracted away
- Setting the result of a function is abstracted away
Then, to implement create_aggregate
,
Database.prototype["create_aggregate"] = function create_aggregate(name, init, step, finalize) {
let state = undefined;
const wrapped_step = function(cx, argc, argv) {
if (state === undefined) {
state = init();
}
const args = parseFunctionArguments(argc, argv);
invokeWithFunctionArguments(cx, step, [state, ...args]);
};
const wrapped_finalize = function (cx) {
const result = invokeWithFunctionArguments(cx, finalize, [state]);
setFunctionResult(cx, result);
state = undefined;
}
if (Object.prototype.hasOwnProperty.call(this.functions, name)) {
removeFunction(this.functions[name]);
delete this.functions[name];
}
if (Object.prototype.hasOwnProperty.call(this.functions, name + "__finalize")) {
removeFunction(this.functions[name + "__finalize"]);
delete this.functions[name + "__finalize"];
}
// The signature of the wrapped function is :
// void wrapped(sqlite3_context *db, int argc, sqlite3_value **argv)
const step_ptr = addFunction(wrapped_step, "viii");
// The signature of the wrapped function is :
// void wrapped(sqlite3_context *db, int argc, sqlite3_value **argv)
const finalize_ptr = addFunction(wrapped_finalize, "viii");
this.functions[name] = step_ptr;
this.functions[name + "__finalize"] = finalize_ptr;
this.handleError(sqlite3_create_function_v2(
this.db,
name,
//nArg
//If this parameter is -1,
//then the SQL function or aggregate may take any number of arguments between
//0 and the limit set by sqlite3_limit(SQLITE_LIMIT_FUNCTION_ARG)
/**
* @todo Implement vararg aggregate function
*/
step.length - 1,
SQLITE_UTF8,
//pApp
0,
//xFunc
0,
//xStep
step_ptr,
//xFinal
finalize_ptr,
//xDestroy
0
));
return this;
};
The TypeScript signature for create_aggregate
would be,
create_aggregate<StateT> (
functionName : string,
init : () => StateT,
step : (state : StateT, ...args : unknown[]) => void,
finalize : (state : StateT) => unknown
) : this;
You can use it to implement STDDEV_POP
,
create_aggregate(
"STDDEV_POP",
() => {
return {
values : [] as number[],
};
},
(state, x) => {
if (x === null) {
return;
}
if (typeof x == "number") {
state.values.push(x);
} else {
throw new Error(`STDDEV_POP(${typeof x}) not implmented`);
}
},
(state) => {
if (state == undefined) {
return null;
}
if (state.values.length == 0) {
return null;
}
const sum = state.values.reduce(
(sum, value) => sum + value,
0
);
const count = state.values.length;
const avg = sum/count;
const squaredErrors = state.values.map(value => {
return Math.pow(value - avg, 2);
});
const sumSquaredErrors = squaredErrors.reduce(
(sumSquaredErrors, squaredError) => sumSquaredErrors + squaredError,
0
);
return Math.sqrt(
sumSquaredErrors / count
);
}
);
If someone wants to make a PR for it, go for it! I'm kind of pre-occupied with other stuff at the moment.
Instead of addFunction
, it might be nice to have something like,
function addOrReplaceFunction (name, wrapped) {
if (Object.prototype.hasOwnProperty.call(this.functions, name)) {
removeFunction(this.functions[name]);
delete this.functions[name];
}
// The signature of the wrapped function is :
// void wrapped(sqlite3_context *db, int argc, sqlite3_value **argv)
const func_ptr = addFunction(wrapped, "viii");
this.functions[name] = func_ptr;
}
So, it can be re-used in create_function
and create_aggregate
This sounds great @AnyhowStep ! I really encourage you to make a pull request ! You did most of the work anyway. You would just have to add a test and a documentation comment (and maybe mention it in the README). I'm sure this feature would benefit others !
Hi guys! Why https://github.com/sql-js/sql.js/pull/407 is not merged yet? Can I help? :)
If I took #407, updated it, and finished the two open tasks on it, would it have any chance of landing?
I'm willing to take a whack at it but I'd like to confirm that it's the lack of tests and docs holding it up
Hi ! yes, the feature is still welcome ! I can make no promise as to my responsiveness since I don't have a lot of time on my hands for sql.js, but if the code is good, tested, and documented, it will end up merged :)