Michael Jungmair
Michael Jungmair
Currently, in-clauses are executed naively in the case of a list with scalar values. Even if there are hundreds of values, we generate hundreds of comparisons. This not only hurts...
Currently, char types (fixed-width strings), are treated like this in LingoDB: For up to a length of 8 bytes/chars, integers of appropriate width are used to represent chars below the...
It would be great, if we could have a documentation for LingoDB's MLIR dialects similar to the one of [MLIR](https://mlir.llvm.org/docs/Dialects/), and could automate the process split into two parts. 1....
Problems: - Too many passes/iteration over the IR, which increase optimization time - Some of the passes are executed multiple times, but are not idempotent. This leads to problems further...
For joins, we currently do not take the number of distinct values into account. Especially for categorical data stored e.g. in strings, our estimates are completely off. Also: we could...
Current implementation returns int64_t (epoch nanoseconds) instead of preserving the input timestamp type like PostgreSQL. The function should (probably) return the same timestamp type as its input argument.
At the moment, erroneous queries are sometimes not rejected by the frontend and usually fail later in the compilation. Example: `select l_shipmode, count(*) from lineitem` This should be fixed. Additionally,...
**1. directly compute hashes for column values in runtime** Currently hashes are calculated using a embedded SQL query: https://github.com/lingo-db/lingo-db/blob/aa3a3610c503aa8deb6ae88646448474f9f9683b/src/runtime/LingoDBHashIndex.cpp#L54 This introduces quite some overhead... **2. don't use arrow function to...