opteryx
opteryx copied to clipboard
🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.
~~~sql SELECT EXTRACT(YEAR FROM hire_date) AS hire_year, COUNT(*) FROM employees GROUP BY hire_year; ~~~
~~~sql SELECT COUNT(*) AS total_rows, COUNT(*) FILTER (WHERE status = 'active') AS active_rows, AVG(salary) FILTER (WHERE department = 'engineering') AS avg_engineering_salary FROM employees; ~~~
this would reduce some of the most complex python code
~~~sql FROM a INNER HASH JOIN b ON a.id = b.id FROM a INNER NESTED JOIN b ON a.id = b.id FROM a INNER MERGE JOIN b ON a.id =...
do we need a unicode or similar type for times we really need unicode Could we use NORMALIZE as a modifier at all?
we may be able to use latches to hold an item in the buffer until the reading is complete Reads appear to have a delay due to making copies. We...
https://www.postgresql.org/docs/current/functions-array.html Test for overlap of two arrays Keep @> for literal and column matching only Implement
This is generally being used as a containment test for the LEFT table, rewriting as a SEMI JOIN will be faster.
The Int64 BL is less accurate than the text bloom filter but is faster. Where we have multiple join columns we should merge the bloom filters - this should improve...
~~~python import pyarrow as pa import multiprocessing.shared_memory as shm import numpy as np import multiprocessing def create_shared_arrow_array(): """Creates a large Arrow array and stores it in shared memory.""" data =...