redash
redash copied to clipboard
Added Duckdb query runner
What type of PR is this?
- New Query Runner (Data Source)
Description
This PR adds support for duckdb query runner. DuckDB is an OLAP database used by data professionals, such as data scientists and analysts, to analyze data in a fast and efficient manner . This supports querying tables from duckdb files.
How is this tested?
- Manually
This query runner is tested with redash docker based setup and environment.For sample .duckdb files, it fetches tables and sample queries are also checked.
Related Tickets & Documents
Mobile & Desktop Screenshots/Recordings (if there are UI changes)
I am facing issue while loading schema in schema dashboard. Getting schema refresh failed. For some query runners that works. So how can I do that for this one?
I actually like DuckDB. I think it's worth adding it now as a QR. But maybe you should start saving query results to a local instance of DuckDB, and not to main PG.
@ashutosh6500 hI, think there's a description here. https://redash.io/help/open-source/dev-guide/write-a-query-runner .
You need implement get_schema https://github.com/getredash/redash/blob/master/redash/query_runner/init.py#L230
I actually like DuckDB. I think it's worth adding it now as a QR. But maybe you should start saving query results to a local instance of DuckDB, and not to main PG.
yes
@ashutosh6500 hI, think there's a description here. https://redash.io/help/open-source/dev-guide/write-a-query-runner . You need implement
get_schemahttps://github.com/getredash/redash/blob/master/redash/query_runner/init.py#L230
thanks!
Then we are waiting for you to implement get_schema and tests.
Then we are waiting for you to implement get_schema and tests.
I have implemented get_schema now. What about tests? I didn't get actually.
You can take the test as an example, I think from this folder and do it by analogy https://github.com/getredash/redash/tree/master/tests/query_runner
It is advisable for you to make such a setting for yourself https://github.com/getredash/redash/wiki/Local-development-setup#configuring-pre-commit . And run make format
I could do it for you, but then the lines of code will belong to me, not to you. I would like to avoid this !
I've just added a new commit to this PR, which should (in theory) fix the formatting. Hopefully the CI tests all pass ok now. :smile:
Codecov Report
Attention: 51 lines in your changes are missing coverage. Please review.
Comparison is base (
d333660) 60.78% compared to head (3c7722c) 60.59%. Report is 126 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #6356 +/- ##
==========================================
- Coverage 60.78% 60.59% -0.20%
==========================================
Files 153 154 +1
Lines 12527 12596 +69
Branches 1701 1712 +11
==========================================
+ Hits 7614 7632 +18
- Misses 4687 4738 +51
Partials 226 226
| Files | Coverage Δ | |
|---|---|---|
| redash/settings/__init__.py | 98.65% <ø> (ø) |
|
| redash/query_runner/duckdb.py | 26.08% <26.08%> (ø) |
Well, my commit fixed the formatting ok. It didn't fix the other stuff though. :innocent:
the problem is in ci/cd because it can't find duckdb
Traceback (most recent call last):
File "/app/manage.py", line 6, in <module>
from redash.cli import manager
File "/app/redash/__init__.py", line 53, in <module>
import_query_runners(settings.QUERY_RUNNERS)
File "/app/redash/query_runner/__init__.py", line 436, in import_query_runners
__import__(runner_import)
File "/app/redash/query_runner/duckdb.py", line 3, in <module>
import duckdb
ModuleNotFoundError: No module named 'duckdb'
the problem is in ci/cd because it can't find duckdb
Traceback (most recent call last): File "/app/manage.py", line 6, in <module> from redash.cli import manager File "/app/redash/__init__.py", line 53, in <module> import_query_runners(settings.QUERY_RUNNERS) File "/app/redash/query_runner/__init__.py", line 436, in import_query_runners __import__(runner_import) File "/app/redash/query_runner/duckdb.py", line 3, in <module> import duckdb ModuleNotFoundError: No module named 'duckdb'
will take a look at this. why its not working
@ashutosh6500 Any interest in finishing this off? :smile:
Closing for now - @ashutosh6500 feel free to reopen
I'm interested in picking this up & finishing it off but a few questions:
But maybe you should start saving query results to a local instance of DuckDB, and not to main PG.
Not sure I understand this comment (at least in the context of the PR as it now stands). In my use case, I need to open the duckdb file in read-only mode (since there maybe multiple processes accessing it). Any problem doing so?
I'm not sure what @konnectr meant either, I was under the impression all query results were saved to the Redash internal postgres db.
For now, I'd be comfortable moving this forward if you merged master in / fixed conflicts and added tests. Thanks for picking this back up, I really appreciate it!