redash
redash copied to clipboard
Support a python-table-object string in query_results query-runner
Allow to add a table in python-string format in the sqlite command to extend the possibilities of the query_results, especially in combination with the python query runner.
What type of PR is this?
- [x] Feature
Description
The query_results runner is modified to accept python stringified result tables. This should be especially useful, when executed from python query runner, where you have intermediate result tables that you can then give to the query_results runner to refine further.
The table-string has to be preceded by "query_table_start_" and followed by "_query_table_end"
Example:
select * from query_table_start_{'rows': [{'id': 1040}, {'id': 4711}], 'columns': [{'name': 'id', 'friendly_name': 'id', 'type': 'integer'}]}_query_table_end where id = 4711
I assume that this is not a secure feature as the table might contain strings that:
- are interpreted as table ends and then add additional SQL commands,
- dangerous non-table objects that are created when using the python eval() command to convert the string to an object
Because of the security concerns, this feature is not enabled by default. It can be enabled by setting the environment variable REDASH_QUERY_RESULTS_ALLOW_PYTHON_TABLE_STRING to "true"
I assume it has no special redash-version requirements. It was tested on redash:10.1.0.b50633 container.
How is this tested?
- [ ] Unit tests (pytest, jest)
- [ ] E2E Tests (Cypress)
- [X] Manually
- [ ] N/A
Done some request with and without new string-table, also joined with another query_X table Tested on redash:10.1.0.b50633 container, by replacing the query_results.py file.
Related Tickets & Documents
Mobile & Desktop Screenshots/Recordings (if there are UI changes)
Hi @K-Lohse , thanks for the PR!
I'm curious why this functionality is useful for you - could you explain what's different between this and using the query results runner with a query like below?
with custom_table(id) as (
values (1040), (4711)
)
select * from custom_table where id = 4711
Additionally, codecov won't let us merge this PR in unless we add tests to cover this patch. Finally, we'd also have to test this against the current tip of master, since we've updated a lot of things since 10.1 came out.
Hi @guidopetri, thanks for your review. To be honest, I did not know of the custom_table SQL functionality.
My use case is more or less described in this discussion. We have a lot of limited JSON-APIs that require parameters and needed to be combined afterwards. As it is currently not possible to pass parameters from a query_results runner to a subquery, my workaround tries to utilize the python query runner with the execute_query() function to be able to run all base queries with parameters. These results need then to be combined and refined. To do this with minimum effort, I want to pass the execute_query results directly to the results_query runner to make use of the SQLite functionality.
I assume this approach is still more convenient than creating a custom_table string from the results. Please let me know if you agree and I should spend some time getting my head around writing the necessary tests.
I think I still don't see why you'd use this python-table-object string anywhere. Your python query runner can call execute_query on both (parameterized) queries, then you'll have the results to both results and can join them.
JSON_query_1 --|
|-> Python query runner -> query results query runner (which uses SQLite)
JSON_query_2 --|
Am I misunderstanding something?
Am I misunderstanding something?
Your basic thinking is correct, but I assume you are missing, that there is no feature to pass a parameter from the query_results runner to the python runner. When I would use the query_results query runner on my python query runner, I could not pass the parameters to the python query runner and therefore not include the parameter choice in the dashboard.
When instead I use the python query-runner, I can use the parameters in each execute_query() function (not in a get_query_result() function) and therefore include the parameters used in all of the queries in the dashboard.
In pseudocode my use case looks something like this:
R1 = execute_query(json-query, "my query string using PARAM1")
R2 = execute_query(json-query, "my query string using PARAM2")
R3 = execute_query(query-results,"my query on R1, joined R2 using PARAM3")
Embed the graph of R3 with choices for PARAM1, PARAM2 and PARAM3 in the dashboard.
So basically, this is a workaround for the missing feature of the result_query runner to pass parameters to the sub-queries
In which one of your queries would you use this functionality that you can't use the values in a CTE?
It could also be achieved with the values in a CTE, but it would add more effort for the user in each python query to escape and transform results to use them as sql values table.