django-clickhouse-backend
django-clickhouse-backend copied to clipboard
[BUG] - Package taking time to render response.
Describe the bug Observed a case where when the query string is long, Package is taking more time to render data. Also ran a profiler and checked that database is returning result instantly (less than 500ms) but django is taking more 30-35 seconds.
Query string length is 575316 characters, this is raw SQL length
With lower query string it works fine.
To Reproduce Trigger API with higher query string (approx more than 575316)
Expected behavior Django should return response within 1-2 seconds
Versions
- ClickHouse server version. Version can be obtained by running
SELECT version()query. - 23.12.1.956 - Python version. - 3.11
- Clickhouse-driver version. 0.26
- Django version. - 4.2
- Django clickhouse backend version. 1.14
Is it caused by a SELECT query filtering JSON field?
Is it caused by a SELECT query filtering JSON field?
No, we don't have JSON field @jayvynl
Let us know if you need more info, we can share. @jayvynl
@vigneshshettyin What is the query like
WITH "cte" AS (
SELECT "test_data"."CMO_ID",
uniq("test_data"."CLP_ID") AS "no_clp",
uniq("test_data"."PATIENT_ID") AS "no_clp"
FROM "test_data"
WHERE "test_data"."CODE" IN (..............)
GROUP BY "test_data"."CMO_ID"
) SELECT "entity_ui_ch_entityagg"."id",
"entity_ui_ch_entityagg"."address",
"entity_ui_ch_entityagg"."city",
"entity_ui_ch_entityagg"."dimid_c",
"entity_ui_ch_entityagg"."dx_list",
"entity_ui_ch_entityagg"."list_filter",
"entity_ui_ch_entityagg"."full_name",
"entity_ui_ch_entityagg"."cmo_id",
"entity_ui_ch_entityagg"."dimid_m",
"entity_ui_ch_entityagg"."state",
"entity_ui_ch_entityagg"."data_tax",
COALESCE("cte"."no_clp", 0) AS "no_clp",
COALESCE("cte"."no_clp", 0) AS "no_clp"
FROM "entity_ui_ch_entityagg"
LEFT OUTER JOIN "cte"
ON "entity_ui_ch_entityagg"."cmo_id" = ("cte"."CMO_ID")
ORDER BY 12 DESC
LIMIT 10
Pls, find the above query, where CODE is the dynamic filter (CODE eg are like PLO92, PLO393, HJO93 ...). When the number of codes are like 10,000 django app performs well. But when the number of codes are more than 40,000 we are facing this issue.
@jayvynl
As shown in your profiling, clickhouse_driver.varint.read_varint take most of the execution time. The problem is not caused by this project.
- Check if Django queryset generate expected SQL query, check if Django gives the correct result.
- Test the same query with another clickhouse driver, for example: clickhouse http endpoint/clickhouse go driver. If they are significantly faster than clickhouse_driver, then your can open an issue to clickhouse_driver.
- I noticed that
clickhouse_driver.varint.read_varinthave been called by 17098 times, it is unusual because the query result is limited to 10 rows. I have tested a query of 10 rows result, the fuction is only called by 106 times. - I also noticed that
clickhouse_driver.varint.read_varintfunction spent 4ms per call in your environment. In my local test environment it only spend 0.04ms per call. This difference may be caused by network delay. Your should also take network delay into account.
Ok, let me raise a issue ticket for https://github.com/mymarilyn/clickhouse-driver