visualization-tool icon indicating copy to clipboard operation
visualization-tool copied to clipboard

possible filter and components query needlessly fires twice - and takes a very long time

Open Rdataflow opened this issue 1 year ago • 5 comments

@bprusinowski very similar results are fetched twice - most of which aren't even needed (non-key) - thanks for taking a look

Describe the bug The rework of https://github.com/visualize-admin/visualization-tool/pull/1487 shows a tiny glitch endin up in a big difference on milk cubes.

To Reproduce Steps to reproduce the behavior:

  1. Go to https://int.visualize.admin.ch/browse?dataset=https%3A%2F%2Fagriculture.ld.admin.ch%2Ffoag%2Fcube%2FMilkDairyProducts%2FConsumption_Price_Month&dataSource=Int-uncached
  2. Enable Debug mode
  3. Create visualization
  4. See queryplan and GraphQL waterfall

Expected behavior

  • possible filters are fetched once
  • overall cubeComponents are fetched once using a speedy query

Actual behavior

  • possible filters are fetched twice
  • overall cubeComponents are fetched twice, the second query is very slow (and asks for information about every single non-keyDimension which becomes expensive and takes > 20s)
  • OTOH the chart already appears perfectly a few seconds from the start
  • so it's just the left panel (and not the filters) which fires that expensive re-query

Screenshots or video image

Environment (please complete the following information):

  • Visualize environment and version: INT 4.7.2

Additional context the query needlessly re-asking for information over all dimensions get.components.expensive.txt

curl 'https://int.visualize.admin.ch/api/graphql' -X POST -H 'content-type: application/json' --data-raw '{"operationName":"DataCubeComponents","variables":{"locale":"de","sourceType":"sparql","sourceUrl":"https://lindas.admin.ch/query","cubeFilter":{"iri":"https://agriculture.ld.admin.ch/foag/cube/MilkDairyProducts/Consumption_Price_Month","filters":{"https://agriculture.ld.admin.ch/foag/dimension/product":{"type":"single","value":"https://agriculture.ld.admin.ch/foag/product/193"},"https://agriculture.ld.admin.ch/foag/dimension/value-chain-detail":{"type":"single","value":"https://agriculture.ld.admin.ch/foag/value-chain-detail/18"},"https://agriculture.ld.admin.ch/foag/dimension/key-indicator-type":{"type":"single","value":"https://agriculture.ld.admin.ch/foag/key-indicator-type/1"},"https://agriculture.ld.admin.ch/foag/dimension/production-system":{"type":"single","value":"https://agriculture.ld.admin.ch/foag/production-system/3"}},"loadValues":true}},"query":"query DataCubeComponents($sourceType: String!, $sourceUrl: String!, $locale: String!, $cubeFilter: DataCubeComponentFilter!) {\n  dataCubeComponents(\n    sourceType: $sourceType\n    sourceUrl: $sourceUrl\n    locale: $locale\n    cubeFilter: $cubeFilter\n  )\n}\n"}'
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/cost-component> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/currency> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/data-method> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/data-source> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/date> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/foreign-trade> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/key-indicator-type> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/market> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/product> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/product-group> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/production-system> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/product-origin> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/product-properties> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/product-subgroup> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/sales-region> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/unit> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/usage> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/value-chain> } 
VALUES ?dimensionIri { <https://agriculture.ld.admin.ch/foag/dimension/value-chain-detail> }

Rdataflow avatar Jul 09 '24 06:07 Rdataflow

Hey @Rdataflow, thanks for identifying and reporting the issue. I am not 100% sure it's a regression from before the refactor of the query, as there are some specific reasons we need to have the flow mentioned as problematic above:

  • first query fetches all dimensions with values, including non-key dimensions,
  • we need this information to be able to create initial chart config and a list of possible chart types,
  • we need to access dimension values to derive the initial filters and check if we need to fire possibleFiltersQuery in case default filters result in no observations,
  • after the chart was initialized, we no longer need to send an unfiltered components query and send ones with filters, specifically for the left panel.

To sum up, it looks like it's the left panel that has some duplicated logic, but in fact first queries are set when initializing chart from cube and are not related to the queries we send from the left filter panel. I wouldn't treat this as a bug, as we introduced this behavior in order to make sure we show a "correct" chart as soon as it loads, and prevent showing no-data screen initially, followed by a reload of queries only afterwards.

I modified the logic a bit to re-use the preview query in https://github.com/visualize-admin/visualization-tool/pull/1697. As a con, it always fires possible filters query, contrary to conditional firing in the old logic. Let me know if that explains and improves situation :)

bprusinowski avatar Aug 29 '24 10:08 bprusinowski

After discussing with @Rdataflow, we'll not merge #1697, but rather exclude non-key dimensions from the most expensive query fired when initializing a chart from cube. The problem with #1697 is that we can no longer easily determine "preferred dimension values", which means that e.g. when opening an NFI: Change cube, we do not select Schweiz anymore (as the top-root hierarchy value), but rather Fribourg.

bprusinowski avatar Sep 04 '24 13:09 bprusinowski

@bprusinowski can this issue be closed (as the PR will not be merged), or should we keep it open?

sosiology avatar Sep 25 '24 13:09 sosiology

Hi @sosiology, I think there's still one thing we should improve connected to this issue (excluding non-key dimension from the most expensive query). I'd keep it open for now 👍

bprusinowski avatar Sep 25 '24 14:09 bprusinowski

got it! Thanks @bprusinowski

sosiology avatar Sep 25 '24 14:09 sosiology