spectacles icon indicating copy to clipboard operation
spectacles copied to clipboard

Duplication of dimension queries

Open joshtemple opened this issue 6 years ago • 3 comments

With one-query-per-dimension approach, it's much more likely that we'll query the same dimension if it appears in multiple explores. This has the potential to add a lot of overhead to runtime.

I propose that we track the dimension names (namespaced by models, maybe?) that have been queried and skip over anything we've already queried as we loop through other explores and models.

joshtemple avatar Jun 17 '19 19:06 joshtemple

This makes sense. Few things come to mind:

  • We would have to start tracking the view that a field comes from, right?
  • Would we want to display an 'error' against all views that the field is in, even if we are only testing it in one of them?
  • What if the join of an explore is broken? There is a risk here that we miss the error if we test all of one view in one explore, and then skip it in subsequent explores.

DylanBaker avatar Jun 18 '19 07:06 DylanBaker

We would have to start tracking the view that a field comes from, right?

The dimension name returned by the API is of the form view_name.dimension_name so I think that should be easy enough to track.

Would we want to display an 'error' against all views that the field is in, even if we are only testing it in one of them?

Yes I think we should.

What if the join of an explore is broken? There is a risk here that we miss the error if we test all of one view in one explore, and then skip it in subsequent explores.

So this would be a case where the dimension itself is valid, the join in the explore we're testing is valid, but the dimension is being used in a join for another explore... and that join is not valid?

That's true, in that case we would miss the error. What if we check the actual query SQL and skip it if the same SQL has already been run. You're right, checking the same dimension isn't enough, because the SQL for a dimension could vary depending on which explore it's being run through.

joshtemple avatar Jun 18 '19 21:06 joshtemple

So this would be a case where the dimension itself is valid, the join in the explore we're testing is valid, but the dimension is being used in a join for another explore... and that join is not valid?

It could be trying to join across incompatible data types. But that's a pretty small edge case.

The dimension name returned by the API is of the form view_name.dimension_name so I think that should be easy enough to track.

I think if the view is aliased in an explore, it comes up with the alias not the original name, which could add a bit of complexity. I think we can get around it though.

What if we check the actual query SQL and skip it if the same SQL has already been run.

That makes sense to me.

DylanBaker avatar Jun 19 '19 14:06 DylanBaker