dbt-core icon indicating copy to clipboard operation
dbt-core copied to clipboard

[CT-1023] [SPIKE] Use real `ref` in metric compilation logic

Open jtcohen6 opened this issue 1 year ago • 0 comments

Current State

As @jtcohen6 raised in the above issue, there are problems with how the metrics package is currently handling model references. For the curious, this is because metrics can have nested dependencies that resolve as models 1+n nodes upstream. For example, if you have an expression metric that is built on a chain of 4 other metrics, we need to resolve the model(s) reference at the end of that chain.

Right now we construct that information from graph. We can find the edges of the metric dependency, get the model name from the metric, construct that as a relation and then use it in our sql generation. This produces issues, however, when it comes to state deferral.

Desired End State

As most of this will happen behind the scenes, there are few constraints when it comes to resolving this issue. The end goal is that the relationship would be correctly picked up and understood . @jtcohen6 raised a few ideas on how to resolve this:

  • Add dependencies at parse time. Whenever dbt-core detects a call to metric(), it also tries to look up that metric's model dependencies, to record them in depends_on.nodes, too. This is potentially very slow, and I'm not even sure it's possible — the metrics haven't been parsed yet.
  • More complex check in the RuntimeRefResolver. We adjust the logic in RuntimeRefResolver (code linked above) so it checks to ensure that either:
    • the ref'd model is in self.model.depends_on.nodes, OR
    • the ref'd model is upstream of this model, because it's upstream of a metric that is in self.model.depends_on.nodes / self.model.metrics. This feels sorta gross, and like it requires information that's not in scope here.
  • Resolve the ref for the metric, rather than the downstream model. If there were a way to "compile" a metric so that it's the one resolving the ref, everything's golden. We'd need to store the resolved form of the metric's model reference ... in the graph somewhere ... or make it accessible as another Python method.

One thing I do want to call out is that we'd need this functionality to be able to support n+1 node depth. IE not just understanding metric --> model but also understanding metric --> ... --> model.

Would love to hear thoughts about solutions from the languages team!

Originally posted by @callum-mcdata in https://github.com/dbt-labs/dbt-core/issues/5525#issuecomment-1208425135

jtcohen6 avatar Aug 10 '22 13:08 jtcohen6