emma
emma copied to clipboard
Add exists() unnesting rule.
The exists
unnesting rule should be added to the normalization engine.
This seems like an old issue that is no longer related. Can we close it?
Yes.
Obsolete due to c3130222e33cc7ea24805d44a6c07dd3f958cac4.
@joroKr21 just to clarify, this is still on the TODO list and did not become obsolete due to c313022, it's just not so easy to add it at the moment as we need to integrate some notion of keys and identity in order to make the transformation safe while keeping it within the Bag
(and not the Set
) monad.
I hope this get resolved in the future as part of a different line of work.
Ok, I'm confused, I thought we were talking about exists
as a fold
.
No, we're talking about rewriting expressions like
val dataEngineers = for {
s <- students
if studentCourses.withFilter(_.sid == s.id).exists(_.major = "DataScience")
} yield s
into equivalent expressions (which can be translated to joins) of the form
val dataEngineers = (for {
s <- students
c <- studentCourses
if c.sid = s.id
if c.major = "DataScience"
} yield s).distinct()
This transformation is sound only if the original outer comprehension is without duplicates.
TPCH Q4 gives a good example of that transformation in SQL (see p. 34 in the TPC-H specification).
In this case we should reopen.
it's just not so easy to add it at the moment as we need to integrate some notion of keys and identity in order to make the transformation safe while keeping it within the bag (and not set) monad.
In the meantime, we have the field annotation pk
, which is exactly this, if I understand correctly, right?
Yes, the next step is to develop an analysis pass over the Emma Core representation that infers key constraints for intermediate results from their inputs. This is actually the main part of the work to close this issue, as we need this information in order to decide whether the rewrite is sound. Implementing the actual rewrite can be maybe done in a day or two.