ConstraintStreams: accumulate or outer right join
This pattern is common:
f.forEach(MyEntity)
.groupBy(MyEntity::myGrouping, count())
.join(MinMaxRule, ... same myGrouping ...)
It's also buggy. If nothing is assigned to the grouping, the MinMaxRule isn't checked.
This is because join() is an inner join in SQL talk. We need an outer right join instead. Some models have helper code for that:
@SafeVarargs
private <A, B> BiConstraintStream<A, B> joinOuterRight(UniConstraintStream<A> sourceStream,
Class<B> joinedClass, BiJoiner<A, B>... joiners) {
return sourceStream.join(joinedClass, joiners)
.concat(sourceStream.ifNotExists(joinedClass, joiners));
}
Let's standardize it.
Proposal A)
Add syntactic sugar + tests for Uni/Bi/Tri/QuadConstraintStream.joinOuterRight(...), as shown above.
Later implementations can potentially optimize the implementation, but for now the facade is enough.
Naming to be discussed.
Later we might want to implement a left outer join (.joinOuterLeft()) and a full outer join (.joinOuterFull()) too.
Can we use complement instead?
f.forEach(Lesson)
.groupBy(lesson.dayOfWeek, lesson.teacher, count()) // Tri<DayOfWeek, Teacher, int>
.join(MinLessonsPerDayPerTeacherRule,
equal((d, t, c) -> d, rule.dayOfWeek),
equal((d, t, c) -> t, rule.teacher)) // Quad <DayOfWeek, Teacher, int, Rule>
.complement(... Rule, ...) ???
Proposal C) forEach(Rule).join(Tri<...>) // Rejected
Proposal D)
f.forEach(Lesson)
.groupBy(lesson.dayOfWeek, lesson.teacher, count()) // Tri<DayOfWeek, Teacher, int>
.joinOuterRight(Rule,
equal((d, t, c) -> d, rule.dayOfWeek),
equal((d, t, c) -> t, rule.teacher),
rule -> MONDAY,
rule -> "No teacher",
rule -> 0)
Proposal E)
forEach(Rule)
.join(Lesson.class, ... same dayOfweek+teacher ...)
.complement(Rule, null)
.groupBy(Rule::dayOfWeek, Rule::teacher, count())
This has a buggy count:
Rules:
- Monday Ann min 5
- Monday Beth min 3 Lessons:
- Monday Ann 9:00
- Monday Ann 13:00
=> Monday Ann count is 2 Monday Beth count is 1, but should be 0
Fix: E1)
forEach(Rule)
.join(Lesson.class, ... same dayOfweek+teacher ...)
.complement(Rule, null)
.groupBy(Rule::dayOfWeek, Rule::teacher, countIf(... lesson not null ...))
Con:
- 100 rules. 20k lessons
- => 20k Bi<Rule, Lesson>
- => 100 Tri<DayOfWeek, Teacher, ...>
record Rule(teacher, day, minimumCount) record Lesson(teacher, startDatetime, ...)
F)
forEach(Lesson) .groupBy(lesson.teacher, lesson.dayOfWeek, count()) .complementBi(Teacher.class, DayOfWeek.class, (teacher, dayOfWeek) -> 0) .join(rule, ... equals ...)
Risky: this presumes that the compelentBi is a complete set, such that inner join with rule will not rule out any rules (pun intended)
G) Accumulate
forEach(Rule)
.accumulate(Lesson, count(), ... equals ...) // Bi<Rule, lessonCount>
Philosophy: the user does not have to define the default (zero) value for a collector.
G1) Builder pattern
forEach(Rule)
.accumulateBuilder(Lesson).withJoiners(... equals ...).withCollectors(count()).build())
Also fixes the groupBy compiler struggles:
forEach(Lesson)
.groupByBuilder(lesson.teacher, lesson.dayOfWeek).withCollectors(count())).build()
Currently Lukas and me like the accumulate idea a lot more because it is a more useful building block