rascal
rascal copied to clipboard
Join is Cartesian product?
I'm confused (again?) about the semantics of the join operator. It computes a flat Cartesian product currently by the Rascal interpreter and it does not match with my expectation of the "join" concept in relational calculus.
Subquestion: compose o does implement the relational join semantics on the final and first columns of the first and second relation, respectively. What would or should be the difference between join and compose?
Subquestion: in relational calculus the column on which to join are parameters of the join operation, or else the "natural join" joins on the columns with equal names (which also implies the existence of a column renaming operator). Should we not also allow for a similar kind of parametrisation? This would be in line with the semantics of the subscript operator on relations which uses _ to selectively join on certain columns.
As a sidestep: join is nog very robust against non-wellformed operands:
rascal>{1,2} join {<1,2>}
java.lang.ArrayIndexOutOfBoundsException: 2
(internal error)
at $root$(|main://$root$|)
java.lang.ArrayIndexOutOfBoundsException: 2
at io.usethesource.vallang.type.TupleType.getFieldType(TupleType.java:124)
at org.rascalmpl.interpreter.result.RelationResult.joinSet(RelationResult.java:344)
at org.rascalmpl.interpreter.result.SetResult.join(SetResult.java:51)
at org.rascalmpl.semantics.dynamic.Expression$Join.interpret(Expression.java:1384)
at org.rascalmpl.semantics.dynamic.Command$Expression.interpret(Command.java:61)
at org.rascalmpl.interpreter.Evaluator.eval(Evaluator.java:1088)
at org.rascalmpl.interpreter.Evaluator.eval(Evaluator.java:958)
at org.rascalmpl.interpreter.Evaluator.eval(Evaluator.java:913)
at org.rascalmpl.repl.RascalInterpreterREPL.evalStatement(RascalInterpreterREPL.java:137)
at org.rascalmpl.eclipse.repl.RascalTerminalConnector$2.evalStatement(RascalTerminalConnector.java:337)
at org.rascalmpl.repl.BaseRascalREPL.handleInput(BaseRascalREPL.java:112)
at org.rascalmpl.eclipse.repl.RascalTerminalConnector$2.handleInput(RascalTerminalConnector.java:292)
at org.rascalmpl.repl.BaseREPL.handleInput(BaseREPL.java:163)
at org.rascalmpl.repl.BaseREPL.run(BaseREPL.java:324)
at org.rascalmpl.eclipse.repl.RascalTerminalConnector$1.run(RascalTerminalConnector.java:134)
ouch, maybe this signals that the current implementation of join is not that useful, so nobody has run into specific bugs with it.
Still think we should change join to the natural join from relational calculus based on the static column names. That would make sense.