tremor-runtime
tremor-runtime copied to clipboard
Missing validation on pipeline and connector connection
Problem
It is possible to create a invalid and non obviously wrong configuration when connecting to non existing ports on sources, sinks or pipelines.
Steps
- connect the
inport of a source to theinport of a pipeline - observe no error
- observe no data flowing
Possible Solution(s)
We should validate the choice of ports when connecting different artifacts.
Upon the binding process in https://github.com/tremor-rs/tremor-runtime/blob/main/src/repository/artefact.rs#L487 we should verify that we only ever link
- from an output port (not
in) AND - to an input port (
in)
and return with an error if we don't.
@mfelsche I picked up this issue before I got a chance to read your comment. This patch should do the trick as well. Please let me know, if this isn't what we are looking for :)
@aryan9600 nice patch. This is in principle a good approach, but it will only catch the cases where both ports are the same. We need to check the conditions described above.
I think to really solve this it will require for artifacts to share what input/output ports are available on them so it can be validated not only that they exist, but also that they are going in the right direction right
that we only ever link from an output port (not in) AND to an input port (in)
@mfelsche do you mean that a upstream entity can only have an /out port and a downstream entity an /in port?
An upstream entity can have any output port, except the reserved in port. There is only one in port for input. We might revisit this at some point, but for now this is how it is.
So rather the validation should be something like any upstream entity can have any output port except in and make sure that all downstream entities have a in port? I guess if we validate this we implicitly guarantee that upstream and downstream entities don't have the same ports.
I think it will be a bit more complicated. We need to track all valid input/output ports and check that connections are done in the correct direction.
If we allow or forbid naming input and output ports the same isn't strict as of today. Right now we could have a output port called in and it would work correctly as long as we connect it up correctly.
I think there are arguments for both unique names and unique for input and output and allowing them to exist side by side. For the sake of preventing 'footguns' I would opt for unique names (i.e. we can't have an input and output port with the same name) but that's my personal bias for explicitly not founded in any facts other than "this kind of things have bitten me in the past".