metafacture-core
                                
                                
                                
                                    metafacture-core copied to clipboard
                            
                            
                            
                        Modules blocking on `closeStream()` may not forward their data in scripts with wormholes
The behaviour you observed is a bug indeed, @ah641054:
Using wormholes users can combine data from multiple flows:
Flow A -> @wormhole;
Flow B -> @wormhole;
@wormhole -> Flow C;
Flux first executes flow A and then flow B. Flow C is executed indirectly by receiving data from A and B. Once execution of A and B is finished, closeStream() is called on both flows. After having called closeStream() on flow A, all modules in A and C are closed. Calling closeStream() on flow B then only closes the remaining modules in B. Modules in C ignore the second call.
This behaviour is all well until a module in flow B attempts to send data prior to forwarding the closeStream() event (as count-triples does, for instance). This data will be send to the already closed modules in flow C.
Possible solutions for this problem are:
- @mgeipel suggested to extend the 
Lifecycleinterface with aflush()method which is called beforcloseStream(). - @cboehme suggested to add a wormhole module which acts as a barrier for 
closeStream()calls and forwards them only once each incoming flow has been closed. 
We requires significant framework changes. For the time being a quick fix should be implemented (in version 1.1)
Provided a quick fix in 49f24df by adding wait-for-inputs(N) to Flux.
example
data1 | open-file | as-lines | @Y;
data2 | open-file | as-lines | @Y;
@Y|
wait-for-inputs("2")|
write(out);
See also https://github.com/culturegraph/metafacture-core/tree/master/examples/beacon/create