wally
wally copied to clipboard
Increase default value for connection limit (from 10)
This is currently done here
But perhaps decoupling this value from parallelism makes more sense than simply raising parallelism? \cc @jtfmumm
There is also a default value set in the python SourceConnectorConfig class in the source-migration work branch that should be updated (or allowed to be null and populated by the pony default). \cc @JONBRWN @nisanharamati
@SeanTAllen @jtfmumm @slfritchie any thoughts on what a reasonable default value would be for this?
In a recent project, I've used the following, when we knew that we wanted at least 40 source connections for a high-throughput pipeline.
.new_pipeline[Array[U8] val, None](
"Busy pipeline",
TCPSourceConfig[Array[U8] val]
.from_options(InboundDecoder,
TCPSourceConfigCLIParser(env.args)?(0)?
where parallelism' = 512))
In a recent project, I've used the following, when we knew that we wanted at least 40 source connections for a high-throughput pipeline.
.new_pipeline[Array[U8] val, None]( "Busy pipeline", TCPSourceConfig[Array[U8] val] .from_options(InboundDecoder, TCPSourceConfigCLIParser(env.args)?(0)? where parallelism' = 512))
Does that add a significant overhead in the cases where you don't have that many active sources?
Aside from the memory allocations, no, I haven't noticed any overhead ... though none of the tests I've tried so far have been anywhere close to a stress/load test.
From very quick discussion between John and Nisan:
- Currently parallelism is used to preset the static topology DAG for barriers. This sets a hard limit on number of concurrent connections per source.
- John thinks that the best approach, when we do address this, is to figure out support for a dynamic topology graph for the barriers, which will allow this number to grow dynamically