wally icon indicating copy to clipboard operation
wally copied to clipboard

Increase default value for connection limit (from 10)

Open nisanharamati opened this issue 6 years ago • 4 comments

This is currently done here

But perhaps decoupling this value from parallelism makes more sense than simply raising parallelism? \cc @jtfmumm

There is also a default value set in the python SourceConnectorConfig class in the source-migration work branch that should be updated (or allowed to be null and populated by the pony default). \cc @JONBRWN @nisanharamati

@SeanTAllen @jtfmumm @slfritchie any thoughts on what a reasonable default value would be for this?

nisanharamati avatar Feb 19 '19 23:02 nisanharamati

In a recent project, I've used the following, when we knew that we wanted at least 40 source connections for a high-throughput pipeline.

          .new_pipeline[Array[U8] val, None](
            "Busy pipeline",
            TCPSourceConfig[Array[U8] val]
              .from_options(InboundDecoder,
                            TCPSourceConfigCLIParser(env.args)?(0)?
                            where parallelism' = 512))

slfritchie avatar Feb 20 '19 04:02 slfritchie

In a recent project, I've used the following, when we knew that we wanted at least 40 source connections for a high-throughput pipeline.

          .new_pipeline[Array[U8] val, None](
            "Busy pipeline",
            TCPSourceConfig[Array[U8] val]
              .from_options(InboundDecoder,
                            TCPSourceConfigCLIParser(env.args)?(0)?
                            where parallelism' = 512))

Does that add a significant overhead in the cases where you don't have that many active sources?

nisanharamati avatar Feb 20 '19 17:02 nisanharamati

Aside from the memory allocations, no, I haven't noticed any overhead ... though none of the tests I've tried so far have been anywhere close to a stress/load test.

slfritchie avatar Feb 23 '19 03:02 slfritchie

From very quick discussion between John and Nisan:

  • Currently parallelism is used to preset the static topology DAG for barriers. This sets a hard limit on number of concurrent connections per source.
  • John thinks that the best approach, when we do address this, is to figure out support for a dynamic topology graph for the barriers, which will allow this number to grow dynamically

nisanharamati avatar Apr 25 '19 15:04 nisanharamati