nextflow icon indicating copy to clipboard operation
nextflow copied to clipboard

Improve stability of virtual threads at scale

Open bentsherman opened this issue 1 month ago • 0 comments

See https://nextflow.slack.com/archives/C02T98A23U7/p1715351870737659

When a workflow publishes many files using S3-to-S3 copy, virtual threads are needed to maximize request throughput, since each publish task is simply waiting on an HTTP request and doesn't need to occupy an OS thread the whole time.

However, enabling virtual threads currently creates two problems:

  • The AWS Java SDK imposes a separate limit on the number of max HTTP connections (aws.client.maxConnections), and if this limit isn't also increased, Nextflow will likely fail with a message like Timeout waiting for connection from pool

  • Virtual threads have no queue size limit by default, but in reality we might still want to limit the number of in-flight HTTP requests so as to not overload the network

To this end, I propose the following improvements:

  • Allow virtual threads to be used with a queue size (similar to executor.queueSize) to control the number of concurrent publish tasks

  • Automatically increase the AWS connection limit to match the virtual threads queue size so that the user doesn't have to remember it

bentsherman avatar May 13 '24 14:05 bentsherman