dwave-cloud-client icon indicating copy to clipboard operation
dwave-cloud-client copied to clipboard

Submitting large batch of jobs fails on slow network

Open randomir opened this issue 5 years ago • 2 comments

Default batch size is 20 problems. Submitting 20 full-size problems to an Advantage system/solver amounts to a single SAPI (POST) request with a 9MB payload. That's almost double the part size in the multipart upload scheme we use. Also, on a typical ADSL line, that's over a minute of upload time.

It has been observed such a slow upload request can fail with:

OSError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))

randomir avatar Nov 02 '20 20:11 randomir

Verified fix:

  • [ ] decrease default batch size to 10 (~5 MiB for 10 full Advantage-size problems)
  • [ ] ideally, limit batch size by payload bytes (default to 5 MiB)
  • [ ] decouple connect from read timeout (#440), and increase the default read timeout to 600 s
  • [x] implement retry strategy from #414

randomir avatar Nov 02 '20 22:11 randomir