dwave-cloud-client icon indicating copy to clipboard operation
dwave-cloud-client copied to clipboard

Job Fails Even with Small Batches

Open tcoulvert opened this issue 4 years ago • 1 comments

I've been running an algorithm for the last couple weeks and keep getting a 'RemoteDisconnected' error. The linked issue claims that the main cause of the issue is the large batch size. Through the logger I've confirmed that my posts are only 1 at a time, and the total size is less than 105KB (~1KiB). After reading through the other issues linked I'm unsure if they would help either, nor exactly how to implement them.

Is the connection error simply something that must be solved by fast, stable internet?

Original Issue:

Verified fix:

  • [ ] decrease default batch size to 10 (~5 MiB for 10 full Advantage-size problems)
  • [ ] ideally, limit batch size by payload bytes (default to 5 MiB)
  • [ ] decouple connect from read timeout (#440), and increase the default read timeout to 600 s
  • [x] implement retry strategy from #414

Originally posted by @randomir in https://github.com/dwavesystems/dwave-cloud-client/issues/439#issuecomment-720752001

tcoulvert avatar Aug 27 '21 16:08 tcoulvert

Extending read timeout after we implement #440 would be worth trying in cases of low bandwidth and/or high latency. We'll have that available soon for you to try (if you don't mind installing from source on master).

randomir avatar Aug 27 '21 16:08 randomir