deepforge icon indicating copy to clipboard operation
deepforge copied to clipboard

GetCifar10Data operation (in CIFAR10 example) fails with GME storage

Open brollb opened this issue 4 years ago • 7 comments

There are really two issues here. One is that the following error occurs when using GME storage: DeepinScreenshot_select-area_20200818085310

The second issue is that this error isn't handled well when running the pipeline and it just seems to stop. Errors thrown when uploading the resulting data should report the operation as failed and show the error.

brollb avatar Aug 18 '20 13:08 brollb

For me using GMEStorage with server execution, somehow, the file results.json is never written and this causes the LocalExecutor to fail because it cannot find the file.

umesh-timalsina avatar Aug 18 '20 20:08 umesh-timalsina

This appears to be an issue with the BlobClient's putFile method failing silently when uploading a stream. Generally, when the job shows up as a success but results.json is not written, it means that something failed when uploading the results to their corresponding storage backends.

brollb avatar Aug 19 '20 14:08 brollb

As this is caused by a dependency, I am going to remove this from the milestone so it doesn't block the other bug fixes marked for v2.4.1

brollb avatar Aug 20 '20 15:08 brollb

This is failing for me even using the sciserver-files service.

umesh-timalsina avatar Sep 04 '20 19:09 umesh-timalsina

Is this resolved?

umesh-timalsina avatar Oct 20 '20 20:10 umesh-timalsina

Unfortunately not. The first portion of it (error handling) was fixed in the referenced PR from webgme-engine but the CONNRESET errors still happen for me.

I had deprioritized this since it wasn't an issue for the deployment and was specific to the GME storage backend but I will make an issue on webgme-engine now.

brollb avatar Oct 20 '20 20:10 brollb

Just opened an issue: https://github.com/webgme/webgme-engine/issues/240

brollb avatar Oct 20 '20 20:10 brollb