deepforge
deepforge copied to clipboard
GetCifar10Data operation (in CIFAR10 example) fails with GME storage
There are really two issues here. One is that the following error occurs when using GME storage:
The second issue is that this error isn't handled well when running the pipeline and it just seems to stop. Errors thrown when uploading the resulting data should report the operation as failed and show the error.
For me using GMEStorage
with server execution, somehow, the file results.json
is never written and this causes the LocalExecutor to fail because it cannot find the file.
This appears to be an issue with the BlobClient's putFile
method failing silently when uploading a stream. Generally, when the job shows up as a success but results.json
is not written, it means that something failed when uploading the results to their corresponding storage backends.
As this is caused by a dependency, I am going to remove this from the milestone so it doesn't block the other bug fixes marked for v2.4.1
This is failing for me even using the sciserver-files
service.
Is this resolved?
Unfortunately not. The first portion of it (error handling) was fixed in the referenced PR from webgme-engine but the CONNRESET errors still happen for me.
I had deprioritized this since it wasn't an issue for the deployment and was specific to the GME storage backend but I will make an issue on webgme-engine now.
Just opened an issue: https://github.com/webgme/webgme-engine/issues/240