deepforge icon indicating copy to clipboard operation
deepforge copied to clipboard

Data retrieval error for sciserver-files

Open umesh-timalsina opened this issue 4 years ago • 3 comments

Retrieving data of around 3.6 Gigabytes from sciserver-files fails with the following error:

Data retrieval failed for data: FetchError: Could not create Buffer from response body for https://apps.sciserver.org/fileservice/api/file/Storage/utimalsina/AstroResearch/OriginalDataset/sdss-data-50000.npz: The value of "length" is out of range. It must be >= 0 && <= 2147483647. Received 3839691267

umesh-timalsina avatar May 12 '20 13:05 umesh-timalsina

The most direct way to fix this would probably be to update the storage adapter API so getFile returns a ReadableStream rather than a Buffer. Similarly, putFile should probably accept a ReadableStream for the file input (or perhaps adding support for a ReadableStream but not requiring it). Taking a quick look at the codebase, the following files would need to be updated:

  • [ ] common/plugin/LocalExecutor.js
  • [x] common/storage/index.js
  • [ ] common/plugin/GeneratedFiles.js
  • [x] plugins/UploadArtifact/UploadArtifact.js
  • [ ] common/storage/backends/StorageClient.js
  • [ ] plugins/GenerateJob/templates/run-debug.js
  • [ ] plugins/GenerateJob/templates/start.js
  • [ ] routers/InteractiveCompute/job-files/start-session.js
  • [x] gme client
  • [x] sciserver-files client
  • [x] s3 client

brollb avatar May 12 '20 14:05 brollb

For the SciServer Files adapter, this will likely mean moving to returning response.body as described at https://developer.mozilla.org/en-US/docs/Web/API/Body/body.

brollb avatar May 12 '20 14:05 brollb

This issue should be handled as the following:

  1. Extend the BlobClient to handle streams (to be upstreamed to webgme-engine)
  2. Extend the Storage Clients to handle streams. One catch here is with the S3 Storage Client and using non-file streams (See https://github.com/aws/aws-sdk-js/issues/1157) and should be resolved by s3.Upload() method.
  3. Change the Plugins to use getStreams whenever possible.

umesh-timalsina avatar May 17 '20 07:05 umesh-timalsina