dash-docs icon indicating copy to clipboard operation
dash-docs copied to clipboard

Add dcc.Upload Limitations section and workarounds

Open tobinngo opened this issue 3 years ago • 2 comments

Addresses #1022

image

tobinngo avatar Apr 27 '21 21:04 tobinngo

Let's say the limitations are more like 5-10MB rather than 100MB. 100MB is a lot of data to transfer of the network.

I would expand the first paragraph to include a few more notes. Basically, large files will slow things down in a few ways:

  1. Callback network costs: Every time the upload data is used as an input or state to a callback, the data is transferred from the client's browser to the server over the network.
  2. Client's browser memory cost: The uploaded content is held in memory in the browser for the session. Large files can consume a lot of the client's memory and can slow down the responsiveness of the UI interactions.
  3. Server memory cost: Dash is stateless. This means that the uploaded content from the client's session in the browser is transferred to the server every time the callback is executed. The file is not saved on the server. This can keep the memory of your server low as the file is not saved by default - it is only loaded "when needed" for the callbacks.

These can be worked around by:

  1. Architecting your application to rely on small uploads (<5-10MB)
  2. Uploading your data in advance to a separate file service like S3 and allowing the user to select files via a dropdown. This would only work if you know which files would be uploaded in advance.
  3. Using the caching & signalling method in https://dash.plotly.com/sharing-data-between-callbacks so that the upload data is saved on the server in a Redis database and subsequent callbacks that rely on the uploaded data are loaded from the server instead of transferred over the network. This would improve the "Callback network costs" but wouldn't save the "Client's browser memory cost" as the file will still be loaded within the dcc.Upload component in the browser.
  4. Using the community managed dash_uploader to save the file to the server. This would save the "Callback network costs" and the "Client's browser costs" but may introduce some more complexity with managing sessions (i.e. which uploaded file corresponds to which user). In Docker-based deployment systems like Dash Enterprise or Heroku, the Docker file system is also ephemeral (wiped clean on every deployment) so these files won't persist between deploys.

chriddyp avatar Apr 30 '21 19:04 chriddyp

Also let's deploy a simple example using dash-uploader to dash enterprise just to make sure that it works

chriddyp avatar Apr 30 '21 19:04 chriddyp