livebook
livebook copied to clipboard
Support file type of `Kino.Input`
This PR supports file type of Kino.Input
.
This PR is related to https://github.com/livebook-dev/kino/pull/153.
Hey @nallwhy, thanks for the PR :) I think we shouldn't keep the file contents in memory, because the files may be arbitrarily large (especially that Livebook operates in data-related domain). We can use path as input value instead. The file would initially be uploaded using LV uploads, this gives us a path within the local file system. However, ideally we don't want to rely on that path within the runtime, because in the future we may support runtimes on a different machine and that wouldn't work. On the other hand, in such case we would need to copy the file to the runtime whenever it's started. We haven't settled as for how we should address file inputs yet. @josevalim feel free to add your thoughts :)
@jonatanklosko I think file input is not that different from text input. Test input also can take large input and cause memory problems. It is just another interface of input.
@nallwhy we want to support uploading potentially 100MBs, which is not very common for text area. So we need to indeed discuss how to upload directly to disk. We also want to make sure it returns a path to a file because we may want to show the user how working with some file-related APIs work, and that would be hard to do without a file input that mimics a file.
I will discuss with @jonatanklosko what are the potential approaches here and follow up. :)
I talked to @chrismccord and we want to allow the LiveView to stream directly to the runtime, but we need to add the proper APIs to LiveView first. :)
I talked to @jonatanklosko and we came up with an approach that should work:
- We will use LiveView uploads for the upload
- When consuming, we will transfer the file to the session process
- The session process should remove the file once the input is removed
- On
Input.read
, we will call a function in the runtime to transfer the file (if necessary). This should be a no-op for some runtimes as we can assume it runs on the same machine -
file_input
will then receive the filename from step 4
This may work for small uploads of a few hundred MB but for larger data I would think LiveBook would want to allow for uploads to go to S3 and/or pull from S3 or other cloud providers. For example, the liveview app I'm working on (https://github.com/mgwidmann/kart_vids) will process potentially GB+ files (these are high resolution 360 videos about 5 minutes long and around 3-4 GB each). I may not end up using Nx and LiveBook to develop the models because I'm still learning and need the greater resources available but I do want to come back and refactor to a pure Elixir approach later.
@mgwidmann the idea is to upload files into disk, then Kino.Input.read(input)
will return just the path, not file content, so you can read or stream the content yourself. With this approach there should be no issue with regard to file size, since we are not storying it in the memory.
For cloud storage, such as S3, you just can upload the files directly and then use some S3 client to pull the file into Livebook.
@nallwhy thanks for the effort! I've just implemented the approach described above in #1622 :)