kirby
kirby copied to clipboard
Chunked file uploads
This PR …
Enhancements
- Panel uploads can now exceed the
upload_max_filesizelimit as they're uploaded in chunks. If you want to restrict the upload size, please use the file blueprintacceptmaxsizeoption: https://getkirby.com/docs/reference/panel/blueprints/file#accept
Ready?
- [x] In-code documentation (wherever needed)
- [x] Unit tests for fixed bug/feature
- [x] Tests and checks all pass
For review team
- [x] Squash commits
- [ ] Add changes & docs to release notes draft in Notion
I've tried to upload a EXE file. Kirby 4.2.0 throws a error The extension "exe" is not allowed but nothing happens for this PR. Seems something broken.
Also I realized that file is still remains when I cancel the upload. Should be deleted. But should be careful when file replacing.
@afbora We cannot delete the tmp file on cancel. But we clean up the files with a garbage collection.
Adding the "two-person review" label as I realized we need to be super careful about the security implications here. There's a lot that could go horribly wrong if we miss anything. I think we are heading in a great direction though 👍
@lukasbestle makes sense. And as written in the initial post, I think we should have a few review rounds as well. Now getting the concept right, later adding unit tests etc. before final review.
We cannot delete the tmp file on cancel. But we clean up the files with a garbage collection.
@distantnative What if we upload to the temp folder, does it make sense? If upload successful, we can move the file to the model root, if not garbage collector take care.
@afbora That's what we are doing here. Just not the system temp folder as that doesn't persist across requests. So the first parts aren't there anymore when the last chunk is received.
Is it theoretically possible to add a config option that will limit the file size?
I've found few issues, could you confirm that?
- File not uploaded without error or log (dialog closed and no file uploaded,) with big files. For me, ~100 MB success, ~300 MB fails.
- I'm getting
A file with the name "filename.zip" already existserror when big files upload from textarea field
@afbora the PR currently is not functional as it's in the middle of implementing changes based on Lukas' feedback and some troubles I encountered (see our comments discussion). Sorry that you tried, but I think any upload fails currently that needs more than one chunk.
@distantnative No problem. I'll test again after PR is ready.
Is it theoretically possible to add a config option that will limit the file size?
Good point. I think it would make sense to make this an option for the file blueprint.
@lukasbestle @tobimori are you thinking of something different than the existing accept maxsize option?
I didn't know there was an option already. Consider my request as solved 😀
Also forgot about maxsize. 😆
@lukasbestle added some more validations – skipping MIME types (will still checked for the full file), but at least basic ones as allowed extensions.
About maxsize: We need to ensure that we check for that in three ways:
- The frontend should check for it before even uploading the first chunk (avoids user frustration). Maybe this already happens.
- The backend should also ensure it by checking the
Upload-Lengthagainst it. - Also there should be a validation for the actual uploaded chunk size. So the size of the current tmp file + the size of the temporary chunk file provided by PHP should be compared to
maxsizebefore streaming over the chunk to our tmp file. Otherwise one could provide anUpload-Lengthwith an allowed size (e.g. 5 MB) but send a chunk with 100 MB. The file would then already take up double that size on the server before Kirby finally validates the size in$page->createFile().
@lukasbestle I am not sure we can follow your suggestions for maxsize. This would require us parsing the blueprints on each of those requests (Panel view, chunk upload). And given that our blueprints currently are always fully resolved (sadly still), this could create real performance issues. Because of that, I think until we resolve blueprints more lazily, we have to stick to the (not ideal) behavior that the accept rules only get evaluated at the end of the full upload.
About maxsize: You are right that it won't work out to do the first check in the frontend as we would need to provide all possible maximum sizes for all allowed file blueprints before we even know if a file will be uploaded at all. However during an active upload, this is different: There I feel like the performance impact of checking the maxsize is negligible compared to the duration of the upload (unless the uploader is somewhere in Switzerland with a symmetric 1 Gbit/s connection 😆).
However with these checks we could prevent attackers from stuffing our custom temp dir with insanely large partially uploaded files that are so far prevented by the blueprint. At the moment, PHP will accept the file first, store it to the system temp folder and then Kirby decides not to copy it elsewhere. So the damage is limited to the upload_max_filesize for a brief moment in time (PHP auto-cleans uploaded files that were not copied during the request). With chunked uploads, an attacker could however initiate an upload of a 1 GB file and upload 999 MB of that. Those 999 MB will reside on the server until cleanup. This could lead to denial of service if done until the disk on the server or quota is full.
@lukasbestle I added a check on the backend for maxsize: both, checking the total length from the header as well as existing tmp plus chunk size.
Would be great indeed. I've rebased and squashed the commits already.
@distantnative I'm getting A file with the name "filename.zip" already exists error when big files upload from textarea field. Same file uploads without error from files section.
@afbora good catch. Can reproduce myself. Somehow already after the first chunk they get moved to the content folder. But need a fresher mind to investigate why.
Maybe a different route processes these so the Upload-Length header is ignored and the new chunk logic on the backend doesn't get executed?
Indeed it is. So I am thinking to actually move the Upload::chunk() call here: https://github.com/getkirby/kirby/blob/main/src/Api/Api.php#L697
So any place using $api->upload() would work with it.
Furthermore, I think I would like to move most of the Code from Api::upload() to the new Upload class. I think that could als help us to simplify that code into several single-purpose methods, e.g. Upload::response(), Upload::error(), Upload::filename() ...
Makes sense to me. The Api::upload() method is indeed a spaghetti monster at the moment.
Behavior would be that Api::upload() would check for chunks in the place you linked to and only call the callback on the last chunk when the upload is complete, right?
Yes, exactly. Have a version working locally. It really helps breaking this up but also having it all in the one class then.
Now only need to add unit tests to all new parts of the Upload class. Bit of work but also is nice cause we'll cover most of this with proper unit tests now.
Works great for me. I couldn't see any issue 🎉
I think I have encountered an issue: Uploading a large file and canceling while it's uploading, it still shows up in the file list after a reload.
Turns out: When one cancels the upload dialog, all running uploads still run and finish in the background. This is not new in this PR but a lot more visible/likely to happen with a different expectation by the user.