pub-dev icon indicating copy to clipboard operation
pub-dev copied to clipboard

Validate tar-encoding

Open jonasfj opened this issue 6 years ago • 3 comments

We should decode and re-encode all tarballs we serve. This would ensure that tweaks in tarball which are ignored by tar cannot be used to create a file that interpreted by the browser as an HTML file.

Since we serve from a different domain, I'm not particularly worried. But it would also ensure that decoding errors experienced on the client are consistent.

This is probably a non-trivial project, we have to:

Stage 1 A) Write a tool that decodes all existing tarballs and stores them in a new bucket. B) Serve tarballs from the new bucket. C) Ensure that new uploads are written to both the old and new bucket.

.Stage 2: i) Stop creating tarballs in the old bucket (which uses the original encoding). ii) Remove tool that decodes existing tarballs and stores them in the new bucket.

Stage 3 x) Remove the old bukcet (which uses the original encoding).

We should probably wait a few weeks between "stage 1" and "stage 2" to see if anyone reports bugs, as the hash of all these tarballs is changed. Maybe a canonization will cause issues decoding on some platform.


We should also explore if the hash of the tarballs is stored somewhere in PUB_CACHE, ie. if this change will have any impact.

jonasfj avatar Sep 24 '19 11:09 jonasfj

Related issue: #4440.

isoos avatar Jan 28 '21 09:01 isoos

Now that we have content hashes in pubspec.lock I don't think we can do this retroactively

But we should be able to do it for new uploads.

sigurdm avatar Feb 01 '24 12:02 sigurdm

We decided to keep the integrity of the uploaded tar-ball.

We should probably consider if there are more consistency checks we could do on upload time...

sigurdm avatar Feb 01 '24 12:02 sigurdm