Addon icon indicating copy to clipboard operation
Addon copied to clipboard

upload not working because etag is blocked by this extensions

Open liangxiwei opened this issue 2 years ago • 14 comments

because the upload use etag to support multipartUpload, but this extensions block etag!

liangxiwei avatar Feb 26 '22 11:02 liangxiwei

Which side is broken? Do you have an example URL?

KevinRoebert avatar Feb 26 '22 13:02 KevinRoebert

Our product is a notion like but only for chinese. We use ali-oss sdk to upload file. And the sdk will use etag for multipartUpload. Here is the sdk source link:

https://github.com/ali-sdk/ali-oss/blob/HEAD/lib/common/multipart.js#L250

liangxiwei avatar Feb 28 '22 03:02 liangxiwei

maybe I need check if error is occur, use single part upload to instead of multipartUpload

liangxiwei avatar Feb 28 '22 03:02 liangxiwei

Chiming in here, S3 based multi-part file upload APIs are not usable if ETag headers are stripped from responses. Additionally using ETag for cache validation is a pretty common and reasonable use case, which also wouldn't work.

We also use an S3 compatible object storage, meaning uploads from our users with this extension don't work. Most of the users I've talked to haven't realized ETag headers are being stripped by this extension at all as the name nor the headline really implies it to be a tracker removal tool, but instead an URL clearing tool.

I'd suggest making it more clear that some valid use cases (such as s3 protocol based file uploads) don't work with the ETag stripping option enabled. There are several object storage providers that use the S3 protocol, so I'd imagine broken direct uploads to them being a fairly common issue.

MythicManiac avatar Mar 14 '22 06:03 MythicManiac

@liangxiwei @MythicManiac I have the same problem as you: my Javascript application couldn't read ETag headers in a PUT response from S3, despite having the correct Access-Control-Expose-Headers CORS header set.

I hope that #214 will be accepted as a way to allow our use case, while still preventing ETag-based tracking.

brianhelba avatar Jun 03 '22 01:06 brianhelba

@KevinRoebert was this closed on purpose, or was the issue perhaps closed due to an automatic trigger from https://github.com/ClearURLs/Addon/commit/783f1fc99ad2e32d692be0a5626f1184e84fdc20?

It might be worth keeping the issue open as the problem exists, even if the proposed solution isn't viable.

MythicManiac avatar Jun 07 '22 16:06 MythicManiac

@KevinRoebert was this closed on purpose, or was the issue perhaps closed due to an automatic trigger from https://github.com/ClearURLs/Addon/commit/783f1fc99ad2e32d692be0a5626f1184e84fdc20?

It was the automatic trigger. Re-opened

KevinRoebert avatar Jun 07 '22 16:06 KevinRoebert

fwiw, this is an absurd behavior that sent us down a fun debugging spiral on UploadThing. Absurd enough I didn't actually believe it at first. Super unintuitive that a "clear url" extension entirely breaks etag, and as such almost all implementations of multi-part upload

t3dotgg avatar Dec 16 '23 02:12 t3dotgg

Also Etag is part of cache strategy for web, it's not intrusive at all

vasilvestre avatar Dec 16 '23 08:12 vasilvestre

Also Etag is part of cache strategy for web, it's not intrusive at all

The problem is that it can be used to track users, and its probably more common than we realise https://levelup.gitconnected.com/no-cookies-no-problem-using-etags-for-user-tracking-3e745544176b https://www.secjuice.com/etag-entity-tag-tracking/

proevilz avatar Dec 16 '23 13:12 proevilz

Also Etag is part of cache strategy for web, it's not intrusive at all

The problem is that it can be used to track users, and its probably more common than we realise https://levelup.gitconnected.com/no-cookies-no-problem-using-etags-for-user-tracking-3e745544176b https://www.secjuice.com/etag-entity-tag-tracking/

I'm not sure how the extension works but an opt-in would be the best option IMHO. If people want privacy they can opt-in and you can warm users that it may break some websites. Wdyt ?

vasilvestre avatar Dec 18 '23 09:12 vasilvestre

The privacy view is valid, but clearly something is wrong if this continues being a common issue for valid use cases of ETag headers. In our case we ended up including a special case error handling for missing ETag headers which instructs our users to disable the ClearURLs addon, but it's absurd we had to go to those lengths.

To my understanding the ETag issue with privacy tracking is that the browser will automatically send ETag headers on outgoing requests, not that it's receiving them on inbound requests. Would it be possible to modify the feature to focus on stripping the outgoing request headers rather than the incoming ones? It would still break caching, but at least it shouldn't break S3-compatible uploads.

MythicManiac avatar Dec 18 '23 10:12 MythicManiac

Hello,

To be honest, I'm not quite sure how to better implement ETag Filtering.

For instance, Privacy Possum checks whether an ETag changes upon reloading a resource and then blocks it. However, this method requires storing every visited URL, including its ETag, in a cache (LRU), leading to significantly higher RAM usage by the addon. Moreover, the entire idea relies on the assumption that the (tracking) ETag will change after a subsequent request.

Another implementation was proposed for Privacy Badger, which verifies whether the ETag was correctly generated according to the algorithm of nginx or Apache. If not, it would be blocked. However, this method would only work for Apache and nginx servers. Additionally, it might end up blocking various elements if the server sets the ETag in a different manner.

Another option could be to completely remove ETag Filtering from ClearURLs and rely on the "Network Isolation Key" for cache strategies in Chrome 96 and Firefox 85. This approach aims to prevent tracking across multiple sites. However, it won't prevent recognition after a session on the same site.

I'm open to suggestions from the community.

By the way, the ETag Filtering has been disabled by default since version 1.25.0 (2022-07-27).

#220 #321

KevinRoebert avatar Dec 18 '23 16:12 KevinRoebert

This does seem like a rather complicated issue to solve, impossible even without accepting some tradeoffs. That being the case, would it make more sense to approach this by adding exceptions to the filtering for known valid use cases (such as s3 multi-part uploads)?

I'd also be interested in the philosophy for this filtering in the case of requests sent from javascript; does a scenario exist where the javascript which sends the request and has access to the response could not simply use some other means of tracking, such as encoding the tracker in the response payload directly rather than headers?

Or in a bit simpler terms, does a scenario exist where filtering out ETag headers for requests created by javascript that's already being executed in the browser realistically makes a difference?

MythicManiac avatar Dec 19 '23 11:12 MythicManiac