reqwest icon indicating copy to clipboard operation
reqwest copied to clipboard

Allow setting `no_gzip` / `gzip` for Request

Open Xuanwo opened this issue 9 months ago • 9 comments

Background

GCS has a feature called Decompressive Transcoding, which can decompress gzip-encoded files based on specific conditions.

For example:

The file is gzip-compressed when stored in Cloud Storage. The object's metadata includes Content-Encoding: gzip.

Users need to use the following ways to avoid this:

There are two ways to prevent decompressive transcoding from occurring for an object that is otherwise eligible:

If the request for the object includes an Accept-Encoding: gzip header, the object is served as-is in that specific request, along with a Content-Encoding: gzip response header.

If the Cache-Control metadata field for the object is set to no-transform, the object is served as a compressed object in all subsequent requests, regardless of any Accept-Encoding request headers.

But sadly, this feature can conflict with reqwest's auto gzip behavior. We will have following behavior matrix as described in https://github.com/apache/opendal/issues/5070

Image

The only workaround so far is to set no_gzip at the client level, but this can introduce https://github.com/apache/opendal/issues/5897 because gcs will return gzip-ed response in other API.

Proposal

My current idea is to allow setting no_gzip / gzip for Request directly, so users like opendal can control whether disable reqwest's auto gzip behavior.

I have checked related logic and it seems easy to be added without introduce big changes:

https://github.com/seanmonstar/reqwest/blob/03d1635347cbfe979bd5a7f4ba7ad2cdc73ef68c/src/async_impl/client.rs#L2910-L2917

We can add Accepts in Request and merge with client's settings before construct Response.

What do you think?

Xuanwo avatar Mar 27 '25 14:03 Xuanwo

If you manually set .header("accept-encoding", "identity") on that specific request, does it all work?

seanmonstar avatar Mar 27 '25 16:03 seanmonstar

If you manually set .header("accept-encoding", "identity") on that specific request, does it all work?

Hi, this isn't working as expected. In this case, GCS performs Decompressive Transcoding, which results in the following two behaviors.

Image

too much data mean this request will return more data than the size of its object.

Xuanwo avatar Mar 28 '25 08:03 Xuanwo

Really? That seems weird, indeed. I read the doc you linked, it says to transcoding will happen if you send accept-encoding: gzip, which reqwest sends if the gzip feature is enabled. However, if it sees an existing accept-encoding header, it won't set one. Sending accept-encoding: identity is a normal thing that many servers understand to mean don't encode the content.

seanmonstar avatar Mar 28 '25 13:03 seanmonstar

I read the doc you linked, it says to transcoding will happen if you send accept-encoding: gzip, which reqwest sends if the gzip feature is enabled.

Hi, gcs's behavior is that, for objects with content-encoding: gzip:

  • If request has accept-encoding: gzip or cache-control: no-transform, the content will be sent as-is (aka, in gzip).
  • If not (like has accept-encoding: identity only), gcs will transcode the content to decompressed.

Xuanwo avatar Mar 28 '25 14:03 Xuanwo

Maybe it can be solved by ignoring sending the accept-encoding header at the request level? Similar to attohttpc Http Client which allows to skip decompression when building the request

https://github.com/sbstp/attohttpc/blob/8500cda02d5075736143c10434e2ca52190a07e3/src/request/builder.rs#L382

If possible, I can make a PR for this

0x676e67 avatar Apr 02 '25 06:04 0x676e67

Hi, @seanmonstar what do you think? I'm open to either options.

Xuanwo avatar Apr 14 '25 09:04 Xuanwo

I wrote up #2641 that relates to this.

seanmonstar avatar Apr 14 '25 18:04 seanmonstar

Somwhat on a related note. I wonder if reqwest sets accept-encoding automatically based on feature flags enabled during compilation or do we need to set it manually when creating a request?

pronebird avatar Apr 21 '25 14:04 pronebird

Yes, if the header doesn't already exist: https://docs.rs/reqwest/latest/reqwest/struct.ClientBuilder.html#method.gzip

seanmonstar avatar Apr 21 '25 14:04 seanmonstar