azure-sdk-for-net icon indicating copy to clipboard operation
azure-sdk-for-net copied to clipboard

[BUG] Repeat calls DownloadFileAsync fails when using an intermediate proxy due to caching

Open Lizzy-Gallagher opened this issue 2 years ago • 10 comments

Library name and version

Azure.Storage.Blobs 12.12

Describe the bug

We run all of our production usages of Azure.Storage.Blobs through an intermediate proxy.

When we repeat a call to download a file, we get the following error

Azure.RequestFailedException: Response x-ms-client-request-id '<GUID>' does not match the original expected request id, '<OTHER-GUID>'.

We determined that this was because the response was being cached by our intermediate proxy and we could workaround the issue by doing the following:

  • creating an HttpClientTransport with a DelegatingHandler that looks like:
private sealed class PreventCachingHandler : DelegatingHandler
{
	public PreventCachingHandler(HttpMessageHandler innerHandler) : base(innerHandler) { }

	protected override HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken)
	{
		request.Headers.CacheControl = new CacheControlHeaderValue { NoCache = true, NoStore = true };
		return base.Send(request, cancellationToken);
	}

	protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
	{
		request.Headers.CacheControl = new CacheControlHeaderValue { NoCache = true, NoStore = true };
		return base.SendAsync(request, cancellationToken);
	}
}
  • creating a ScopedBlobserviceClient with the "ownedTransport" parameter

Expected behavior

Although we have an intermediate proxy, we would have expected Azure Blobs to add cache headers on the response (so that our intermediate proxy would not have cached).

Actual behavior

Azure Blobs did not set any no caching headers, so we had to debug and do this ourselves.

Reproduction Steps

You can examine the return response or the request from Azure Blobs and see that there are no headers requesting the response to not be cached.

Environment

$ dotnet --info .NET SDK (reflecting any global.json): Version: 6.0.402 Commit: 6862418796

Visual Studio 17.3.6

Lizzy-Gallagher avatar Oct 24 '22 18:10 Lizzy-Gallagher

Thank you for your feedback. This has been routed to the support team for assistance.

ghost avatar Oct 24 '22 21:10 ghost

@Lizzy-Gallagher Thanks for reaching out to us and reporting this issue. We have seen similar such issue reported in the past where the fix suggested was to disable the caching at the proxy end. If this workaround is not feasible for you to implement, then try the below action plan to avoid the caching :

  1. Navigate to you Azure Storage Account.
  2. Open the concerned Container.
  3. Navigate to each of the blobs within this container and add the no-cache response header on each of the blobs as shown below:

image

Note: The existing cache on the proxy might persist. So you need to clear it first from the proxy and observe for the subsequent request after implementing the above changes.

Awaiting your reply.

navba-MSFT avatar Oct 25 '22 06:10 navba-MSFT

Hi @Lizzy-Gallagher. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text “/unresolve” to remove the “issue-addressed” label and continue the conversation.

ghost avatar Oct 27 '22 03:10 ghost

Hi @Lizzy-Gallagher, since you haven’t asked that we “/unresolve” the issue, we’ll close this out. If you believe further discussion is needed, please add a comment “/unresolve” to reopen the issue.

ghost avatar Nov 03 '22 04:11 ghost

@navba-MSFT disabling caching on the proxy or in Azure itself doesn't feel like a good long-term answer here. The proxy is just complying with the HTTP caching headers Azure is sending which should be valid. I don't think the client should assume that no proxy caching is happening in order to function correctly.

madelson avatar Nov 04 '22 14:11 madelson

/unresolve

Lizzy-Gallagher avatar Nov 04 '22 15:11 Lizzy-Gallagher

@madelson @Lizzy-Gallagher In the issue description, I saw the below expectation:

Expected behavior Although we have an intermediate proxy, we would have expected Azure Blobs to add cache headers on the response (so that our intermediate proxy would not have cached).

So this action plan to add the response header to disable the caching was suggested.

navba-MSFT avatar Nov 07 '22 06:11 navba-MSFT

Sohttps://github.com/Azure/azure-sdk-for-net/issues/31975#issuecomment-1290045858 to add the response header to disable the caching was suggested.

@navba-MSFT but this is a configuration change to the Azure account, which requires access/control over that account. I think the Azure client should "just work" without needing to override behavior.

madelson avatar Nov 08 '22 01:11 madelson

This seems similar to this issue #24635

See this comment. To summarize simply we have a feature that verifies the request by echoing the x-ms-client-request-id. Caching the headers in the proxy would not only disable this feature but other features of the SDK.

Although we have an intermediate proxy, we would have expected Azure Blobs to add cache headers on the response (so that our intermediate proxy would not have cached).

Actually in .NET Core the feature to cache the HTTP Request headers was not carried forward from .NET Framework (See comment). So this would be an incorrect assumption of the .NET Core design (this is outside the scope of the SDK).

Usage of such a global cache can be source of many errors in the app. We recommend against using it.

What's the reason behind requiring the headers to be cached and why is it necessary for the intermediate proxy to cache the HTTP headers? I could see other issues with this such as accidentally corrupting http header values such as metadata values that are changed between requests (but I could be wrong, I don't know the extent of the proxy).

amnguye avatar Nov 09 '22 23:11 amnguye

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.

Issue Details

Library name and version

Azure.Storage.Blobs 12.12

Describe the bug

We run all of our production usages of Azure.Storage.Blobs through an intermediate proxy.

When we repeat a call to download a file, we get the following error

Azure.RequestFailedException: Response x-ms-client-request-id '<GUID>' does not match the original expected request id, '<OTHER-GUID>'.

We determined that this was because the response was being cached by our intermediate proxy and we could workaround the issue by doing the following:

  • creating an HttpClientTransport with a DelegatingHandler that looks like:
private sealed class PreventCachingHandler : DelegatingHandler
{
	public PreventCachingHandler(HttpMessageHandler innerHandler) : base(innerHandler) { }

	protected override HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken)
	{
		request.Headers.CacheControl = new CacheControlHeaderValue { NoCache = true, NoStore = true };
		return base.Send(request, cancellationToken);
	}

	protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
	{
		request.Headers.CacheControl = new CacheControlHeaderValue { NoCache = true, NoStore = true };
		return base.SendAsync(request, cancellationToken);
	}
}
  • creating a ScopedBlobserviceClient with the "ownedTransport" parameter

Expected behavior

Although we have an intermediate proxy, we would have expected Azure Blobs to add cache headers on the response (so that our intermediate proxy would not have cached).

Actual behavior

Azure Blobs did not set any no caching headers, so we had to debug and do this ourselves.

Reproduction Steps

You can examine the return response or the request from Azure Blobs and see that there are no headers requesting the response to not be cached.

Environment

$ dotnet --info .NET SDK (reflecting any global.json): Version: 6.0.402 Commit: 6862418796

Visual Studio 17.3.6

Author: Lizzy-Gallagher
Assignees: amnguye
Labels:

Storage, Service Attention, Client, needs-author-feedback, customer-reported, question

Milestone: -

ghost avatar Nov 09 '22 23:11 ghost

Hi, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

ghost avatar Nov 17 '22 02:11 ghost

@amnguye I'm not sure I understand why using an intermediate cache is invalid. The cache is just respecting the HTTP caching headers that Azure is returning. In my view the SDK is at fault here. Azure is returning headers saying that the response can be cached, the proxy is caching it, and the SDK is then failing because it assumes that caching won't ever happen.

Can't the SDK just skip request ID verification for responses with caching headers? Why is the ID verification even necessary?

madelson avatar Nov 17 '22 04:11 madelson

Hi, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

ghost avatar Nov 24 '22 14:11 ghost

Please see my latest comment.

madelson avatar Nov 26 '22 02:11 madelson

Hi, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

ghost avatar Dec 03 '22 08:12 ghost

See https://github.com/Azure/azure-sdk-for-net/issues/31975#issuecomment-1318037594

madelson avatar Dec 03 '22 13:12 madelson

Hi, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

ghost avatar Dec 10 '22 14:12 ghost

See https://github.com/Azure/azure-sdk-for-net/issues/31975#issuecomment-1318037594

madelson avatar Dec 10 '22 20:12 madelson

Here's some documentation explaining why the client side request id echoing is necessary.

https://learn.microsoft.com/en-us/rest/api/storageservices/client-side-logging-with-the-.net-storage-client-library#client-side-log-schema-and-sample

Short summary, it helps with network traces and storage service side logs. This feature helps correlate the REST API requests with the corresponding requests. Also on the client side we have to verify that the correct client request id is being echoed back to us, to ensure the response is the response we expected. It's a part of ensuring data integrity, disabling the feature would put all requests that come back at a higher risk of data corruption.

the SDK is then failing because it assumes that caching won't ever happen.

I think there's some confusion here. We're expected the request headers to be returned in a certain format. But when the cache alters those headers, that's when the error is occurring. You can cache the headers, but once you start mucking around with the header values, that's when the SDK throws this error.

I think to reiterate from this post: "Instead we recommend to cache results of the SDK calls in memory if that's something you desire. Alternatively, if you still pursue the global request caching, I'd recommend to use raw HttpClient or WebRequest to download resources - as I mentioned we can't guarantee correctness of the SDK implementation if global system.net cache is enabled (for example when we chunk the downloads)."

What's the reason behind requiring the headers to be cached and why is it necessary for the intermediate proxy to cache the HTTP headers?

amnguye avatar Dec 12 '22 17:12 amnguye

Hi @amnguye thanks for responding.

We're expected the request headers to be returned in a certain format. But when the cache alters those headers, that's when the error is occurring

I'm not sure I follow. In my understanding, the proxy is not altering the headers; rather it is respecting the Cache-Control header that Azure is putting on the response and then serving up a cached response based on that. This results in a "stale" header, but only because Azure told the proxy that was OK (per it's cache control directive).

What's the reason behind requiring the headers to be cached and why is it necessary for the intermediate proxy to cache the HTTP headers?

I don't need the proxy to do caching, I just don't want the fact that the proxy is performing valid caching behaviors (per the HTTP protocol) to break my application.

In my case, I'm using these libraries in an environment where access to the internet must go through a proxy, and we have limited control over said proxy. I can anticipate having to run the same code in an environment where we have no control over the proxy.

If you don't want to disable the request ID validation (which as you explained has value), then I think a reasonable fix here would be to have the SDK add something like Cache-Control: max-age=0 to all requests it makes, or at least the ones that are not compatible with caching. That way the SDK isn't susceptible to intermediate caches being in place. We're currently using this trick to work around the problem (injecting the header via a custom handler), but ideally the SDK would work out of the box. Thoughts?

madelson avatar Dec 13 '22 02:12 madelson

The x-ms-client-request-id is getting altered after the request is being sent. So between after the request has been sent between the SDK and before the SDK receives it back, a header is being altered. We've seen this happen when customers put proxies or have intermediate web applications that require changing the x-ms-client-request-id.

The error you received back

Azure.RequestFailedException: Response x-ms-client-request-id '' does not match the original expected request id, ''. Occurs only when the x-ms-client-request-id is different from when the original value when the request was first sent. The check is made here.

You may have to investigate the end to end scenario, because this error would not be thrown if the x-ms-client-request-id isn't getting changed.

From the original post I saw:

We run all of our production usages of Azure.Storage.Blobs through an intermediate proxy.

When we repeat a call to download a file, we get the following error

We determined that this was because the response was being cached by our intermediate proxy

It sounds like after you repeated "a call to download a file" that the client request id is being cached from the first download request, and then when you go to make the "repeat" is changed the x-ms-client-request-id to the one cached in the proxy.

I don't think setting Cache-Control: max-age=0 is a fix we should be sending with all requests... that sounds like a breaking change for customers who don't expect that and also customers who need to set that header value as well.

Another workaround that could be utilized (for those who have a proxy who's altering the x-ms-client-request-id but know when a repeat download request needs to be called, but knows it will get cached) is to manually set the value. So you know the repeat download requests are correlated, because they have the same x-ms-client-request-id. However I wouldn't say this is my favorite fix either.

Sample here: https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/core/Azure.Core/samples/Diagnostics.md#setting-x-ms-client-request-id-value-sent-with-requests

@JoshLove-msft @jsquire Is there any other workaround we can provide that might help out this customer?

amnguye avatar Dec 13 '22 19:12 amnguye

The x-ms-client-request-id is getting altered after the request is being sent.

@amnguye I'm not sure I understand what you mean by "altered". Here's my understanding of what happens with proxy caching:

  1. Client makes first request (ID = 1)
  2. Proxy sees cache miss, forwards request to Azure
  3. Azure sends back a response with headers indicating that the response is cacheable
  4. The proxy caches the response (ID = 1) and also forwards it back to the client
  5. Client makes second request (ID = 2)
  6. Proxy sees cache hit and responds with the cached response (ID = 1)

As I understand it the proxy isn't altering the ID; it is just serving up a cached response per HTTP protocol and the cached response happens to have the old ID. Is that your understanding as well?

madelson avatar Dec 14 '22 12:12 madelson

If you are receiving this error, the x-ms-client-request-id value is different from the value the value that it originally sent out. Which means that between after the request has been sent between the SDK and before the SDK receives it back, a header value is being altered.

  1. Proxy sees cache hit and responds with the cached response (ID = 1)

The SDK will send the request with ID = 2, the proxy sees the cache hit and responds with a cached response with ID = 1. The SDK compares the IDs and then see's there's a mismatch (e.g. expected value of 2, but got value of 1 instead). It expected x-ms-client-request-id to be 2, not 1. Thus throwing the error you see.

To the SDK the first download call and the second download call, are two completely different calls that are unrelated. This is by design that these calls are different calls. Which is why the x-ms-client-request-id have different values.

amnguye avatar Dec 14 '22 17:12 amnguye

The SDK will send the request with ID = 2, the proxy sees the cache hit and responds with a cached response with ID = 1. The SDK compares the IDs and then see's there's a mismatch (e.g. expected value of 2, but got value of 1 instead). It expected x-ms-client-request-id to be 2, not 1. Thus throwing the error you see.

Great I think we are 100% on the same page about why this error is happening.

I think where we perhaps disagree is about whether the proxy is doing something wrong or the SDK is doing something wrong.

My view is that HTTP caching a standard part of the protocol and we should assume that it could happen.

If the SDK wants to assume that its requests will never receive a cached response, then it should specify Cache-Control: max-age=0 to prevent this. Another potential path would be to add the request ID to GET requests as an extra "cache buster" query parameter to prevent caching that way.

Alternatively, the request ID checking could be modified to be more resilient to caching. For example the check could be conditioned to skip if the request comes back with cache headers, or the ID could be generated in a "stable" way (e.g. hash of request contents).

I'm not sure if any of these are great ideas; but I do think it would be nice to find a way for the SDK to be more resilient to different setups. Especially where the request ID check doesn't feel like it adds that much value.

madelson avatar Dec 15 '22 00:12 madelson

My view is that HTTP caching a standard part of the protocol and we should assume that it could happen.

I think catering to every single scenario would leave a lot of other customers unhappy. In your scenario, yes defaulting to Cache-Control: max-age=0 would resolve your scenario, since this proxy has to be used. For other scenarios this would not be fitting.

Making a change to start sending Cache-Control: max-age=0 would be a big breaking change for other customers who either A. Are expecting this value not to be set B. They are setting it themselves We are not allowed to be making a breaking change like this.

Alternatively, the request ID checking could be modified to be more resilient to caching. For example the check could be conditioned to skip if the request comes back with cache headers, or the ID could be generated in a "stable" way (e.g. hash of request contents).

The SDK storage clients are stateless. To cache each request that the client sends, would break the stateless strategy, and to do this just for this specific scenario, doesn't seem fitting.

but I do think it would be nice to find a way for the SDK to be more resilient to different setups. Especially where the request ID check doesn't feel like it adds that much value.

The point of the SDK is to allow customers to have enough knobs and options to adjust to their scenario or needs. Which is why you're allowed to set things like the Cache-Control header or manually setting the x-ms-client-request-id for when you know you're repeating a request.

Yes, for some customers they don't find value for checking the x-ms-client-request-id. For others this is essential for data integrity.

Closing the issue, as there's nothing actionable. We can revisit this if there's more noise for this specific scenario, and what the best solution (maybe the solution is not in an SDK form).

amnguye avatar Dec 15 '22 17:12 amnguye

Fair enough. Thanks for taking the time to understand our pain point and I appreciate your consideration of the options.

madelson avatar Dec 16 '22 21:12 madelson

Hi, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

ghost avatar Dec 24 '22 02:12 ghost