async-http-client
async-http-client copied to clipboard
Provide a way to have request level control over CookieStore (currently only global)
Hi,
I am using AsyncHttpClient version 2.5.2 for my application. I notice that CookieStore will add the cookie when HttpResponse come https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/netty/handler/intercept/Interceptors.java#L73-L82 and use cookie from CookieStore for redirect https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/netty/handler/intercept/Redirect30xInterceptor.java#L129-L136 and execute new request https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/DefaultAsyncHttpClient.java#L193-L207.
When we send multiple requests with the same URI:
- request 1 send to server
- request 2 send to server
- server returns response 2 with redirect cookie 2
- server returns response 1 with redirect cookie 1
It's possible for the redirect request 2 will be sent with cookie 1 -> it's incorrect. Also, I don't want to add cookie to the request automatically when it's in cookie store. I can disable CookieStore, but doing that will make redirect won't work.
I found this PR https://github.com/AsyncHttpClient/async-http-client/pull/1495 added cookie store. I hope we can revert it and don't use cookie store.
@slandelle @unoexperto Can you help to look at this issue?
Please explain the functionality you're trying to achieve. Are you trying to crawl a website will multiple concurrent virtual users?
I am working at Wego(https://www.wego.com) - a travel search engine based in Singapore. Our backend will send http requests to our partners to get best flights/hotels deals and display on website and mobile applications. We are using the old version 1.9.40 and now want to upgrade to the latest version 2.5.2. Obviously, it's possible for us to send multiple requests having the same URI to 1 partner. However, with cookie store, it doesn't work correctly because:
- The later cookie can override the sooner cookie with the same URI -> the redirect will be incorrect.
- Don't want to set cookie back to server with new request even cookie is not expired -> the subsequent requests should be independent with the previous request.
@tranchitam I'm not sure I understand the requirement. You want http client to distinguish between different calls to same domain or do you want client to initialize cookie store only once ?
It looks like you want separate CookieStore per request. It's possible to implement that was but it's not how majority of browsers and http clients work. You could create new client per request but I suspect there will be performance penalty.
What I'd do if I were you is create pool of clients and set/reset cookie store before making new request.
@unoexperto I want http client to distinguish between different calls, the next request shouldn't add cookie from the previous http response.
@tranchitam But do you need to remember cookie from previous calls ? If not you can clear cookie store before making new call
asyncHttpClient.getConfig.getCookieStore.clear()
@unoexperto it's not safe to use asyncHttpClient.getConfig.getCookieStore.clear() before making new call because it's possible after you clear, some http responses (with the same URI) come and cookie will be added again.
@tranchitam I see. Then pool of clients is probably the way to go. Alternatively you can implement custom org.asynchttpclient.cookie.CookieStore.
@unoexperto You mean 1 client for 1 new request? Pool of clients will cause performance issue as it will create a lot of threads, event loop... From the design, 1 async http client can be used globally and shared by multiple threads in 1 application. Customized org.asynchttpclient.cookie.CookieStore also won't help because you cannot avoid cookie overriding here in multithreading environment https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/netty/handler/intercept/Interceptors.java#L73-L82
One more thing, if I disable cookie store, redirect will not work.
Our backend will send http requests to our partners to get best flights/hotels deals and display on website and mobile applications.
Looks a bit weird to me :) Usually, partners get private API with token base security. Cookies are usually for human users with browsers.
I see the following possibilities:
- disabling followRedirect and handle redirects and cookies yourself
- provide a PR that allows to pass a CookieStore in the RequestBuilder to override the global one from the config. This way, you'll be able to manage the scope of each CookieStore for each concurrent virtual user.
@slandelle Yeah, we have many partners. Most of them use API with token, but other partners are using cookies as their credentials information.
Could you look at the logic here https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/netty/handler/intercept/Interceptors.java#L73-L82? As you can see, in multithreading environment, the later cookie will override the sooner cookie with the same URI.
With these lines of code https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/netty/handler/intercept/Redirect30xInterceptor.java#L129-L136, it's possible we set wrong cookie for the next redirect request.
Thanks for your recommendations, but
- disabling followRedirect and handle redirects and cookies yourself -> redirect can work well without cookie store, just need to set cookie to the redirect request from response headers(Set-Cookie).
- provide a PR that allows to pass a CookieStore in the RequestBuilder to override the global one from the config. This way, you'll be able to manage the scope of each CookieStore for each concurrent virtual user -> I think it doesn't make sense to pass CookieStore for each RequestBuilder because from the response we can get cookies directly.
We have the option to disable cookie store by setting cookie store = null. What do you think about adding cookies from response headers to redirect request when cookie store is null? https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/netty/handler/intercept/Redirect30xInterceptor.java#L129-L136
As you can see, in multithreading environment, the later cookie will override the sooner cookie with the same URI.
That's not a problem of multi-threading, it's a problem of your current logic trying to use a shared CookieStore with multiple concurrent virtual users trying to get and set cookies for the same domain.
It's fine as is and serves its purpose, typically crawling multiple website concurrently with one single virtual user for each.
What do you think about adding cookies from response headers to redirect request when cookie store is null?
No. First, the logic of domain and path matching is not trivial and you can't simply assume the redirect request must have the cookies from the redirected response. Then, what about the following non-redirect requests? They will be missing the cookie.
As I said, the best solution is to introduce the ability to let user pass a CookieStore at request level, so you can implement your own CookieStore isolation logic that would match your concurrent virtual users one.
@slandelle Got your idea, thanks. But if we introduce CookieStore at request level, will be cookies set back from global cookie store when we send a new request? https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/DefaultAsyncHttpClient.java#L193-L207
No. You’ll be the one in charge of handling lifecycle
Do you want me to keep global CookieStore or remove it completely?
Hi @slandelle,
I just created the PR https://github.com/AsyncHttpClient/async-http-client/pull/1567. Please help to take a look.
@tranchitam Hi, I had the same problem with you. Do you have solve this problem of global cookie?
I need cookie store per request too. I know most of HTTP clients don't support it (Apache does though), but in practice it's a fairly common case.
Any backend will make requests to different servers and you just don't want cookies to mix. Another case is you have multiple users, each should have their own cookie store.