okhttp
okhttp copied to clipboard
java.net.SocketTimeoutException from HTTP/2 connection leaves dead okhttp clients in pool
Tried writing a unit test w/ TestButler on Android w/ no luck, so I'll write up the steps to reproduce this and include some sample code. This happens if you connect to an HTTP/2 server and your network goes down while the okhttp client is connected to it:
- create an okhttp client
- tell it to read from the HTTP/2 server
- bring the network down
- tell it to read from the HTTP/2 server (it'll get a SocketTimeoutException)
- bring the network back up
- tell it to read from the HTTP/2 server again (it'll be stuck w/ SocketTimeoutExceptions)
- if you create new http clients at this point, it'll work, but the dead http client will eventually come back in the pool and fail.
okhttp client should attempt to reopen the HTTP/2 connection instead of being stuck in this state
Code sample for Android (create a trivial view w/ a button and a textview):
public class MainActivity extends AppCompatActivity {
OkHttpClient okhttpClient = new OkHttpClient();
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
Button loadButton = (Button) findViewById(R.id.loadButton);
TextView outputView = (TextView) findViewById(R.id.outputView);
loadButton.setOnClickListener(view -> Observable.fromCallable(() -> {
Request request = new Request.Builder()
.url(<INSERT URL TO YOUR HTTP/2 SERVER HERE>)
.build();
Response response = okhttpClient.newCall(request).execute();
return response.body().string();
})
.subscribeOn(Schedulers.io())
.observeOn(AndroidSchedulers.mainThread())
.subscribe(outputView::setText, t -> outputView.setText(t.toString()))
);
}
}
FYI, we found a workaround...set the connectionPool in the builder so it uses a new connection pool w/ a size of zero and also turn off HTTP/2 support by setting a new protocolList in the builder with only HTTP/1.1 support.
You’re using 3.6.0?
yep...3.6.0 unfortunately. Thought about rolling back to pre-http/2 support but that would mean 2.2 which is too far back because of all the okhttp3 dependencies :-(
Oh that's terrible. We've had problems with similar failures before but I thought we'd fixed ’em all. If you can make a test case that'd be handy, otherwise I'll try to look soon.
In the interim you can disable HTTP/2 with the protocols list in the OkHttpClient.Builder.
Correct me if I'm wrong, but I think part of this is working as expected. HTTP/2 connections can carry N outstanding requests. If one of those requests times out and the HTTP/2 connection is closed, then the other N - 1 requests are also lost. I think the intent is that for HTTP/2 connections, a timeout does not necessarily mean the connection is bad.
Is it surprising to 'bring the network down' and not receive any sort of socket exception reading or writing?
N-1 requests being lost is fine if the connection is down. The issue is that it doesn't recover when you bring the network back up...i.e., the broken idle connection objects are in the pool stay there and when you try connecting again, you can't connect until the user kills off your app to restart everything...
@swankjesse : I couldn't figure out how to write a test for this because making all the sockets disconnected was happening at at an OS level. Tried to write and Android Test Butler one (to flip the network switch on/off on an Android emulator) but the current version of that has issues and probably wouldn't work in this code base :-)
So our attempts to write to the socket are failing silently? Might need to steal the automatic pings that we added for web sockets.
Essentially...not that they're failing silently, but they're dead sockets and they're stuck in the pool. We traced through a bit of the code and saw some code that was pulling the a dead socket out of the pool each time it tried to use one which should have cleared things up after 5 dead sockets were pulled out but the network layer still appeared stuck unless we purged the pool w/ evictAll() or waited for the 5 min eviction timeout. Wasn't obvious what a proper fix was... HTTP/2 essentially behaves like web sockets so you're probably on the right track...
Pretty sure this issue is another manifestation of this one:
https://github.com/square/okhttp/issues/3118
I'm sure it's not. We don't see any SSL Handshake exceptions.
This bug is actually probably two bugs because we had to disable the connection pool and the HTTP/2 support. #3118 might be affected by the connection pool bug (it doesn't clear the broken idle connection objects in the pool).
I've seen what you've described but then also the ssl exceptions. Same steps to reproduce as you outlined.
any updates? I've same issue
The workaround I described works in our QA testing so far :-)
@kenyee setting new pool works, but I wonder when an update will arrive?
Is this issue resolved? I ran into the same issue using 3.5.0. I am using OkHttp to send push to Apple http/2. Yesterday I had this issue resulting in almost 80k push messages not getting delivered.
Caused by: java.net.SocketTimeoutException: timeout
at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.java:587) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.java:595) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http2.Http2Stream.getResponseHeaders(Http2Stream.java:140) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http2.Http2Codec.readResponseHeaders(Http2Codec.java:115) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:54) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[okhttp-3.5.0.jar:?]
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:179) ~[okhttp-3.5.0.jar:?]
at okhttp3.RealCall.execute(RealCall.java:63) ~[okhttp-3.5.0.jar:?]
After I got this error, none of my other requests succeeded.
Code:
KeyStore ks = KeyStore.getInstance("PKCS12");
ks.load(new ByteArrayInputStream("/foo/bar/mycert"), password.toCharArray());
KeyManagerFactory kmf = KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm());
kmf.init(ks, password.toCharArray());
KeyManager[] keyManagers = kmf.getKeyManagers();
SSLContext sslContext = SSLContext.getInstance("TLS");
final TrustManagerFactory tmf = TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm());
tmf.init((KeyStore) null);
sslContext.init(keyManagers, tmf.getTrustManagers(), null);
TrustManager[] trustManagers = tmf.getTrustManagers();
if (trustManagers != null && (trustManagers.length != 1 || !(trustManagers[0] instanceof X509TrustManager))) {
throw new IllegalStateException("Unexpected default trust managers:"
+ Arrays.toString(trustManagers));
}
final X509TrustManager trustManager = (X509TrustManager) trustManagers[0];
final SSLSocketFactory sslSocketFactory = sslContext.getSocketFactory();
OkHttpClient.Builder builder = new OkHttpClient.Builder();
builder.connectTimeout(5, TimeUnit.SECONDS).writeTimeout(10, TimeUnit.SECONDS).readTimeout(10, TimeUnit.SECONDS);
builder.connectionPool(new ConnectionPool(3, 10, TimeUnit.MINUTES));
builder.sslSocketFactory(sslSocketFactory, trustManager);
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(proxyHost, proxyPort));
builder.proxy(proxy);
OkHttpClient client = builder.build();
As socket timeout exception is an instance of IO exception, I am not sure if the following approach will work. Can one of you pls get back to me?
I am calling evictAll() in the catch block of IOException.
try {
response = client.newCall(request).execute();
statusCode = response.code();
responseBody = response.body().string();
} catch (IOException ioe) {
client.connectionPool().evictAll();
} finally {
if (response != null) {
response.body().close();
}
}
Also how do we check if a connection is stale or not?
With Apache HttpClient, there is a way to do it to set a flag for checking stale connections. Wondering how OkHttp3 checks for it internally before it uses the connection.
CloseableHttpClient client = HttpClients.custom().setDefaultRequestConfig(
RequestConfig.custom().setStaleConnectionCheckEnabled(true).build()
).setConnectionManager(connManager).build();
Any updates? I have the same issue too. :(
Same issue here!
We still experiencing the same issue :-(
I think i'm seeing another manifestation of this on 3.5.0, when the server forcibly closes the connection.
We try to establish both a h2 and http1.1 connection. The server responds with 200 to both:
06-26 15:07:55.286 22094 22380 I okhttp3.OkHttpClient: --> GET<url> http/1.1
06-26 15:07:55.524 22094 22380 I okhttp3.OkHttpClient: --> GET<url> h2
06-26 15:07:55.596 22094 22380 I okhttp3.OkHttpClient: <-- 200 <url> (71ms)
06-26 15:07:55.597 22094 22380 I okhttp3.OkHttpClient: <-- 200 <url> (303ms)
Then at some point we try to read from the http2 connection, which fails in checkNotClosed and throws a StreamResetException
06-26 15:06:01.560 22094 22126 I MyProject: Caused by: okhttp3.internal.http2.StreamResetException: stream was reset: PROTOCOL_ERROR
06-26 15:06:01.560 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Stream$FramedDataSource.checkNotClosed(Http2Stream.java:428)
06-26 15:06:01.560 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Stream$FramedDataSource.read(Http2Stream.java:330)
06-26 15:06:01.560 22094 22126 I MyProject: at okio.ForwardingSource.read(ForwardingSource.java:35)
06-26 15:06:01.560 22094 22126 I MyProject: at okio.RealBufferedSource$1.read(RealBufferedSource.java:409)
06-26 15:06:01.560 22094 22126 I MyProject: at com.google.android.exoplayer.upstream.HttpDataSource.read(HttpDataSourceImpl.java:699)
06-26 15:06:01.560 22094 22126 I MyProject: at com.google.android.exoplayer.upstream.HttpDataSource.read(HttpDataSourceImpl.java:424)
Then, since this is media, we do something that causes a seek to 0 in the media, which needs to reopen the request from the beginning. At this point, we see the same exception as is posted above:
06-26 15:08:39.387 22094 22126 I MyProject: Caused by: java.net.SocketTimeoutException: timeout
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.java:587)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.java:595)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Stream.getResponseHeaders(Http2Stream.java:140)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Codec.readResponseHeaders(Http2Codec.java:115)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:54)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:212)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:212)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:179)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.RealCall.execute(RealCall.java:63)
this seems to be very similar to the other cases here, which seem to all be related to an ungraceful shutdown of the connection, and it remaining pooled.
I've also confirmed that disabling the ConnectionPool "works around" this issue:
OkHttpClient.Builder clientBuilder = new OkHttpClient.Builder()
.connectTimeout(connectTimeoutMillis, TimeUnit.MILLISECONDS)
.retryOnConnectionFailure(true)
.readTimeout(readTimeoutMillis, TimeUnit.MILLISECONDS).connectionPool(new ConnectionPool(0, 1, TimeUnit.NANOSECONDS));
is thr any update on this issue?
Same issue here
any idea if/when this will be fixed? We're seeing the same issue.
@jpearl
I can confirm that disabling ConnectionPool stops getting StreamResetException
when it uses HTTP/2.
I'm also using ExoPlayer with OkHttp and in my case, it was happening when my app goes to background. Even if I turn off the Battery Optimizations it was being closed after few minutes in background getting SocketTimeoutException
when the player tried to play the next track.
I was thinking to use DefaultHttpDataSource for the requests in the ExoPlayer because it also works without throwing SocketTimeoutException
, but by disabling the ConnectionPool for me would better at this moment.
I'll keep it disabled for now until I find a better solution.
OkHttp: 3.8.1
Model: Huawei P9 Lite
OS: 7.1.2
Thanks for sharing this!
@alessandrojp do you know if the ExoPlayer team is aware of this issue? We've run into it only with exoplayer as well.
I'm also seeing SocketTimeout / dead client issues with OkHttp and Exoplayer.
evict all connection from connection pool,resolve sockettimeout exception if(throwable instanceof SocketTimeoutException){ okHttpClient.connectionPool().evictAll(); }
I am also facing similar issue. When my app goes in background and if internet goes off and comes back and after if i come back to the app, nothing loads and even request doesn't go and i get timeout. After that all requests behave same. Please help me in solving this issue.
For now i have put a hack by reading the above thread is that whenever i get IOException i evict all connections from connection pool. This solves the problem but this happens at least once and user sees retry/reload screen.
@swankjesse Any news on this?