okhttp
okhttp copied to clipboard
OkHttpClient POST request fails when Request body size beyond certain limit
I am making a sequence of POST requests with each subsequent request passing the next token received the previous response in the request body of the new request. I am noticing that if POST HTTP request body size is beyond a certain limit, the request times out after certain number of requests in the sequence. Typical request body size is about 14KB. With OkHttpClient event listener I notice that (for the failing call)
override fun requestBodyStart(call: Call)
is invoked but not
override fun requestBodyEnd(call: Call, byteCount: Long)
Instead override fun requestFailed(call: Call, ioe: IOException) is called with Socket timeout exception.
It works if make this call before each request (but looks like a hack)
okHttpClient.connectionPool.evictAll()
Not sure if OkHTTPClient maintains some state within. Charles is showing Failure status with IO: Stream cancelled by CLIENT message.
I am using
implementation "com.squareup.okhttp3:okhttp:4.9.3"
Backend is not in my control and I cannot optimize the request body size. Apparently, a truncated-body request reaches server which doesn't response and hence the timeout. Is there a way to fix this issue. Note: If I replace OkHttp with Volley I do not see this issue.
Hmmm... not enough information to suggest any fixes here. I'm curious if your calling code is blocked waiting for the server to read?
Can you isolate this into a standalone test case?
@swankjesse thanks. Here is my test code. Possibly I can try using a fake server to create a standalone test.
object NetworkModule {
private lateinit var webservice: Webservice
private lateinit var okHttpClient: OkHttpClient
fun initRetrofit() {
val loggingInterceptor = HttpLoggingInterceptor()
loggingInterceptor.setLevel(HttpLoggingInterceptor.Level.BODY)
val builder = OkHttpClient.Builder()
builder.addInterceptor(loggingInterceptor)
//builder.eventListener(MyEventListener())
okHttpClient = builder.build()
val retrofit = Retrofit.Builder()
.addConverterFactory(GsonConverterFactory.create())
.baseUrl(BASE_URL)
.client(okHttpClient)
.build()
webservice = retrofit.create(Webservice::class.java)
}
private var next: String? = ""
suspend fun getData() {
do {
val data = webservice.getData(getHeaders(), makeRequest(next))
next = data.items?.result?.get(0)?.meta?.next
val itemCount = data.items?.result?.get(0)?.items?.size
if (next?.isEmpty() == true) {
break
}
/*CoroutineScope(Dispatchers.IO).launch {
okHttpClient.connectionPool.evictAll() /// <-- THIS WILL PREVENT THE ERROR/TIMEOUT CRASH
}*/
delay(2000)
} while (next != null && next?.isNotEmpty() == true)
}
private fun getHeaders(): Map<String, String> {
val headers = mutableMapOf<String,String>()
headers["Content-Type"] = "application/json"
headers["Cookie"] = COOKIE
headers["User-Agent"] = UA_STR
return headers
}
private fun makeRequest(next: String?): MyRequest {
return MyRequest(next = next, topics = listOf(Topic()))
}
private const val TAG = "NetworkModule"
private const val BASE_URL = "https://<host>"
private const val UA_STR = "..."
private const val COOKIE = "..."
}
interface Webservice {
@POST("... url ....")
suspend fun getData(@HeaderMap headerMap: Map<String, String>, @Body body: MyRequest): MyResponse
}
class MyApplication: Application() {
override fun onCreate() {
super.onCreate()
NetworkModule.initRetrofit()
CoroutineScope(Dispatchers.IO).launch {
NetworkModule.getData()
}
}
}
Yeah this looks completely reasonable. Any chance you can post a thread dump when it stalls? I’m wondering if it’s stalling in OkHttp or possibly in coroutines somewhere?
+1 to a thread dump.
Also compare the working and non working case with an event listener printing all events including connections. https://square.github.io/okhttp/features/events/
Yeah this looks completely reasonable. Any chance you can post a thread dump when it stalls? I’m wondering if it’s stalling in OkHttp or possibly in coroutines somewhere?
@swankjesse attached the thread dump okhttp-thread-dump.txt
+1 to a thread dump.
Also compare the working and non working case with an event listener printing all events including connections. https://square.github.io/okhttp/features/events/
Here is the log for not-working case. okhttp-events.txt
Thread dump shows it's an issue with HTTP/2 flow control. It's blocking on this line:
[email protected]() // Wait until we receive a WINDOW_UPDATE.
Somehow OkHttp and your HTTP/2 server don't agree on flow control things, because the server has not granted OkHttp permission to send more bytes.
If you do an HTTP/2 frame dump we should be able to assign blame to either OkHttp or your HTTP/2 server.
https://square.github.io/okhttp/contribute/debug_logging/#http2-frame-logging
@swankjesse attaching the frame dump okhttp.timeout.log
You can see the timeout happing after the log with timestamp 15:27:38.620 1950-2863
Will this help?
@saravr if you’ll forgive me for sitting on this for 2 months, I’ve got a PR out to add even more information to aid debugging. https://github.com/square/okhttp/pull/7325
I’ve also got a hypothesis for what’s happening...
A few streams (0x00000005, 0x00000013) are canceled by the client:
...
2022-05-04 15:27:19.341 1950-2971/com.myapp.debug W/System.err: ODL: [2022-05-04 15:27:19] >> 0x00000005 4 RST_STREAM
...
2022-05-04 15:27:48.650 1950-2857/com.myapp.debug W/System.err: ODL: [2022-05-04 15:27:48] >> 0x00000013 4 RST_STREAM
Later the server does a connection-wide window update:
2022-05-04 15:27:48.698 1950-2851/com.myapp.debug W/System.err: ODL: [2022-05-04 15:27:48] << 0x00000000 4 WINDOW_UPDATE
But is this window update big enough? In particular does it account for the received bytes of the canceled streams? The additional debugging in the PR should tell us.
Relevant section from the HTTP/2 spec
Flow-controlled frames (i.e., DATA) received after sending
RST_STREAM are counted toward the connection flow-control window.
Even though these frames might be ignored, because they are sent
before the sender receives the RST_STREAM, the sender will
consider the frames to count against the flow-control window.
One other gotcha – if you’re canceling a stream but not closing the ResponseBody, you will have flow control errors. Make sure you’re closing the response body.
@swankjesse Any chance this is fixed by https://github.com/square/okhttp/pull/7801
or likely needs a similar writer side change?
#7801 is the right fix.