onload
onload copied to clipboard
RTO timer being removed
Hi,
I am debugging an issue with packets not being retransmitted. I can see an RTO timer created, in ci_tcp_ds_done(), yet when ci_ip_timer_poll() runs a few miliseconds later it's not there anymore. I instrumented all the ci_tcp_rto_* manipulating methods and they are not called. I also instrumented the actual RTO callback, not called either.
Can someone please tell me what removes the RTO timer? Thanks
2 obvious way to remove the RTO timer are:
- incoming ACK;
- TCP connection status change (shutdown, reset, etc).
I instrumented all the ci_tcp_rto_* manipulating methods
Just to be sure: do you understand that such functions exist in both libonload.so
library and in the onload.ko
kernel module? If "instrumenting" is ci_log()
, then you should recompile both (and reload the module). If you use other ways, then again, you should remember this and apply instrumentation twice.
I am debugging just the userspace libonload.so. I can see the RTO timer there (dumped by ci_ip_timer_state_dump()), yet next call (on the same netif) from ci_ip_timer_poll() doesnt list the RTO timer. Not sure why/how kernel module could affect those.
Regarding ACKs. Am I wrong in assuming that an ACK removes the RTO timers only in ci_tcp_rx_free_acked_bufs() ? That is not called in my case. Or is there some other place that I missed?
Let me repeat - all these functions can be called from onload.ko
. It is completely useless to "debug just the userspace". For example:
- I can see the RTO timer there (dumped by ci_ip_timer_state_dump())
-
The timer is handled from
onload.ko
- Next call (on the same netif) from ci_ip_timer_poll() doesnt list the RTO timer.
Am I wrong in assuming that an ACK removes the RTO timers only in ci_tcp_rx_free_acked_bufs() ? That is not called in my case.
How can you say that when you have no idea of what's going on in the kernel module?
So what you're implying is that both kernel and userspace lib both access the same data structures for the same tcp connection? That would explain why I am not seeing the ACK.
Would you give me a few pointers on where to look how this usespace/kernel cooperation is implemented?
So what you're implying is that both kernel and userspace lib both access the same data structures for the same tcp connection?
Yes.
Would you give me a few pointers on where to look how this usespace/kernel cooperation is implemented?
"netif state", ni->state
is the shared memory. It is mapped to both kernel and userland.