mptcp_net-next
mptcp_net-next copied to clipboard
scheduler: react when subflow-level events pop up (ACK/RTO)
When discussing about the packet scheduler API at the last meeting, it sounds very likely the current packet scheduler will not react by queuing more packets if some subflows only events are emitted, e.g. new TCP ACKs are received only acking things at TCP-level but not at MPTCP level.
The scheduler should probably be called when such events happen.
This can be checked with packetdrill: a ACK is received at TCP level and the scheduler might not send anything while it should (there is more room available).
Hints:
- hooking at TCP window level changes might be enough:
- maybe too frequent?
- might affect perf?
- or hook in MPTCP options to do more checks there?:
- quite heavy, the core should filter events
How about invoking __mptcp_push_pending() right after mptcp_pm_nl_work() in mptcp_worker(). Something like:
mptcp_pm_nl_work(msk);
+ __mptcp_push_pending(sk, 0);
+
mptcp_check_send_data_fin(sk);
mptcp_check_data_fin_ack(sk);
mptcp_check_data_fin(sk);
How about invoking __mptcp_push_pending() right after mptcp_pm_nl_work() in mptcp_worker().
I forgot to reply to this one: we talked about that suggestion at the weekly meeting on the 19th of Sept
It doesn't seem OK:
- we don't want to do such thing from the mptcp worker (we need to limit the work there)
- the scheduler might need to react each time an ACK/RTO is received/fired: more frequently than when the worker is invoked
From the last meeting:
-
What is important before sending any changes to netdev, is to have a "performance environment":
- to measure the differences, see the impact, etc.
- Might be good to add such tests in the selftests:
- We can depend on external tools, e.g. TCP Big selftests uses netperf
- But maybe not bigger tools like transperf even if it would analyse already a few things for us
- udpgso_bench_[rt]x.c: might be extended
- @matttbe is going to look at that
- It should run on HW devices and virtual ones:
- not the same utilisation of the CPU
- some ideas: using RPS in RX, and adding netem delay in TX, to force not having the same context being reused, different usage of the CPU
-
It looks like this work on the scheduler is challenging, and it would be better to prepare this work, split it, etc.:
- So if someone wants to work on that, best to come up with a plan, with the different steps, what will be changed in the architecture, etc.
- The goal is not to spend a long time in the reviews, and completely reworking the code each time, etc.