isochron
isochron copied to clipboard
Random Premature/Late Transmissions with Taprio Traffic Controller
Hey,
I'm currently using isochron to measure the delay from one I210 to another using a taprio traffic controller over VLAN. Unfortunately, it seem to be getting a lot of variable results. Sometimes the program will run perfectly fine for over a minute, other times it will result in late/premature transmissions within seconds, discarding all other timestamps.
The taprio traffic controller configuration is as follows:
qdisc taprio 100: root refcnt 9 tc 3 map 0 0 0 0 0 0 1 2 0 0 0 0 0 0 0 0
queues offset 0 count 1 offset 1 count 1 offset 2 count 1
clockid TAI base-time 0 cycle-time 10000000 cycle-time-extension 0
index 0 cmd S gatemask 0x4 interval 2000000
index 1 cmd S gatemask 0x2 interval 2000000
index 2 cmd S gatemask 0x1 interval 6000000
where isochron has socket priority 6 (using traffic class 1).
The isochron send command I am using is as follows:
sudo isochron send -i enp1s0.5 -s 64 --client 10.0.0.1 -c 0.01 -t 1 -w 0.002 -F ${fileprefix}-isochron.dat -n 6500 -o -O 37 --cpu-mask $((1 << 1)) --sched-fifo --sched-priority 99 -4 -J 10.0.0.1 -p 6 -S 0.002 -Q
and the following is the isochron receive:
taskset -a $((1 << 1)) sudo isochron rcv -i enp5s0.5 -t 1 -O 37 --sched-fifo --sched-priority 99 -4
The resulting isochron output on the send side is:
Now: 1692646016.985340048
First wakeup: 1692646017.485900000
Base time: 1692646017.486200000
Cycle time: 0.001000000
Late transmission by 1 cycles detected for seqid 57 scheduled for 1692646017.542200000: TX hwts 1692646017.543384569
Timed out waiting for TX timestamps, 64944 timestamps unacknowledged
seqid 58 missing timestamps: hw,
...
I've noticed that increasing the window size of isochron causes ptp (which is running on traffic class 2) to fault:
ptp4l[4324.490]: timed out while polling for tx timestamp
ptp4l[4324.490]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
ptp4l[4324.490]: port 1 (enp1s0.5): send delay request failed
ptp4l[4324.490]: port 1 (enp1s0.5): SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
and decreasing the window size any more results in isochron having consistent premature/late transmissions.
I was mainly wondering if this was common behavior with isochron, if there were any configuration changes that could be made to help stabilize the results, and finally if there are any other known causes that may be interfering with the isochron processes.