iiwa_ros icon indicating copy to clipboard operation
iiwa_ros copied to clipboard

Running an iiwa at 1kHz

Open matthias-mayr opened this issue 2 years ago • 2 comments

We are currently evaluating to move from 500 Hz control rate to 1 kHz and I am interested if anyone else has a achieved a setup in which this work reliably.

Without #47 and #48 we had issue running at anything higher than 200Hz and with them we can reliably work at 500 Hz.

However even though we use

  • a dedicated and recent 6 core i5 machine with a PREEMPT_RT patch set
  • assigning the driver the highest RT priority (see #47 )
  • real network ports and a
  • a direct cable connection

we are experience interruptions at 1 kHz. Our controller takes about 800 microseconds to evaluate and it does not look like it's causing the interruptions.

One observation is that as soon as the controller is publishing some more data (which does not make it slower, we profiled that), the interruptions happen much faster. So right now the working assumption is that it's related to OS + network.

A second observation is that we never get a connection quality of 3 (excellent), but only 2 (good) and then after some time it immediately drops to 0 (poor) and interrupts.

Does anyone else run this robot at 1 kHz? If yes, which other tricks did you apply?

matthias-mayr avatar Nov 14 '22 21:11 matthias-mayr

Hi Matthias, We're running at 1kHz but with an Orocos-based driver and controllers. We're only using ROS2 for non hard-RT communications with a Orocos-ROS2 bridge. We're also using a dedicated machine with PREEMPT_RT, RT priority for the driver and direct cable connection.

However, it's worth mentioning that, even if it's working most of the time (with an excellent reported connection quality), we're still experiencing random connection loss during long runtime (10-20min roughly) and changes of controller (this one is probably related to Orocos though).

Things we tried that could help you (in no particular order):

  • Reduce as much as possible the computational load on the update loop, obviously, but also avoid memory allocation and logging inside it.
  • Only give RT priority to the driver.
  • Modify the boot sequence of the RT kernel by adding the following to the "linux" line between "quiet splash" & "$vt_handoff": nosmt noefi intel_idle.max_cstate=1 pstate_driver=no_hwp intel_pstate=no_hwp clocksource=tsc tsc=reliable rcu_nocb_poll nmi_watchdog=0 nosoftlockup nosmap audit=0 irqaffinity=<NOT cpuID(s)> isolcpus=<cpuID(s)> rcu_nocbs=<cpuID(s)> nohz_full=<cpuID(s)> Essentially, it'll disable hyper-threading and isolate cpu(s) (isolcpus=<cpuID(s)>) to which you should assign your driver process. We're not using this anymore since it didn't seems to work well with Orocos but It's definitely worth trying.
  • Instead of the rate being defined by the blocking call of FRI read, schedule an update when there's an activity on the socket used by FRI. In theory, it would induce a slight delay but it can be worth trying as it prevent you're CPU being blocked by the FRI read.
  • Just in case, prefer a fixed computer instead of a laptop. We also have a dedicated high-speed Ethernet card on it but I'm not sure it's necessary.

I'll edit my comment if i think of something else.

Hope it helps !

AntoineHX avatar Nov 17 '22 15:11 AntoineHX

Thanks a lot for the tips. Some quick update:

We noticed that catkin workspaces do not set a build type by default and this seemed to have been the main reason for out interruptions. So it turned out that we built both the driver and controller without any compiler optimizations. Thus also https://github.com/matthias-mayr/Cartesian-Impedance-Controller/commit/9427a198a068e6b745fd9107e46b4eaa2cb02383 and #94 to not miss out on that again.

I only had time to test it for a couple of minutes, but at least that worked at 1 kHz. We already do 1., 2. & 5. of your list. If we should encounter issues, I can check out number 3 & 4 as well. Thanks again!

matthias-mayr avatar Nov 23 '22 22:11 matthias-mayr