klipper Move from straight Ziegler-Nichols To Tyreus-Luyben variant

The Ziegler-Nichols method generates aggressive gain.

This is a well-known and understood issue to the degree it's even mentioned on the wikipedia page ("It yields an aggressive gain and overshoot – some applications wish to instead minimize or eliminate overshoot, and for these this method is inappropriate").

For Klipper, this tends to result in overshoot on heater beds (especially large ones), and cause larger-than-needed oscillation on hotend temperatures.

This moves to a slightly different set of constants that does better at both of these things, without causing regressions for existing hotends (as far as i can tell)

I tested this on a number of different hotends and hotbeds. On my magneto x, this consistently reduces initial overshoot of the heatbed (1000w heatbed) to 1C or less, whereas it was around 3-5C before. For the hotend, the temperature range is about +-0.2C, and before it was about +-0.6C steady state. This is representative of other results, and there were no regressions.

This is about what you would expect as it produces less initial overshoot and better settling behavior that results in smaller oscillation due to less aggressive gain.

Not sure what the normal method of experimentation is here, but happy to make it a config option or whatever if folks are interested.

Depending how much people care, i could also implement one of the newer NN based PID techniques that are robust across basically everything we would care about (the biggest issue here seems to be time delayed thermal masses, which they handle remarkably well), which would eliminate the need for this kind of pid tuning, but be more computationally expensive (though a lot of the newer MCU cores are starting to get accelerators for this these days)

Jul 08 '24 18:07 dberlin

Thank you for submitting a PR, pleas refer to point 3 in "What to expect in a review" in https://github.com/Klipper3d/klipper/blob/master/docs/CONTRIBUTING.md and provide a signed off by line.

Thanks James

Jul 09 '24 08:07 JamesH1978

Thank you for submitting a PR, pleas refer to point 3 in "What to expect in a review" in https://github.com/Klipper3d/klipper/blob/master/docs/CONTRIBUTING.md and provide a signed off by line.

Thanks James

Fixed, sorry I missed that.

Jul 10 '24 15:07 dberlin

Depending how much people care,

I do! This is great, let's hope @KevinOConnor can take a look at this PR.

Jul 16 '24 23:07 thijstriemstra

There is also some internal restriction on heater PWM output, to reduce update frequency. So, fluctuation will happen in a time window of 5s where power change <5%. So, current pid can produce less oscillation, it is just restricted because current variation is fine.

Jul 29 '24 02:07 nefelim4ag

Thanks.

I guess my main feedback would be that it is really hard to judge if this change would be a net improvement across the entire range of 3d printers (for both hotends and beds). So, it's hard to make a change like this without getting feedback from many users running many hardware variants that indicates a consistent improvement.

For what it is worth, it has also been identified that Klipper's implementation of Astrom-Hagglund is flawed because the heating and cooling cycles are often unbalanced on high power heaters. There has been proposals to automatically decrease the power level to attempt to better balance heating and cooling times during the pid calibration. I mention this because fixing that flaw might be important prior to optimizing the resulting coefficient calculations (Ziegler-Nichols or Tyreus Luyben).

-Kevin

Aug 05 '24 03:08 KevinOConnor

@KevinOConnor I understand your point of view/goal/responsibility as maintainer of a large/important project, but not your conclusion:

I guess my main feedback would be that it is really hard to judge if this change would be a net improvement across the entire range of 3d printers (for both hotends and beds). So, it's hard to make a change like this without getting feedback from many users running many hardware variants that indicates a consistent improvement.

To me it doesn't seem like it is, actually.

Wouldn't the Wikipedia description of the two algorithms support a conclusion opposite to yours, since it highlights general issues with the current algorithm?

In other words, unless experimental/real world data show that the current algorithm performs better, wouldn't the expected behaviour of the new algorithm make more people happier than the current one? See also the statement of the author:

I tested this on a number of different hotends and hotbeds. On my magneto x, this consistently reduces initial overshoot of the heatbed (1000w heatbed) to 1C or less, whereas it was around 3-5C before. For the hotend, the temperature range is about +-0.2C, and before it was about +-0.6C steady state. This is representative of other results, and there were no regressions.

While I'm not an expert, it seems you are negating measured benefits (of course, some more details about the configurations tested is expected) and "mathematical" benefits against hypothetical regressions/worsening. Which configurations would perform worse?

Of course, you say that it's better to postpone this change after the other one you mentioned has been taken care of, and that's another topic. I just found really strange to dismiss a change tested, and also expected from theory, to provide better results, without data proving such an unlikely scenario.

Aug 05 '24 14:08 dewi-ny-je

So, it's hard to make a change like this without getting feedback from many users running many hardware variants that indicates a consistent improvement.

but happy to make it a config option

I agree there should be an option to configure the variant so that we don't break existing setups and can gather feedback from users that are willing to test the new variant. If it turns out that it's a superior variant, it could become the default in the future. The PR as is should not "throw away" the existing variant. My 2 cents.

Aug 05 '24 15:08 thijstriemstra

In order to satisfy all the views here, i just rewrote the PID controller to be online adaptive and adjust the P/I/D terms in response to error. This requires no offline tuning/calibration. So i'm going to close this PR and post that one instead.

Aug 15 '24 14:08 dberlin

I would have liked to understand the reasoning of the statements I quoted from @KevinOConnor , since I really don't see how to get to the conclusions brought forward, but it was more out of curiosity

Aug 15 '24 14:08 dewi-ny-je

One thing i'll flag as i close this out - our current PID controller is not really a PID controller, or at least, it does not function like a normal one.

Assuming e(t) is error in temperature vs setpoint at time t, and dT(t) is change in temperature at time t, it actually is P = based on e(t) I = based on e(t) D = based on dT(t)

instead of all of them being based on e(t) (ie D is normally de(t)/dt not dT(t)/dt)

This is actually one reason (but not the only reason) we end up with some crazy derivative terms after tuning - the others are accounting for error, and the derivative term is trying to account for temperature change, instead of error change.

Aug 15 '24 17:08 dberlin

Wouldn't the Wikipedia description of the two algorithms support a conclusion opposite to yours, since it highlights general issues with the current algorithm?

No. The PID algorithm is a heuristic and there is no one "right" or "wrong" way to configure it. The Ziegler-Nichols method is one of the oldest and likely still one of the most popular ways of performing tuning. I don't doubt that a new mechanism may perform better on 3d printer heaters, but I also don't doubt that it could perform worse. That's why it is hard to make changes like this without getting feedback from lots of people on lots of different printers.

In other words, unless experimental/real world data show that the current algorithm performs better, wouldn't the expected behaviour of the new algorithm make more people happier than the current one?

No. We commit to the main Klipper repo after we identify an improvement - we don't commit every change so that we can identify which ones are improvements. That would be chaos.

Cheers, -Kevin

Aug 17 '24 02:08 KevinOConnor