Marlin Make Dwell immune to millis() overflow.

Description

Fixes the possible issue when millis() timer overflows during Dwell command. The dwell period was to extend to 50days... so basically a halt machine. This PR makes it overflow proof and prevents possible very long Dwell. The probability of the related bug manifesting itself is small. Millis overflow happens once every 50days, and the Dwell command would need to run during this exactly. Anyway it would give a more peace of mind for users who run their printers constantly online, so why not fix it.

Requirements

Applies to all boards..

Benefits

Users who like to run threir machines constantly online would not have to worry about the need for restarting the printer regularily.

Configurations

Applies to all configurations.

Related Issues

This is one PR in a series resulting from analysing against possible issues related to #26545

Dec 22 '23 05:12 TBAMax

The PENDING macro should already handle this correctly since it always starts by creating an unsigned difference value, then looks for a negative result. So it can handle any interval up to 24.85 days. If we really want to make sure that there's no weird overflow, we just need to constrain values given to G4 to be less than or equal to 2147483647.

Dec 24 '23 04:12 thinkyhead

The PENDING macro should already handle this correctly since it always starts by creating an unsigned difference value, then looks for a negative result. So it can handle any interval up to 24.85 days. If we really want to make sure that there's no weird overflow, we just need to constrain values given to G4 to be less than or equal to 2147483647.

Hi! I think there is misunderstanging here. The maximum allowable dwell period is not the issue here (Unlikely that there is use case for such a long dwell time anyway). The issue is what happens at the time around the point where millis counter reaches its maximum value. The millis() function acutally returns value of the counter that counts up at every millisecond. Unfortunately there is a limit of maximum number it can count up to. So when it reaches its maximum value, it jumps from there back to 0 and starts counting from 0 again. This is where the trick is, that time comparisons get confused.

When calculating future time value and it is larger than its holding variable can take it also overflows and returns to near 0 value. This instantly confuses the time comparison because it waits for some exra large value, but instead there is near 0.

Dec 24 '23 05:12 TBAMax

By the way there is also a even more efficient way of using the millis(). Provided that the interval is shorter that 2ˇ16 ms ( about 65.5s). So for intervals that are limited to below that, it is possible to use. It relies on using only the 16bits of the 32bit millis variable, so it uses half the ram and half the flash and half the computing power, compared to the standard version. I can look up the examples if needed. Downside was that it was not received well because of hard to understand rules where there must be certain casts and where not (standard usage works most cases even when You use no casts at all), and the fear of unknown about how certain compiler interprets the casts(althoug standard behaviour mostly by todays standards).

Dec 24 '23 05:12 TBAMax

Currently I can see no way of getting the overflow proof behaviour when pre-calculating the future value, and trying to compare this to the current <millis()> value, no matter how to cast the variables. Are there some more good articles on the topic?

Dec 24 '23 07:12 TBAMax

Darn it, now you're making me have to think…. So I broke out the Simulator and did some tests and –indeed– these macros need a bit more work to be robust. I added the following tests to see how they fared. The last test predictably doesn't do so well.

constexpr millis_t erly = 0x0000FFFF, late = 0x7FFFFF00,
                    erly2 = erly + MIN_TO_MS(1), late2 = late + MIN_TO_MS(1), huge = erly + 0x7FFFFFFF0;
SERIAL_ECHOLN(F("PENDING("), int32_t(erly), AS_CHAR(','), int32_t(erly2), F(") is "), PENDING(erly, erly2) ? F("OK") :  F("BAD"));
SERIAL_ECHOLN(F("PENDING("), int32_t(late), AS_CHAR(','), int32_t(late2), F(") is "), PENDING(late, late2) ? F("OK") :  F("BAD"));
SERIAL_ECHOLN(F("PENDING("), int32_t(erly), AS_CHAR(','), int32_t(huge), F(") is "), PENDING(erly, huge) ? F("OK") :  F("BAD"));

I propose two versions of PENDING / ELAPSED. One takes a "now" and "soon" value while the other takes "now", "start", and "interval."

constexpr bool PENDING(const millis_t &now, const millis_t &soon) { return (now < soon); }
constexpr bool PENDING(const millis_t &now, const millis_t &start, const millis_t &interval) { return ((now - start) < interval); }
#define ELAPSED(V...) !PENDING(V)

…and then I guess refactor all the code that uses PENDING / ELAPSED to favor the version with 3 parameters.

Dec 25 '23 03:12 thinkyhead

I looked at the new template MTimeout. It seems promicing. In the regard of properly and most importantly effectively handling shorter variable types for the shorter delays.

Although here might be some caveats to this. According to my experience the for 16 bit timing the cast to unsigned 16bit variable is reguired like this:

if((uint16_t)(currentMillis - lastmillis) >= interval)

But for 8bit timing (max delay 255ms) the AVR-GCC unfortunately requires following style to work correctly

      if((uint8_t)(lowByte(currentMillis) - lastmillis) >= interval)

Without the "lowbyte" it unfortunately still forces 16bit math internally, and the overflow behaviour is not as intended

I would propose some optimizations to ensure the math is done most efficient way possible. This is aiming to have all variables as short as possible. For example there is no need to store the full original length start_ms when we are actually using less.

template<typename T>
struct MTimeout {
  T start_ms = 0;
  T delay_ms = 0;
  MTimeout() {}
  MTimeout(T interval) { start_ms = (T)millis(); delay_ms = interval; }
  void prime(const millis_t ms=millis()) { start_ms = ms; }
  void clear(const millis_t ms=millis()) { delay_ms = 0; }
  void start(const T interval, const millis_t ms=millis()) { delay_ms = interval; prime((T)ms); }
  bool pending(const millis_t ms=millis()) const { return ((T)((T)ms - start_ms) < delay_ms); }
  bool elapsed(const millis_t ms=millis()) const { return ((T)((T)ms - start_ms) > delay_ms); }
  bool on_pending(const millis_t ms=millis()) const { return delay_ms && pending(ms); }
  bool on_elapsed(const millis_t ms=millis()) const { return delay_ms && elapsed(ms); }
  void idle() const { if (delay_ms) { while(pending()) ::idle(false); } }
  millis_t remaining(const millis_t ms=millis()) const { return pending() ? (millis_t)(start_ms + delay_ms - ms) : 0; }
};

This should work with the unsigned 32bit or unsigned 16bit, and be computationally and recource efficient. Probably would not work with unsigned 8bit delay, for the reasons mentioned earlier. Do not know if the small delays of less than 255ms are of practical use anyway. If they are, there must be some kind of manual version provided with template to deal with this case.

Dec 26 '23 00:12 TBAMax

there is no need to store the full original length start_ms when we are actually using less

Makes sense, and we love our size optimizations. For ARM we prefer speed over size, so "fast" types should be used. This results in the same fetch and store time but requires fewer instructions (to mask the unused bits).

Size optimization of MTimeout being determined by the delay value type is good. We can just define the delay as a fast type. For size optimization of MDelay<const int> (which has a fixed delay duration) the storage type can be defined as uvalue_t(D) which allocates just enough storage to hold the unsigned value D. I can also add uvalue_fast_t and then it can always allocate the fastest size.

Dec 26 '23 04:12 thinkyhead

@InsanityAutomation — What do you make of this old IDEX code from way back in the day (9547fb9dfb), which is called in manage_inactivity?

  #if defined(DUAL_X_CARRIAGE)
    // handle delayed move timeout
    if (delayed_move_time != 0 && (millis() - delayed_move_time) > 1000 && Stopped == false) {
      // travel moves have been received so enact them
      delayed_move_time = 0xFFFFFFFFUL; // force moves to be done
      memcpy(destination,current_position,sizeof(destination));
      prepare_move(); 
    }
  #endif

The same code (basically) exists today, and the apparent effect is to apply a single delayed move and then to set delayed_move_time to 0xFFFFFFFF so that from then on the elapsed check will "always" be true (according to isolated testing)…

const millis_t delayed_move_time = 0xFFFFFFFF, start_ms = 1000;
const bool Stopped = false;

if (delayed_move_time != 0 && (start_ms - delayed_move_time) > 1000 && Stopped == false)
  cout << "Time " << to_hex(delayed_move_time, 8) << " is ELAPSED compared to " << start_ms << ENDL;
else
  cout << "Time " << to_hex(delayed_move_time, 8) << " is PENDING compared to " << start_ms << ENDL;

if (delayed_move_time != 0 && ELAPSED(start_ms, delayed_move_time, 1000) && Stopped == false)
  cout << "Time " << to_hex(delayed_move_time, 8) << " is ELAPSED compared to " << start_ms << ENDL;
else
  cout << "Time " << to_hex(delayed_move_time, 8) << " is PENDING compared to " << start_ms << ENDL;

…meaning that after any single delayed move there will be constant calls to prepare_move in manage_inactivity until there is a call to idex_set_parked.

A "delayed move" results from a call to dual_x_carriage_unpark when the machine is in DXC_AUTO_PARK_MODE (1).

It seems to me that the logic should be to do a single move, but then set delayed_move_time to 0 and be done with it.

Maybe the original author @buildrob can also provide some insight to help us make sense of this code.

Dec 28 '23 22:12 thinkyhead

Ok, I see it now.

First prepare_line_to_destination calls dual_x_carriage_unpark which calls idex_set_parked which sets delayed_move_time to 0. So that seems to be the intent. A single move added, a flag for dual_x_carriage_unpark to skip the delayed move check. The code is a tad "brittle" in that prepare_line_to_destination must be used or it breaks this aspect of IDEX, so we'll see what we can do to make it more clear.

Dec 28 '23 22:12 thinkyhead

Size optimization of MTimeout being determined by the delay value type is good. We can just define the delay as a fast type. For size optimization of MDelay<const int> (which has a fixed delay duration) the storage type can be defined as uvalue_t(D) which allocates just enough storage to hold the unsigned value D. I can also add uvalue_fast_t and then it can always allocate the fastest size.

Yeah, the possibility to use 'fast' integer types would be good. But I guess it requires some work to research how to use it properly. So the safe option for the moment is to use standard unsigned types, with the exeption of staying away from the 8bit variants.

@thinkyhead In general MDelay tempate was quite hard to understand. And it did not work any more when I changed the start_ms type in the MTimeout template. Because there are functions for modifying the start time, and when start time is made constant those result in error. This is why I redesigned it at the moment. I would suggest using the initializer list for initializing constants. By definition the initializer list style do not allocate the ram for variable, same as template style. But maybe the initializer list style is a bit easier to follow?

Dec 29 '23 06:12 TBAMax

Ok, I have been thinking about the topic here. Maybe it has been a little overheated. I did read the more info form here(https://arduino.stackexchange.com/questions/12587/how-can-i-handle-the-millis-rollover/12588#12588)

As it was with the cast to signed int and then compare to 0 ((int32_t)(NOW-(SOON))<0) Comes out the following things only with this:

It is not portable by definition.
And that it can not handle intervals longer than half of the timer maximim value.
Very popular open source project seemingly comparing timestamps may lead some programmers to the wrong path with the millis() or micors() usage (promoting bad habits unintentionally).

so to address those

Not portable can be maybe addressed simpler, by making startup tests that run on every platform and catch non compliant behaviour this way, and write custom platform specific macros for those where something being wrong is detected.
Now looking at the test results the problem really was only there when the time interval is too large (about 25days then). But in reality we never use such long interval. So would need to pay attention only in those places where so long interval is timed (currently there is none). It would be more realistic issue if we would be dealing with the micros(), but we are talking about millis() only here. So it actually is no problem at all...
No idea why it bothers me... it certainly is no point of discussion against the need for maximum performance.

So in general the current state of this PR is that macros have instead gone worse -- have lost the needed cast to signed int and thus not operable. I have been too eager to work on things most likely.... So a little unclear at the moment what to take and what to leave from this. Needs a new fresh take on things. Feels like to discard all would be a loss too... There is some potential for performance gain from using smaller data types in places where needed time inteval is short. The question reduces to weather to go for this or not (MTimeout template). MTimeout template needs some more work too. Need to figure out the template inheritation there to use the fixed delay constant without the ram being allocated to it. Turns out the struct initializer list still allocates ram for delay value in the struct(it can not change the struct size!) so it can indeed be only done by template as original there was.

Jan 02 '24 16:01 TBAMax

Marlin Marlin copied to clipboard

Make Dwell immune to millis() overflow.

Description

Requirements

Benefits

Configurations

Related Issues

Marlin
Marlin copied to clipboard