core icon indicating copy to clipboard operation
core copied to clipboard

Slow raster performance

Open farbefreak opened this issue 2 years ago • 15 comments

Hello Guys,

i compiled the grblHAL for a Arduino Due (SAM3X8E). I used grbl on an Arduino Uno (328p) before. I found the uno to be too slow for rasterscanning images. So i was hoping for better performance on a Due.

Sadly i still cant go above 7000mm/min engraving speed. (stuttery/jerky motion above) My DIY engraver/cutter is physically capable of 20000mm/min movement speed and 4000mm/s² acceleration. So i was hoping to get to these speeds.

I used the nativ usb serial interface with Lightburn on a Mac. I set light burn to 2Mbaud. I could verify that the serial connection is not the issue as the max speed archivable didnt change with lower serial baud rate.

I tried increasing the BLOCK_BUFFER_SIZE in grbl/config.h to 512

I also tried increasing/decreasing the ACCELERATION_TICKS_PER_SECOND (50 to 200). And finally tried changing SEGMENT_BUFFER_SIZE.

I could not reach higher speeds while engraving grayscale.

Any ideas how i could improve the speeds?

Thanks for your support.

farbefreak avatar Mar 09 '22 19:03 farbefreak

Do not increase BLOCK_BUFFER_SIZE too much as the processor has no FPU (Floating Point Unit) - 100 -200 is better?

IMO leave SEGMENT_BUFFER_SIZE and ACCELERATION_TICKS_PER_SECOND at the default values. I have not checked the impact of ACCELERATION_TICKS_PER_SECOND, AFAIKT SEGMENT_BUFFER_SIZE does not matter much.

What is your step/mm setting for the X-axis?

If you want high speeds nothing beats the Teensy 4.1...

terjeio avatar Mar 09 '22 19:03 terjeio

i currently use 20Teeth pullys with 16microsteps. This equates to 80steps/mm.

With 7000mm/min speed, this should equate to a step frequency of: (7000mm/min*80steps/mm)/60=9,33Khz So actually quite "slow". I would guess that the step puls speed is not the bottleneck, (as i can drive the machine around with 20m/min without stutter) but rather the pwm update time?

I did not think that raw CPU performance would be the problem as i found out that enabling/disableing dynamic power scaling does not effect the speed. I would have guessed that the extra calculations needed for the power scaling would slow things down if cpu power was the bottleneck.

I have a ESP32 laying around that i could test. It features a FPU, do you think that i would be faster than a SAM3X8E? Should be as it offers 240Mhz 2core vs 84Mhz, so in theory roughly 6 times faster. This would equate to 42m/min raster scan speed which would be faster than my mechanics.

Do you have any information about what your software combined with a teensy/ESP32 is capable of (rasterscanning speed)?

farbefreak avatar Mar 09 '22 20:03 farbefreak

I used the nativ usb serial interface with Lightburn on a Mac.

Try the programming port as well - I get high feed rates with that, but I do not have motors connected so cannot tell if it stutters or not. Stuttering could be due to USB peripheral overhead/IRQ priority.

I have a ESP32 laying around that i could test. It features a FPU, do you think that i would be faster than a SAM3X8E?

It should be as it has an FPU - a rather slow one compared to ARM M4F/M7 cores though.

Do you have any information about what your software combined with a teensy/ESP32 is capable of (rasterscanning speed)?

I have pushed the Teensy 4.1 driver to above 70.000mm/min, again without motors. I have not profiled the ESP32 driver so I have no idea about what it is capable of.

terjeio avatar Mar 09 '22 20:03 terjeio

I checked again with the due. I actually get around 5 times the performance of the UNO running grbl. So a big improvement. That actually makes sense as the uno is running at 16Mhz and the due is running at 84Mhz which is roughly 5 times faster disregarding the architectural changes.

I also tried using the serial port, the performance with this is actually a little worse. I can only get the port to run at 115200, so that might be a part of it. I tried defining "#define BAUD_RATE 500000" inside the /stream.h file, however it didnt make a difference.

Try the programming port as well - I get high feed rates with that, but I do not have motors connected so cannot tell if it stutters or not. Stuttering could be due to USB peripheral overhead/IRQ priority.

What do you mean by high feedrates? I get around 5200mm/min with your SenderIO application and aggressive buffering on and 3600mm/min with aggressiv buffering off. Thats with the nativ USB port.

With the serial port (that I cant get above 115200baud) I only get 2500mm/min feedrate. Seems like the slow port is limiting here.

Seems like I should try the ESP32

farbefreak avatar Mar 10 '22 10:03 farbefreak

@terjeio Something is puzzling me..

Just out of curiosity, I grabbed the greyscale_raster_test.nc file from the issue linked above - to try on bare Teensy and H743 boards (both set to 50,000mm/min max and 1,000mm/sec2 accel). I was only able to test over USB connection - and as you say in the other issue, the planner buffer is starved on both boards.

However, the receive buffer appears to be filled okay (generally has between 20-40 bytes free at any point). What I can't get my head around, if it was a USB data transfer issue, then surely the receive buffer would be starved also? Or is there additional overhead on the MCU, causing a bottleneck in getting the data out of the receive buffer fast enough?

(Incidentally, my run times were a bit under 10mins for the H743 & a bit under 20mins for the Teensy4.1, so that looks promising for the H743 port).

dresco avatar Mar 12 '22 12:03 dresco

@terjeio Something is puzzling me..

Ah no, ignore me, I'm an idiot :) I'd forgotten to enable laser mode, so it was stopping motion with each power change! Is now taking ~40s for the Teensy and ~22s for the H743.

(Has uncovered an issue with the H7 port though, as it sometimes stalls for several seconds part way through, will look into that).

dresco avatar Mar 12 '22 23:03 dresco

I have done some more testing of the raster performance. I can get feed rate overshoot with certain combinations of feed rate and acceleration - my guess this is due to too high processor load as I do not see this with the Teensy driver. When a slowdown happens the planner buffer is starved (not full).

Has uncovered an issue with the H7 port though, as it sometimes stalls for several seconds part way through

Does the delay roughly correspond to the time needed for a complete count (2^32) of the main stepper timer?

terjeio avatar Mar 15 '22 20:03 terjeio

Does the delay roughly correspond to the time needed for a complete count (2^32) of the main stepper timer?

Aha, yes it does. I hadn't noticed, but is always 17.89s, which corresponds to a timer clock rate of 240MHz.

Btw, while looking this up, I noticed that the stepper timer wasn't necessarily calculated correctly. I needed to add the check for the x2 multiplier to driver_init() - may be the same for other STM32 devices also..?

diff --git a/Src/driver.c b/Src/driver.c
index 9a99a99..9a4e23d 100644
--- a/Src/driver.c
+++ b/Src/driver.c
@@ -1758,13 +1758,18 @@ bool driver_init (void)
     __HAL_RCC_GPIOF_CLK_ENABLE();
     __HAL_RCC_GPIOG_CLK_ENABLE();
 
+    RCC_ClkInitTypeDef clock;
+    uint32_t latency;
+
+    HAL_RCC_GetClockConfig(&clock, &latency);
+
     hal.info = "STM32H743";
     hal.driver_version = "211211";
 #ifdef BOARD_NAME
     hal.board = BOARD_NAME;
 #endif
     hal.driver_setup = driver_setup;
-    hal.f_step_timer = HAL_RCC_GetPCLK2Freq();
+    hal.f_step_timer =  HAL_RCC_GetPCLK2Freq() * (clock.APB2CLKDivider == 0 ? 1 : 2);
     hal.rx_buffer_size = RX_BUFFER_SIZE;
     hal.delay_ms = &driver_delay;
     hal.settings_changed = settings_changed;

dresco avatar Mar 15 '22 23:03 dresco

Re-tested this morning with hal.f_step_timer set to the actual timer clock frequency, and not seeing the hangs any more - although it has doubled the run time from ~22 secs to 45.

Edit: After checking some timings, I can see the run time change makes sense, as was previously moving at 2x the expected/reported speeds..

dresco avatar Mar 16 '22 08:03 dresco

In testing for my upcoming laser rebuild, I've found this issue at 18000mm/min at 256 steps/mm on a Teensy 4.1 board. I'd like to hit around 36000mm/min, but we'll see how that goes once the hardware is in the laser this weekend. I'm using LightBurn to generate G-Code and either LightBurn or ioSender as my sender.

Using vertical boxes in rastering as my test file, I change their width and spacing closer until the controller seems to start lagging. Once I reach a spacing of about 2x2mm (width/gap), the commanded speed of the motor both audibly falls and is reported as 17119mm/min by ioSender. This also happens with Lightburn, though I can't see the feed rate. Once the spacing reaches 1mmx1mm, the speed drops to 12119mm/min and ioSender starts stuttering without aggressive buffering on. Down to 0.5mm spacing and feeds are down to 8569mm/min. I recognise that this is an extreme case for the g-code planner, but it is an interesting edge case. As long as the laser power behaves properly during slowdowns, I would consider this a graceful 'failure.'

Thoughts for testing tomorrow when I have more time tomorrow: Changing microstepping to see if step pulse rate changes this behavior. Modifying the size of planning/streaming buffers Streaming over Ethernet instead of Teensy USB Attaching something to the spindle output to monitor 'laser' power.

SRFirefox avatar Mar 31 '22 17:03 SRFirefox

@SRFirefox Out of interest, would you be able to share your test file? Thanks!

dresco avatar Mar 31 '22 18:03 dresco

Yup. Each of these is set at 18000mm/min. Have fun! Also: Changing the step/mm setting did nothing.

raster_testfile.zip

SRFirefox avatar Mar 31 '22 19:03 SRFirefox

What is your $0 setting (step pulse time)? With $0=3 and planner buffer set to 500 step pulses are looking good at my scope @ 36000mm/m. test05x05.nc is transferred via USB in 00:18 in check mode, ~200Kb/s.

Tip: you can reduce file size with File > Transform > Compress in ioSender.

terjeio avatar Apr 01 '22 04:04 terjeio

Very cool. The pulse time might be set to 8uS, but more importantly I compiled from source and neglected to change the planner buffer to something higher than 36. I've changed my source to 1024 buffer size and I'll change the pulse time to 3 when I get back to the hardware. Also glad to know the native USB is blasting through the data in check mode. One less thing to worry about. I'll be interested to see how it does in Ethernet mode, some galvanic isolation might be nice.

Thanks for checking the files out. I'll report back in the morning - I'm very excited to have a much more hackable machine!

SRFirefox avatar Apr 01 '22 05:04 SRFirefox

Changed the planner buffer up to 1024 segments and everything is running smoothly. My laser rasters comfortably at 48000mm/min and if I'm careful it can accelerate up to 60000mm/min. But this should handle everything I can throw at it though I'm sure one of the other laser users here at the hackerspace will prove me wrong. So far it looks even better than the old controller.

SRFirefox avatar Apr 05 '22 02:04 SRFirefox