Pixelorama icon indicating copy to clipboard operation
Pixelorama copied to clipboard

Flood fill issue

Open Distortions81 opened this issue 2 years ago • 5 comments

Pixelorama version: 0.9.1 stable

OS/device including version: Ubuntu 21

Issue description: Bucket/Flood fill is insanely slow

Steps to reproduce: Use flood fill on a blank 4096x4096 image.

Distortions81 avatar Jan 22 '22 16:01 Distortions81

I have attempted to implement the floodfill routine from the Allegro library (https://www1.udel.edu/CIS/software/dist/allegro-4.2.1/src/flood.c). It seems to be somewhat faster than the current implementation in Bucket.gd it would replace.

I can provide a PR, but I'm not 100% confident I tested everything. Let me know if you are interested anyway, and if anyone can guide me making sure it's properly tested.

Some approximate common timings for flood fill of an entire picture on my machine for the different implementations follow. The columns labeled (no color) correspond to code where I commented out the calls to _set_pixel in order to only compare the algorithms.

WxH Bucket.gd (current) Bucket.gd (no color) Allegro (new) Allegro (no color)
512x512 px 1570 ms 1075ms 910 ms 330 ms
1024x1024 px 6250 ms 4290ms 3590 ms 1300 ms
2048x2048 px 25000 ms 17200 ms 14400 ms 5200 ms

It's probably not a game changer. It's likely that looking at my code someone more experienced with gdscript might find better ways to optimize it.

MatteoPiovanelli-Laser avatar Mar 30 '22 21:03 MatteoPiovanelli-Laser

@MatteoPiovanelli-Laser That sounds very interesting! Feel free to open a PR and we can review the code and test if it's working properly.

OverloadedOrama avatar Mar 30 '22 22:03 OverloadedOrama

In case of specific image formats, it's possible to replace repeated calls to image.set_pixel by writing directly to the image's underlying buffer. That should speed things up further.

For example, see the set data column in the following table. The test are done for images in the RGBA8 format, which it's my understanding is the default in Pixelorama, but I'm not sure whether it's the only one that' used.

WxH Bucket.gd (current) Bucket.gd (no color) Allegro (new) Allegro (set data) Allegro (no color)
512x512 px 1570 ms 1075ms 910 ms 690 ms 330 ms
1024x1024 px 6250 ms 4290ms 3590 ms 2750 ms 1300 ms
2048x2048 px 25000 ms 17200 ms 14400 ms 10800 ms 5200 ms

Unfortunately that involves a steep memory cost, because it basically requires creating a duplicate of the image in memory.

Once my original PR gets through, I'll also add this code if you are interested.

MatteoPiovanelli-Laser avatar Apr 02 '22 20:04 MatteoPiovanelli-Laser

I'm investigating a bit more the reason why this bucket fill is still so slow compared to implementations of the same algorithm in other languages. This screenshot is a snapshot from Godot's profiler showing the functions relevant to the bucket fill process, taken while filling a 1024x1024 image. image

Cutting down on the calls to project.can_pixel_get_drawn would help, as that is currently checked twice for every point in the area being painted, yields the timing in the Allegro (refactor) column.

WxH Allegro (new) Allegro (refactor)
512x512 px 910 ms 200 ms
1024x1024 px 3590 ms 805 ms
2048x2048 px 14400 ms 3100 ms

What I did was basically take advantage of the fact that the test project.can_pixel_get_drawn is only really significant when there is a selection to refactor some of those calls away. This, "happens" to be right, but on principle it may not be, afaik, so I'm not really sold on this. The other thing I refactored was the _set_pixel function, by shortcircuiting some code paths so that common cases now may skip several checks. Basically, within the bucket fill operation, I can make more assumptions on the "safety" of some data, because it's been checked already before.

See #672 for the code corresponding to these tests.

MatteoPiovanelli-Laser avatar Apr 19 '22 20:04 MatteoPiovanelli-Laser

I did a final refactor in #681 so it's easier to profile what's costing in terms of performances now. I have to say, I gave the code one more look and I couldn't see more corners to cut to make it faster right now.

MatteoPiovanelli-Laser avatar Apr 27 '22 22:04 MatteoPiovanelli-Laser