Paint development
This is continuation of Paint discussion started in #2269
Notably, Windows 3.0 in MartyPC performs better in this regard — the cursor is noticeably more responsive, even at 60Hz. A more faithful mouse implementation in the emulator should help, and it will be done eventually. At the same time, this might hint at some suboptimal behavior in ELKS itself, which could be worth exploring and optimizing.
The fun part of this little adventure in computer archaeology is being able to compare MS Paintbrush with ELKS Paint side by side — and I have to say, the former looks pretty pathetic in comparison. :)
Line drawing
https://github.com/user-attachments/assets/cf1a130c-f7e4-4d9f-949d-88ef715edfb3
Circle drawing
https://github.com/user-attachments/assets/e4a86836-1d48-4e25-9ba2-94dcd696674b
Originally posted by @Vutshi in #2269
What does @dbalsom have to say about that?
Yesterday he fixed VGA emulation and at some point he will add 40 Hz mouse.
It is possible that Windows 3.0 has a better internal mouse filter/accelerator. Have you tried Paint with USE_MOUSE_ACCEL both on and off?
I am not sure mouse acceleration has anything to do with it. USE_MOUSE_ACCEL was disabled on both the real hardware and in MartyPC.
Now Paint also runs on Book8088 VGA edition:
P. S. The ‘F’ key test performed before connecting the mouse is interesting — on computer N1, the flood fill starts from the top of the screen!
I have to say, the former looks pretty pathetic in comparison.
Wow, our Paint is definitely superior in terms of line drawing, also using the circular disk instead of square pixel cursor!
It seems Paint is quicker and MSPaint on 8088 also, true?
I do like the thick circle draws and the ability to draw ellipses on MSPaint though.
I have to say, the former looks pretty pathetic in comparison.
Nice! I think the reason for the fill starting in differing spots might have something to do with the initial cursor location, as that's used for the fill start pixel.
It seems Paint is quicker and MSPaint on 8088 also, true?
Yes, our drawing is much faster.
I do like the thick circle draws and the ability to draw ellipses on MSPaint though.
Ellipses are easy to implement, I wonder how to control it. Draw an ellipse by default and constrain to a circle when a control key is pressed?
Nice! I think the reason for the fill starting in differing spots might have something to do with the initial cursor location, as that's used for the fill start pixel.
Initial cursor location should be the same if mouse is untouched/unconnected. But for some reason I saw that initial position can jump between different setups EGA VGA QEMU MartyPC. Maybe it has something to do with the initial position being defined in events.c using hardcoded screen size parameters.
@ghaerr, how can I fully remove remnants of OWC and C86? I’ve already reset the environment and run make clean, but I still can’t get it to fit on a 1.44MB floppy disk.
Remove them from roots_template/bin.
Remove them from roots_template/bin.
Thanks, that worked, especially after I also removed roots_template/usr :)
Circles competition
In the left corner of the ring – Bresenham's circles, and in the right corner, Arithmetical circles.
Arithmetical circles are defined as (r-1/2)^2 <= x^2 + y^2 <= (r+1/2)^2. Bresenham's circle is the thinnest possible circle approximation.
Ellipses are easy to implement, I wonder how to control it. Draw an ellipse by default and constrain to a circle when a control key is pressed?
I would suggest the idea of using the top left corner (just like rectangle) as the anchor point for drawing a circle, if you want to also include the ability to draw ellipses. When the mouse forms a square from the top left, a circle, otherwise when its rectangular, an ellipse is drawn. Using a control key has the problem that it's not user-intuitive and also there's currently no way to get the keyboard shift information from the kernel, so a new API or "super-raw" keyboard termios mode would have to be implemented.
Thanks, that worked, especially after I also removed roots_template/usr :)
I've made a note to remove cpaint, opaint, and /usr during "make clean" from their respective directories.
Initial cursor location should be the same if mouse is untouched/unconnected.
event_open() sets them to SCREENWIDTH/2, SCREENHEIGHT/2, so that should be called after graphics_open(), I'll bet that's the problem.
In the left corner of the ring – Bresenham's circles, and in the right corner, Arithmetical circles.
Interesting - I like the left circles better from normal viewing distance, but the right circles look pretty good up close. I'd say I prefer the Bresenham circles - at least from the screenshot displayed on my laptop.
More colors for the glory of ELKS Paint!
@ghaerr, I found some bug in flood fill, but I have no idea what is going on in this function. If you apply this diff:
diff --git a/elkscmd/gui/app.c b/elkscmd/gui/app.c
index 8cce8022..b02b5440 100644
--- a/elkscmd/gui/app.c
+++ b/elkscmd/gui/app.c
@@ -267,6 +267,20 @@ int main(int argc, char* argv[])
// Highlight Brush Button
R_HighlightActiveButton();
+ int y00 = 15;
+
+ for (int i = 0; i<16; i++) {
+ for (int j = i; j<16; j++) {
+ int x = y00 + I*30;
+ int y = y00 + j*30;
+ R_DrawCircle(x, y, 12, WHITE);
+ currentMainColor = i;
+ currentAltColor = j;
+ }
+ }
+ int x = 400;
+ int y = 100;
+ R_LineFloodFill(x, y, currentMainColor, readpixel(x, y));
initcursor();
and run paint it will be stuck at this point:
You can still keep drawing after attaching mouse but on exit it is stuck forever.
One can also initiate the flood fill manually, then it is stuck such that you cannot exit at all.
and run paint it will be stuck at this point:
Try increasing FLOOD_FILL_STACK to a larger number. The flood fill function keeps a stack of pixels to check while it looks in each direction from the current pixel. It seems that the FF_Stack_Push function checks for stack overflow by not adding more items to the stack, but perhaps the algorithm still crashes when too many pixel locations are traversed with a 100 deep stack.
Try increasing FLOOD_FILL_STACK to a larger number.
Increasing to 200 helped. What I couldn't achieve is some kind of graceful exit from the function. Paint might work ok after stopping flood fill prematurely but after exit from the program I see strange behaviour ranging from "stack underflow" to weird symbols or no text mode at all.
Increasing to 200 helped.
I think that indicates that the internal stack that R_LineFloodFill keeps is part of, and probably all of, the problem.
after exit from the program I see strange behaviour ranging from "stack underflow" to weird symbols or no text mode at all.
Memory corruption for sure. Try increasing the stack to 400, or way more than that. I also think there's an off-by-one error in the stack push/pop function. Code the below, which fixes off-by-one and also will indicate when stack overflow is present:
static void FF_StackPush(transform2d_t stack[], int x, int y, int* top)
{
if (*top < FLOOD_FILL_STACK - 1) { // <-- add "- 1" to fix off-by-one
(*top)++;
stack[*top].x = x;
stack[*top].y = y;
} else __dprintf("stack overflow at %d\n", *top); // <-- display stack overflow to QEMU console
}
Given that you have many circles, all with different x,y points along their circumference, I'm thinking that filling your "More colors for the glory of ELKS Paint!" is actually running the flood fill routine in worst case!
I've spent a fair bit of time exploring various flood fill algorithms. The effort paid off: I achieved a 3× speed improvement.
Benchmark of the original R_LineFloodFill in MartyPC:
time paint
Real 3m6.030s
https://github.com/user-attachments/assets/78a50976-1cce-4dbb-9d46-91a55420669e
To improve performance, I experimented with SeedFill.c from Graphics Gems, which brought the time down to 1m22.03s. Another algorithm from the same source, using a smaller stack, gave 1m25.32s. Eventually, I settled on the Front_Fill, which employs a tricky stack structure and allows for filling entire segments with drawhline.
The new R_FrontFill:
time paint
Real 1m5.650s
it also looks pretty good:
https://github.com/user-attachments/assets/5c02e7a9-f8d9-43d8-967b-3f1721ade2c0
That said, MS Paint remains a flood fill beast, still about 2× faster than my current implementation:
https://github.com/user-attachments/assets/b2ac3d8f-7227-4ffb-8414-eb77e44eece0
At this point, I suspect that readpixel might be the current bottleneck.
Very nice @Vutshi, you're showing a mastery of some cool graphics algorithms! That's a great improvement 3x the original. Is the FrontFill PDF paper available anywhere? The link you sent wants an account and purchase to read it.
I suspect that readpixel might be the current bottleneck.
Well, readpixel has to loop through each of four planes to get the pixel color, of which I'm not sure we can get around. But looking at vga-ia16.S::readpixel, it seems that lots of time is spent calculating the video buffer offset in BX from the passed X an Y parameters:
mov arg1+2(%bp), %ax// AX := y
mov arg1(%bp), %bx // BX := x
shl $1, %ax // AX := [y * 80] (= y*64 + y*16)
shl $1, %ax
shl $1, %ax
shl $1, %ax
mov %ax, %dx
shl $1, %dx
shl $1, %dx
add %dx, %ax
mov %bl, %cl // save low order column bits
shr $1, %bx // BX := [x / 8]
shr $1, %bx
shr $1, %bx
add %ax, %bx // BX := [y * BYTESPERLN] + [x / 8]
That's a lot of code duplicated every readpixel, just to get the video offset in BX. (BTW, drawpixel does the same thing). If we could track the video buffer offset (address) separately in render.c, then both readpixel and drawpixel would get sped up by quite a bit. What we'd want would be something like readpixel_offset(unsigned offset) instead.
This would require some recoding of the flood fill algorithm to keep an offset instead of tracking X, Y separately. What do you think?
Another idea would be to use rewrite readpixel as a macro which would try to get rid of the CALL/PUSH BP/POP BP/RET etc, which would likely speed things up, which would basically insert the whole routine into every call to read/drawpixel. We could call that readpixel_fast etc so as to not have to inline everywhere.
I can write the macros or special functions depending on what you think would be best.
Is the FrontFill PDF paper available anywhere?
I guess I can put it here.
What do you think?
I’ll open a PR later today—once that’s in, we can better assess whether readpixel is the actual performance bottleneck. I don’t have any profiling data; just a strong hint: switching from drawpixel to drawhline alone cut the runtime by around 25 seconds.
Well, readpixel has to loop through each of four planes to get the pixel color, of which I'm not sure we can get around.
ChatGPT suggests to either keep a “shadow” buffer in system RAM (the fastest read) or to use the VGA hardware “latch” trick to get all four bit‑planes in one byte read. Probably MS Paint does the former. Do you know how to do the latter?
ChatGPT suggests to either keep a “shadow” buffer in system RAM (the fastest read) or to use the VGA hardware “latch” trick to get all four bit‑planes in one byte read. Probably MS Paint does the former. Do you know how to do the latter?
you can't read the latches from the CPU, they're just for copying stuff around in VRAM.
a shadow buffer is sensible, but a simple optimization to cache 8 pixels at a time when you read a byte from each plane would mean only 1/8 calls to readpixel would be slow assuming they read lineaerly.
a simple optimization to cache 8 pixels at a time when you read a byte from each plane would mean only 1/8 calls to readpixel would be slow assuming they read lineaerly.
This sounds good. I was thinking of reading a whole row of pixels, but 8 at a time is probably better.
Also I found that apparently there is a hardware feature for color comparison in VGA:
Read Mode 1 is used to perform comparisons against a reference color, specified by the Color Compare field. If a bit is set in the Color Don't Care field then the corresponding color plane is considered for by the comparison, otherwise it is ignored. Each bit in the returned result represents one comparison between the reference color from the Color Compare field, with the bit being set if the comparison is true. This mode is mainly used by flood fill algorithms that fill an area of a specific color, as it requires 1/4 the number of reads to determine the area that needs to be filled in addition to the additional work done by the comparison. Also an efficient "search and replace" operation that replaces one color with another can be performed when this mode is combined with Write Mode 3.
Maybe this is what we need?
Maybe this is what we need?
considering they mention flood fill it seems promising, but i have no clue how it is actually used.
i implemented things like the color don't care register without really having a clue as to what they were for :D
@ghaerr
something strange is going on with colors in MS Paint as can be seen in the video above and in the screenshot:
It doesn't distinguish brown and dark red. Maybe our "optimized" VGA palette is confusing for 40 years old software?)
UPDATE: What I am doing here is opening export.bmp generated by ELKS Paint in MS Paint
UPDATE: What I am doing here is opening export.bmp generated by ELKS Paint in MS Paint
Is this in MartyPC again?
i'd check the colors in 86box before assuming it isn't my fault. Although i'm not sure what would cause that.
UPDATE: What I am doing here is opening export.bmp generated by ELKS Paint in MS Paint
Is this in MartyPC again?
i'd check the colors in 86box before assuming it isn't my fault. Although i'm not sure what would cause that.
Yes it is MartyPC. I haven’t check 86box, but ELKS Paint in the same MartyPC shows colors as expected
Yes it is MartyPC. I haven’t check 86box, but ELKS Paint in the same MartyPC shows colors as expected
Can you get a shot with the palette open in the Video Card Viewer from both?
Can you get a shot with the palette open in the Video Card Viewer from both?
Tomorrow
@dbalsom
Here are the shots MS Paint
ELKS Paint
It doesn't distinguish brown and dark red. Maybe our "optimized" VGA palette is confusing for 40 years old software?)
@Vutshi: remember that neither ELKS nor Paint actually rewrites any of the VGA palette registers - we're currently just assuming the indices of the "standard" CGA palette, which according to some links I sent previously, is the palette which the hardware and most emulators use.
We could write RGB colors ourselves to the palette if you think that's needed, but that might complicate the emulator environment depending on how the VGA palette hardware is emulated, and/or the initial defaults when not running Paint.
Yeah, windows has just remapped the palette.
Before everything became 32-bit RGBA and people stopped having to deal with it, you had to deal with device dependent bitmaps. If you're targeting windows you'd need to try to match the windows palette.