How to performantly draw many rectangles
Hi,
Thanks for developing macroquad. In terms of usability, the project has been really fantastic to get started with.
I have an application I am considering rewriting in macroquad. Essentially, I need to draw a bunch of rectangles on the screen. This is a profiler, so it is very much a goal to scale this as far as possible. So the more rectangles I can draw, the better. (Obviously, there are many things that go into making a scalable profiler, but if the base rendering framework is not fast, then everything else suffers. This is something we learned the hard way the first time we wrote this project.)
I set up a very, very basic stress test to get a ballpark sense of how far I can push macroquad before it falls over. You can find the test code here (instructions in README):
https://github.com/elliottslaughter/test-macroquad
The relevant part is a loop that basically does:
for r in &mut rects {
draw_rectangle(r.x, r.y, r.w, r.h, r.color);
}
On my 2020 13-inch MacBook with Intel i7 quad-core (2.3 GHz), I'm able to hit about:
- Natively: 3,000 rectangles with 30 FPS
- In browser via wasm32: 1,000 rectangles with 30 FPS
Are there any tricks I'm missing to improve performance? I was really hoping to hit higher rectangle counts with this code.
I think you could improve performance with drawing textures instead of rectangles. Here is an example based on your code:
use macroquad::prelude::*;
struct Rectangle {
x: f32,
y: f32,
w: f32,
h: f32,
vx: f32,
vy: f32,
texture: Texture2D,
}
#[macroquad::main("BasicShapes")]
async fn main() {
let mut rects: Vec<Rectangle> = Vec::new();
const N: i32 = 1000;
for i in 0..N {
// TODO: build a texture atlas to improve performance
let w = rand::gen_range(0.0, screen_width() / 4.0);
let h = rand::gen_range(0.0, screen_height() / 4.0);
let color = Color::from_rgba(
rand::gen_range(0, 255),
rand::gen_range(0, 255),
rand::gen_range(0, 255),
255, //rand::gen_range(0, 255),
);
let mut render_target_camera = Camera2D::from_display_rect(Rect::new(0., 0., w, h));
render_target_camera.render_target = Some(render_target(w as u32, h as u32));
set_camera(&render_target_camera);
clear_background(WHITE);
draw_rectangle(0.0, 0.0, w, h, color);
let texture = render_target_camera.render_target.unwrap().texture;
rects.push(Rectangle {
x: rand::gen_range(0.0, screen_width() - w),
y: rand::gen_range(0.0, screen_height() - h),
w: w as f32,
h: h as f32,
vx: rand::gen_range(-1.0, 1.0),
vy: rand::gen_range(-1.0, 1.0),
texture,
});
set_default_camera();
}
loop {
clear_background(WHITE);
for r in &mut rects {
draw_texture(r.texture, r.x, r.y, WHITE);
r.x += r.vx;
r.y += r.vy;
if r.x < 0.0 {
r.vx = -r.vx;
}
if r.y < 0.0 {
r.vy = -r.vy;
}
if r.x + r.w > screen_width() {
r.vx = -r.vx;
}
if r.y + r.h > screen_height() {
r.vy = -r.vy;
}
}
draw_text(
&format!("FPS: {}", macroquad::time::get_fps()),
20.0,
20.0,
30.0,
DARKGRAY,
);
next_frame().await
}
}
When working with so many textures you should also combine them into a TextureAtlas to reduce draw calls. This is not part of the example and should drastically improve the performance.
@macnelly can you give an example of using a texture atlas? I don't see any examples for how to utilize this feature.
I can't really see why textures should be faster?
I got ~50fps on 10000 rects with your example on the Intel UHD 620 (old integrated gpu), so 30fps with 1000 on a macbook seems like something is wrong.
Maybe double check that you are running it in release mode?
For what it's worth, I moved on to egui for my use case, so I don't have an immediate need for this. Feel free to keep this issue open or close it as you see fit.
In case it helps, from my testing, the 3k limit seems to have been closely associated with drawing rectangles on the CPU. It appears that the macOS-provided GUI library is CPU-only. I don't know whether macroquad uses an OpenGL backend by default, but I found that for other Rust libraries, if they used the macOS UI toolchain, all of them were limited to 3k rectangles (with reasonable FPS) across the board. Only egui was able to meaningfully go past that, and I was easily hitting 80k-160k rectangles without significant issues.
I'm 99.9% sure I ran all my comparisons in release mode, but it's been a while so I don't have direct documentation of that anymore.