nannou icon indicating copy to clipboard operation
nannou copied to clipboard

draw.line() 250k times uses 3.25GB memory

Open SMUsamaShah opened this issue 3 years ago • 4 comments

I have a huge list of command, coords pairs. Based on command I draw lines between certain coords with different colors and weights.

This piece of code runs around 250k times

fn draw_line(draw: &Draw, brush: &mut Brush) {
    draw.line()
        .rgb(brush.color[0], brush.color[1], brush.color[2])
        //.start(brush.v_px_py) // shared Vector2
        //.end(brush.v_x_y)
        .start(pt2(brush.px, brush.py)) // new Vector2
        .end(pt2(brush.x, brush.y)) 
        .weight(brush.size as f32)
        .caps_round();
}

No matter if I use pt2(x,y) or the shared one brush.v_x_y it ends up taking around 3.25GB memory.

What am I doing wrong here?

EDIT: Omitting start and end calls keeps the programm less then 100MB

PS: I am using nannou to learn rust.

SMUsamaShah avatar May 15 '21 18:05 SMUsamaShah

You mean 250k times per frame? Like 250000 lines at once?

If that's the case, what happens is that nannou does not compute the geometry for one line and draws it (that would be very inefficient), it creates a inner structure with all your information, then asks the rasterizer to transform all that data into a list of triangles to render, then renders it with one draw call, so that you get the best performance of your GPU.

Now a line without caps needs 4 vertices (2 tris), a line with rounded caps ~20 vertices (~40 tris) maybe? That's a lot of vertices to compute/store (position + color information + index). I would not be surprised if that is the case. Just that 250K lines is a lot of data, probably stored twice in different formats. Maybe try without the round caps to see how big of an impact that is?

Now if you are not drawing all the lines at once, then my guess is that maybe the buffers storing the data might not be cleared/re-used at each frame (I believe that allocating the buffers for the vertices data is usually on the side of wgpu, and the memory allocator there kind of does its own thing, hopefully something efficient).

As of to if it's wrong, I'd say nothing that I see here is wrong. Do you care that much about using your memory when you have some? I would be more interested on the behavior when you have only 1Gb available for example. Maybe then the memory usage fits into what's available?

MacTuitui avatar May 23 '21 08:05 MacTuitui

Thank you for explanation.

I only realized about memory usage when I saw it being too much slower than my Javascript implementation and on the second/third draw call it crashed with memory allocation error.

I am doing it all at once I believe and I do need caps to be round. The image is a text file of around 200KB. does it with minimal cpu and memory usage.

It's like this

fn view(app: &App, frame: Frame) {
  ...
  while i < img_file.len() { // ~= 250_000
    draw.line().start(pt2(x1, y1)).end(pt2(x2, y2)).caps_round();
  }
  draw.to_frame(app, &frame).unwrap();
}

Experiments based on suggestions

No round caps, low mem usage, good performance

Removed round line cap. This time memory only went up to 80MB. Even though it's still a single draw.to_frame() call after 250k draw.line() calls.

I do need round caps, otherwise end result looks crappy.

Send to frame after every line

fn view(app: &App, frame: Frame) {
  ...
  while i < img_file.len() { // ~= 250_000
    draw.line().start(pt2(x1, y1)).end(pt2(x2, y2)).caps_round();
    draw.to_frame(app, &frame).unwrap();
  }
  draw.to_frame(app, &frame).unwrap();
}

Doing draw.to_frame(app, &frame).unwrap(); after each draw.line() within loop crashes it with thread 'main' panicked at 'called Result::unwrap()on anErr value: AllocationError(OutOfMemory(Device))' no matter if use round caps or not.

SMUsamaShah avatar May 23 '21 15:05 SMUsamaShah

Thanks for the details!

So a few things to consider here from my perspective: You want to render once as you are not animating the lines (right?), so there is no need for you to redraw everything. One way to do this might be to incrementally render 5% of the lines for 20 frames, then just do nothing (as long as you do not draw a background, your image will not change). You should however do this by not calling multiple times the to_frame function within the view function (this only makes duplicates of all the data, but not clearing them as the draw structure - well actually the inner representation of the data - is probably not dropped until the end of view, hence the OutOfMemory error). You need to spread those calls over multiple frames (that is, over multiple calls to view). I would use app.elapsed_frames() here and iterate over the lines in huge chunks. If splitting the lines over 20 frames makes it work fine, I'd say it's good enough.

As to why it is slow compared to a js implementation, I'd say it really is a question of making those draw calls efficient and your use case is really at the edge of what is a prime candidate for optimization. Different use cases might want different optimization targets....

MacTuitui avatar May 24 '21 00:05 MacTuitui

Send to frame after every line

fn view(app: &App, frame: Frame) {
  ...
  while i < img_file.len() { // ~= 250_000
    draw.line().start(pt2(x1, y1)).end(pt2(x2, y2)).caps_round();
    draw.to_frame(app, &frame).unwrap();
  }
  draw.to_frame(app, &frame).unwrap();
}

Doing draw.to_frame(app, &frame).unwrap(); after each draw.line() within loop crashes it with thread 'main' panicked at 'called Result::unwrap()on anErr value: AllocationError(OutOfMemory(Device))' no matter if use round caps or not.

You may want to try creating a new Draw after calling to_frame() on it. I've had some issues with reusing a Draw. Not that I think it'll solve your problem, but who knows.

ErikNatanael avatar Jun 01 '21 07:06 ErikNatanael