fs2open.github.com icon indicating copy to clipboard operation
fs2open.github.com copied to clipboard

optimize collision code a bit

Open Goober5000 opened this issue 1 year ago • 13 comments

This was motivated by the recent Coverity report. Coverity noticed that a few of the copy assignments could be replaced by move assignments, but on further investigation one copy each in ship_weapon_check_collision() and beam_collide_ship() can be entirely removed, another either/or copy in each can be replaced with an either/or pointer assignment, and a few copies can be done only when needed. Another vec3d copy can be replaced with a vec3d pointer since there is no longer a need to keep a local copy.

Goober5000 avatar Jul 08 '24 18:07 Goober5000

Are there any profiling results to see what sort of difference this makes?

qazwsxal avatar Jul 08 '24 20:07 qazwsxal

I haven't run any.

EDIT: @EatThePath feel like doing some profiling tests?

Goober5000 avatar Jul 08 '24 20:07 Goober5000

Ok, so just to be clear, this hasn't been tested for performance regressions either? I've had pretty unintuitive results when it comes to collision code performance in the past.

qazwsxal avatar Jul 08 '24 22:07 qazwsxal

One change needs further clarification; additionally, this needs performance testing to ensure it actually is an optimization, or at least not a regression.

Since I was pinged, I'll chime in to concur. Even changes to this area that aren't motivated by optimization really should have some benchmark comparisons to ensure there isn't a regression, it's just above one of our hottest loops and as @qazwsxal points out it's often unpredictable what will affect it, and how.

That said my gut is that this is will be an effectively null impact either way. This is near enough the hot loop to warrant caution but the real heat is inside model_collide, if memory serves. A few small copies more or less outside of it seems likely negligible.

I'm not really up for doing the testing myself, at the moment. If someone wants tips on how to do it, I can probably help with that, but I don't have free time I'd want to allocate to this in the immediate future.

EatThePath avatar Jul 09 '24 00:07 EatThePath

I haven't tested for performance regressions but I have tested this in mission. I have no objection to performance testing but this is a pretty straightforward reduction in the number of copies. It doesn't change any algorithms and it doesn't involve adding or removing loops.

I presume the impact of this PR will be small but positive. There are no drastic changes, but since these functions are run every time collisions are checked between a ship and a weapon/beam, reducing the number of copies for each check has gotta be worth something.

I don't really have the cycles for performance testing myself. @qazwsxal since you brought it up are you willing to do it?

Goober5000 avatar Jul 09 '24 03:07 Goober5000

I haven't got bandwidth either, here's my python scripts for profiling though, run_icarus.py does 5 runs of BP's icarus mission at a low resolution to avoid rendering overhead then dumps results to a numpy blob. Then plot_files.py generates a couple of graphs showing individual frametimes and averages, then compares the two on a third graph.

qazwsxal avatar Jul 09 '24 10:07 qazwsxal

A quick profiling run shows that in the collision-heavy segments, performance is effectively identical. I'm less sure about the non-collision-heavy segements (like the first two cuts of icarus), as I see significant variance between runs in thse 300+FPS sections.

BMagnu avatar Jul 09 '24 12:07 BMagnu

A quick profiling run shows that in the collision-heavy segments, performance is effectively identical. I'm less sure about the non-collision-heavy segements (like the first two cuts of icarus), as I see significant variance between runs in thse 300+FPS sections.

From an optimisation perspective, my general feeling with the 300+FPS sections is to practically ignore them. The main thrust of optimisation is to make the same missions playable on less powerful machines, and to make more intensive missions possible. 300FPS is as playable as 500FPS or 200FPS for almost everyone out there, so I don't think it's worth caring about.

That said, if performance is effectively identical, then what's the point of this PR? If it's making the code more readable, then sure, but it appears to be bogged down in a discussion about whether to add a comment or not.

qazwsxal avatar Jul 19 '24 09:07 qazwsxal

There are several goals here, in no particular order:

  1. Address the Coverity issues
  2. Improve performance to the extent possible in these functions
  3. Apply polishing where appropriate

It's already had one approval, and not really bogged down, as the remaining review point is about something rather trivial. Hopefully we can get that resolved soon.

Goober5000 avatar Jul 20 '24 02:07 Goober5000

Thank you for adding the comment; now, where are your performance tests?

MageKing17 avatar Jul 24 '24 17:07 MageKing17

Thank you for adding the comment; now, where are your performance tests?

@BMagnu ran a profiling run a few posts up which should cover that.

Goober5000 avatar Jul 25 '24 18:07 Goober5000

@BMagnu ran a profiling run a few posts up which should cover that.

No it shouldn't; no data was provided. I say again, where are your performance tests to prove that your optimization isn't a regression?

MageKing17 avatar Jul 25 '24 19:07 MageKing17

I'm happy to take him at his word, but based on your comment I've asked him to upload his data. But he's busy for the short term so it may take several days.

Goober5000 avatar Jul 28 '24 04:07 Goober5000

Okay, finally had time to rerun the benchmarks (icarus, 10 runs per option) properly with a lot of repetitions and influence of the nvidia driver minimized. image The first plot / the blue line in the third plot shows the pre-PR performance. The second plot / the orange line in the third plot shows the post-PR performance. There seems to be little difference overall, if not a tiiiiiiny speedup post-PR.

BMagnu avatar Nov 05 '24 22:11 BMagnu