l5kit
l5kit copied to clipboard
Speedup semantic rasterizer
I achieved ~ 7-8% speedup with these modifications.
render_semantic_map is the bottleneck when using a low count of history_frames. This PR speeds up the render_semantic_map function by ~200%. Even for higher count of history_frames this change is still significant and I suggest to accept it.
This PR speeds up the render_semantic_map function by ~200%.
did you compare this against #196?
did you compare this against #196?
thanks for pointing to that PR. It does even better once the caching is done:
52 ms ± 4.77 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) no changes 24.8 ms ± 777 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) this PR #140
32.9 ms ± 17.2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) PR #196 The slowest run took 4.23 times longer than the fastest. This could mean that an intermediate result is being cached.
after caching: 17.5 ms ± 880 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) PR #196
timings are for the full dataset loading (448x224 raster, 0 history frames) so the speedups are normalized to include all other operations such as box_rasterizer as well)