Speed up all programs until they are 50Hz or better
- [x] Doublebuffer the protocol stream
- [ ] Increase GPU clocks
- [ ] Increase width of GPU memory writes
- [ ] Skip more pixels when outside triangle
- [ ] Batch together multiple triangles into one DRAW
- [ ] -ffast-math
- [ ] Profile - where are hotspots? Will C++ optimization help any further?
- [ ] Lighting optimizations
- [ ] Matrix optimizations
- [x] Break up long thin diagonal triangles
- [x] Reduce per-triangle rasterization overhead (issue #49)
Doublebuffering the command stream added in 5e3abfe. Arena is still 25fps in some orientations. Buttonfly drops to 25fps when the button fills the screen.
With SKIP_FPGA_WORK, arena reports 223.15 frames per second and buttonfly reports 261.68 frames per second.
I take that to mean the CPU work is 5ms or less for both of these programs, so optimizing the C++ code won't increase frame-rate any further.
I imagine the chosen button in buttonfly causes a lot of pixels to be tested. A faster clock would probably help here.
I'm not so sure about arena. It may be that there are lots of thin triangles in the maze that cause a lot of pixel testing for not much result. Maybe faster clock would help here, too.
The following run at 50 FPS:
- [x] logo
- [x] insect
- [x] bounce
- [x] jello
- [ ] arena
- [x] buttonfly
It's all at 50 FPS except for arena, which drops to 25 FPS when you go inside the maze.
Martini and candlestick in bounce don't hit 50 FPS. Brad thinks it may be lighting.
Bounce Martini with lighting if(0)’d: total frame ms, cpu, copy, raster wait, est total raster, swap wait, skipped 27.72, 26.51, 1.20, 0.00, 27.71, 0.00, 0 WARNING: Skipped frame 19.80, 11.65, 1.06, 5.90, 18.62, 1.18, 1 19.84, 11.64, 1.06, 5.91, 18.60, 1.23, 0 19.84, 11.64, 1.06, 5.00, 17.69, 2.14, 0 19.84, 11.64, 1.07, 4.97, 17.69, 2.15, 0 19.84, 11.64, 1.06, 5.90, 18.60, 1.23, 0 19.84, 11.63, 1.06, 5.95, 18.64, 1.19, 0 19.84, 11.63, 1.06, 5.59, 18.28, 1.55, 0 19.84, 11.64, 1.06, 4.69, 17.39, 2.45, 0 19.84, 11.66, 1.06, 5.65, 18.37, 1.46, 0
Bounce Candlestick with lighting if(0)’d: total frame ms, cpu, copy, raster wait, est total raster, swap wait, skipped 29.54, 27.94, 1.59, 0.00, 29.53, 0.00, 0 WARNING: Skipped frame WARNING: Skipped frame 20.25, 14.23, 1.47, 2.45, 18.15, 2.10, 2 19.84, 14.22, 1.47, 2.67, 18.37, 1.46, 0 19.84, 14.21, 1.47, 2.46, 18.14, 1.69, 0 19.84, 14.21, 1.47, 2.86, 18.54, 1.29, 0 19.84, 14.28, 1.47, 2.60, 18.35, 1.48, 0 19.84, 14.22, 1.47, 2.65, 18.34, 1.49, 0 19.84, 14.22, 1.47, 2.61, 18.31, 1.53, 0 19.84, 14.20, 1.47, 1.66, 17.34, 2.50, 0 19.84, 14.20, 1.47, 1.57, 17.24, 2.59, 0 19.84, 14.20, 1.47, 1.97, 17.64, 2.19, 0 19.84, 14.21, 1.47, 1.61, 17.30, 2.54, 0
Bounce Martini with lighting if(1)’d: total frame ms, cpu, copy, raster wait, est total raster, swap wait, skipped 31.48, 30.29, 1.18, 0.00, 31.47, 0.00, 0 27.95, 19.88, 1.07, 0.00, 20.95, 6.99, 4 28.15, 20.35, 1.06, 0.00, 21.42, 6.73, 16 26.76, 20.22, 1.06, 0.00, 21.28, 5.48, 13 29.24, 20.55, 1.06, 0.00, 21.61, 7.62, 19 29.75, 20.66, 1.06, 0.00, 21.73, 8.02, 27 30.01, 20.20, 1.06, 0.00, 21.26, 8.74, 16 29.76, 20.63, 1.07, 0.00, 21.69, 8.06, 17 29.76, 20.50, 1.06, 0.00, 21.56, 8.19, 17 27.63, 20.52, 1.06, 0.00, 21.58, 6.04, 15
Bounce Candlestick with lighting if(1)’d: total frame ms, cpu, copy, raster wait, est total raster, swap wait, skipped 38.89, 37.21, 1.60, 0.00, 38.81, 0.00, 0 29.80, 27.39, 1.47, 0.00, 28.86, 0.93, 17 29.76, 27.15, 1.46, 0.00, 28.62, 1.14, 34 29.73, 27.14, 1.46, 0.00, 28.60, 1.12, 33