OpenTESArena
OpenTESArena copied to clipboard
Software renderer redesign for 0.15.0
The existing software renderer is a naive 2.5D ray caster and retains several poor design choices from the past few years. Although it has a couple optimizations like multi-threading and per-pixel-column occlusion culling, it was meant only for prototyping new features. I am rewriting it for 0.15.0 so it is actually decent (more importantly: optimizable) and can match Arena's appearance more closely with 8-bit palette colors instead of true color.
Part 1
- Scrap 2.5D ray caster, salvaging important functions
- Implement scene graph which feeds the renderer geometry and lights and keeps gameplay details (voxels/entities/sky/etc.) away. The renderer stores only frame buffers and allocated textures; it does not own geometry
- Implement 3D triangle rasterization (projection, clipping, barycentric coordinates, debug RGB colors, perspective correctness)
- Implement all pixel shaders using 8-bit textures and light tables, supporting material types (i.e. opaque/alpha-tested/ghost)
- Don't need optimization or features like dithering or multi-threading in bulk yet
- Some color/palette bugs are okay
Part 2
- Prepare for 0.15.0 release, fix bugs
- All features in to look identical to Arena (i.e. dithering)
- More scene graph optimizations (antiportal occlusion culling, multi-threading)
- See how close to 2160p 60 fps we can get
Branch naming convention: sw-renderer-redesign-part-#
Revised roadmap as of 1/13/2024.
Part 1
- [x] Implement 3D triangle rasterization (projection, clipping, barycentric coordinates, perspective correctness)
- [x] Implement vertex buffers, attribute buffers, and index buffers
- [x] Implement 8-bit texturing
- [x] Implement basic scene graph which feeds the renderer draw calls every frame for visible geometry
- [x] Implement one pixel shader for all geometry just to get everything on-screen
- [x] Various engine design clean-ups to support renderer development as needed (scene management, etc.)
- [x] Get to a geometry-complete state with the chunk system, meaning all voxels and entities are appearing in-world
Part 2
- [x] Get to a geometry-complete state with sky objects, particles, and screen-space fog (fog shading not required)
- [x] Implement pixel shader per material (opaque, alpha-tested, ghost, chasm wall, etc.)
- [x] Implement sky gradient using dither textures
- [x] Implement lights using light tables and basic forward rendering, supporting up to 8 lights per draw call
- [x] Get to a complete state with lighting voxels and entities using the light levels from the original game
- [x] Get appearance to be nearly identical to Arena while supporting any screen resolution
Part 3
- [x] Implement voxel visibility manager using quadtree
- [x] Optimize voxel draw call generation
- [x] Implement entity visibility manager using bounding boxes
- [x] Optimize entity draw call generation
- [x] Implement sky object visibility manager
- [x] Optimize sky object draw call generation
- [x] Change triangle clipping from world space to clip space
- [x] Implement lighting using clip space -> world space transform in rasterizer
- [x] Render sky and weather near the camera and without depth testing so far plane can be closer
- [x] Sort entities by distance (fixes ghost rendering)
- [x] Prioritize lights by distance (fixes shading with tightly-grouped lights)
- [x] Implement puddle shader again
- [x] Optimize rasterizer and pixel shading for higher pixels per second
- [x] Add multi-threaded pixel shading
#252 is a big enough issue with rain/snow that it should be considered a blocker for finishing part 3.
Done with #252.
Going to skip harder optimizations like occlusion culling and just optimize the rasterizer and shaders a more conventional way with fewer special cases/algorithms/etc..
Ideally opaque meshes would draw front to back without depth reads and then sky would draw after it with a depth < INFINITY
check, but that's too hard to implement, so instead going to draw sky first and then everything else after it.
Deciding how to handle tiled rendering with puddles and ghosts, because these two shaders have to read from the frame buffer. I think trying to do multi-pass shading is a good idea instead of coming up with a complex tile scheduler to keep threads from blocking on unfinished tiles.
Added multi-threading today but performance is still poor even with very high threads. I think the binning design will need to change so they store much more work for each thread, instead of needing threads to synchronize every 8 or so draw calls.