nGL
nGL copied to clipboard
Update gl.cpp
Support for SDL2, and fixed some divide by 0 errors with Fix<>, changed some other things for simplicity.
Some questions:
-
why use this to create the depth buffer instead of
new
:z_buffer = new std::remove_reference<decltype(*z_buffer)>::type[SCREEN_WIDTH*SCREEN_HEIGHT];
-
why use <signal.h>?
-
why were all the beginning variables static?
-
lastly, the #define cluster around line 477, why so much/complicated?:
//I hate code duplication more than macros and includes
#ifdef TEXTURE_SUPPORT
#define TRANSPARENCY
#include "triangle.inc.h"
#undef TRANSPARENCY
#undef TEXTURE_SUPPORT
#define FORCE_COLOR
#include "triangle.inc.h"
#define TEXTURE_SUPPORT
#undef FORCE_COLOR
#endif
#include "triangle.inc.h"
A lot of the code will appear as changed because MSVC beautified it. I spent a lot of time wondering why I couldn't get a triangle to draw, so I went ahead to learn about matrices and stuff to try and backtrack the problem. I thought of enabling wireframe, so it would draw a 3d line which was pretty basic, then I got the div by 0 errors... Well it's fixed now, I hope my changes can be used.
Yay!
Support for SDL2, and fixed some divide by 0 errors with Fix<>, changed some other things for simplicity.
Please split all that into separate commits, especially the reformatting. I see that there are various changes which aren't just reformatting but also change the meaning of code (e.g. replacing decltype
with float
or removing static
), those have to be undone.
It also changed e.g const VERTEX *v1
to const VERTEX* v1
, which is a regression. It makes it clearer that with VERTEX *v1, v2;
for instance v2
is not a pointer.
Some questions:
- why use this to create the depth buffer instead of
new
:z_buffer = new std::remove_reference<decltype(*z_buffer)>::type[SCREEN_WIDTH*SCREEN_HEIGHT];
At some point I was experimenting with using different precisions for the Z buffer, so I used this way of getting the right type without having to change it everywhere. Meanwhile it could be replaced by hardcoding GLFix
though.
- why use <signal.h>?
It's used for installing a SIGINT handler to quit SDL. Without that, you can't terminate the program properly. I guess MSVC doesn't implement this? In that case, you can wrap it in #ifndef __WIN32__
or similar.
- why were all the beginning variables static?
They're not supposed to be visible externally/globally, which also allows for more optimizations. See also https://stackoverflow.com/questions/1856599/when-to-use-static-keyword-before-global-variables for more info.
- lastly, the #define cluster around line 477, why so much/complicated?:
triangle.inc.h
contains just a single function for drawing triangles, but it has several different modes which require different code (interpolating U/V coords, transparency, combination thereof, etc.). Doing that with plain if(transparency)
or such would be too slow, so it uses the preprocessor instead to have separate functions without having to duplicate all the code manually.
//I hate code duplication more than macros and includes #ifdef TEXTURE_SUPPORT #define TRANSPARENCY #include "triangle.inc.h" #undef TRANSPARENCY #undef TEXTURE_SUPPORT #define FORCE_COLOR #include "triangle.inc.h" #define TEXTURE_SUPPORT #undef FORCE_COLOR #endif #include "triangle.inc.h"
A lot of the code will appear as changed because MSVC beautified it. I spent a lot of time wondering why I couldn't get a triangle to draw, so I went ahead to learn about matrices and stuff to try and backtrack the problem. I thought of enabling wireframe, so it would draw a 3d line which was pretty basic, then I got the div by 0 errors... Well it's fixed now, I hope my changes can be used.
Yeah, the line drawing code wasn't really used beyond the initial bringup, it's older than the triangle drawing code and uses a naive, unoptimized and broken way of drawing.
I'll leave some comments about the code changes in the review section.
Ok I made the split commits, all organized now. For some reason when I disable WIREFRAME, nothing gets rendered.
I just realized something about the transformation matrices, not all of the nodes are used for translations/rotations/scaling,
. In the nglMultMatMat, and maybe nglMultMatVectRes functions could have those specific indexes removed from the operations since it will be equivalent to adding 0.
I don't know if this will make a difference but it seems like something that could be trimmed for performance.
__builtin_expect macro for gnu isn't recognized by msvc, so it doesn't work as expected when I add the custom macro #define __builtin_expect(exp, c) exp != c
. I'm not sure whether this is correct either.
I just realized something about the transformation matrices, not all of the nodes are used for translations/rotations/scaling,
. In the nglMultMatMat, and maybe nglMultMatVectRes functions could have those specific indexes removed from the operations since it will be equivalent to adding 0. I don't know if this will make a difference but it seems like something that could be trimmed for performance.
You noticed that correctly. The unused part is for the W coordinate, which isn't really needed for 3D affine transformations which always have 0 0 0 1
as last column. Except for the perspective calculation, all transformations in nGL so far are affine, and so this can be optimized away. Vectors are stored as struct VECTOR3
without the W component and it's always treated as 1 in calculations. You can see that for instance in nglMultMatVectRes
: There is no *w
for the last column. https://en.wikipedia.org/wiki/Transformation_matrix#Affine_transformations explains this approach as well.
This makes nglMultMatVectRes
unsuitable for non-affine transformations though, and so perspective calculation in nglPerspective
is not done using a projection matrix, but rather manually. That way also some additional calculations can be avoided which aren't needed for this projection.
A optimization similar to VECTOR3
could be done to struct MATRIX
by not storing the last column and treating it as 0 0 0 1
and ignoring writes. That makes MATRIX
less general-purpose though and wouldn't actually be a huge benefit: Matrix multiplication is only performed to change the world transformation , e.g. by calling glScale3f
/glTranslatef
/nglRotateX
. That doesn't happen often during a frame, is independent of the total vertex/primitive count and doesn't take significant frame time.
Much more time is spent during drawing of primitives, so avoiding overdraw (writing pixels which aren't visible in the final frame) would have a much bigger impact on performance. nGL can't do any visibility calculations depending on scenery, so that's up to the application.
Another topic is that currently nGL uses a 32-bit Z buffer, which has to be read (and written, if in front) for each pixel of a triangle. A different approach is to depth-sort primitives before drawing them, which avoids that access, but increases overdraw as everything behind it which is just partially visible is fully drawn and then just overwritten again.
__builtin_expect macro for gnu isn't recognized by msvc, so it doesn't work as expected when I add the custom macro #define
__builtin_expect(exp, c) exp != c
. I'm not sure whether this is correct either.
It's not - __builtin_expect
is to tell the compiler what the "most likely" result of the expression is. So the correct replacement would be just #define
__builtin_expect(exp, c) (exp)`.