ezquake-source
ezquake-source copied to clipboard
-fno-tree-vectorize no longer needed. Maybe...
Currently the Makefile disables the tree-vectorize optimization. I believe this is no longer necessary. I think the compiler bug that made you remove it has been fixed for quite some time now. I built ezQuake without -fno-tree-vectorize
and the client seemed to work just fine.
Now, whether that flag helps or hinders, that is something that will require further testing. Preferably with the help of much more people and different machines. On my machine it seems that tree-vectorize was actually making things a little bit slower. Imperceptible, but slower. Something like 1000 fps when using -fno-tree-vectorize and 990 fps when using it.
Speaking of which, it might be a good idea to enable link-time optimization (LTO) in the Makefile for release builds at least. ezQuake is pretty fast to build by today's standards, so the impact on build time and memory usage during the build should be limited.
”Think, believe, maybe, might, seemed, should” is not yet a convincing case :)
Long time ago now but, I/we disabled these for a reason back then. I would like to turn the question around and ask Why should they be enabled? With a fast computer you can get thousands of fps today, meaning that the time to render a frame is in the hundreds of microseconds - for everything - that is crazy fast. Otherwise it’s usually the GPU rendering that takes time.
Compiler flags and optimizations are complex, I think there is good reason to be cautious and require better understanding of why things should be changed, if they should be changed.
@Calinou Good point. I forced O3 and LTO when I built it. The client worked flawlessly.
@jite I totally agree about being cautious and testing.
I disagree with some of what you said in the second paragraph though. You are right when you say that, nowadays, the GPU will be pretty much the only significant factor, but I don't like when people starting going like "today's CPU are very fast, so let's be lazy here and there. No one will notice..." Well... not exactly what you said :), but I think you got the idea.
But your last paragraph was perfect and I wholeheartedly agree. I'm not saying "lets blindly enable everything because it will be better." In fact, in the OP I said that, after a couple of quick tests, enabling that flag seemed to have detrimental results! Almost imperceptible, but detrimental. I think part of the confusion is that we are talking about a -fno
flag here. And when I said that I "enabled it", I meant that I enabled the optimization the -fno
flag was disabling. And since I think that now I just made things even more confusing, let me try again. Here we go
I removed the -fno-tree-vectorize
flag from the makefile, thus allowing the compiler to use that optimization if it felt it would generate faster code. And the result was a slightly worse performance.
And if that is exactly what you understood in the first place, then I am sorry. But at least now I think we are all on the same page.
Anyway, I'm pretty sure the reason -fno-tree-vectorize
is being used in the makefile has less to do with performance than with a known bug GCC used to have when it was enabled. That bug was fixed, so now it is time to test whether it should still be disabled. I'm sorry if I gave the wrong impression, but the whole idea here was not to ask for it to be enabled, but to point out that passing the flag didn't cause crashes and, because of that, maybe gather enough people willing to benchmark and reach a final conclusion.
You are correct in that the reason for that being disabled in the first place is no longer an issue with gcc 10+ (or clang 9+ iirc), it was because gcc would vectorize unaligned pointers when loading maps, those that weren't "properly" created (dark-terror-ffa was the go to for an example of this) would crash out ezquake. I also have no issue with lto'ing things, though quite honestly the benefit is very minimal so long as you were using a recent compiler anyway.
@jite I totally agree about being cautious and testing.
I disagree with some of what you said in the second paragraph though. You are right when you say that, nowadays, the GPU will be pretty much the only significant factor, but I don't like when people starting going like "today's CPU are very fast, so let's be lazy here and there. No one will notice..." Well... not exactly what you said :), but I think you got the idea.
Sure I definitely agree, however it’s not really what I meant. What I meant was that there’s nothing much to optimize at all, it’s already well optimized. Due to that fact the potential gains are quite limited, but the potential problems that might arise are unknown. Changing things, especially compiler flags - from a known working state - normally requires a good motivation and reason, IMO.
Basically what triggered me was the ”I built it and it seems to work” comment; yes absolutely that might be the case, but without knowledge of the initial problem and a conclusion that it can no longer be reproduced with newer compiler versions, it needs more testing.
I was the admin/main dev for ezQuake years ago and worked quite hard on stabilizing things. Lesson learned was that even the best of intentions might have side effects :) Don’t get me wrong, it might be the right approach to try and get rid of the old ”workarounds”, just keeping in mind it might require more testing.
For reference I work in the embedded world, so no resources wasted here :)