melonDS
melonDS copied to clipboard
Hi-res software rendering
A cleaner redo of #618. The code modifications are less intrusive and performance is better.
There are conflicts between this and #975 but they are minor. I can rebase and force-push either that or this after one of them is merged if desired.
Apparently the compiler used by the macOS build check doesn't support my use of alignas in GPUSoft_2D.cpp:22:
BGOBJLine = new alignas(8) u32[len];
When I compile it locally with MSYS2 on Windows following compile instructions in the README it works. I tested the actual address alignment and it seems to always be 8-byte aligned (16, actually) regardless of whether or not I use alignas. So I tested alignas(64) and that did work for consistently getting 64-byte alignment.
Is there a nice solution here?
The macOS builds use Clang, whereas all the others use GCC. (I changed the name to make it more clear.)
From the failed build check: "error: aligned allocation function of type 'void *(std::size_t, std::align_val_t)' is only available on macOS 10.14 or newer"
Uhg. Can it be upgraded to macOS 10.14? Alternatively, it looks like from my Google search that passing -fno-aligned-allocation to the compiler should fix the error, although I guess it won't necessarily be aligned as we want. EDIT: Or maybe there's a macro we can check with #ifdef? Then users could compile on older versions of macOS without having to mess with that.
The CI is macOS 10.15, we target macOS 10.9 so lower versions of macOS can use the build. (I should probably change that though, I don't think Qt is available for macOS versions that low.)
oldest osx version supported by qt 5.12 is 10.12
oh looks like brew gives 5.15 not 5.12
so oldest supported is 10.13
btw support ended for 10.9 in 2016. systems that run 10.9 are all able to run up to 10.11
Yep, macOS 10.9 was the lowest I could set it to, without CMake throwing some error.
so oldest supported is 10.13
Which is just below the required 10.14 for aligned allocation. So if the targeted build is updated to 10.13 we still need a change to this code. Another potential option not mentioned in my previous post, would be to just not align it. I made a test build to test performance with bad alignment. I did not see any difference in FPS when dispmode=1 (where BGOBJLine is cast to a u64*). Of course I don't know enough about these things to know if that will be true of all systems.
so oldest supported is 10.13
Looks like Homebrew says 10.12 is required for Qt, though idk if it actually builds (they only provide packages for the last 3 macOS versions, so macOS 10.14+.
idk qt docs for the version of qt supplied by homebrew stats that 10.13 is the oldest supported.
According to the QT formulae page, bottle installation support for Intel x86_64 is limited to Big Sur, Catalina and Mojave. I think it can be taken that anything lower would have limited support at the very most.
Ultimately, you could potentially go lower by using an older version of Xcode and the CLTs.
Qt 6 seems to only have support for macOS 10.14+. https://doc.qt.io/qt-6/macos.html
Restricting to Mojave greatly reduces support, however. Only Macs with graphics cards that support metal can run 10.14. Here's Macs compatible with 10.13:
- iMac: Late 2009 or later
- MacBook: Late 2009 or later
- MacBook Pro: Mid 2010 or later
- MacBook Air: Late 2010 or later
- Mac Mini: Mid 2010 or later
- Mac Pro: Mid 2010 or later
Here's the list for 10.14:
- MacBook: Early 2015 or newer
- MacBook Air: Mid 2012 or newer
- MacBook Pro: Mid 2012 or newer, Retina display not needed
- Mac Mini: Late 2012 or newer
- iMac: Late 2012 or newer
- iMac Pro
- Mac Pro: Late 2013 or newer
Your list is a little off there. Mac pros from Mid 2010 & Mid 2012 also can use Mojave with a Metal Compatible card.
In general that still leaves systems up to 9 years old as working with Mojave. Hardly reducing support. The Minimum OS restriction is more of an artifact of the new Universal Binary design as well as changes to programming languages than it is computer hardware itself.
Given that I was running Mojave and Catalina on a Surface Pro 7 with decent performance, there are always ways around restrictions like that as well.
I force-pushed a version that has been rebaesd onto master. (If you saw my force-push from earlier today ignore it, I was being dumb.)
I'm still waiting for a decision on if the targeted version of MacOS will be updated. I can update the code to not rely on the aligned new[] and delete[] operators if wanted. Changing the type of the new-ed array to u64 should provide the desired alignment I think. It can then be cast to a u32*, and back again for deletion.
@SuuperW I have changed the macOS deployment target to macOS 10.14. This PR should build now on macOS.
Any updates on this?
@SuuperW any updates on this pr 🙏
No, I've been very busy lately. Is there anything that needs to be done before this is merged? If it's going to be merged I can try to find time for resolving conflicts or whatever.
we've already stated why we don't want to merge this
@Koas9000 maybe bother to read the comment which is conveniently placed above yours.
I don't see the thing stating why you don't want to merge this.
The explanation is in the old pull request #618
I remember that explanation, but haven't the issues with that PR been fixed by this new PR?
I remember that explanation, but haven't the issues with that PR been fixed by this new PR?
no, not really. Though I have to say that my previous reply was indeed not that helpful, because looking back into the old PR, it's not that easy to read the reasons from there, so let me reiterate:
We prefer to keep the accuracy focussed code paths (software rasteriser) clearly separated from the from the enhancement code paths (OpenGL renderer). Of course this is not always possible or reasonable, but there needs to be a good reason to mix them.
But rendering at higher resolutions on the cpu is not that useful, as performance quickly tanks without special optimisations like vectorisation and more advanced multithreading. And even than GPUs will most likely will still be better suited for this.
I know that the classic OpenGL renderer currently has weaknesses, some of which can be solved, some of which just cannot be solved, but there's also the option of a special software renderer running on the GPU in compute shaders (which already exists for Switch, I just need to port it to OpenGL).
some of which just cannot be solved, but there's also the option of a special software renderer running on the GPU in compute shaders (which already exists for Switch, I just need to port it to OpenGL).
Is there any progress on the compute shader renderer?