melonDS Hi-res software rendering

A cleaner redo of #618. The code modifications are less intrusive and performance is better.

There are conflicts between this and #975 but they are minor. I can rebase and force-push either that or this after one of them is merged if desired.

Feb 20 '21 21:02 SuuperW

Apparently the compiler used by the macOS build check doesn't support my use of alignas in GPUSoft_2D.cpp:22: BGOBJLine = new alignas(8) u32[len];

When I compile it locally with MSYS2 on Windows following compile instructions in the README it works. I tested the actual address alignment and it seems to always be 8-byte aligned (16, actually) regardless of whether or not I use alignas. So I tested alignas(64) and that did work for consistently getting 64-byte alignment.

Is there a nice solution here?

Feb 20 '21 21:02 SuuperW

The macOS builds use Clang, whereas all the others use GCC. (I changed the name to make it more clear.)

Feb 20 '21 21:02 RayyanAnsari

From the failed build check: "error: aligned allocation function of type 'void *(std::size_t, std::align_val_t)' is only available on macOS 10.14 or newer"

Uhg. Can it be upgraded to macOS 10.14? Alternatively, it looks like from my Google search that passing -fno-aligned-allocation to the compiler should fix the error, although I guess it won't necessarily be aligned as we want. EDIT: Or maybe there's a macro we can check with #ifdef? Then users could compile on older versions of macOS without having to mess with that.

Feb 20 '21 23:02 SuuperW

The CI is macOS 10.15, we target macOS 10.9 so lower versions of macOS can use the build. (I should probably change that though, I don't think Qt is available for macOS versions that low.)

Feb 21 '21 12:02 RayyanAnsari

oldest osx version supported by qt 5.12 is 10.12

Feb 21 '21 16:02 nathanielcwm

oh looks like brew gives 5.15 not 5.12

so oldest supported is 10.13

btw support ended for 10.9 in 2016. systems that run 10.9 are all able to run up to 10.11

Feb 21 '21 16:02 nathanielcwm

Yep, macOS 10.9 was the lowest I could set it to, without CMake throwing some error.

Feb 21 '21 18:02 RayyanAnsari

so oldest supported is 10.13

Which is just below the required 10.14 for aligned allocation. So if the targeted build is updated to 10.13 we still need a change to this code. Another potential option not mentioned in my previous post, would be to just not align it. I made a test build to test performance with bad alignment. I did not see any difference in FPS when dispmode=1 (where BGOBJLine is cast to a u64*). Of course I don't know enough about these things to know if that will be true of all systems.

Feb 21 '21 21:02 SuuperW

so oldest supported is 10.13

Looks like Homebrew says 10.12 is required for Qt, though idk if it actually builds (they only provide packages for the last 3 macOS versions, so macOS 10.14+.

Feb 21 '21 22:02 RayyanAnsari

idk qt docs for the version of qt supplied by homebrew stats that 10.13 is the oldest supported.

Feb 22 '21 03:02 nathanielcwm

According to the QT formulae page, bottle installation support for Intel x86_64 is limited to Big Sur, Catalina and Mojave. I think it can be taken that anything lower would have limited support at the very most.

Ultimately, you could potentially go lower by using an older version of Xcode and the CLTs.

Feb 22 '21 03:02 ghost

Qt 6 seems to only have support for macOS 10.14+. https://doc.qt.io/qt-6/macos.html

Feb 22 '21 13:02 RayyanAnsari

Restricting to Mojave greatly reduces support, however. Only Macs with graphics cards that support metal can run 10.14. Here's Macs compatible with 10.13:

iMac: Late 2009 or later
MacBook: Late 2009 or later
MacBook Pro: Mid 2010 or later
MacBook Air: Late 2010 or later
Mac Mini: Mid 2010 or later
Mac Pro: Mid 2010 or later

Here's the list for 10.14:

MacBook: Early 2015 or newer
MacBook Air: Mid 2012 or newer
MacBook Pro: Mid 2012 or newer, Retina display not needed
Mac Mini: Late 2012 or newer
iMac: Late 2012 or newer
iMac Pro
Mac Pro: Late 2013 or newer

Feb 23 '21 05:02 nathanielcwm

Your list is a little off there. Mac pros from Mid 2010 & Mid 2012 also can use Mojave with a Metal Compatible card.

In general that still leaves systems up to 9 years old as working with Mojave. Hardly reducing support. The Minimum OS restriction is more of an artifact of the new Universal Binary design as well as changes to programming languages than it is computer hardware itself.

Given that I was running Mojave and Catalina on a Surface Pro 7 with decent performance, there are always ways around restrictions like that as well.

Feb 23 '21 06:02 ghost

I force-pushed a version that has been rebaesd onto master. (If you saw my force-push from earlier today ignore it, I was being dumb.)

I'm still waiting for a decision on if the targeted version of MacOS will be updated. I can update the code to not rely on the aligned new[] and delete[] operators if wanted. Changing the type of the new-ed array to u64 should provide the desired alignment I think. It can then be cast to a u32*, and back again for deletion.

Feb 28 '21 23:02 SuuperW

@SuuperW I have changed the macOS deployment target to macOS 10.14. This PR should build now on macOS.

Mar 12 '21 17:03 RayyanAnsari

Any updates on this?

May 01 '21 01:05 autofire372

@SuuperW any updates on this pr 🙏

Jul 18 '21 23:07 ghost

No, I've been very busy lately. Is there anything that needs to be done before this is merged? If it's going to be merged I can try to find time for resolving conflicts or whatever.

Jul 19 '21 22:07 SuuperW

we've already stated why we don't want to merge this

Jul 19 '21 22:07 RSDuck

@Koas9000 maybe bother to read the comment which is conveniently placed above yours.

Aug 27 '21 22:08 RayyanAnsari

I don't see the thing stating why you don't want to merge this.

Aug 28 '21 00:08 poudink

The explanation is in the old pull request #618

Aug 28 '21 07:08 Brankale

I remember that explanation, but haven't the issues with that PR been fixed by this new PR?

Aug 28 '21 19:08 poudink

I remember that explanation, but haven't the issues with that PR been fixed by this new PR?

no, not really. Though I have to say that my previous reply was indeed not that helpful, because looking back into the old PR, it's not that easy to read the reasons from there, so let me reiterate:

We prefer to keep the accuracy focussed code paths (software rasteriser) clearly separated from the from the enhancement code paths (OpenGL renderer). Of course this is not always possible or reasonable, but there needs to be a good reason to mix them.

But rendering at higher resolutions on the cpu is not that useful, as performance quickly tanks without special optimisations like vectorisation and more advanced multithreading. And even than GPUs will most likely will still be better suited for this.

I know that the classic OpenGL renderer currently has weaknesses, some of which can be solved, some of which just cannot be solved, but there's also the option of a special software renderer running on the GPU in compute shaders (which already exists for Switch, I just need to port it to OpenGL).

Aug 28 '21 22:08 RSDuck

some of which just cannot be solved, but there's also the option of a special software renderer running on the GPU in compute shaders (which already exists for Switch, I just need to port it to OpenGL).

Is there any progress on the compute shader renderer?

Dec 10 '22 13:12 fsmid

melonDS melonDS copied to clipboard

Hi-res software rendering

melonDS
melonDS copied to clipboard