moonlight-qt
moonlight-qt copied to clipboard
Moonlight L4T Video decode unit queue overflow
terminal_log.txt ^^ Log file ^^
Describe the bug When applying resolutions higher than 1440p video decode latency jumps from ~12 to ~150ms.
Steps to reproduce Apply 4k resolution on Nvidia Jetson TX2 and Xavier NX devices.
Client PC details (please complete the following information)
- OS: Ubuntu 18.04
- Moonlight Version: v3.1.0
- Nvidia Jetson TX2
- Nvidia Xavier NX
Devices that work with 4k
- Intel laptop with UHD 630, ubuntu 18.04
- Samsung galaxy note 8
- Phone with Snapdragon 865
Additional context On the newer version of jetpack both devices trow segmentation fault trying to launch moonlight right after these lines of code
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
In resolutions that work decode latency never goes under 10ms, compared to 2ms Jetson Nano from other Moonlight users
TDLR
Want to know where to search for the problem.
Same problem happens when I try Higher resolution, but with low bitrate (4k 20 Mbps)
For example 1440p 80 Mbps works great with network usage sitting in 60 to 80 Mbps range.
Confirmed on my Jetson Nano. I will need to bisect to find the regression.
This doesn't actually look like a Moonlight regression. It appears something has changed with the L4T kernel or system libraries that is causing the performance reduction, because this worked before with identical Moonlight software.
I tried going back to v3.0.0 or even v2.2.0 and the performance is still bad at 4K. You can try yourself using apt install moonlight-qt=3.0.0-1
or apt install moonlight-qt=2.2.0-1
to revert to older packages.
The segfaults with newer Jetpack versions is probably due to Nvidia making breaking changes to their system libraries. I have clue why they would do that, but it prevents a single build of Moonlight from supporting pre-4.3 and post-4.3 builds.
I think I maybe running into this issue too. I'm using a Jetson Nano 2GB running Jetpack 4.4.1 and the latest Moonlight Qt. Anytime I try to push any higher than 60 fps @ 1080p, I start getting decode unit overflow in the terminal. Even at 60fps my decode time is around 15ms, which is what my 2nd gen firestick runs. I tried jetpack 4.2 which is the earliest version that says will support the nano, and the performance is exactly the same. Theoretically, shouldn't this board be able to handle 1080p @ 120fps? Is there a known working version that will allow 1080@120fps?
An interesting note is that forcing software decoding has much better performance until too much stuff on screen starts changing. I tried recompiling Moonlight using Jetson ffmpeg, but I couldn't get it to work due to some differences between the standard ffmpeg and Jetson ffmpeg.
My solution was to use my own hardware decoder, which I wrote using jetson decode example. I could get all the performane that was advertised from nvidia side
the implementation for jetson ffmpeg seemed to be fine, so decode unit handling on moonlight part might be a problem
Mind sharing some details? Specifically, what files did you modify and what example did you base it off of? I'm willing to give it a shot, but its been years since I've touched anything other than python.
Edit: I just found the example from nvidia. NvVideoDecoder, correct?
https://docs.nvidia.com/jetson/l4t-multimedia/l4t_mm_00_video_decode.html This example, and for code, you need tof ind place where decode units are sent to ffmpeg decoder
The long term approach is to get rid of the special Tegra-specific Moonlight packages by using a standard interface like https://github.com/cyndis/vaapi-tegra-driver and shipping a single arm64 package for every device.
Performance will be better with VAAPI too, since it can avoid extra copies needed by the nvmpi+SDL2 backend by mapping the decoded frame as a texture and rendering it via GLES. Moonlight already has this code today, and it should "just work" when L4T has a suitable VAAPI driver.
Got a short term solution? I just spent a day failing to even get Moonlight Qt to build from source (lots of errors from ffmpeg, of which I am using v4.2). I managed to build the embedded version, but my skill level isn't high enough in c and c++ to try and adapt a decoder to either of them.
Edit: figured out my compile issue. Turns out my Qt SDK messed up while installing. Gonna give modifying Qt again.
@janis8008 Is there any way you can make your decoder available? I've tried to wrap my head around everything to get this working, but I was hoping decode example would be more cut, paste, and pipe everything in.
Edit: just a note to anybody that happens upon this thread while searching. The above VAAPI driver, in it's current state, doesn't seem to play nice with Moonlight. I tried it for giggles, and the test decode fails. That maybe my fault though, I don't know.
You may also try building Moonlight with the new official Nvidia ffmpeg package that comes with recent versions of Jetpack rather than the third-party jetson-ffmpeg library.
I'm not sure how Nvidia has implemented the new decoders and Moonlight may need some minor changes to use them. If you post output from ffmpeg -decoders
and ffmpeg -hwaccels
, I can hopefully sort that out quickly.
https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/multimedia.html#wwpID0EQHA
So, when I try to build using the official Nvidia ffmpeg, I get
/home/koolguy007/moonlight-qt/app/streaming/video/ffmpeg.h:44:38: error: ‘AVCodecHWConfig’ does not name a type; did you mean ‘AVCodecContext’?
I don't know why exactly. avcodec.h from the 3rd party ffmpeg and avcodec.h from Nvidia are identical according to the diff command. As for the outputs, I'll attach them. HWAccelsOutput.txt
Make sure it's using the proper FFmpeg headers. I think that's the error you'd get if it was using the headers that come with the stock libavcodec-dev
on Ubuntu 18.04.
It looks like Nvidia is using a new decoder rather than a hwaccel. When you run Moonlight, you'll need to run with some environment variables set: H264_DECODER_HINT=h264_nvv4l2dec
and HEVC_DECODER_HINT=hevc_nvv4l2dec
So, I managed to get past that point in compiling, but I've hit another roadblock. So far I've compiled Nvidia's ffmpeg from source and installed it without installing Ubuntu's libavcodec-dev. Now I'm running into this:
/usr/local/lib/libavcodec.so: undefined reference to 'v4l2_close'
/usr/local/lib/libavcodec.so: undefined reference to 'v4l2_ioctl'
/usr/local/lib/libavcodec.so: undefined reference to 'v4l2_open'
collect2: error: ld returned 1 exit status
Google has pointed me to adding -lv4l1 -lv4l2
to the compiler flags, but I tried adding QMAKE_CXXFLAGS += -lv4l1 -lv4l2
in app.pro. I haven't seen the flags appear in the compiler output, so I guess I'm not quite sure where to add it.
Hmm, seems like you might need that on the FFmpeg build itself? Libavcodec.so should have been linked to all the libraries it needs when it was compiled.
The ffmpeg build "seemed" to compile fine. I got the -lv4l1 -lv4l2 flags working in Qt Creator, I can see them in the compile output right before the error, but no change to error status. I'm getting a dreadful feeling that something is amiss between libv4l-dev and Nvidia's ffmpeg. I'll try recompiling ffmpeg again and watch the output closely, but I would expect the compile to fail if all libraries were not ok.
Hi,
On my side, to get it work with the official Nvidia ffmpeg, I had to add -lv4l2
at the end of the line starting by LIBS =
in the Makefile.Debug or Makefile.Release in app
directory.
Passing the option using qmake didn't help, although I see -lv4l2 in the makefile files.
For info, I didn't compile ffmpeg, just used the one provided by nvidia for Jetson.
Basically:
with qmake "LIBS+=-lv4l2"
in app/Makefile.Debug, I see: LIBS = $(SUBLIBS) -lv4l2 -ldl -L(...) -lEGL
--> it doesn't work undefined reference to 'v4l2_close'
with manual modification in app/Makefile.Debug
LIBS = $(SUBLIBS) -ldl -L(...) -lEGL -lv4l2
--> it works and moonlight is compiled
I'm not familiar with Qt and I don't know how the Makefile.Debug/Release are generated and I cannot really say why it works with the second option?! any idea?
My guess is that it's linking to static libraries and their libavcodec.pc doesn't include the appropriate -lv4l2
flag
Nvidia recently updated their ffmpeg decoder, haven't had time to test it yet, maybe they fixed all problems.
On Wed, Jan 12, 2022, 3:38 AM Cameron Gutman @.***> wrote:
My guess is that it's linking to static libraries and their libavcodec.pc doesn't include the appropriate -lv4l2 flag
— Reply to this email directly, view it on GitHub https://github.com/moonlight-stream/moonlight-qt/issues/546#issuecomment-1010543459, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL6HSHDSUDIE6HZ2PDNUCD3UVTLSPANCNFSM4ZIKFE4A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>