xpra icon indicating copy to clipboard operation
xpra copied to clipboard

gstreamer shadow capture

Open totaam opened this issue 2 years ago • 7 comments

Based on #3706, provide a codec we can use with a gstreamer pipeline to capture the display(s) via gstreamer. This should work for:

Main difficulty is going to be the integration of events with the pull interface currently used by the base shadow server class.

Later on we can consider doing the whole pipeline in gstreamer, which will be a lot quicker than extracting the data (assuming we can even get zerocopy buffers out of GStreamer via appsink) then compressing it in xpra... Downside is that we no longer control the encoder flow rate.

totaam avatar Jan 29 '23 13:01 totaam

MS Windows: dx9screencapsrc or gdiscreencapsrc

For Windows, you should use d3d11screencapturesrc. Both others are officially legacy. Not sure about Mac.

ehfd avatar Jan 30 '23 07:01 ehfd

@ehfd yes, that's the one I meant. gdi, we might as well use our existing capture code and dx9 offers nothing that d3d11 does not have - I have kept it in anyway but I don't think it will be used.

So far, this seems to work pretty well, just needs more work:

  • pass display options to the capture factory, so users can specify what to capture: which monitor / geometry (and try window handle?)
  • there's a bug: the pipeline isn't stopped when the last client disconnects, wasting huge amounts of CPU
  • we set the capture framerate when creating the pipeline, which is not what we want with xpra: we want a pull model, not a push one! (not easy to fix AFAICT)

Perhaps we should use the same gstreamer pipeline for compression (at least as an option) - if that gives us some zero-copy benefits. We would then lose all sorts of benefits exclusive to xpra, like video region detection, scroll encoding, etc.. but may be worth it when a hardware video encoder is available.

totaam avatar Feb 03 '23 12:02 totaam

We now have the ability to screencast existing (wayland / X11) desktop sessions that support org.freedesktop.portal.ScreenCast - via gstreamer / pipewire. Running xpra shadow should now capture the wayland desktop session / monitor selected.

To get a full remote wayland desktop (#387), we need a lot more APIs from org.freedesktop.portal.RemoteDesktop or libei

totaam avatar Feb 07 '23 13:02 totaam

The difficulty is generating pipelines that:

pipeline warning: Internal GStreamer error: code not implemented.  Please file a bug at https://gitlab.freedesktop.org/gstreamer/gstreamer/issues/new.
                   ../gst-libs/gst/video/gstvideofilter.c(296)
                    gst_video_filter_transform ()
                    /GstPipeline
                   pipeline8/GstVideoConvert
                   videoconvert0
invalid video buffer received

It's not clear which buffer uses what format - GST_DEBUG=4 didn't help.


Other issues:

  • framerate - perhaps use a videorate element and tune it based on packet acks?
  • don't use video mode if mmap is available!
  • add option to enable the video mode with x11 shadow server via ximagesrc, using syntax like xpra shadow :100,stream=true?
  • propagate settings and client options: bitrate, b-frames?, fps
  • enable error-resiliency flags when quic is used: #3376
  • colorspace is wrong - nesting the view shows it is too dark
  • x264enc doesn't honour the quality we set? (no such problems with our own x264 encoder..)
  • handle client decoding errors by restarting the pipeline and / or changing to a different / safer encoder?
  • the pipelines take time to start - perhaps send a start-of-stream draw packet to the client?

Here are some useful links to tune this further:

  • gstpipewire source: https://github.com/PipeWire/pipewire/blob/master/src/gst/gstpipewiresrc.c
  • if available, selkies uses nvh264enc: https://github.com/selkies-project/selkies-gstreamer/blob/11006e393b9b7af81ab2b01d82a0092167da3e87/src/selkies_gstreamer/gstwebrtc_app.py#LL200C30-L200C39 via a cudaupload and cudaconvert - are there any cases where pipewire can give us a GPU buffer?
  • otherwise they use x264enc: https://github.com/selkies-project/selkies-gstreamer/blob/11006e393b9b7af81ab2b01d82a0092167da3e87/src/selkies_gstreamer/gstwebrtc_app.py#LL309C32-L309C32 With: threads=4 bframes=0 key-int-max=0 byte-stream=True tune=zerolatency speed-preset=veryfast bitrate={} and high profile.
  • or even vp8 / vp9 vpxenc: https://github.com/selkies-project/selkies-gstreamer/blob/11006e393b9b7af81ab2b01d82a0092167da3e87/src/selkies_gstreamer/gstwebrtc_app.py#L348 With {threads=4 cpu-used=8 deadline=1 error-resilient=partitions keyframe-max-dist=10 auto-alt-ref=True target-bitrate={}
  • gnome-shell uses these pipelines: https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/c57f4a1c73413fcf3f84de448b46f66e9cb186a7/js/dbusServices/screencast/screencastService.js#L31
capsfilter caps=video/x-raw(memory:DMABuf),max-framerate=%F/1 ! \
             glupload ! glcolorconvert ! gldownload ! \
             queue ! \
             vp8enc cpu-used=16 max-quantizer=17 deadline=1 keyframe-mode=disabled threads=%T static-threshold=1000 buffer-size=20000 ! \
             queue ! \
             webmmux
capsfilter caps=video/x-raw,max-framerate=%F/1 ! \
             videoconvert chroma-mode=none dither=none matrix-mode=output-only n-threads=%T ! \
             queue ! \
             vp8enc cpu-used=16 max-quantizer=17 deadline=1 keyframe-mode=disabled threads=%T static-threshold=1000 buffer-size=20000 ! \
             queue ! \
             webmmux

We don't need the muxer, or the queue?

  • desktopcast uses: https://github.com/seijikun/desktopcast/blob/9ae61739cedce078d197011f770f8e94d9a9a8b2/src/stream_server.rs#L162
videoconvert !
queue leaky=2 !
x264enc threads={} tune=zerolatency speed-preset=2 bframes=0 !
video/x-h264,profile=high !
queue !
rtph264pay name=pay0 pt=96",
  • other projects hardcode pipelines similar to: pipewiresrc do-timestamp=true ! vaapipostproc ! queue ! vaapih264enc

totaam avatar May 11 '23 11:05 totaam

@totaam Our configuration for the encoder isn't really optimized. Any findings on your side?

ehfd avatar Aug 04 '23 10:08 ehfd

@ehfd not yet optimized either: https://github.com/Xpra-org/xpra/issues/3706#issuecomment-1665252432 The current thinking is that some kind of manual setup / testing / validation may be needed to get the best out of hardware encoders because of the amount of variation there is between OS / drivers / GPUs / ... We will try to make it easier to do that.

totaam avatar Aug 04 '23 11:08 totaam

@m1k1o Adding Neko to the discussion.

ehfd avatar Aug 04 '23 11:08 ehfd