rpi-ffmpeg Could I extract 10bit luma data from AV_PIX_FMT

Hello!

I have started a new project and am looking for a way to transmit 10(or 12)bit monochromatic depth camera data. One way I thought would be 10bit hevc using avcodec.

I am sending from a Ubuntu-PC and receiving on a RPi4b.

On the sender side I am using AV_PIX_FMT_YUV420P10 at the moment. This gives me this,

ffprobe video_dump.h265
Stream #0:0: Video: hevc (Rext), yuv420p10le(tv), 512x424, 30 fps, 30 tbr, 1200k tbn, 30 tbc

But I'm unsure how I can extract the Luma channel on the receiving end on the RPi. I looked in the avcodec source code and found a function,

int av_rpi_sand_to_planar_frame(AVFrame * const dst, const AVFrame * const src)

But that crashes with a segfault when receiving the stream described above.

I am not fixed on needing to get this straight into OpenGL I'm hoping I'll be ok taking the luma pixels out to cpu memory as the resolution is not super high.

Cheers!

Apr 03 '22 16:04 Fredrum

Wait! I just realized I had forgotten to do tmpDst = av_frame_alloc(); on the destination AVFrame!

now the command goes though and reports success! :) int av_rpi_sand_to_planar_frame(AVFrame * const dst, const AVFrame * const src)

Exciting fingers crossed I will take a look at the data now!

Cheers!

Apr 05 '22 03:04 Fredrum

I have extracted the YUV planes from the HEVC P030 format using av_hwframe_transfer_data(). This function in hwcontext_drm.c calls drm_transfer_data_from() which, if SAND is configured for ffmpeg, will call av_rpi_sand30_to_planar_y16(). I have written a reversed av_rpi_sand30_to_planar_y16() and av_rpi_sand30_to_planar_c16() to yield back a planar16 format and used jc-kynesim drmu package to load the pixels directly to framebuffer. It's sloooow. Transfering back using av_hwframe_transfer_data is not possible since no hevc encoder available, also following examples like vaapi_transcode and vaapi_encode don't work as no code to handle pool generation for DRMPRIME. I added some rudimentary code which does this but again you run into having to use code to reverse planar16 to sand30 which does not exist. Easier to load the pixels into the framebuffer. The reason for all this is to see if I ported libplacebo dovi code for drmu and added the ffmpeg dovi bits to rpi-ffmpeg 4.3.1/drmprime1 and rpi-ffmpeg/dev/4.4/rpi-import1 it would work.. It Works but super sloooooooow. Even the libplacebo dovi code which uses shaders is slow on pi4. Would be nice if shaders for SAND30 existed or MESA could finish that support if its even possible. I think though rpi4 hardware isnt up to handling this.

Jul 19 '22 17:07 docdude

Well the good news is that someone is looking into enabling GL SAND30 texture import - whether or not that will turn out to be successful or fast enough I don't know, but I wouldn't hold your breath. Don't use the 4.3.1/drmprime_1 branch - its obsolete - current is test/4.3.4/rpi_main. I've lost understanding of why you want to construct SAND30 frames (its a terrible format for most CPU driven ops)? Given current h/w capabilities the only reason I can think of would be if you want HDR output via DRM? I don't think our GL allows HDR output currently???

Jul 20 '22 10:07 jc-kynesim

Well the good news is that someone is looking into enabling GL SAND30 texture import - whether or not that will turn out to be successful or fast enough I don't know, but I wouldn't hold your breath.

That is good news.

Don't use the 4.3.1/drmprime_1 branch - its obsolete - current is test/4.3.4/rpi_main.

Thank you.

I've lost understanding of why you want to construct SAND30 frames (its a terrible format for most CPU driven ops)? Given current h/w capabilities the only reason I can think of would be if you want HDR output via DRM?

I want to output dolby encoded video thru DRM, not using shaders, Ive managed to grab the planar output from av_hwframe_transfer_data() and run it through the ported dolby code. Works but as I mentioned CPU processing is slooow.

I don't think our GL allows HDR output currently???

Ah but it can. I've modded mesa code and added the HDR bits.


diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c

index 87ea2bdb3de..a10cf09693a 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -912,7 +912,7 @@ dri2_setup_screen(_EGLDisplay *disp)
                  1 << __DRI_API_GLES2 |
                  1 << __DRI_API_GLES3;
    }
-
+ 
    disp->ClientAPIs = 0;
    if ((api_mask & (1 <<__DRI_API_OPENGL)) && _eglIsApiValid(EGL_OPENGL_API))
       disp->ClientAPIs |= EGL_OPENGL_BIT;
@@ -941,7 +941,19 @@ dri2_setup_screen(_EGLDisplay *disp)
    if (dri2_renderer_query_integer(dri2_dpy,
                                    __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB))
       disp->Extensions.KHR_gl_colorspace = EGL_TRUE;
-

+   if (disp->Extensions.KHR_gl_colorspace) {
+      disp->Extensions.EXT_gl_colorspace_bt2020_linear = EGL_TRUE;
+      disp->Extensions.EXT_gl_colorspace_bt2020_pq = EGL_TRUE;
+      disp->Extensions.EXT_gl_colorspace_display_p3 = EGL_TRUE;
+      disp->Extensions.EXT_gl_colorspace_display_p3_linear = EGL_TRUE;
+      disp->Extensions.EXT_gl_colorspace_scrgb = EGL_TRUE;
+      disp->Extensions.EXT_gl_colorspace_scrgb_linear = EGL_TRUE;
+
+      disp->Extensions.EXT_surface_SMPTE2086_metadata = EGL_TRUE;
+      disp->Extensions.EXT_surface_CTA861_3_metadata = EGL_TRUE;
+   }

    if (dri2_dpy->image_driver ||
        (dri2_dpy->dri2 && dri2_dpy->dri2->base.version >= 3) ||
        (dri2_dpy->swrast && dri2_dpy->swrast->base.version >= 3)) {
diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index decf1182067..2b4abbb7674 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -475,7 +475,7 @@ _eglAppendExtension(char **str, const char *ext)
  * the driver's Extensions string.
  */
 static void
-_eglCreateExtensionsString(_EGLDisplay *disp)
+_eglCreateExtensionsString(_EGLDisplay *disp) 
 {
 #define _EGL_CHECK_EXTENSION(ext)                                          \
    do {                                                                    \
@@ -510,12 +510,18 @@ _eglCreateExtensionsString(_EGLDisplay *disp)
 
    _EGL_CHECK_EXTENSION(KHR_cl_event2);
    _EGL_CHECK_EXTENSION(KHR_config_attribs);
    _EGL_CHECK_EXTENSION(KHR_create_context);
    _EGL_CHECK_EXTENSION(KHR_create_context_no_error);
    _EGL_CHECK_EXTENSION(KHR_fence_sync);
    _EGL_CHECK_EXTENSION(KHR_get_all_proc_addresses);
    _EGL_CHECK_EXTENSION(KHR_gl_colorspace);
+   _EGL_CHECK_EXTENSION(EXT_gl_colorspace_bt2020_linear);
+   _EGL_CHECK_EXTENSION(EXT_gl_colorspace_bt2020_pq);
+   _EGL_CHECK_EXTENSION(EXT_gl_colorspace_display_p3);
+   _EGL_CHECK_EXTENSION(EXT_gl_colorspace_display_p3_linear);
+   _EGL_CHECK_EXTENSION(EXT_gl_colorspace_scrgb);
+   _EGL_CHECK_EXTENSION(EXT_gl_colorspace_scrgb_linear);  
    _EGL_CHECK_EXTENSION(KHR_gl_renderbuffer_image);
    _EGL_CHECK_EXTENSION(KHR_gl_texture_2D_image);
    _EGL_CHECK_EXTENSION(KHR_gl_texture_3D_image);
diff --git a/src/egl/main/egldefines.h b/src/egl/main/egldefines.h
index c925e0ca553..89eb58d02d8 100644
--- a/src/egl/main/egldefines.h
+++ b/src/egl/main/egldefines.h
@@ -38,7 +38,7 @@
 extern "C" {
 #endif
 
-#define _EGL_MAX_EXTENSIONS_LEN 1000
+#define _EGL_MAX_EXTENSIONS_LEN 1500
 
 /* Hardcoded, conservative default for EGL_LARGEST_PBUFFER,
  * this is used to implement EGL_LARGEST_PBUFFER.
diff --git a/src/egl/main/egldisplay.h b/src/egl/main/egldisplay.h
index aee9f86a699..3f5ba19e6e3 100644
--- a/src/egl/main/egldisplay.h
+++ b/src/egl/main/egldisplay.h
@@ -126,6 +126,12 @@ struct _egl_extensions
    EGLBoolean KHR_fence_sync;
    EGLBoolean KHR_get_all_proc_addresses;
    EGLBoolean KHR_gl_colorspace;
+   EGLBoolean EXT_gl_colorspace_bt2020_linear;
+   EGLBoolean EXT_gl_colorspace_bt2020_pq;
+   EGLBoolean EXT_gl_colorspace_display_p3;
+   EGLBoolean EXT_gl_colorspace_display_p3_linear;
+   EGLBoolean EXT_gl_colorspace_scrgb;
+   EGLBoolean EXT_gl_colorspace_scrgb_linear;
    EGLBoolean KHR_gl_renderbuffer_image;
    EGLBoolean KHR_gl_texture_2D_image;
    EGLBoolean KHR_gl_texture_3D_image;
@@ -145,7 +151,7 @@ struct _egl_extensions
    EGLBoolean MESA_query_driver;
 
    EGLBoolean NOK_swap_region;
 
    EGLBoolean NV_post_sub_buffer;
 
diff --git a/src/egl/main/eglsurface.c b/src/egl/main/eglsurface.c
index 9167b9b7eed..2fc6f099f08 100644
--- a/src/egl/main/eglsurface.c
+++ b/src/egl/main/eglsurface.c
@@ -78,6 +78,42 @@ _eglParseSurfaceAttribList(_EGLSurface *surf, const EGLint *attrib_list)
             break;
          }
          switch (val) {
+         case EGL_GL_COLORSPACE_BT2020_LINEAR_EXT:
+            if (!disp->Extensions.EXT_gl_colorspace_bt2020_linear) {
+               err = EGL_BAD_MATCH;
+               break;
+            }
+            break;
+         case EGL_GL_COLORSPACE_BT2020_PQ_EXT:
+            if (!disp->Extensions.EXT_gl_colorspace_bt2020_pq) {
+               err = EGL_BAD_MATCH;
+               break;
+            }
+            break;
+         case EGL_GL_COLORSPACE_DISPLAY_P3_EXT:
+            if (!disp->Extensions.EXT_gl_colorspace_display_p3) {
+               err = EGL_BAD_MATCH;
+               break;
+            }
+            break;
+         case EGL_GL_COLORSPACE_DISPLAY_P3_LINEAR_EXT:
+            if (!disp->Extensions.EXT_gl_colorspace_display_p3_linear) {
+               err = EGL_BAD_MATCH;
+               break;
+            }
+            break;
+         case EGL_GL_COLORSPACE_SCRGB_EXT:
+            if (!disp->Extensions.EXT_gl_colorspace_scrgb) {
+               err = EGL_BAD_MATCH;
+               break;
+            }
+            break;
+         case EGL_GL_COLORSPACE_SCRGB_LINEAR_EXT:
+            if (!disp->Extensions.EXT_gl_colorspace_scrgb_linear) {
+               err = EGL_BAD_MATCH;
+               break;
+            }
+            break;
          case EGL_GL_COLORSPACE_SRGB_KHR:
          case EGL_GL_COLORSPACE_LINEAR_KHR:
             break;

Works and is 10bit bit perfect on our pattern generator software.

void ofxRPI4Window::HDRWindowSetup()
{
	if (!DestroyWindow()) 
	{
		ofLogError() << "GBM: Failed to deinitialize GBM";
	}
	
    gbmDevice = gbm_create_device(device);
	  if (!gbmDevice)
	{
		ofLogError() << "GBM: - failed to create device: " << gbmDevice; 

	}
	if (ofxRPI4Window::bit_depth == 10) {
		if ((strcmp(mode.name, "4096x2160") == 0 || strcmp(mode.name, "3840x2160") == 0) && mode_vrefresh(&mode) >= 30) { 
			//for (int i=0;i<connector->count_modes;i++) {
			//	mode = connector->modes[i];
		
			//	if (strcmp(mode.name, "3840x2160") == 0 && mode_vrefresh(&mode) == 30) {
				//	ofxRPI4Window::mode_idx = i;
				//break;
				//}
			//}
			mode = MODE_4K_10bit;// mode_3840x2160_30;
			//mode = mode_4096x2160_30;
			ofLogError() << "DRM: - Detected 4k mode > 30Hz...changed resolution to " << mode.hdisplay << "x" << mode.vdisplay << "@" << mode_vrefresh(&mode) <<"Hz";
		}
		
	}
#if 1
#if defined(HAS_GBM_MODIFIERS)
	if (num_modifiers > 0)
	{
		gbmSurface = gbm_surface_create_with_modifiers(gbmDevice, (uint32_t)mode.hdisplay, (uint32_t)mode.vdisplay, GBM_FORMAT_ABGR2101010, modifiers,
                                                num_modifiers);
	}
#endif
	if (!gbmSurface)
	{
		gbmSurface = gbm_surface_create(gbmDevice, (uint32_t)mode.hdisplay, (uint32_t)mode.vdisplay,GBM_FORMAT_ABGR2101010,
									GBM_BO_USE_SCANOUT | GBM_BO_USE_RENDERING);
	}

	if (!gbmSurface)
	{
		ofLogError() << "GBM: - failed to create surface: " << strerror(errno);

	} else {

		ofLog() << "GBM: - created surface with size " << mode.hdisplay << "x" << mode.vdisplay << " and " << ((*modifiers >= 0) ? "modifier " : "no modifier ") << hex << ((*modifiers >= 0) ? *modifiers : 0);
	}
	free(modifiers);
#else
    gbmSurface = gbm_surface_create(gbmDevice, (uint32_t)mode.hdisplay, (uint32_t)mode.vdisplay, GBM_FORMAT_ABGR2101010, GBM_BO_USE_SCANOUT | GBM_BO_USE_RENDERING);

	if (!gbmSurface)
	{
		ofLogError() << "GBM: - failed to create surface: " << strerror(errno);

	} else {

		ofLog() << "GBM: - created surface with size " << mode.hdisplay << "x" << mode.vdisplay;
	}
#endif		
	display = gbm_get_display(gbmDevice);
    if (!display)
    {
        auto error = eglGetError();
        ofLogError() << "display ERROR: " << eglErrorString(error);
    }
        
    int major, minor;
    if (!eglInitialize(display, &major, &minor))
    {
        auto error = eglGetError();
        ofLogError() << "initialize ERROR: " << eglErrorString(error);
    }
    eglBindAPI(EGL_OPENGL_ES_API);
        
    EGLint count = 0;
    EGLint matched = 0;
    int config_index = -1;
        
    if (!eglGetConfigs(display, NULL, 0, &count) || count < 1)
    {
        ofLogError() << "No EGL configs to choose from";
    }
    ofLog() <<"EGL has " << count << " configs";


	EGLConfig *configs = (EGLConfig *)malloc(count * sizeof *configs);
  //      EGLConfig configs[count];

	EGLint configAttribs[] = {
		EGL_RED_SIZE,10,
		EGL_GREEN_SIZE,10,
		EGL_BLUE_SIZE,10,
		EGL_ALPHA_SIZE,2,
		EGL_DEPTH_SIZE,24,
		EGL_BUFFER_SIZE,32,
		EGL_STENCIL_SIZE,8,
		EGL_SAMPLES,0,
		EGL_SAMPLE_BUFFERS,0,
//		EGL_BIND_TO_TEXTURE_RGBA,EGL_TRUE,
//		EGL_BIND_TO_TEXTURE_RGB,EGL_FALSE,
//		EGL_CONFIG_CAVEAT,EGL_NON_CONFORMANT_CONFIG,
		EGL_RENDERABLE_TYPE, EGL_OPENGL_ES3_BIT_KHR, //| EGL_OPENGL_ES3_BIT,
		EGL_COLOR_COMPONENT_TYPE_EXT, EGL_COLOR_COMPONENT_TYPE_FIXED_EXT, //EGL_COLOR_COMPONENT_TYPE_FLOAT_EXT, 
		EGL_NONE
	};

	EGLint visualId = GBM_FORMAT_ABGR2101010;

	if (ofGetLogLevel() == 0) PrintConfigs(display);
	 
    EGLConfig config = NULL;
        
        
        
        
    if (!eglChooseConfig(display, configAttribs, configs, count, &matched) || !matched)
	{
        printf("No EGL configs with appropriate attributes.\n");
    }
       
    if (config_index == -1)
    {
        config_index = match_config_to_visual(display,
                                              visualId,
                                              configs,
                                              matched);
    }
        
    if (config_index != -1)
    {
        config = configs[config_index];
    }
        
    free(configs);        
        
    const EGLint contextAttribs[] = {
        EGL_CONTEXT_MAJOR_VERSION, 3,  //update to version 3.0, previously 2
		EGL_CONTEXT_MINOR_VERSION, 1,
        EGL_NONE
	};
			
    if(config)
    {
        context = eglCreateContext(display, config, EGL_NO_CONTEXT, contextAttribs);
        if (!context)
        {
            auto error = eglGetError();
            ofLogError() << "context ERROR: " << eglErrorString(error);
        }
	    const char *client_extensions = eglQueryString(display, EGL_EXTENSIONS);
				  
	    if (strstr(client_extensions, "EGL_EXT_gl_colorspace_bt2020_pq"))
		{
			ofLog() << "EGL_GL_COLORSPACE_BT2020_PQ_EXT available\n";
				  
		} else {
			ofLogError() << "EGL_GL_COLORSPACE_BT2020_PQ_EXT not available\n";
		}

		if (strstr(client_extensions, "EGL_KHR_gl_colorspace")) {
			ofLog() << "EGL_GL_COLORSPACE_KHR  available\n";
		} else {
			ofLogError() << "EGL_GL_COLORSPACE_KHR not available\n";
		}		
		 
		if (hdr_primaries == 1) {
			if (static_cast<int>(eotf) == 2) {
				EGLint attribs[] = {EGL_GL_COLORSPACE_KHR,EGL_GL_COLORSPACE_BT2020_PQ_EXT,EGL_NONE };
				EGL_create_surface(attribs, config);				
			} else {
				EGLint attribs[] = {EGL_GL_COLORSPACE_KHR,EGL_GL_COLORSPACE_BT2020_LINEAR_EXT,EGL_NONE }; 	
				EGL_create_surface(attribs, config);
			}
		}

		if (hdr_primaries == 2 || hdr_primaries == 0) {
			EGLint attribs[] = {EGL_GL_COLORSPACE_KHR,EGL_GL_COLORSPACE_DISPLAY_P3_LINEAR_EXT,EGL_NONE };    //linear Display-P3 color space is assumed, with a corresponding GL_FRAMEBUFFER_ATTACHMENT_COLOR_ENCODING value of GL_LINEAR
	//		EGLint attribs[] = {EGL_GL_COLORSPACE_KHR,EGL_GL_COLORSPACE_DISPLAY_P3_EXT,EGL_NONE };   //non-linear, sRGB encoded Display-P3 color space is assumed, with a corresponding GL_FRAME-BUFFER_ATTACHMENT_COLOR_ENCODING value of GL_SRGB.
   			EGL_create_surface(attribs, config);
		}

#if 1
		eglSurfaceAttrib(display, surface, SurfaceAttribs[0],EGLint(DisplayChromacityList[hdr_primaries].RedX * EGL_METADATA_SCALING_EXT));
		eglSurfaceAttrib(display, surface, SurfaceAttribs[1],EGLint(DisplayChromacityList[hdr_primaries].RedY * EGL_METADATA_SCALING_EXT));
		eglSurfaceAttrib(display, surface, SurfaceAttribs[2],EGLint(DisplayChromacityList[hdr_primaries].GreenX * EGL_METADATA_SCALING_EXT));
		eglSurfaceAttrib(display, surface, SurfaceAttribs[3],EGLint(DisplayChromacityList[hdr_primaries].GreenY * EGL_METADATA_SCALING_EXT));
		eglSurfaceAttrib(display, surface, SurfaceAttribs[4],EGLint(DisplayChromacityList[hdr_primaries].BlueX * EGL_METADATA_SCALING_EXT));
		eglSurfaceAttrib(display, surface, SurfaceAttribs[5],EGLint(DisplayChromacityList[hdr_primaries].BlueY * EGL_METADATA_SCALING_EXT));
		eglSurfaceAttrib(display, surface, SurfaceAttribs[6],EGLint(DisplayChromacityList[hdr_primaries].WhiteX * EGL_METADATA_SCALING_EXT));
		eglSurfaceAttrib(display, surface, SurfaceAttribs[7],EGLint(DisplayChromacityList[hdr_primaries].WhiteY * EGL_METADATA_SCALING_EXT));
//		eglSurfaceAttrib(display, surface, SurfaceAttribs[8],EGLint(10000.0f * 10000.0f));
//		eglSurfaceAttrib(display ,surface, SurfaceAttribs[9],EGLint(0.001f    * 10000.0f));
#endif
        eglSurfaceAttrib(display, surface, SurfaceAttribs[8], hdr_metadata.hdmi_metadata_type1.max_display_mastering_luminance); //EGL_SMPTE2086_MAX_LUMINANCE_EXT
                         
        eglSurfaceAttrib(display, surface, SurfaceAttribs[9], hdr_metadata.hdmi_metadata_type1.min_display_mastering_luminance);	//EGL_SMPTE2086_MIN_LUMINANCE_EXT
                        
		eglSurfaceAttrib(display, surface, SurfaceAttribs[10], hdr_metadata.hdmi_metadata_type1.max_cll); //EGL_CTA861_3_MAX_CONTENT_LIGHT_LEVEL_EXT            

		eglSurfaceAttrib(display, surface, SurfaceAttribs[11], hdr_metadata.hdmi_metadata_type1.max_fall); //EGL_CTA861_3_MAX_FRAME_AVERAGE_LEVEL_EXT  

	 

        if (!surface)
        {
            auto error = eglGetError();
            ofLogError() << "surface ERROR: " << eglErrorString(error);
        }
         currentRenderer.reset();  
        currentRenderer = make_shared<ofGLProgrammableRenderer>(this);
        makeCurrent();
        static_cast<ofGLProgrammableRenderer*>(currentRenderer.get())->setup(3,1);
		if (avi_info.output_format != 0 && shader_init) { 

			rgb2ycbcr_shader();
		}
		if (is_std_DoVi && shader_init) {
			if (colorspace_on) {
				dovi_pattern_shader();
			} else {
				dovi_image_shader();
			}

		}
		EGL_info();	
		ofLog() << "GBM: - initialized GBM";	
			
	} else {
        ofLogError() << "RIP";
    }
	
	
}

Jul 20 '22 15:07 docdude

If you are outputting Dolby Video to the TV my understanding is that you need dynamic metadata attached to the frame - as it stands (unless you've done more mods) our DRM layer is capable of static HDR metadata, but not dynamic. (I've got non-Dolby HDR working via direct DRM)

Jul 20 '22 15:07 jc-kynesim

If you want example code that outputs HDR directly via DRM look at my drmu project (hello_drmu is a trivial video player).

Jul 20 '22 15:07 jc-kynesim

If you want example code that outputs HDR directly via DRM look at my drmu project (hello_drmu is a trivial video player).

I already output HDR metadata through DRM and dovi standard and LLDV infoframes(with modded kernel DRMvc4). Had to add code to the kernel to parse dolby VSDB, dynamic HDR, and dolby Vendor specific infoframes.

static void drm_parse_cea_ext(struct drm_connector *connector,
			      const struct edid *edid)
{
	struct drm_display_info *info = &connector->display_info;
	const u8 *edid_ext;
	int i, start, end;

	edid_ext = drm_find_cea_extension(edid);
	if (!edid_ext)
		return;

	info->cea_rev = edid_ext[1];

	/* The existence of a CEA block should imply RGB support */
	info->color_formats = DRM_COLOR_FORMAT_RGB444;

	/* CTA DisplayID Data Block does not have byte #3 */
	if (edid_ext[0] == CEA_EXT) {
		if (edid_ext[3] & EDID_CEA_YCRCB444)
			info->color_formats |= DRM_COLOR_FORMAT_YCBCR444;
		if (edid_ext[3] & EDID_CEA_YCRCB422)
			info->color_formats |= DRM_COLOR_FORMAT_YCBCR422;
	}

	if (cea_db_offsets(edid_ext, &start, &end))
		return;

	for_each_cea_db(edid_ext, i, start, end) {
		const u8 *db = &edid_ext[i];

		if (cea_db_is_hdmi_vsdb(db))
			drm_parse_hdmi_vsdb_video(connector, db);
		if (cea_db_is_hdmi_forum_vsdb(db))
			drm_parse_hdmi_forum_vsdb(connector, db);
		if (cea_db_is_microsoft_vsdb(db))
			drm_parse_microsoft_vsdb(connector, db);
		if (cea_db_is_y420cmdb(db))
			drm_parse_y420cmdb_bitmap(connector, db);
		if (cea_db_is_vcdb(db))
			drm_parse_vcdb(connector, db);
		if (cea_db_is_hdmi_hdr_metadata_block(db))
			drm_parse_hdr_metadata_block(connector, db);
		if (cea_db_is_hdmi_hdr_dynamic_metadata_block(db))
			drm_parse_hdmi_hdr_dynamic_metadata_block(connector, db);
		if (cea_db_is_hdmi_hdr10_plus_vsdb(db))
			drm_parse_hdmi_hdr10_plus_vsdb(connector, db);
		if (cea_db_is_hdmi_dolby_vsdb(db))
			drm_parse_hdmi_dolby_vsdb(connector, db);
//		if (cea_db_is_hdmi_colorimetry_data_block(db))
//			drm_parse_colorimetry_data_block(connector, db);
	}
}

void ofxRPI4Window::updateHDR_Infoframe(hdmi_eotf eotf, int idx)
{
	bool ok;
	uint64_t blob_id = 0;	
	ofLog() << "DRM: Setting HDR infoframe";	

/*
	ok = drm_mode_get_property(device, connectorId,	DRM_MODE_OBJECT_CONNECTOR, "DOVI_OUTPUT_METADATA", &prop_id, &blob_id, &prop);
	if (!ok) {
		ofLogError() << "Unable to find DOVI_OUTPUT_METADATA";
	} else {
		if (blob_id) {
			drmModeDestroyPropertyBlob(device, blob_id);
			blob_id = 0;
		}
	}
*/
	ok = drm_mode_get_property(device, connectorId,	DRM_MODE_OBJECT_CONNECTOR, "HDR_OUTPUT_METADATA", &prop_id, &blob_id, &prop);
	if (!ok) {
		ofLogError() << "Unable to find HDR_OUTPUT_METADATA";

	} else {
		if (blob_id)
			drmModeDestroyPropertyBlob(device, blob_id);
		blob_id = 0;
	
		struct drm_hdr_output_metadata meta;
		if (static_cast<int>(eotf) == 3) {
			meta.metadata_type = HDMI_STATIC_METADATA_TYPE1;
			meta.hdmi_metadata_type1.eotf = eotf;
			meta.hdmi_metadata_type1.metadata_type = HDMI_STATIC_METADATA_TYPE1;

			meta.hdmi_metadata_type1.display_primaries[0].x = 0;
			meta.hdmi_metadata_type1.display_primaries[0].y = 0;
			meta.hdmi_metadata_type1.display_primaries[1].x = 0;
			meta.hdmi_metadata_type1.display_primaries[1].y = 0;
			meta.hdmi_metadata_type1.display_primaries[2].x = 0;
			meta.hdmi_metadata_type1.display_primaries[2].y = 0;		
			meta.hdmi_metadata_type1.white_point.x = 0;
			meta.hdmi_metadata_type1.white_point.y = 0;

			meta.hdmi_metadata_type1.max_display_mastering_luminance = 0;
			meta.hdmi_metadata_type1.min_display_mastering_luminance = 0;
		
			meta.hdmi_metadata_type1.max_fall = 0; 
			meta.hdmi_metadata_type1.max_cll = 0;
		} else {
			meta.metadata_type = HDMI_STATIC_METADATA_TYPE1;
			meta.hdmi_metadata_type1.eotf = eotf;
			meta.hdmi_metadata_type1.metadata_type = HDMI_STATIC_METADATA_TYPE1;

			meta.hdmi_metadata_type1.display_primaries[0].x = std::round(DisplayChromacityList[idx].GreenX * EGL_METADATA_SCALING_EXT);
			meta.hdmi_metadata_type1.display_primaries[0].y = std::round(DisplayChromacityList[idx].GreenY * EGL_METADATA_SCALING_EXT);
			meta.hdmi_metadata_type1.display_primaries[1].x = std::round(DisplayChromacityList[idx].BlueX * EGL_METADATA_SCALING_EXT);
			meta.hdmi_metadata_type1.display_primaries[1].y = std::round(DisplayChromacityList[idx].BlueY * EGL_METADATA_SCALING_EXT);
			meta.hdmi_metadata_type1.display_primaries[2].x = std::round(DisplayChromacityList[idx].RedX * EGL_METADATA_SCALING_EXT);
			meta.hdmi_metadata_type1.display_primaries[2].y = std::round(DisplayChromacityList[idx].RedY * EGL_METADATA_SCALING_EXT);		
			meta.hdmi_metadata_type1.white_point.x = std::round(DisplayChromacityList[idx].WhiteX * EGL_METADATA_SCALING_EXT);
			meta.hdmi_metadata_type1.white_point.y = std::round(DisplayChromacityList[idx].WhiteY * EGL_METADATA_SCALING_EXT);

			meta.hdmi_metadata_type1.max_display_mastering_luminance = (uint16_t)((float)hdr_metadata.hdmi_metadata_type1.max_display_mastering_luminance);// * 10000.0f);//(uint16_t)(10000.0f * 10000.0f);
			meta.hdmi_metadata_type1.min_display_mastering_luminance = (uint16_t)((float)(hdr_metadata.hdmi_metadata_type1.min_display_mastering_luminance/10000.0f) * 10000.0f);//(uint16_t)(0.001f    * 10000.0f);
		
			meta.hdmi_metadata_type1.max_fall = (float)hdr_metadata.hdmi_metadata_type1.max_fall; 
			meta.hdmi_metadata_type1.max_cll = (float)hdr_metadata.hdmi_metadata_type1.max_cll;
		}
			
		drmModeCreatePropertyBlob(device, &meta, sizeof(meta), (uint32_t*)&blob_id); 
		first_req = 1; // allocate for atomic requests
		last_req = 1; // commit previous atomic requests	
		drm_mode_atomic_set_property(device, req, "HDR_OUTPUT_METADATA", connectorId, prop_id, blob_id, prop, DRM_MODE_ATOMIC_ALLOW_MODESET);
	}

}
 
void ofxRPI4Window::updateDoVi_Infoframe(int enable, int dv_interface) 
{
	bool ok;
	uint64_t blob_id = 0;
	
	ofLog() << "DRM: Setting DoVi infoframe";	
	
	ok = drm_mode_get_property(device, connectorId,	DRM_MODE_OBJECT_CONNECTOR, "DOVI_OUTPUT_METADATA", &prop_id, &blob_id, &prop);
	
	if (!ok) {
		ofLogError() << "Unable to find DOVI_OUTPUT_METADATA";
	} else {
		if (blob_id)
			drmModeDestroyPropertyBlob(device, blob_id);
		blob_id = 0;	
		if (!enable && !dv_interface) {
			first_req = 1; // allocate for atomic requests
			last_req = 1; // commit previous atomic requests	
			drm_mode_atomic_set_property(device, req, "DOVI_OUTPUT_METADATA", connectorId, prop_id, blob_id, prop, DRM_MODE_ATOMIC_ALLOW_MODESET);
			return; 
		}
		struct dovi_output_metadata dovi;
		if (dv_interface == 1) 
			dovi.oui = 0x000C03;
		else if (dv_interface == 2)	
			dovi.oui = 0x00D046;
		dovi.dv_status = enable; //set to 1 to enable dovi infoframe 
		dovi.dv_interface = dv_interface; 
		dovi.backlight_metadata = 0;
		dovi.backlight_max_luminance = 0;
		dovi.aux_runmode = 0;
		dovi.aux_version = 0;
		dovi.aux_debug = 0;
		drmModeCreatePropertyBlob(device, &dovi, sizeof(dovi), (uint32_t*)&blob_id); 
		first_req = 1; // allocate for atomic requests
		last_req = 1; // commit previous atomic requests	
		drm_mode_atomic_set_property(device, req, "DOVI_OUTPUT_METADATA", connectorId, prop_id, blob_id, prop, DRM_MODE_ATOMIC_ALLOW_MODESET);
	} 

}

│   │   └───Properties
│   │       ├───"EDID" (immutable): blob = 346
│   │       ├───"DPMS": enum {On, Standby, Suspend, Off} = On
│   │       ├───"link-status": enum {Good, Bad} = Good
│   │       ├───"non-desktop" (immutable): range [0, 1] = 0
│   │       ├───"TILE" (immutable): blob = 0
│   │       ├───"CRTC_ID" (atomic): object CRTC = 103
│   │       ├───"Colorspace": enum {Default, SMPTE_170M_YCC, BT709_YCC, XVYCC_601, XVYCC_709, SYCC_601, opYCC_601, opRGB, BT2020_CYCC, BT2020_RGB, BT2020_YCC, DCI-P3_RGB_D65, DCI-P3_RGB_Theater} = BT2020_RGB
│   │       ├───"available output formats" (immutable): bitmask {RGB444, YCbCr444, YCbCr422, YCbCr420} = (RGB444 | YCbCr444 | YCbCr422)
│   │       ├───"output format": enum {RGB444, YCbCr444, YCbCr422, YCbCr420} = YCbCr444
│   │       ├───"left margin": range [0, 100] = 0
│   │       ├───"right margin": range [0, 100] = 0
│   │       ├───"top margin": range [0, 100] = 0
│   │       ├───"bottom margin": range [0, 100] = 0
│   │       ├───"max bpc": range [8, 12] = 10
│   │       ├───"active pixel rate": range [0, UINT32_MAX] = 185625000
│   │       ├───"active color format": enum {RGB444, YCbCr444, YCbCr422, YCbCr420} = YCbCr444
│   │       ├───"HDR_OUTPUT_METADATA": blob = 350
│   │       │   │   ├───"type" = 0
│   │       │   │   ├───"eotf" = 2
│   │       │   │   ├───"metadata_type" = 0
│   │       │   │   ├───"display_primaries_r_x" = 34000
│   │       │   │   ├───"display_primaries_r_y" = 16000
│   │       │   │   ├───"display_primaries_g_x" = 13250
│   │       │   │   ├───"display_primaries_g_y" = 34500
│   │       │   │   ├───"display_primaries_b_x" = 7500
│   │       │   │   ├───"display_primaries_b_y" = 3000
│   │       │   │   ├───"white_point_x" = 15635
│   │       │   │   ├───"white_point_y" = 16450
│   │       │   │   ├───"max_display_mastering_luminance" = 4014
│   │       │   │   ├───"min_display_mastering_luminance" = 1
│   │       │   │   ├───"max_cll" = 1000
│   │       │   │   └───"max_fall" = 400
│   │       ├───"DOVI_OUTPUT_METADATA": blob = 0
│   │       └───"Broadcast RGB": enum {Automatic, Full, Limited 16:235} = Full

Dolby video needs the reshaping algorithms and ICtPt transform matrices to get proper color output, the coefficients are gleaned from the RPU metadata in the av frame. For drmu I wrote a dovi_rpu module that grabs the dovi rpu metadata from modded rpi-ffmpeg(added the dovi bits) and processes the metadata to reshape/decode the dolby video.

drmu_dovi.h

/**
 * Dolby Vision metadata description
 */
enum dovi_reshape_method_t
{
    DOVI_RESHAPE_POLYNOMIAL = 0,
    DOVI_RESHAPE_MMR = 1,
};

enum dovi_nlq_method_t
{
    DOVI_NLQ_NONE = -1,
    DOVI_NLQ_LINEAR_DZ = 0,
};


struct matrix3x3 {
    float m[3][3];
};

// Represents an affine transformation, which is basically a 3x3 matrix
// together with a column vector to add onto the output.
struct transform3x3 {
    struct matrix3x3 mat;
    float c[3];
};


typedef struct video_dovi_metadata_t
{
    /* Common header fields */
    uint8_t coef_log2_denom;
    uint8_t bl_bit_depth;
    uint8_t el_bit_depth;
    enum dovi_nlq_method_t nlq_method_idc;

    /* Colorspace metadata */
    float nonlinear_offset[3];
    struct matrix3x3 nonlinear_matrix;//[9];
    struct matrix3x3 linear_matrix;//[9];
    uint16_t source_min_pq; /* 12-bit PQ values */
    uint16_t source_max_pq;

    /**
     * Do not reorder or modify the following structs, they are intentionally
     * specified to be identical to AVDOVIReshapingCurve / AVDOVINLQParams.
     */
    struct dovi_reshape_t {
        uint8_t num_pivots;
        float pivots[9];
        enum dovi_reshape_method_t mapping_idc[8];
        uint8_t poly_order[8];
        float poly_coeffs[8][3];
        uint8_t mmr_order[8];
        float mmr_constant[8];
        float mmr_coeffs[8][3][7];
    } curves[3];

    struct dovi_nlq_t {
        uint8_t offset_depth; /* bit depth of offset value */
        uint16_t offset;
        uint64_t hdr_in_max;
        uint64_t dz_slope;
        uint64_t dz_threshold;
    } nlq[3];
} video_dovi_metadata_t;


void map_dovi_metadata(video_dovi_metadata_t *out, const AVDOVIMetadata *data);
void dovi_decode(AVFrame *tmp_frame, const video_dovi_metadata_t *data, int x, int y);

May be an exercise in futility but wanted to see if it was possible to bypass GL and use decoded dolby video pumped directly into DRM.

Libplacebo proved GL works for dolby video using shaders. It grabs the AVFrame pixel data and creates a texture which is then processed in shaders. Slooow on pi4. but works. Want to do the same but thru DRM.

BTW I like what you did with drmu.

Jul 20 '22 16:07 docdude

As I understand it if you can simply extract the dolby metadata from the frame (which you've done) and attach it to the frame in the right bit of HDMI metadata it should all just work - or are you saying you also need to rewrite the frame on the fly as well as outputting the appropriate metadata? If the latter then you/we are probably stuffed - you don't have enough time to do even the simplest transform on 4kp60 video on the CPU.

If the former then conceptually this is trivial - either reuse the existing DRM hdr metadata property or create a new one, add to the commit with the rest of the frame and everything works. However as it stands changing hdr metadata causes a modeset (I think in the DRM framework code) which is definitely not wanted for Dolby so some tweaking would be required there. - the driver also needs to get it attached to the right frame which I understand may not be trivial (from when I asked about this).

Do keep me updated with your progress - if we can get Dolby Video working that would be super.

Jul 20 '22 17:07 jc-kynesim

Having looked at your patches a bit harder - isn't it just a matter of dropping the dovi header into the dovi blob? I assume not as you wouldn't be asking me all this if it was that simple?

Jul 20 '22 17:07 jc-kynesim

Having looked at your patches a bit harder - isn't it just a matter of dropping the dovi header into the dovi blob? I assume not as you wouldn't be asking me all this if it was that simple?

I wish. But no. The dovi blob merely sends an Vendor specific infoframe to set the dolby vision badge on the display. An LLDV infoframe will set the dv badge but any dolby content still has to be decoded which is extremely slooow on the pi4 cpu and gpu for that matter with shaders. A dolby STANDARD infoframe requires the content to have embedded rpu (vdr)color metadata inserted in the first 2 scanlines as described in the dolby patent available online. With this VSIF and the correct AVIF, these dovi rpu pixels embedded, and correct even/odd pixel value format you get video, known as "rgb tunneling". This is how we generate dolby vision patterns for our pattern generator to do dolby display calibration.

If the latter then you/we are probably stuffed - you don't have enough time to do even the simplest transform on 4kp60 video on the CPU.

I think this will end up being the case, unfortunately.

We can turn on the dolby vision badge sending the infoframes with the modded DRM code, but it still requires decoding the metadata in the AVFRAME to reshape the color of each frame in order to correctly view dolby content.
For pattern generation, using STANDARD dolby infoframe, we can embed the dovi color rpu, havent tried that on video though. Im thinking would still need the color reshaping plus it only works for 1080p patterns, and 8bit, not 4k AFAIK. I've gotten 10bit patterns to work but you can see the dolby rpu metadata on the top2 scanlines since it is in 8bit.
Dolby documentation is very limited so most of this has been done by reverse engineering.

Ill keep u posted though.

Jul 20 '22 18:07 docdude

but any dolby content still has to be decoded which is extremely slooow on the pi4 cpu

Can you explain roughly what this involves? Does it require processing per pixel for the decoded video frame? If so roughly how many operations, and what is it doing this for?

Jul 20 '22 18:07 popcornmix

Can you explain roughly what this involves? Does it require processing per pixel for the decoded video frame? If so roughly how many operations, and what is it doing this for?

It is per pixel processing. You can see it requires multiple operations. It is basically a rudimentary software dolby metadata decoder required to get the correct colors displayed from dolby content.

This function pulls the dovi metadata from the AVFrame sidedata

void map_dovi_metadata(video_dovi_metadata_t *out,
                               const AVDOVIMetadata *data )
{	
    const AVDOVIRpuDataHeader *hdr = av_dovi_get_header( data );
    const AVDOVIDataMapping *vdm = av_dovi_get_mapping( data );
    const AVDOVIColorMetadata *color = av_dovi_get_color( data );
    out->bl_bit_depth = hdr->bl_bit_depth;
    out->el_bit_depth = hdr->el_bit_depth;
    out->coef_log2_denom = hdr->coef_log2_denom;
    out->nlq_method_idc = (enum dovi_nlq_method_t) vdm->nlq_method_idc;
    for( size_t i = 0; i < ARRAY_SIZE( out->nonlinear_offset ); i++ )
        out->nonlinear_offset[i] = av_q2d( color->ycc_to_rgb_offset[i] );
    for( size_t i = 0; i < 9; i++ ) {
		float *nonlinear_matrix = &out->nonlinear_matrix.m[0][0];

        nonlinear_matrix[i] = av_q2d( color->ycc_to_rgb_matrix[i] );
	

		float *linear_matrix = &out->linear_matrix.m[0][0];
        linear_matrix[i] = av_q2d( color->rgb_to_lms_matrix[i] );
	}
    out->source_min_pq = color->source_min_pq;
    out->source_max_pq = color->source_max_pq;
    for (int c = 0; c < 3; c++) {
        const AVDOVIReshapingCurve *csrc = &vdm->curves[c];
        struct dovi_reshape_t *cdst = &out->curves[c];
        cdst->num_pivots = csrc->num_pivots;
        for (int i = 0; i < csrc->num_pivots; i++) {
            const float scale = 1.0f / ((1 << hdr->bl_bit_depth) - 1);
            cdst->pivots[i] = scale * csrc->pivots[i];
        }
        for (int i = 0; i < csrc->num_pivots - 1; i++) {
            const float scale = 1.0f / (1 << hdr->coef_log2_denom);
            cdst->mapping_idc[i] = csrc->mapping_idc[i];
            switch (csrc->mapping_idc[i]) {
            case DOVI_RESHAPE_POLYNOMIAL:
                for (int k = 0; k < 3; k++) {
                    cdst->poly_coeffs[i][k] = (k <= csrc->poly_order[i])
                        ? scale * csrc->poly_coef[i][k]
                        : 0.0f;
                }
                break;
            case DOVI_RESHAPE_MMR:
                cdst->mmr_order[i] = csrc->mmr_order[i];
                cdst->mmr_constant[i] = scale * csrc->mmr_constant[i];
                for (int j = 0; j < csrc->mmr_order[i]; j++) {
                    for (int k = 0; k < 7; k++)
                        cdst->mmr_coeffs[i][j][k] = scale * csrc->mmr_coef[i][j][k];
                }
                break;
            }
        }
    }
 //   assert(sizeof(out->curves) == sizeof(vdm->curves));
    assert(sizeof(out->nlq)    == sizeof(vdm->nlq));
  //  memcpy(out->curves, vdm->curves, sizeof(out->curves));
    memcpy(out->nlq,    vdm->nlq,    sizeof(out->nlq));
    for( size_t i = 0; i < ARRAY_SIZE( out->curves ); i++)
        assert( out->curves[i].num_pivots <= ARRAY_SIZE( out->curves[i].pivots ));
}

This takes the dovi metadata and does the color reshaping.

void dovi_decode(AVFrame *out_frame, const video_dovi_metadata_t *data, int x, int y)
{
    double mul[3]   = { 0.0, 0.0, 0.0 };
    double black[3] = { 0.0, 0.0, 0.0 };	
	float color[4] = {0.0,0.0,0.0,1.0};

	int xpos=0, ypos=0;
	xpos = x;
	ypos = y; 
    // Y component
     color[0] = (out_frame->data[0][out_frame->linesize[0]*ypos + xpos] << 8 | out_frame->data[0][out_frame->linesize[0]*ypos + xpos + 1] & 0xff);///65535;

    // U, V components 
    xpos /= 2;
	ypos /= 2;
    color[1] = ((out_frame->data[1][out_frame->linesize[1]*ypos + xpos] << 8 | out_frame->data[1][out_frame->linesize[1]*ypos + xpos + 1]) & 0xff);///65535;
    color[2] = ((out_frame->data[2][out_frame->linesize[2]*ypos + xpos] << 8 | out_frame->data[2][out_frame->linesize[2]*ypos + xpos + 1]) & 0xff);///65535;	


	dovi_reshape(data,color);		   

	// Represents an affine transformation, which is basically a 3x3 matrix
	// together with a column vector to add onto the output.
	struct transform3x3 tr = { .mat = data->nonlinear_matrix };

    double scale = (1LL << 10) / ((1LL << 10) - 1.0);

    for (int i = 0; i < 3; i++) {
        mul[i] = 1.0;
        black[i] = data->nonlinear_offset[i] * scale;
    }

    // Multiply in the texture multiplier and adjust `c` so that black[j] keeps
    // on mapping to RGB=0 (black to black)
    for (int i = 0; i < 3; i++) {
        for (int j = 0; j < 3; j++) {
            tr.mat.m[i][j] *= mul[j];
            tr.c[i] -= tr.mat.m[i][j] * black[j];
        }
    }
	transform3x3_apply(&tr,color);
	
    // Dolby Vision always outputs BT.2020-referred HPE LMS, so hard-code
    // the inverse LMS->RGB matrix corresponding to this color space.
    struct matrix3x3 dovi_lms2rgb = {{
        { 3.06441879, -2.16597676,  0.10155818},
        {-0.65612108,  1.78554118, -0.12943749},
        { 0.01736321, -0.04725154,  1.03004253},
    }};	
	
	matrix3x3_mul(&dovi_lms2rgb, &data->linear_matrix);	
	
    // PQ EOTF
	EOTF(color);
    // LMS matrix
    matrix3x3_apply(&dovi_lms2rgb,  color);
    // PQ OETF
	OETF(color);
    xpos = x;
	ypos = y;
out_frame->data[0][out_frame->linesize[0]*ypos + xpos] = ((int)(color[0]*65535) >> 8 )&0xff;
out_frame->data[0][out_frame->linesize[0]*ypos + xpos + 1] = (int)(color[0]*65535)&0xff;
xpos /=2;
ypos /=2;
out_frame->data[1][out_frame->linesize[1]*ypos + xpos] = ((int)(color[1]*65535) >> 8 )&0xff;
out_frame->data[1][out_frame->linesize[1]*ypos + xpos + 1] = (int)(color[1]*65535)&0xff;
out_frame->data[2][out_frame->linesize[2]*ypos + xpos] = ((int)(color[2]*65535) >> 8 )&0xff;
out_frame->data[2][out_frame->linesize[2]*ypos + xpos + 1] = (int)(color[2]*65535)&0xff;	 
	
	
}

These are the functions that do the reshaping. Some i had to create some GLSL functions in c.

float dot_product(float v[], float u[], int n)
{
    float result = 0.0;
    for (int i = 0; i < n; i++)
        result += v[i]*u[i];
    return result;
}

void matrix3x3_apply(const struct matrix3x3 *mat, float vec[3])
{
    float x = vec[0], y = vec[1], z = vec[2];

    for (int i = 0; i < 3; i++)
        vec[i] = mat->m[i][0] * x + mat->m[i][1] * y + mat->m[i][2] * z;
}

void matrix3x3_mul(struct matrix3x3 *a, const struct matrix3x3 *b)
{
    float a00 = a->m[0][0], a01 = a->m[0][1], a02 = a->m[0][2],
          a10 = a->m[1][0], a11 = a->m[1][1], a12 = a->m[1][2],
          a20 = a->m[2][0], a21 = a->m[2][1], a22 = a->m[2][2];

    for (int i = 0; i < 3; i++) {
        a->m[0][i] = a00 * b->m[0][i] + a01 * b->m[1][i] + a02 * b->m[2][i];
        a->m[1][i] = a10 * b->m[0][i] + a11 * b->m[1][i] + a12 * b->m[2][i];
        a->m[2][i] = a20 * b->m[0][i] + a21 * b->m[1][i] + a22 * b->m[2][i];
    }
}

float *vec_mul(float *a, float *b, int size, float *result)
{
	//static float result[] = {0.0};
	for (int i=0; i < size; i++) {
		result[i] = a[i] * b[i];
	}
	return result;
}
	
void transform3x3_apply(const struct transform3x3 *t, float vec[3])
{
    matrix3x3_apply(&t->mat, vec);

    for (int i = 0; i < 3; i++)
        vec[i] += t->c[i];
}

void mix ( float x[4], float y[4], float a, int n, float result[4] )
{
	//static float result[4] = {0.0};
	for (int i = 0; i < n; i++)
      result[i]=(1-a)*x[i] + a*y[i];
   // return result;
}

double fsel( double a, double b, double c ) {
  return a >= 0 ? b : c; 
}
static inline double clamp ( double a, double min, double max ) 
{
   a = fsel( a - min , a, min );
   return fsel( a - max, max, a );
}

static inline void reshape_mmr(float *s, float sig[3], float coeffs[4], float mmr[6][4], bool single,
                               int min_order, int max_order)
{
unsigned int order;
unsigned int mmr_idx;
		float *sig2;//sig2[3];
		sig2=(float[]){0.0,0.0,0.0};
		float sigX[4];
		float *sigX2;//sigX2[4];
		sigX2=(float[]){0.0,0.0,0.0,0.0};
		float tempsig[3];
						 		

//float sig[3] = {0.0,0.0,0.0};
    if (single) {
  mmr_idx = 0u;
    } else {
       mmr_idx = (unsigned int)coeffs[1];
    }

    assert(min_order <= max_order);
    if (min_order < max_order)
		order = (unsigned int)coeffs[3];

		 sigX[0] = sig[0] * sig[1];
		 sigX[1] = sig[0] * sig[2];
		 sigX[2]= sig[1] * sig[2];
		 sigX[3] = sigX[0] * sig[2];
		 
		 *s = coeffs[0];
*s += dot_product(mmr[mmr_idx + 0],sig,3);
*s += dot_product(mmr[mmr_idx + 1],sigX,4);

    if (max_order >= 2) {
     //  if (min_order < 2) {
      //    if (order >= 2) {


				vec_mul(sig,sig,3, sig2);

				vec_mul(sigX,sigX,4, sigX2);

	


*s += dot_product(mmr[mmr_idx + 2],sig2,3);
*s += dot_product(mmr[mmr_idx + 3],sigX2,4);
        if (max_order == 3) {
		 
*s += dot_product(mmr[mmr_idx + 4],vec_mul(sig2,sig,3,tempsig),3);


*s += dot_product(mmr[mmr_idx + 5],vec_mul(sigX2,sigX,4,tempsig),4);
         //  if (min_order < 3)
			//	} 
        }

      // if (min_order < 2)
         // } 
    }
}

static inline void reshape_poly(float* s, float coeffs[4])
{
   *s = (coeffs[2]* *s + coeffs[1]) * *s + coeffs[0];
}

void dovi_reshape(const video_dovi_metadata_t *data, float color[4])
{
    if (!data)
        return;


float sig[3];
//float color[4] = {Y/1023., U/1023., V/1023.,1.0};
float s=0.0;
sig[0] = clamp(color[0], 0.0, 1.0);
sig[1] = clamp(color[1], 0.0, 1.0);
sig[2] = clamp(color[2], 0.0, 1.0);
	//float pivots[9] ={0.0};
    float coeffs_data[8][4] = {{0.0}};
    float mmr_packed_data[8*6][4] = {{0.0}};
	float *coeffs;
	coeffs=(float[]){0.0,0.0,0.0,0.0};
    for (int c = 0; c < 3; c++) {
        struct dovi_reshape_t *curves = &data->curves[c];
        if (!curves->num_pivots)
            continue;

       assert(curves->num_pivots >= 2 && curves->num_pivots <= 9);


		s = sig[c];
        // Prepare coefficients 
        bool has_poly = false, has_mmr = false, mmr_single = true;
        int mmr_idx = 0, min_order = 3, max_order = 1;
        memset(coeffs_data, 0, sizeof(coeffs_data));
        for (int i = 0; i < curves->num_pivots - 1; i++) {
			const float scale = 1.0f / (1 << data->coef_log2_denom);
			
            switch (curves->mapping_idc[i]) {
            case DOVI_RESHAPE_POLYNOMIAL: // polynomial
                has_poly = true;
                coeffs_data[i][3] = 0.0; // order=0 signals polynomial
                for (int k = 0; k < 3; k++)
                    coeffs_data[i][k] = curves->poly_coeffs[i][k];
                break;

            case DOVI_RESHAPE_MMR:
                min_order = MIN(min_order, curves->mmr_order[i]);
                max_order = MAX(max_order, curves->mmr_order[i]);
                mmr_single = !has_mmr;
                has_mmr = true;
                coeffs_data[i][3] = (float) curves->mmr_order[i];
                coeffs_data[i][0] = scale * curves->mmr_constant[i];
                coeffs_data[i][1] = (float) mmr_idx;
                for (int j = 0; j < curves->mmr_order[i]; j++) {
                    // store weights per order as two packed vec4s
                    float *mmr = &mmr_packed_data[mmr_idx][0];
                    mmr[0] = curves->mmr_coeffs[i][j][0];
                    mmr[1] = curves->mmr_coeffs[i][j][1];
                    mmr[2] = curves->mmr_coeffs[i][j][2];
                    mmr[3] = 0.0; // unused
                    mmr[4] = curves->mmr_coeffs[i][j][3];
                    mmr[5] = curves->mmr_coeffs[i][j][4];
                    mmr[6] = curves->mmr_coeffs[i][j][5];
                    mmr[7] = curves->mmr_coeffs[i][j][6];
                    mmr_idx += 2;
                }
                break;

            default:
				unreachable();
			break;

            }
        }

        if (curves->num_pivots > 2) {

            // Skip the (irrelevant) lower and upper bounds
            float pivots_data[7] = {0.0};
            memcpy(pivots_data, curves->pivots + 1,
                   (curves->num_pivots - 2) * sizeof(pivots_data[0]));

            // Fill the remainder with a quasi-infinite sentinel pivot
            for (size_t i = curves->num_pivots - 2; i < ARRAY_SIZE(pivots_data); i++)
                pivots_data[i] = 1e9f;


		//	memcpy(pivots,pivots_data,sizeof(pivots_data));

		//	memcpy(coeffs,coeffs_data,sizeof(coeffs_data));
  



coeffs = coeffs_data[7];				 
  mix(coeffs, coeffs_data[6], (s < pivots_data[6]),4, coeffs); 
mix(coeffs, coeffs_data[5], (s < pivots_data[5]),4, coeffs); 
mix(coeffs, coeffs_data[4], (s < pivots_data[4]),4, coeffs); 
mix(coeffs, coeffs_data[3], (s < pivots_data[3]),4, coeffs); 
 mix(coeffs, coeffs_data[2], (s < pivots_data[2]),4, coeffs ); 
 mix(coeffs, coeffs_data[1], (s < pivots_data[1]),4, coeffs); 
 mix(coeffs, coeffs_data[0], (s < pivots_data[0]),4, coeffs);             
        } else {

            // No need for a single pivot, just set the coeffs directly

			memcpy(coeffs,coeffs_data[0],sizeof(coeffs_data[0]));
        }


        if (has_mmr) {

		         //      memcpy(mmr, mmr_packed_data,sizeof(mmr_packed_data[0]));
        }

        if (has_mmr && has_poly) {
            if (coeffs[4] == 0.0) {//fix
            reshape_poly(&s,coeffs);
            } else {
            reshape_mmr(&s, sig, coeffs, mmr_packed_data, mmr_single, min_order, max_order);
            }
        } else if (has_poly) {
            reshape_poly(&s,coeffs);
        } else {
            assert(has_mmr);
   
            reshape_mmr(&s, sig, coeffs, mmr_packed_data, mmr_single, min_order, max_order);

        }


        float lo = curves->pivots[0];
        float hi = curves->pivots[curves->num_pivots - 1];
        color[c] = clamp(s, lo, hi); 
    }

  
}



void EOTF(float color[3]) 
{
	// Common constants for SMPTE ST.2084 (PQ)
    static const double m1 = 2610.0 / 4096 * 1./4,
						m2 = 2523.0 / 4096 * 128,
						c1 = 3424.0 / 4096,
						c2 = 2413.0 / 4096 * 32,
						c3 = 2392.0 / 4096 * 32;
    for (int i=0; i < 3; i++) {
		double Em2 = pow(MAX(0, color[i]), 1 / m2);
		color[i] = pow(MAX(0, Em2 - c1) / (c2 - c3 * Em2), 1 / m1);
	}
}

void OETF(float color[3]) 
{
	// Common constants for SMPTE ST.2084 (PQ)
    static const double m1 = 2610.0 / 4096 * 1./4,
						m2 = 2523.0 / 4096 * 128,
						c1 = 3424.0 / 4096,
						c2 = 2413.0 / 4096 * 32,
						c3 = 2392.0 / 4096 * 32;
    for (int i=0; i < 3; i++) {
		if (color[i] > 0) {
			double Ym1 = pow(color[i], m1);
			color[i] = pow((c1 + c2 * Ym1) / (1 + c3 * Ym1), m2);
		} else {
			color[i] = 0;
		}
	}
}

Again this is code mostly ported from libplacebo, I took the shaders it creates for dolby processing and converted to C. Ran the code against an shader debugger using the libplacebo shaders for dovi and the pixel values come out the same for different dovi profiles i tested.

Jul 20 '22 19:07 docdude

Thanks for that

Am I right that if you output a Dolby standard header & encode the appropriate data in the picture you don't need to manipulate the rest of the data or is some other processing still required? Sorry if I'm failing to understand what you've said - you know a lot more about this than me and things that are obvious to you are still new to me.

Looks like Dolby are deliberately making it hard for people to do easy implementations. - I can't see any way that we could setup our h/w to get all the bottom bits right as is required by that patent. Do you happen to have timings for how long it takes Pi4 shaders to process a frame (ignoring all other frame import/export related sloth)?

[Having said the above I wonder if we could in fact characterise the HVS conversions s.t. we could manipulate the input data to get known bottom bits in the output at the expense of some minor fidelity in the top 3 lines - we could afford quite a lot of processing to get 3 lines right if it saved us processing the whole frame]

Jul 21 '22 09:07 jc-kynesim

Sorry for the late response.

Am I right that if you output a Dolby standard header & encode the appropriate data in the picture you don't need to manipulate the rest of the data or is some other processing still required?

Short answer: No The metadata embedded in a NAL unit of a dolby video stream tells the dolby decoder how to reshape the picture. The input metadata bitstream contains:


Parsing RPU file...
{
  "dovi_profile": 8,
  "header": {
    "rpu_nal_prefix": 25,
    "rpu_type": 2,
    "rpu_format": 18,
    "vdr_rpu_profile": 1,
    "vdr_rpu_level": 0,
    "vdr_seq_info_present_flag": true,
    "chroma_resampling_explicit_filter_flag": false,
    "coefficient_data_type": 0,
    "coefficient_log2_denom": 23,
    "vdr_rpu_normalized_idc": 1,
    "bl_video_full_range_flag": false,
    "bl_bit_depth_minus8": 2,
    "el_bit_depth_minus8": 2,
    "vdr_bit_depth_minus_8": 4,
    "spatial_resampling_filter_flag": false,
    "reserved_zero_3bits": 0,
    "el_spatial_resampling_filter_flag": false,
    "disable_residual_flag": true,
    "vdr_dm_metadata_present_flag": true,
    "use_prev_vdr_rpu_flag": false,
    "prev_vdr_rpu_id": 0,
    "vdr_rpu_id": 0,
    "mapping_color_space": 0,
    "mapping_chroma_format_idc": 0,
    "num_pivots_minus_2": [
      0,
      0,
      0
    ],
    "pred_pivot_value": [
      [
        0,
        1023
      ],
      [
        0,
        1023
      ],
      [
        0,
        1023
      ]
    ],
    "nlq_method_idc": null,
    "nlq_num_pivots_minus2": null,
    "nlq_pred_pivot_value": null,
    "num_x_partitions_minus1": 0,
    "num_y_partitions_minus1": 0
  },
  "rpu_data_mapping": {
    "mapping_idc": [
      [
        0
      ],
      [
        0
      ],
      [
        0
      ]
    ],
    "mapping_param_pred_flag": [
      [
        false
      ],
      [
        false
      ],
      [
        false
      ]
    ],
    "num_mapping_param_predictors": [
      [
        0
      ],
      [
        0
      ],
      [
        0
      ]
    ],
    "diff_pred_part_idx_mapping_minus1": [
      [],
      [],
      []
    ],
    "poly_order_minus1": [
      [
        0
      ],
      [
        0
      ],
      [
        0
      ]
    ],
    "linear_interp_flag": [
      [
        false
      ],
      [
        false
      ],
      [
        false
      ]
    ],
    "pred_linear_interp_value_int": [
      [],
      [],
      []
    ],
    "pred_linear_interp_value": [
      [],
      [],
      []
    ],
    "poly_coef_int": [
      [
        [
          0,
          1
        ]
      ],
      [
        [
          0,
          1
        ]
      ],
      [
        [
          0,
          1
        ]
      ]
    ],
    "poly_coef": [
      [
        [
          0,
          0
        ]
      ],
      [
        [
          0,
          0
        ]
      ],
      [
        [
          0,
          0
        ]
      ]
    ],
    "mmr_order_minus1": [
      [],
      [],
      []
    ],
    "mmr_constant_int": [
      [],
      [],
      []
    ],
    "mmr_constant": [
      [],
      [],
      []
    ],
    "mmr_coef_int": [
      [],
      [],
      []
    ],
    "mmr_coef": [
      [],
      [],
      []
    ]
  },
  "vdr_dm_data": {
    "compressed": false,
    "affected_dm_metadata_id": 0,
    "current_dm_metadata_id": 0,
    "scene_refresh_flag": 1,
    "ycc_to_rgb_coef0": 9574,
    "ycc_to_rgb_coef1": 0,
    "ycc_to_rgb_coef2": 13802,
    "ycc_to_rgb_coef3": 9574,
    "ycc_to_rgb_coef4": -1540,
    "ycc_to_rgb_coef5": -5348,
    "ycc_to_rgb_coef6": 9574,
    "ycc_to_rgb_coef7": 17610,
    "ycc_to_rgb_coef8": 0,
    "ycc_to_rgb_offset0": 16777216,
    "ycc_to_rgb_offset1": 134217728,
    "ycc_to_rgb_offset2": 134217728,
    "rgb_to_lms_coef0": 7222,
    "rgb_to_lms_coef1": 8771,
    "rgb_to_lms_coef2": 390,
    "rgb_to_lms_coef3": 2654,
    "rgb_to_lms_coef4": 12430,
    "rgb_to_lms_coef5": 1300,
    "rgb_to_lms_coef6": 0,
    "rgb_to_lms_coef7": 422,
    "rgb_to_lms_coef8": 15962,
    "signal_eotf": 65535,
    "signal_eotf_param0": 0,
    "signal_eotf_param1": 0,
    "signal_eotf_param2": 0,
    "signal_bit_depth": 12,
    "signal_color_space": 0,
    "signal_chroma_format": 0,
    "signal_full_range_flag": 1,
    "source_min_pq": 62,
    "source_max_pq": 3696,
    "source_diagonal": 42,
    "cmv29_metadata": {
      "num_ext_blocks": 5,
      "ext_metadata_blocks": [
        {
          "Level1": {
            "min_pq": 2,
            "max_pq": 3383,
            "avg_pq": 819
          }
        },
        {
          "Level2": {
            "target_max_pq": 2081,
            "trim_slope": 2048,
            "trim_offset": 2048,
            "trim_power": 2048,
            "trim_chroma_weight": 1462,
            "trim_saturation_gain": 2132,
            "ms_weight": 512
          }
        },
        {
          "Level4": {
            "anchor_pq": 0,
            "anchor_power": 0
          }
        },
        {
          "Level5": {
            "active_area_left_offset": 0,
            "active_area_right_offset": 0,
            "active_area_top_offset": 0,
            "active_area_bottom_offset": 0
          }
        },
        {
          "Level6": {
            "max_display_mastering_luminance": 4000,
            "min_display_mastering_luminance": 50,
            "max_content_light_level": 0,
            "max_frame_average_light_level": 0
          }
        }
      ]
    }
  },
  "rpu_data_crc32": 3537106722
}

The metadata is broken into the following parts:

• Dolby Vision specific metadata (i.e. composer (220) prediction coefficients)

  "coefficient_log2_denom": 23,
 
   "vdr_dm_metadata_present_flag": true,

   "num_pivots_minus_2": [
     0,
     0,
     0
   ],
   "pred_pivot_value": [
     [
       0,
       1023
     ],
     [
       0,
       1023
     ],
     [
       0,
       1023
     ]
   ],
   "nlq_method_idc": null,
   "nlq_num_pivots_minus2": null,
   "nlq_pred_pivot_value": null,
   "num_x_partitions_minus1": 0,
   "num_y_partitions_minus1": 0
 },
 "rpu_data_mapping": {
   "mapping_idc": [
     [
       0
     ],
     [
       0
     ],
     [
       0
     ]
   ],
   "mapping_param_pred_flag": [
     [
       false
     ],
     [
       false
     ],
     [
       false
     ]
   ],
   "num_mapping_param_predictors": [
     [
       0
     ],
     [
       0
     ],
     [
       0
     ]
   ],
   "diff_pred_part_idx_mapping_minus1": [
     [],
     [],
     []
   ],
   "poly_order_minus1": [
     [
       0
     ],
     [
       0
     ],
     [
       0
     ]
   ],
   "linear_interp_flag": [
     [
       false
     ],
     [
       false
     ],
     [
       false
     ]
   ],
   "pred_linear_interp_value_int": [
     [],
     [],
     []
   ],
   "pred_linear_interp_value": [
     [],
     [],
     []
   ],
   "poly_coef_int": [
     [
       [
         0,
         1
       ]
     ],
     [
       [
         0,
         1
       ]
     ],
     [
       [
         0,
         1
       ]
     ]
   ],
   "poly_coef": [
     [
       [
         0,
         0
       ]
     ],
     [
       [
         0,
         0
       ]
     ],
     [
       [
         0,
         0
       ]
     ]
   ],
   "mmr_order_minus1": [
     [],
     [],
     []
   ],
   "mmr_constant_int": [
     [],
     [],
     []
   ],
   "mmr_constant": [
     [],
     [],
     []
   ],
   "mmr_coef_int": [
     [],
     [],
     []
   ],
   "mmr_coef": [
     [],
     [],
     []
   ]
 },
   "rgb_to_lms_coef0": 7222,
   "rgb_to_lms_coef1": 8771,
   "rgb_to_lms_coef2": 390,
   "rgb_to_lms_coef3": 2654,
   "rgb_to_lms_coef4": 12430,
   "rgb_to_lms_coef5": 1300,
   "rgb_to_lms_coef6": 0,
   "rgb_to_lms_coef7": 422,
   "rgb_to_lms_coef8": 15962,

Static metadata as defined in SMPTE 2086 in Ref. [7].

    "ycc_to_rgb_coef0": 9574,
    "ycc_to_rgb_coef1": 0,
    "ycc_to_rgb_coef2": 13802,
    "ycc_to_rgb_coef3": 9574,
    "ycc_to_rgb_coef4": -1540,
    "ycc_to_rgb_coef5": -5348,
    "ycc_to_rgb_coef6": 9574,
    "ycc_to_rgb_coef7": 17610,
    "ycc_to_rgb_coef8": 0,
    "ycc_to_rgb_offset0": 16777216,
    "ycc_to_rgb_offset1": 134217728,
    "ycc_to_rgb_offset2": 134217728,

**IPTPQ_YCCtoRGB_coeff Matrix**                                                                       **dovi profile 8.1 BT2020 coeff**
9574/9574  0/9574  13802/9574                    1  |   0    |   1.53948193                    0.262711 | 0.677998 | 0.059291
9574/9574 -1540/9754 -5348/9574  ===>             1 |  -0.183100063   | -0.457697932   ===>    -0.14283 | -0.36861 | 0.511434 
9574/9574  17610/9574  0/9574                    1  |     1.814184249  |   0                      0.511434 | -0.47031 | -0.04113

**IPTPQ_YCCtoRGB_offset Matrix**                                                         **dovi profile 8.1 offsets**
profile 8 uses 1<<28 scalar
16777216/268,435,456 134217728/268,435,456 134217728/268,435,456  ===>                      0.0625,  0.5, 0.5

• Dynamic scene-based metadata (e.g., as may be defined in WD SMPTE ST 2094).

    "cmv29_metadata": {
      "num_ext_blocks": 5,
      "ext_metadata_blocks": [
        {
          "Level1": {
            "min_pq": 2,
            "max_pq": 3383,
            "avg_pq": 819
          }
        },
        {
          "Level2": {
            "target_max_pq": 2081,
            "trim_slope": 2048,
            "trim_offset": 2048,
            "trim_power": 2048,
            "trim_chroma_weight": 1462,
            "trim_saturation_gain": 2132,
            "ms_weight": 512
          }
        },
        {
          "Level4": {
            "anchor_pq": 0,
            "anchor_power": 0
          }
        },
        {
          "Level5": {
            "active_area_left_offset": 0,
            "active_area_right_offset": 0,
            "active_area_top_offset": 0,
            "active_area_bottom_offset": 0
          }
        },
        {
          "Level6": {
            "max_display_mastering_luminance": 4000,
            "min_display_mastering_luminance": 50,
            "max_content_light_level": 0,
            "max_frame_average_light_level": 0
          }

For a profile 8.1 dovi metadata, no polynomial reshaping is done, only unity values are applied to the luma and chroma. The static metadata as you can see is BT2020 for profile 8.1. For profile 8.2 it is BT709. This matrix is applied to luma and chroma. Profile 5 does apply significant polynomial reshaping.

So just sending the dolby VSIF is not enough.

Now for what we do with display calibration, we do inject only the static metadata (1024 bits) into the first 2 scanlines of an image(not video) and apply algorithms to the pixels so as to pack as dolby RGB, basically "RGB tunneling". This only works for an image. When the display is in dolby vision mode, only patterns with the injected metadata and pixels packed as dolby RGB will the image then be visible.

For video however, it does require reshaping all the pixel values in a frame to get correct color reproduction, which on the pi4, even in 1080p, is very, very slooow, even profile8.1 and 8.2.

I've used CSC coefficient matrices in vc4_hdmi.c corresponding to the static metadata values but it only works with profile8.1 and profile 8.2, since polynomial reshaping doesnt really change anything in these profiles. But for any dovi content created with the other profiles i don't see how the hardware could handle each of the above steps as delineated in the code i posted.

As an aside, and just FYI, the YUV coefficients in vc4_hdmi.c are not bit perfect. We've tested with RGB analyzers and there are rounding errors that no matter what I did to adjust the matrices, the values never came out bit perfect. The work around was to use shaders and set only RGB full range unity matrix for CSC for each colorspace, at least for our software implementation. This was the only we could get bit perfect 8 bit and 10bit RGB, YCC444, YCC422 and dolby patterns.

Do you happen to have timings for how long it takes Pi4 shaders to process a frame (ignoring all other frame import/export related sloth)?

Lastly, eliminating all import/export latency, The complile processing time from the dovi shader in libplacebo is the following(running in plplay, libplacebo video player):

[ 99] vec4 _main_18_2() {
[100] vec4 color = vec4(0.0, _const_0_2, _const_0_2, 1.0);
[101] // pass_read_image
[102] {
[103] vec4 tmp;
[104] tmp = _sub_1_2();
[105] color[0] = tmp[0];
[106] tmp = _sub_2_2();
[107] color[1] = tmp[0];
[108] color[2] = tmp[1];
[109] }
[110] // pl_shader_decode_color
[111] {
[112] color.rgb *= vec3(_const_3_2);
[113] // pl_shader_reshape
[114] {
[115] vec3 sig;
[116] vec4 coeffs;
[117] float s;
[118] sig = clamp(color.rgb, 0.0, 1.0);
[119] s = sig[0];
[120] coeffs = _coeffs_4_2;
[121] s = (coeffs.z * s + coeffs.y) * s + coeffs.x;
[122] color[0] = clamp(s, _const_5_2, _const_6_2);
[123] s = sig[1];
[124] coeffs = _coeffs_7_2;
[125] s = (coeffs.z * s + coeffs.y) * s + coeffs.x;
[126] color[1] = clamp(s, _const_8_2, _const_9_2);
[127] s = sig[2];
[128] coeffs = _coeffs_10_2;
[129] s = (coeffs.z * s + coeffs.y) * s + coeffs.x;
[130] color[2] = clamp(s, _const_11_2, _const_12_2);
[131] }
[132] color.rgb = _cmat_13_2 * color.rgb + _cmat_c_14_2;
[133] color.rgb = pow(max(color.rgb, 0.0), vec3(1.0/78.84375000000000000000));
[134] color.rgb = max(color.rgb - vec3(0.83593750000000000000), 0.0)
[135]              / (vec3(18.85156250000000000000) - vec3(18.68750000000000000000) * color.rgb);
[136] color.rgb = pow(color.rgb, vec3(1.0/0.15930175781250000000));
[137] color.rgb = _lms2rgb_15_2 * color.rgb;
[138] color.rgb = pow(max(color.rgb, 0.0), vec3(0.15930175781250000000));
[139] color.rgb = (vec3(0.83593750000000000000) + vec3(18.85156250000000000000) * color.rgb)
[140]              / (vec3(1.0) + vec3(18.68750000000000000000) * color.rgb);
[141] color.rgb = pow(color.rgb, vec3(78.84375000000000000000));
[142] }
[143] // pl_shader_linearize
[144] color.rgb = max(color.rgb, 0.0);
[145] color.rgb = pow(color.rgb, vec3(1.0/78.84375000000000000000));
[146] color.rgb = max(color.rgb - vec3(0.83593750000000000000), 0.0)
[147]              / (vec3(18.85156250000000000000) - vec3(18.68750000000000000000) * color.rgb);
[148] color.rgb = pow(color.rgb, vec3(1.0/0.15930175781250000000));
[149] color.rgb *= vec3(49.26108374384236298501);
[150] return color;
[151] }
[152]
[153] void main() {
[154] out_color = _main_18_2();
[155] }
vk->CreateDescriptorSetLayout(vk->dev, &dinfo, PL_VK_ALLOC, &pass_vk->dsLayout)
vk->CreatePipelineLayout(vk->dev, &linfo, PL_VK_ALLOC, &pass_vk->pipeLayout)
vk_compile_glsl(gpu, tmp, GLSL_SHADER_VERTEX, params->vertex_shader, &vert)
shaderc compile status 'success' (0 errors, 0 warnings)
**Spent 352.064 ms translating SPIR-V (slow!)**
vk_compile_glsl(gpu, tmp, GLSL_SHADER_FRAGMENT, params->glsl_shader, &frag)
shaderc compile status 'success' (0 errors, 0 warnings)
**Spent 1519.914 ms translating SPIR-V (slow!)**

The shader processing latency is in bold.
Spent 352.064 ms translating SPIR-V (slow!) Spent 1519.914 ms translating SPIR-V (slow!) I'm not sure how much time is spent pixel processing in the shader.

So, The dolby reshaping process in short is this:

Import the Luma and 2 chroma pixel values
Normalize the values to [0,1]
Polynomial reshaping, i.e. polynomial coeffs used on Luma, MMR coeffs used on chroma
Apply non-linear matrix(ycc_to_rgb coeffs, BT2020 or BT709) and ycc_rgb_offsets //this can be done in CSC
Apply PQ
Apply RGB_to_LMS matrix //this can be done in CSC
Apply inverse PQ

I don't think the pi4 is up to this unfortunately.

Jul 25 '22 20:07 docdude

Many thanks for the comprehensive answer. I have to agree that it doesn't look in any way likely that a Pi is going to be up for that. I admit I was hoping that it was going to be TV-side processing rather than source-side. Ah well - at least when I say "we can't do Dolby Video" I am better informed about why. I guess this way maximises the number of licenses they get to sell.

We'd already noticed the lack of perfection in YUV translation when doing other HDR work but I don't think there's anything we can do about it - on the other hand for non-conformance grade work they are close enough.

Thanks again

Jul 26 '22 10:07 jc-kynesim

Your welcome.
I was wondering if HVS has a 12bit or 16bit format. Mesa has exposed 16bit. I have a utility ive written to write to HVS directly from userspace and have not been able to find a format other than what is already coded in kernel.

Jul 28 '22 21:07 docdude

I think the only >8 bit formats in HVS are 10-bit XRGB packed in 32-bit words and 10-bit YUV 4:2:0 in SAND30. But I'm not an expert on the h/w

Jul 31 '22 16:07 jc-kynesim

I think the only >8 bit formats in HVS are 10-bit XRGB packed in 32-bit words and 10-bit YUV 4:2:0 in SAND30. But I'm not an expert on the h/w

I don't see any other high bit depth formats in the spec.

Aug 01 '22 10:08 popcornmix

Thanks again.

Aug 01 '22 16:08 docdude

@docdude sorry for stirring up this old issue, but I wonder if you can provide some more information on the general topic of Dolby Vision? This is mainly for my understanding more than anything else!

I understand that in order to support DV over HDMI 1.4, we need to "tunnel" the 12-bit DV image + metadata over RGB 8-bit video signal. Your above code essentially does this. Can we signal DV content natively in HDMI 2.0 without the need to do this tunnelling? If yes, does the DV metadata get sent through an av infoframe stream?
Could you encode a HEVC stream directly in DV ICtCp format? If so, would there be any per-pixel operations required (like you described above) to send this over HDMI 2.0 in DV format? If yes, can you provide a brief idea of what's required?
Does DV require ICtCp encoding, or can we use the more common YCbCr encoding? Perhaps this is based on levels/profiles used?

Dec 15 '22 14:12 naushir

rpi-ffmpeg
rpi-ffmpeg copied to clipboard

Could I extract 10bit luma data from AV_PIX_FMT_YUV420P10 HEVC?

rpi-ffmpeg rpi-ffmpeg copied to clipboard

Could I extract 10bit luma data from AV_PIX_FMT_YUV420P10 HEVC?

rpi-ffmpeg
rpi-ffmpeg copied to clipboard