RadeonImageFilter icon indicating copy to clipboard operation
RadeonImageFilter copied to clipboard

Quality degradation of AI denoising when albedo/normals/depth are backed by D3D12 resoruces

Open BartSiwek opened this issue 3 years ago • 18 comments

While trying to integrate the RIF AI denoiser into the existing D3D12 pipeline I have stumbled upon the issue where if the albedo/normals/depth RIF images are backed by a D3D12 resource (created using the rifContextCreateImageFromDirectX12Memory) the resulting image degrades in quality. At the same time if RIF images are not resource backed (created using the rifContextCreateImage) and the same contents is copied into them via the rifImageMap/rifImageUnmap the quality is good.

Example of good quality output: goof quality Example of bad quality output (note the acne like artifacts on the bottom part of the ball and close to the plane edges): bad quality

Notes:

  • The RIF image descriptors are exactly the same in both cases
  • In both cases the context is created using rifCreateContextFromDirectX12Context(RIF_API_VERSION, d3d12_device, d3d12_queue, nullptr, &context)
  • In both cases the queue is created using the rifContextCreateCommandQueue(context, &queue)
  • This issue does not apply to the "colorImg" filter parameter - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed
  • This issue does not apply to the filter output - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed
  • The type of the backing resource does not matter - it happens both with textures and buffers
  • The "useHDR" filter parameter is set to true

BartSiwek avatar Oct 14 '21 12:10 BartSiwek

Hi @BartSiwek , Thanks. We'll solve this problem.

givinar avatar Oct 15 '21 13:10 givinar

Hi @BartSiwek, I'm unable to reproduce the problem at the moment. Could you explain some of the points? First of all, how do you get the resource memory pointer for albedo/normal/depth? I mean, are you using the DX12 capability (ID3D12Resource :: Map/Unmap) or the RIF methods (rifImageMap/rifImageUnmap). Further, I am afraid that I don't understand some of the points regarding "colorIng" and filter output.

  • "This issue does not apply to the "colorImg" filter parameter - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed"

  • "This issue does not apply to the filter output - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed"

Does this mean that to reproduce the problem, "colorImg" and filter output images must be created using the RIF API method (rifContextCreateImage)?

givinar avatar Dec 09 '21 11:12 givinar

Hi @givinar

In an ideal setup I never need to get the resource memory pointer for any of the inputs. I have D3D12 Texture Resources for all of them and I create the corresponding rifImage using the rifContextCreateImageFromDirectX12Memory. Doing so I noticed the acne like artifacts.

In the process of discovering what is causing the artifacts I switched to reading the D3D12 resources from the device back to the host memory using ID3D12Resource :: Map/Unmap, creating the rifImage's using the rifContextCreateImageFilter and providing them with data via the rifImageMap/rifImageUnmap. In this scenario acne does not appear.

My remark "This issue does not apply to the "colorImg" filter parameter - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed" - means that the rigImage I provide as the colorImg parameter can be either created via the rifContextCreateImageFromDirectX12Memory or via the rifContextCreateImageFilter followed by the rifImageMap/rifImageUnmap and that does not affect the output. This is not the case for albedo/depth/normals.

My remark "This issue does not apply to the filter output - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed" - means that the rifImage used for the output can be either created with rifContextCreateImageFromDirectX12Memory or rifContextCreateImage and that does not affect the output.

Hope this helps.

BartSiwek avatar Dec 09 '21 11:12 BartSiwek

Hi @BartSiwek May I ask you to send me some traces? To do this, you need to set your environment variables (RIF_TRACING_ENABLE to 1 and RIF_TRACING_PATH to ${SAVE_PATH}) image

After that, just run the filter twice. First, where is the bad quality (rifContextCreateImageFromDirectX12Memory). Second, with good quality (rifContextCreateImage). You should get two folders in $ {SAVE_PATH}. Could you send them to me? Thanks!

givinar avatar Dec 14 '21 08:12 givinar

Hi,

Sorry for the wait - it took me some time to dust off the code. I have the zip files with traces ready, but each one of them is about 70mb. Where do you want me to send them?

Additionally - I noticed there is a change in behavior depending on if I have the RIF_TRACING_ENABLE set to 1 or not in a scenario when the RIF images are backed by D3D12. Without RIF_TRACING_ENABLE set at all or with it set to 0 my code would run fine. When I set RIF_TRACING_ENABLE=1 I suddenly got an error from the DX12 debug layers stating that the transition from UAV state was invalid since that is not the resource current state. This was easy to fix on my side and it did not change the resulting image, but it seems you are doing different things under the hood depending on that flag being set or not.

BartSiwek avatar Dec 21 '21 18:12 BartSiwek

Hi, We can change the backend depending on the API. This can cause different behaviors and different results. Therefore, I asked you to send traces. Could you upload traces to google drive? https://drive.google.com/drive/folders/115GuyUCUiTiJPzlrl-luOHpx5Lc1vtji?usp=sharing Thanks

givinar avatar Dec 22 '21 09:12 givinar

Hi,

I uploaded the traces.

So what you are saying is that you switch back-ends based on the value of RIF_TRACING_ENABLE? Because in my case I did not change the code. The D3D12 debug layers gave me an error if RIF_TRACING_ENABLE=1 and did not give me one when RIF_TRACING_ENABLE=0.

BartSiwek avatar Dec 22 '21 12:12 BartSiwek

Hi, Thanks for the traces. I mean the backend might change depending on the API rifContextCreateImage or rifContextCreateImageFromDirectX12Memory. Enabling tracing does not affect this.

givinar avatar Dec 23 '21 06:12 givinar

Hi @BartSiwek, I found some issues in the traces.

  1. When you create an image description (rif_image_desc), you are not initializing it properly. Row pitch and slide pitch are random values. You can try something like this: rif_image_desc inputDesc = { 0 }; inputDesc.image_width = 1962; inputDesc.image_height = 1270; inputDesc.num_components = 4; inputDesc.type = RIF_COMPONENT_TYPE_FLOAT32;

  2. rifSyncronizeQueue - function is redundant.

  3. Make sure normals and depth are normalized. For this we use additional filters: ASSERT_EQ(rifImageFilterSetParameter1f(remapNormalsFilter, "dstLo", 0.0f), RIF_SUCCESS); ASSERT_EQ(rifImageFilterSetParameter1f(remapNormalsFilter, "dstHi", 1.0f), RIF_SUCCESS); ASSERT_EQ(rifCommandQueueAttachImageFilter(queue, remapNormalsFilter, normalsImg, normalsImg), RIF_SUCCESS);

    ASSERT_EQ(rifImageFilterSetParameter1f(remapDepthFilter, "dstLo", 0.0f), RIF_SUCCESS); ASSERT_EQ(rifImageFilterSetParameter1f(remapDepthFilter, "dstHi", 1.0f), RIF_SUCCESS); ASSERT_EQ(rifCommandQueueAttachImageFilter(queue_, remapDepthFilter, depthImg, depthImg), RIF_SUCCESS);_

  4. And the last, the input image for "rifContextCreateImageFilter" version: image_0

The input image for "rifContextCreateImageFromDirectX12Memory" version: image_0

This also applies to normal, albedo and depth.

givinar avatar Dec 23 '21 14:12 givinar

Hi,

I will try 1) and 2) and get back to you.

Regarding the others: 3) The depth and normal's are normalized since this is how we accumulate them. Depth is naturally in the [0,1] range and normal's we remap from [-1,1] to [0,1] during accumulation. 4) Yes - I noticed that too, but image corruption seems to be coming from RIF since in PIX captures these DirectX resources look fine.

BartSiwek avatar Dec 23 '21 14:12 BartSiwek

Hi,

I just double checked some thing and: Regarding 2) - removing rifSyncronizeQueue did not do anything Regarding 1) - depth, row pitch and slice pitch should not be random as I fill them in (comments added for clarity):

	const auto& textureDesc = texture.getDesc();

        // For RGBA float32 getPixelFormatBitSize returns 32
	const auto rowSize = ( getPixelFormatBitSize( textureDesc.format ) * textureDesc.width ) / 8;

	rif_image_desc rifImageDesc = {};
	rifImageDesc.image_width = textureDesc.width;
	rifImageDesc.image_height = textureDesc.height;
	rifImageDesc.image_depth = textureDesc.depth;
	rifImageDesc.image_row_pitch = rowSize;
	rifImageDesc.image_slice_pitch = rowSize * textureDesc.height;

        // For RGBA Float32 returns 4
	rifImageDesc.num_components = getComponentsForFormat( textureDesc.format );

        // For RGBA Float32 returns RIF_COMPONENT_TYPE_FLOAT32
	rifImageDesc.type = PixelFormatToRifComponentType( textureDesc.format );  

	ID3D12Resource* d3d12res = reinterpret_cast<ID3D12Resource*>( texture.getHandle() );

	rif_image rifImage = nullptr;
	CHECK_AND_RETURN( rifContextCreateImageFromDirectX12Memory( context, &rifImageDesc, d3d12res, &rifImage ), "Error creating image", nullptr );
	return rifImage;

I also checked the values in the debugger and they are correct.

BartSiwek avatar Dec 25 '21 19:12 BartSiwek

Hi @BartSiwek As I can see in the trace log (DX12Backend) the row and slice values seem to be random numbers. API call: RIF_CALL(rifContextCreateImageFromDirectX12Memory((rif_context)(context_0), AsPtr<rif_image_desc>((uint32_t)(1962), (uint32_t)(1270), (uint32_t)(1), (uint32_t)(31392), (uint32_t)(39867840), (uint32_t)(4), RIF_COMPONENT_TYPE_FLOAT32).get(), (void*)(0x1b3a5a49640), &image_3)); Row value is 31392 and slice is 39867840. I'm not sure if these are the correct numbers. Can you set these parameters explicitly (for example 0) and rerun the DX12 Backend with tracing again?

givinar avatar Dec 30 '21 08:12 givinar

The values certainly make sense since width * component_count * size_of_component is 1920 * 4 * 4 == 31392 And rowSize * height is 1270 * 31392 == 39867840.

That I think might be the issue is this does not account for padding introduced by DirectX. I tried to switch the code to the following:

	const auto& textureDesc = texture.getDesc();

	ID3D12Resource* d3d12res = reinterpret_cast<ID3D12Resource*>( texture.getHandle() );

	D3D12_RESOURCE_DESC d = d3d12res->GetDesc();

	D3D12_PLACED_SUBRESOURCE_FOOTPRINT footprint = {};
	UINT numRows = 0;
	UINT64 rowSize = 0;
	UINT64 totalBytes = 0;
	d3d12_device->GetCopyableFootprints( &d, 0, 1, 0, &footprint, &numRows, &rowSize, &totalBytes );

	rif_image_desc rifImageDesc = { 0 };
	rifImageDesc.image_width = textureDesc.width;
	rifImageDesc.image_height = textureDesc.height;
	rifImageDesc.image_depth = textureDesc.depth;
	rifImageDesc.image_row_pitch = footprint.Footprint.RowPitch;
	rifImageDesc.image_slice_pitch = textureDesc.height * footprint.Footprint.RowPitch;

        // For RGBA Float32 returns 4
	rifImageDesc.num_components = getComponentsForFormat( textureDesc.format );

        // For RGBA Float32 returns RIF_COMPONENT_TYPE_FLOAT32
	rifImageDesc.type = PixelFormatToRifComponentType( textureDesc.format );

	rif_image rifImage = nullptr;
	CHECK_AND_RETURN( rifContextCreateImageFromDirectX12Memory( context, &rifImageDesc, d3d12res, &rifImage ), "Error creating image", nullptr );
	return rifImage;

but that leads to the following error from RIF

[DEBUG@RIF]: rifContextCreateImageFromDirectX12Memory called (with args: context=0x16fb405df00 image_desc={width=982 height=634 depth=1 row_pitch=15872 slice_pitch=10062848 num_components=4 type=RIF_COMPONENT_TYPE_FLOAT32} mem=0x16f0090c270)
[ERROR@RIF]: GPU memory object is too small (required 10062848 bytes, allocated 9961408 bytes) at file C:\JN\WS\RadeonProImageProcessor_Build\src\Image.cpp line 314

Which also makes sense. In this case (note the different resolution form previous examples) the row pitch which does not account for padding is 982 * 4 * 4 == 15712, while with padding it's 15872. This leads to the expected size of 9961408 without padding and 10062848 with padding.

BartSiwek avatar Jan 10 '22 18:01 BartSiwek

Hi @BartSiwek, Sorry for wait. [DEBUG@RIF]: rifContextCreateImageFromDirectX12Memory called (with args: context=0x16fb405df00 image_desc={width=982 height=634 depth=1 row_pitch=15872 slice_pitch=10062848 num_components=4 type=RIF_COMPONENT_TYPE_FLOAT32} mem=0x16f0090c270) [ERROR@RIF]: GPU memory object is too small (required 10062848 bytes, allocated 9961408 bytes) at file C:\JN\WS\RadeonProImageProcessor_Build\src\Image.cpp line 314 This is the correct behaviour. The reason for this is that ID3D12Resource was allocated with a smaller size than rif_image_desc.

I still haven't been able to reproduce the error. So I prepared a simple example. To build, use CMake with options: -DRIF_INCLUDE_DIR = PATH_TO_RIF_HEADERS and -DRIF_LIBRARY_DIR = PATH_TO_RIB_LIBRARIES https://drive.google.com/file/d/1mxZlXs9i18pi8ObfVmTkuMb_ZJDyqDo-/view?usp=sharing If you can help me modify it to reproduce the error, that would be great.

givinar avatar Jan 18 '22 10:01 givinar

Hi,

I ran your sample and I hit an issue befere it did anything that could be related to what is in this bug report.

When executing a call to rifContextExecuteCommandQueue on line 268 of the sample it crashes.

The D3D Debug Output states:

D3D12 ERROR: ID3D12Device::CreateComputeShader: Compute Shader is unsigned. [ STATE_CREATION ERROR #322: CREATECOMPUTESHADER_INVALIDSHADERBYTECODE]

In the mean time in the console window the app opens I can see this:

[MSG  @RIF]: Loaded RIF API version: 1.7.1.0xfe53c43a
[MSG  @RIF]: Loading model: D:/Work/marmosetco.toolbag4/RifDenoiserFinal/data/render/rif-models/denoise_c9_ldr.pb
INFO: Memory: required: 1525088256, single node max: 1275854848, requested sum: 4034199552
Fatal error: cannot create pipeline state

BartSiwek avatar Feb 06 '22 01:02 BartSiwek

Hi @BartSiwek, I've added a print debug message based on HRESULT for CreateComputePipelineState. https://drive.google.com/file/d/1TxaYmcNVOBdpjMarbG8HAsBb5wxVLK_T/view?usp=sharing Perhaps we can get more information. This error is not related to the RIF itself. Let's try to understand what's going on. I'm interested in ID3D12Device::CreateComputeShader I didn't find any mention of this call for DX12 only for DX11.

givinar avatar Feb 07 '22 11:02 givinar

Hi @BartSiwek, D3D12 ERROR: ID3D12Device::CreateComputeShader: Compute Shader is unsigned. [ STATE_CREATION ERROR #322: CREATECOMPUTESHADER_INVALIDSHADERBYTECODE] Also, this error is the reason for the absence of the dxcompiler library or its incorrect version. Here is the correct version https://drive.google.com/file/d/1H1oCECq33b_tCYjGv-fp31MnEBalfcdK/view?usp=sharing

givinar avatar Feb 15 '22 12:02 givinar

Hi,

I finally got some time and I am trying your example. I was able to run it this time (even without replacing the d3d compiler dll).

This standalone app seems to not be causing the acne I shown in the example. I'll try to compare the device, queue and resource setup to figure out what could be different.

In the mean time however I do see a bit of a halo in the results: output

BartSiwek avatar Mar 03 '22 00:03 BartSiwek