RadeonImageFilter
RadeonImageFilter copied to clipboard
Quality degradation of AI denoising when albedo/normals/depth are backed by D3D12 resoruces
While trying to integrate the RIF AI denoiser into the existing D3D12 pipeline I have stumbled upon the issue where if the albedo/normals/depth RIF images are backed by a D3D12 resource (created using the rifContextCreateImageFromDirectX12Memory
) the resulting image degrades in quality. At the same time if RIF images are not resource backed (created using the rifContextCreateImage
) and the same contents is copied into them via the rifImageMap
/rifImageUnmap
the quality is good.
Example of good quality output:
Example of bad quality output (note the acne like artifacts on the bottom part of the ball and close to the plane edges):
Notes:
- The RIF image descriptors are exactly the same in both cases
- In both cases the context is created using
rifCreateContextFromDirectX12Context(RIF_API_VERSION, d3d12_device, d3d12_queue, nullptr, &context)
- In both cases the queue is created using the
rifContextCreateCommandQueue(context, &queue)
- This issue does not apply to the "colorImg" filter parameter - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed
- This issue does not apply to the filter output - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed
- The type of the backing resource does not matter - it happens both with textures and buffers
- The "useHDR" filter parameter is set to
true
Hi @BartSiwek , Thanks. We'll solve this problem.
Hi @BartSiwek, I'm unable to reproduce the problem at the moment. Could you explain some of the points? First of all, how do you get the resource memory pointer for albedo/normal/depth? I mean, are you using the DX12 capability (ID3D12Resource :: Map/Unmap) or the RIF methods (rifImageMap/rifImageUnmap). Further, I am afraid that I don't understand some of the points regarding "colorIng" and filter output.
-
"This issue does not apply to the "colorImg" filter parameter - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed"
-
"This issue does not apply to the filter output - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed"
Does this mean that to reproduce the problem, "colorImg" and filter output images must be created using the RIF API method (rifContextCreateImage)?
Hi @givinar
In an ideal setup I never need to get the resource memory pointer for any of the inputs. I have D3D12 Texture Resources for all of them and I create the corresponding rifImage
using the rifContextCreateImageFromDirectX12Memory
. Doing so I noticed the acne like artifacts.
In the process of discovering what is causing the artifacts I switched to reading the D3D12 resources from the device back to the host memory using ID3D12Resource :: Map/Unmap
, creating the rifImage
's using the rifContextCreateImageFilter
and providing them with data via the rifImageMap/rifImageUnmap
. In this scenario acne does not appear.
My remark "This issue does not apply to the "colorImg" filter parameter - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed" - means that the rigImage
I provide as the colorImg
parameter can be either created via the rifContextCreateImageFromDirectX12Memory
or via the rifContextCreateImageFilter
followed by the rifImageMap/rifImageUnmap
and that does not affect the output.
This is not the case for albedo/depth/normals.
My remark "This issue does not apply to the filter output - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed" - means that the rifImage
used for the output can be either created with rifContextCreateImageFromDirectX12Memory
or rifContextCreateImage
and that does not affect the output.
Hope this helps.
Hi @BartSiwek
May I ask you to send me some traces?
To do this, you need to set your environment variables (RIF_TRACING_ENABLE to 1 and RIF_TRACING_PATH to ${SAVE_PATH})
After that, just run the filter twice. First, where is the bad quality (rifContextCreateImageFromDirectX12Memory). Second, with good quality (rifContextCreateImage). You should get two folders in $ {SAVE_PATH}. Could you send them to me? Thanks!
Hi,
Sorry for the wait - it took me some time to dust off the code. I have the zip files with traces ready, but each one of them is about 70mb. Where do you want me to send them?
Additionally - I noticed there is a change in behavior depending on if I have the RIF_TRACING_ENABLE set to 1 or not in a scenario when the RIF images are backed by D3D12. Without RIF_TRACING_ENABLE set at all or with it set to 0 my code would run fine. When I set RIF_TRACING_ENABLE=1 I suddenly got an error from the DX12 debug layers stating that the transition from UAV state was invalid since that is not the resource current state. This was easy to fix on my side and it did not change the resulting image, but it seems you are doing different things under the hood depending on that flag being set or not.
Hi, We can change the backend depending on the API. This can cause different behaviors and different results. Therefore, I asked you to send traces. Could you upload traces to google drive? https://drive.google.com/drive/folders/115GuyUCUiTiJPzlrl-luOHpx5Lc1vtji?usp=sharing Thanks
Hi,
I uploaded the traces.
So what you are saying is that you switch back-ends based on the value of RIF_TRACING_ENABLE? Because in my case I did not change the code. The D3D12 debug layers gave me an error if RIF_TRACING_ENABLE=1 and did not give me one when RIF_TRACING_ENABLE=0.
Hi, Thanks for the traces. I mean the backend might change depending on the API rifContextCreateImage or rifContextCreateImageFromDirectX12Memory. Enabling tracing does not affect this.
Hi @BartSiwek, I found some issues in the traces.
-
When you create an image description (rif_image_desc), you are not initializing it properly. Row pitch and slide pitch are random values. You can try something like this: rif_image_desc inputDesc = { 0 }; inputDesc.image_width = 1962; inputDesc.image_height = 1270; inputDesc.num_components = 4; inputDesc.type = RIF_COMPONENT_TYPE_FLOAT32;
-
rifSyncronizeQueue - function is redundant.
-
Make sure normals and depth are normalized. For this we use additional filters: ASSERT_EQ(rifImageFilterSetParameter1f(remapNormalsFilter, "dstLo", 0.0f), RIF_SUCCESS); ASSERT_EQ(rifImageFilterSetParameter1f(remapNormalsFilter, "dstHi", 1.0f), RIF_SUCCESS); ASSERT_EQ(rifCommandQueueAttachImageFilter(queue, remapNormalsFilter, normalsImg, normalsImg), RIF_SUCCESS);
ASSERT_EQ(rifImageFilterSetParameter1f(remapDepthFilter, "dstLo", 0.0f), RIF_SUCCESS); ASSERT_EQ(rifImageFilterSetParameter1f(remapDepthFilter, "dstHi", 1.0f), RIF_SUCCESS); ASSERT_EQ(rifCommandQueueAttachImageFilter(queue_, remapDepthFilter, depthImg, depthImg), RIF_SUCCESS);_
-
And the last, the input image for "rifContextCreateImageFilter" version:
The input image for "rifContextCreateImageFromDirectX12Memory" version:
This also applies to normal, albedo and depth.
Hi,
I will try 1) and 2) and get back to you.
Regarding the others: 3) The depth and normal's are normalized since this is how we accumulate them. Depth is naturally in the [0,1] range and normal's we remap from [-1,1] to [0,1] during accumulation. 4) Yes - I noticed that too, but image corruption seems to be coming from RIF since in PIX captures these DirectX resources look fine.
Hi,
I just double checked some thing and:
Regarding 2) - removing rifSyncronizeQueue
did not do anything
Regarding 1) - depth, row pitch and slice pitch should not be random as I fill them in (comments added for clarity):
const auto& textureDesc = texture.getDesc();
// For RGBA float32 getPixelFormatBitSize returns 32
const auto rowSize = ( getPixelFormatBitSize( textureDesc.format ) * textureDesc.width ) / 8;
rif_image_desc rifImageDesc = {};
rifImageDesc.image_width = textureDesc.width;
rifImageDesc.image_height = textureDesc.height;
rifImageDesc.image_depth = textureDesc.depth;
rifImageDesc.image_row_pitch = rowSize;
rifImageDesc.image_slice_pitch = rowSize * textureDesc.height;
// For RGBA Float32 returns 4
rifImageDesc.num_components = getComponentsForFormat( textureDesc.format );
// For RGBA Float32 returns RIF_COMPONENT_TYPE_FLOAT32
rifImageDesc.type = PixelFormatToRifComponentType( textureDesc.format );
ID3D12Resource* d3d12res = reinterpret_cast<ID3D12Resource*>( texture.getHandle() );
rif_image rifImage = nullptr;
CHECK_AND_RETURN( rifContextCreateImageFromDirectX12Memory( context, &rifImageDesc, d3d12res, &rifImage ), "Error creating image", nullptr );
return rifImage;
I also checked the values in the debugger and they are correct.
Hi @BartSiwek As I can see in the trace log (DX12Backend) the row and slice values seem to be random numbers. API call: RIF_CALL(rifContextCreateImageFromDirectX12Memory((rif_context)(context_0), AsPtr<rif_image_desc>((uint32_t)(1962), (uint32_t)(1270), (uint32_t)(1), (uint32_t)(31392), (uint32_t)(39867840), (uint32_t)(4), RIF_COMPONENT_TYPE_FLOAT32).get(), (void*)(0x1b3a5a49640), &image_3)); Row value is 31392 and slice is 39867840. I'm not sure if these are the correct numbers. Can you set these parameters explicitly (for example 0) and rerun the DX12 Backend with tracing again?
The values certainly make sense since width * component_count * size_of_component
is 1920 * 4 * 4 == 31392
And rowSize * height
is 1270 * 31392 == 39867840
.
That I think might be the issue is this does not account for padding introduced by DirectX. I tried to switch the code to the following:
const auto& textureDesc = texture.getDesc();
ID3D12Resource* d3d12res = reinterpret_cast<ID3D12Resource*>( texture.getHandle() );
D3D12_RESOURCE_DESC d = d3d12res->GetDesc();
D3D12_PLACED_SUBRESOURCE_FOOTPRINT footprint = {};
UINT numRows = 0;
UINT64 rowSize = 0;
UINT64 totalBytes = 0;
d3d12_device->GetCopyableFootprints( &d, 0, 1, 0, &footprint, &numRows, &rowSize, &totalBytes );
rif_image_desc rifImageDesc = { 0 };
rifImageDesc.image_width = textureDesc.width;
rifImageDesc.image_height = textureDesc.height;
rifImageDesc.image_depth = textureDesc.depth;
rifImageDesc.image_row_pitch = footprint.Footprint.RowPitch;
rifImageDesc.image_slice_pitch = textureDesc.height * footprint.Footprint.RowPitch;
// For RGBA Float32 returns 4
rifImageDesc.num_components = getComponentsForFormat( textureDesc.format );
// For RGBA Float32 returns RIF_COMPONENT_TYPE_FLOAT32
rifImageDesc.type = PixelFormatToRifComponentType( textureDesc.format );
rif_image rifImage = nullptr;
CHECK_AND_RETURN( rifContextCreateImageFromDirectX12Memory( context, &rifImageDesc, d3d12res, &rifImage ), "Error creating image", nullptr );
return rifImage;
but that leads to the following error from RIF
[DEBUG@RIF]: rifContextCreateImageFromDirectX12Memory called (with args: context=0x16fb405df00 image_desc={width=982 height=634 depth=1 row_pitch=15872 slice_pitch=10062848 num_components=4 type=RIF_COMPONENT_TYPE_FLOAT32} mem=0x16f0090c270)
[ERROR@RIF]: GPU memory object is too small (required 10062848 bytes, allocated 9961408 bytes) at file C:\JN\WS\RadeonProImageProcessor_Build\src\Image.cpp line 314
Which also makes sense. In this case (note the different resolution form previous examples) the row pitch which does not account for padding is 982 * 4 * 4 == 15712
, while with padding it's 15872
. This leads to the expected size of 9961408 without padding and 10062848 with padding.
Hi @BartSiwek,
Sorry for wait.
[DEBUG@RIF]: rifContextCreateImageFromDirectX12Memory called (with args: context=0x16fb405df00 image_desc={width=982 height=634 depth=1 row_pitch=15872 slice_pitch=10062848 num_components=4 type=RIF_COMPONENT_TYPE_FLOAT32} mem=0x16f0090c270) [ERROR@RIF]: GPU memory object is too small (required 10062848 bytes, allocated 9961408 bytes) at file C:\JN\WS\RadeonProImageProcessor_Build\src\Image.cpp line 314
This is the correct behaviour. The reason for this is that ID3D12Resource was allocated with a smaller size than rif_image_desc.
I still haven't been able to reproduce the error. So I prepared a simple example. To build, use CMake with options: -DRIF_INCLUDE_DIR = PATH_TO_RIF_HEADERS and -DRIF_LIBRARY_DIR = PATH_TO_RIB_LIBRARIES https://drive.google.com/file/d/1mxZlXs9i18pi8ObfVmTkuMb_ZJDyqDo-/view?usp=sharing If you can help me modify it to reproduce the error, that would be great.
Hi,
I ran your sample and I hit an issue befere it did anything that could be related to what is in this bug report.
When executing a call to rifContextExecuteCommandQueue
on line 268 of the sample it crashes.
The D3D Debug Output states:
D3D12 ERROR: ID3D12Device::CreateComputeShader: Compute Shader is unsigned. [ STATE_CREATION ERROR #322: CREATECOMPUTESHADER_INVALIDSHADERBYTECODE]
In the mean time in the console window the app opens I can see this:
[MSG @RIF]: Loaded RIF API version: 1.7.1.0xfe53c43a
[MSG @RIF]: Loading model: D:/Work/marmosetco.toolbag4/RifDenoiserFinal/data/render/rif-models/denoise_c9_ldr.pb
INFO: Memory: required: 1525088256, single node max: 1275854848, requested sum: 4034199552
Fatal error: cannot create pipeline state
Hi @BartSiwek,
I've added a print debug message based on HRESULT for CreateComputePipelineState.
https://drive.google.com/file/d/1TxaYmcNVOBdpjMarbG8HAsBb5wxVLK_T/view?usp=sharing
Perhaps we can get more information.
This error is not related to the RIF itself. Let's try to understand what's going on.
I'm interested in ID3D12Device::CreateComputeShader
I didn't find any mention of this call for DX12 only for DX11.
Hi @BartSiwek,
D3D12 ERROR: ID3D12Device::CreateComputeShader: Compute Shader is unsigned. [ STATE_CREATION ERROR #322: CREATECOMPUTESHADER_INVALIDSHADERBYTECODE]
Also, this error is the reason for the absence of the dxcompiler library or its incorrect version.
Here is the correct version
https://drive.google.com/file/d/1H1oCECq33b_tCYjGv-fp31MnEBalfcdK/view?usp=sharing
Hi,
I finally got some time and I am trying your example. I was able to run it this time (even without replacing the d3d compiler dll).
This standalone app seems to not be causing the acne I shown in the example. I'll try to compare the device, queue and resource setup to figure out what could be different.
In the mean time however I do see a bit of a halo in the results: