AMF icon indicating copy to clipboard operation
AMF copied to clipboard

[Bug]: Crash on AWS when calling `InitDX12` on a amf::AMFContext2Ptr context

Open DenisTensorWorks opened this issue 2 years ago • 11 comments

Describe the bug Crash on Windows Server 2022 on AWS with AMD Radeon Pro V520 MxGPU when calling "InitDX12" on a amf::AMFContext2Ptr context. This is observed in Unreal Engine as well as AMD AMF sample code. The simplest way to reproduce is to run the AMD AMF Sample project.

To Reproduce Steps to reproduce the behavior:

  1. Clone the https://github.com/GPUOpen-LibrariesAndSDKs/AMF sample;
  2. Open AMF/amf/public/samples/CPPSamples_vs2019.sln in Visual Studio 2019;
  3. The project will check D3D11, so need to modify one line to run D3D12;
  4. In CapabilityManager.cpp, change line 590 from std::wstring capability = L"DX11"; to std::wstring capability = L"DX12";
  5. Change solution platform to x64 (useful for remote debugging);
  6. Right click on CapabilityManager and select build;
  7. Copy CapabilityManager.exe (this can be found in "AMF\amf\bin\vs2019x64Debug") to a AWS instance that matches the specs above;
  8. Start Powershell and run the CapabilityManager.exe. Note it will print a few lines, "DX12: List of adapters"... and "DX12 : Chosen Device 0: Device ID: 7362 [AMD Radeon Pro V520 MxGPU]" But till then terminate. It should print the encoding and decoding capabilities;
  9. If a debugger is attached to the process, it can be observed that the application crashes at line 678: "deviceInit = (pContext2->InitDX12(deviceDX12.GetDevice()) == AMF_OK);"

Setup: AWS instance type: g4ad.xlarge OS: Microsoft Windows Server 2022 Datacenter OS version: 10.0.20348 Build 20348 GPU : AMD Radeon Pro V520 MxGPU Driver: 22.10.01.12-220930a-384126e-WHQLCert_ Driver source: From the ec2-amd-windows-drivers AWS bucket Driver Date: 09/30/2022 Driver version: 30.0.21001.12042

Note: if the project is run with D3D11 set, then "InitDX11" will not crash and the device capabilities will be printed.

Screenshot: image

DenisTensorWorks avatar Aug 02 '23 07:08 DenisTensorWorks

The driver crash has been prevented and the fix will be publicly available in the upcoming G4ad Windows driver release for AWS.

rhutsAMD avatar Dec 21 '23 18:12 rhutsAMD

Fantastic, thank you @rhutsAMD 🎉

DenisTensorWorks avatar Jan 07 '24 21:01 DenisTensorWorks

I checked and the current driver on the AWS S3 bucket is still the same as in the original post. Is there any information anywhere on when the driver will be released ?

abivolmv avatar Feb 20 '24 07:02 abivolmv

AWS drivers are not part of regular AMD driver update sequence. Please ask at AWS when they decide to update the driver.

MikhailAMD avatar Feb 20 '24 14:02 MikhailAMD

Hi @rhutsAMD, would you be able to please check whether the fix was rolled out in the 31.0.21002.2017 version, as it appears to still not be working on that version. Many thanks for your help!

DenisTensorWorks avatar Oct 23 '24 04:10 DenisTensorWorks

Any update on when the correction for this will be released?

ndbs-tsb avatar Feb 28 '25 09:02 ndbs-tsb

The fix for the crash has been verified fixed on G4AD with updated driver 31.0.21002.10002.

rhutsAMD avatar Mar 05 '25 15:03 rhutsAMD

The fix for the crash has been verified fixed on G4AD with updated driver 31.0.21002.10002.

Are you sure this is correct? This is the exact driver version on which I have tested and reproduced the issue.

ndbs-tsb avatar Mar 05 '25 16:03 ndbs-tsb

@rhutsAMD I am still experiencing this error. Specifically, it occurs when running Unreal pixel streaming. Unreal runs ok on its own, but as soon as a pixel streaming client connects, the error occurs.

Perhaps this is a different but related error? I am including some log output below.

[UNREAL-ENGINE][stdout.data]: LogAudioMixer: Display: Sending SubmixBufferListener 'PixelStreaming:SubmixCapturer' register command...

[UNREAL-ENGINE][stdout.data]: LogAudioMixer: Display: Submix buffer listener 'PixelStreaming:SubmixCapturer' registered with submix 'MasterSubmixDefault'

[UNREAL-ENGINE][stdout.data]: LogD3D12RHI: Error: CurrentQueue.Fence.D3DFence->GetCompletedValue() failed 
 at D:\build\++UE5\Sync\Engine\Source\Runtime\D3D12RHI\Private\D3D12Submission.cpp:1013 
 with error DXGI_ERROR_DEVICE_REMOVED with Reason: DXGI_ERROR_DRIVER_INTERNAL_ERROR


[UNREAL-ENGINE][stdout.data]: LogD3D12RHI: Error: [GPUBreadCrumb] Last tracked GPU operations:

[UNREAL-ENGINE][stdout.data]: LogD3D12RHI: Error: [GPUBreadCrumb]	3D Queue 0
Breadcrumbs: > Frame 195 [Active]
  Breadcrumbs: | BufferPoolCopyOps [Finished]
  Breadcrumbs: | TexturePoolCopyOps [Finished]
  Breadcrumbs: | WorldTick [Finished]
  Breadcrumbs: | SendAllEndOfFrameUpdates [Finished]
  Breadcrumbs: | MorphUpdate_Jen_FaceMesh_LOD0 LodVertices=35181 Batches=7436 [Finished]
  Breadcrumbs: | NiagaraGpuComputeDispatch [Finished]
  Breadcrumbs: | HairStrandsIndVoxelPageClear [Finished]
  Breadcrumbs: | HairStrandsVoxelize [Finished]
  Breadcrumbs: | HairStrandsDensityMipGen [Finished]
  Breadcrumbs: > FRDGBuilder::Execute [Active]
    Breadcrumbs: | GPUSkinCache_UpdateSkinningBatches [Finished]
    Breadcrumbs: | GPUSkinCache_RecomputeTangentsBatches [Finished]
    Breadcrumbs: | ClearGPUMessageBuffer [Finished]
    Breadcrumbs: | UpdateAllPrimitiveSceneInfos [Finished]
    Breadcrumbs: | VirtualTextureClear [Finished]
    Breadcrumbs: | ShaderPrint::ClearCounters [Finished]
    Breadcrumbs: | ShaderPrint::UploadParameters [Finished]
    Breadcrumbs: > Scene [Active]
      Breadcrumbs: | HairGuideInterpolation [Finished]
      Breadcrumbs: | AccessModePass[Graphics] (Textures: 0, Buffers: 2) [Finished]
      Breadcrumbs: | Niagara::GPUProfiler_BeginFrame [Finished]
      Breadcrumbs: | NiagaraGpuComputeDispatch [Finished]
      Breadcrumbs: | AccessModePass[Graphics] (Textures: 0, Buffers: 1) [Finished]
      Breadcrumbs: | FFXSystem::PreRender [Finished]
      Breadcrumbs: | FGPUSortManager::OnPreRender [Finished]
      Breadcrumbs: | GPUScene.UploadDynamicPrimitiveShaderDataForView [Finished]
      Breadcrumbs: | BuildRenderingCommandsDeferred(Culling=On) [Finished]
      Breadcrumbs: | ClearDepthStencil (SceneDepthZ) [Finished]
      Breadcrumbs: | PrePass DDM_AllOpaque (Forced by Nanite) [Finished]
      Breadcrumbs: | FVirtualShadowMapArray::UpdatePhysicalPageAddresses [Finished]
      Breadcrumbs: | ComputeLightGrid [Finished]
      Breadcrumbs: | LightFunctionAtlas Generation [Finished]
      Breadcrumbs: | BeginOcclusionTests [Finished]
      Breadcrumbs: | BuildHZB(ViewId=0) [Finished]
      Breadcrumbs: | SubmitCommands [Finished]
      Breadcrumbs: | FenceOcclusionTests [Finished]
      Breadcrumbs: | GBufferClear [Finished]
      Breadcrumbs: | BasePass [Finished]
      Breadcrumbs: | HairInterpolation(Strands) [Finished]
      Breadcrumbs: | HairStrandsAABB [Finished]
      Breadcrumbs: | HairStrandsVoxelization [Finished]
      Breadcrumbs: | HairStrandsVisibility [Finished]
      Breadcrumbs: | FVirtualShadowMapArray::BuildPageAllocation [Finished]
      Breadcrumbs: | ShadowDepths [Finished]
      Breadcrumbs: | DispatchToRHI [Finished]
      Breadcrumbs: | LightCompositionTasks_PreLighting [Finished]
      Breadcrumbs: | ClearStencil (SceneDepthZ) [Finished]
      Breadcrumbs: > DiffuseIndirectAndAO [Active]
        Breadcrumbs: > HairStrands::AO 788x1400 [Active]
        Breadcrumbs: > DiffuseIndirectComposite(DiffuseIndirect=Disabled ApplyAOToSceneColor) 788x1400 [Active]
      Breadcrumbs:   ClearTranslucencyLightingVolumeCompute 64 [Not started]
      Breadcrumbs:   Lights [Not started]
      Breadcrumbs:   FilterTranslucentVolume 64x64x64 Cascades:2 [Not started]
      Breadcrumbs:   ReflectionIndirect [Not started]
      Breadcrumbs:   SubsurfaceScattering(ViewId=0) [Not started]
      Breadcrumbs:   HairStrands::EnvLightingPS(SceneScatter) [Not started]
      Breadcrumbs:   TSR MeasureFlickeringLuma 788x1400 [Not started]
      Breadcrumbs:   SetSceneTexturesUniformBuffer [Not started]
      Breadcrumbs:   FFXSystem::PostRenderOpaque [Not started]
      Breadcrumbs:   AsyncGpuTraceHelper::PostRenderOpaque [Not started]
      Breadcrumbs:   Niagara::PostRenderFinish [Not started]
      Breadcrumbs:   UnsetSceneTexturesUniformBuffer [Not started]
      Breadcrumbs:   FGPUSortManager::OnPostRenderOpaque [Not started]
      Breadcrumbs:   HairStrandsComposition [Not started]
      Breadcrumbs:   Translucency [Not started]
      Breadcrumbs:   VirtualTextureFeedbackCopy [Not started]
      Breadcrumbs:   PostProcessing [Not sta
[UNREAL-ENGINE][stdout.data]: rted]
      Breadcrumbs:   ReleaseRayTracingResources [Not started]
      Breadcrumbs:   RenderFinish [Not started]
      Breadcrumbs:   ExtractUniformBuffer [Not started]
    Breadcrumbs:   EnqueueCopy(GPUMessageManager.MessageBuffer) [Not started]
    Breadcrumbs:   AccessModePass[Graphics] (Textures: 0, Buffers: 82) [Not started]
  Breadcrumbs:   SlateUI Title = QC_CTRLHuman (64-bit Development PCD3D_SM6)  [Not started]


[UNREAL-ENGINE][stdout.data]: LogD3D12RHI: Error: [GPUBreadCrumb]	Copy Queue 0


[UNREAL-ENGINE][stdout.data]: LogD3D12RHI: Error: [GPUBreadCrumb]	Compute Queue 0
Breadcrumbs: | AccessModePass[AsyncCompute] (Textures: 0, Buffers: 1) [Finished]
Breadcrumbs: | Scene [Finished]
Breadcrumbs: | AccessModePass[AsyncCompute] (Textures: 0, Buffers: 1) [Finished]
Breadcrumbs: | Scene [Finished]
Breadcrumbs: | AccessModePass[AsyncCompute] (Textures: 0, Buffers: 1) [Finished]
Breadcrumbs: | Scene [Finished]
Breadcrumbs: | AccessModePass[AsyncCompute] (Textures: 0, Buffers: 1) [Finished]
Breadcrumbs: | Scene [Finished]
Breadcrumbs: | AccessModePass[AsyncCompute] (Textures: 0, Buffers: 1) [Finished]
Breadcrumbs: | Scene [Finished]
Breadcrumbs: | AccessModePass[AsyncCompute] (Textures: 0, Buffers: 1) [Finished]
Breadcrumbs: > Scene [Active]
  Breadcrumbs: | AccessModePass[AsyncCompute] (Textures: 0, Buffers: 1) [Finished]
  Breadcrumbs: > PostProcessing [Active]
    Breadcrumbs: > TemporalSuperResolution(sg.AntiAliasingQuality=3) 788x1400 -> 1080x1920 [Active]
      Breadcrumbs: | TSR ClearPrevTextures 788x1400 [Finished]
      Breadcrumbs: > TSR DilateVelocity(#1 MotionBlurDirections=1) 788x1400 [Active]
      Breadcrumbs:   TSR DecimateHistory(#6  ReprojectResurrection 16bit) 788x1400 [Not started]
    Breadcrumbs:   Bloom 540x960 [Not started]
Breadcrumbs:   AccessModePass[AsyncCompute] (Textures: 0, Buffers: 1) [Not started]
Breadcrumbs:   Scene [Not started]


[UNREAL-ENGINE][stdout.data]: LogD3D12RHI: Error: DRED: No breadcrumb head found.

[UNREAL-ENGINE][stdout.data]: LogD3D12RHI: Error: DRED: No PageFault data.
LogD3D12RHI: Error: Memory Info from frame ID 197:
LogD3D12RHI: Error: 	Budget:	7259.65 MB
LogD3D12RHI: Error: 	Used:	2326.38 MB

[UNREAL-ENGINE][stdout.data]: LogAVCodecs: Error: Error Creating: Failed to initialize the D3D12 encoder context [AMF 18]

[UNREAL-ENGINE][stdout.data]: LogAVCodecs: Error: Error Creating: Failed to initialize the D3D12 encoder context [AMF 18]

[UNREAL-ENGINE][stdout.data]: LogPixelStreaming: Error: Could not create encoder. Check encoder config or perhaps you used up all your HW encoders.

[UNREAL-ENGINE][stdout.data]: LogAVCodecs: Error: Error Creating: Failed to create RHI child platform encoder [RHI]
LogAVCodecs: Error: Error Creating: Failed to create RHI child platform encoder [RHI]

[UNREAL-ENGINE][stdout.data]: LogPixelStreaming: Error: Could not create encoder. Check encoder config or perhaps you used up all your HW encoders.

[UNREAL-ENGINE][stdout.data]: LogAVCodecs: Error: Error Creating: Failed to initialize the D3D12 encoder context [AMF 18]

[UNREAL-ENGINE][stdout.data]: LogAVCodecs: Error: Error Creating: Failed to initialize the D3D12 encoder context [AMF 18]
LogAVCodecs: Error: Error Creating: Failed to create RHI child platform encoder [RHI]
LogAVCodecs: Error: Error Creating: Failed to create RHI child platform encoder [RHI]

ndbs-tsb avatar Apr 08 '25 07:04 ndbs-tsb

Hard to tell if this is the same or a different issue. Few things:

  • Which exact driver do you use?
  • Can you try baremetal local setup?
  • Do you have ability to rebuild UE and enable detailed AMF logging?
  • I would need to check DX3D12 integration between UE and AMF. Last time I saw into it it was D3D11 only.
  • Can you run AMF samples with DX12 as described in the initial issue?

MikhailAMD avatar Apr 08 '25 17:04 MikhailAMD

Hard to tell if this is the same or a different issue. Few things:

  • Which exact driver do you use?

31.0.21002.10002

  • Can you try baremetal local setup?

I don't have hardware for this.

  • Do you have ability to rebuild UE and enable detailed AMF logging?

Not with my current tool setup, unfortunately.

  • I would need to check DX3D12 integration between UE and AMF. Last time I saw into it it was D3D11 only.
  • Can you run AMF samples with DX12 as described in the initial issue?

I don't have the tooling just now.

ndbs-tsb avatar Apr 22 '25 07:04 ndbs-tsb