python_tutorial_lesson_12_using_the_gpu is failing for D3D12Compute
At some recent point it started failing on some (but not all) Windows systems (seems to be somewhat GPU-specific). See https://buildbot.halide-lang.org/master/#/builders/8/builds/45 for an example.
Cannot repro on Windows 10 2004 (build 19041.388) with GTX 1080 Ti and driver version 451.67.
More debugging info from gitter:
Entering Pipeline curved
Target: x86-64-windows-avx-avx2-d3d12compute-debug-f16c-fma-jit-sse41-user_context
Input Buffer b0: buffer(0, 0x0, 0x1b7f6f53040, 1, uint8, {0, 1280, 2304}, {0, 768, 3}, {0, 3, 1})
Input (void *) __user_context: 0xb0d6fee1b8
Output Buffer curved: buffer(0, 0x0, 0x1b7ffaf3080, 0, uint8, {0, 1280, 1}, {0, 768, 1280}, {0, 3, 983040})
[@] halide_d3d12compute_initialize_kernels
[@] halide_d3d12compute_acquire_context
Time for halide_d3d12compute_initialize_kernels: 2.000000e-04 ms
[@] halide_d3d12compute_release_context
[@] halide_d3d12compute_device_interface
[@] halide_d3d12compute_device_malloc
(user_context: 0xb0d6fee1b8, buf: 0xb0d6fec760)
allocating buffer(0, 0x0, 0x0, 0, uint8, {0, 65536, 1})
[@] halide_d3d12compute_acquire_context
[@] new_buffer
[@] new_device_buffer
[@] new_buffer_resource
[@] D3DError
ID3D12Resource object created: 0x1b7f76a9a20
[@] malloct
[@] d3d12_malloc
allocated 80 bytes @ 0x1b7dd34aba0
Time: 1.158000e+00 ms
[@] halide_d3d12compute_release_context
[@] halide_d3d12compute_run
[@] halide_d3d12compute_acquire_context
[@] new_command_allocator
[@] D3DError
ID3D12CommandAllocator object created: 0x1b7f78ce0d0
[@] new_compute_command_list
[@] new_command_list
[@] D3DError
ID3D12GraphicsCommandList object created: 0x1b7f7686b50
[@] malloct
[@] d3d12_malloc
allocated 16 bytes @ 0x1b7dd3435d0
[@] new_function_with_name
groupshared memory size before modification: 0
groupshared memory size: 16 bytes.
numthreads( 16, 1, 1 )
SUCCESS while compiling D3D12 compute shader with entry name 'kernel_lut_s0_i_block___block_id_x'!
[@] malloct
[@] d3d12_malloc
allocated 16 bytes @ 0x1b7dd34ac00
halide_memoization_cache_store
[@] d3d12_malloc
allocated 72 bytes @ 0x1b7dd34ac20
[@] d3d12_malloc
allocated 56 bytes @ 0x1b7dd34ac70
Exiting halide_memoization_cache_store
[@] new_descriptor_binder
[@] D3DError
ID3D12DescriptorHeap object created: 0x1b7f7944060
descriptor handle increment size: 32
[@] malloct
[@] d3d12_malloc
allocated 64 bytes @ 0x1b7dd34acb0
descriptor heap base for CPU: 3
descriptor heap base for GPU: 10848671113412608
Segmentation fault
(I'm disabling this on test on the buildbots; let's be sure to re-enable it when we think we have a fix)
Also does not repro on my machine: Radeon 560X, Windows 10 Enterprise Version 2004 (19041.388), Radeon Driver Packaging Version 19.30.01.33-191121a-349047C, Direct3D® [driver] Version 9.14.10.01410, Python 3.8.2
This should be "closeable" by now. Can someone verify?
@steven-johnson -- can you re-enable on the buildbots? @slomp -- which PR should have fixed this?
See https://github.com/halide/build_bot/pull/97
buildbot changed, restart windows tests as appropriate