Xiang Li

Results 29 issues of Xiang Li

This change add Arch::dx12 for DirectX12. rhi, runtime and codegen are added for dx12 with minimal support to run aot. One AOT test is added to test dxil container generation....

cbuffer CB { float a; float b:packoffset(c1.y); } float main() : SV_Target { return a + b; } FXC will report error " cannot mix packoffset elements with nonpackoffset elements...

Transform Tmp = uav[i]; // UavLdCpy ... Update Tmp; // partial update. ... uav[i] = Tmp; // UavStCpy into Tmp = uav[i]; // UavLdCpy ... Update Tmp; // partial update....

For the repro, only u[i].b need to be load and store. struct S { float4x4 a[10]; float4 b; }; RWStructuredBuffer< S > u; void foo(inout S s) { s.b +=...

For llvm ir like this ``` target triple = "nvptx64-nvidia-cuda" ; Function Attrs: nounwind define void @foo(i64* nocapture readonly byval(i64) %0, i64* nocapture readonly byval(i64) %itop) local_unnamed_addr #0 { entry:...

new issue

Only make sure the pipeline generate something. No real dxil generated yet. Move DX12 build to gpu ci which will run the aot test. Issue: #5276

For shader like this: groupshared float a[10]; [numthreads(8,8,1)] void main() { a[0] = 1; } the output dxil is this: @"\01?a@@3PAMA" = external addrspace(3) global [10 x float], align 4...

performance