antonysigma
antonysigma
Moving the discussions from the draft PR here. Feel free to clarify if I mis-quoted you. > maaz139 : From Given that we now have an assembly tab when using...
> mcourteaux : [The presence of buffer `cuda_gpu_source_kernels` print out in the IR tab] boils down to the other comment of @maaz139. Because that [PTX code] really is the lowered...
> I don't know how you ever got to that screenshot (supposedly in Halide 10?), because the GPU-specific Stmt IR got already offloaded in the Lowering passes before the Generator...
> Making the PTX available in the stmt file made sense to me, as that is really what gets compiled. I am with you on the PTX printout requirement. I...
Thanks @mcourteaux . I am looking forward to the PR. My web development skill is 20 years out of date (XHTML 1.0, Backbone.js, MVC-based architecture). But I can help review...
Re: `.0`, `.1` notations. Do you mean the Tuple datatype? I also saw these suffixes in the IR of complex number datatype implemented in `apps/fft/complex.h`. See also the pinned issue...
Excuse me, is this the right place to add to the tutorial wishlist? I have a few vendor-specific tutorials I wish to be included in the documentation: * [ ]...
> Several bot failures with: > > ``` > /home/halidenightly/build_bot/worker/halide-testbranch-main-llvm18-x86-32-linux-make/halide-source/src/autoschedulers/mullapudi2016/AutoSchedule.cpp:2830:21: error: unused variable ‘types’ [-Werror=unused-variable] > ``` Done removing the offending line. I also rebased the changes on top of...
@steven-johnson and @abadams , thank you for testing the PR on the CI. Yes, the failure is triggered by the CMake build option `-DHalide_TARGET=host-[metal|gpu]`. I didn't know we can do...
Update: The GPU scheduling extension for Mullapudi2016 passes all Buildbot tests except for `autograd_grad.generator` and `local_laplacian_generator`. 1. `autograd_grad` passes the Buildbot tests, but the unamed `Var x` triggers `basic_string::_M_construct ==...