Steven Johnson
Steven Johnson
+1, it seems like there is a pretty clear desire for this. It would be really great to get this as a configurable option in the 'official' actions/cache.
@abadams commented in chat: "We really need a compiler pass that can take a global view of what loads are happening in a stmt... StageStridedLoads or some such thing, then...
Question for schedule estimation: after this PR lands, do we expect this to be a functional (although perhaps alpha- or beta- quality) backend? Or are there more things we know...
You will need to sync this PR to `main` in order to build with top-of-tree LLVM -- ordinarily I'd just do that instead of suggesting it but there are enough...
(I took the liberty of syncing this to main, since the conflicts in test/generator were a little subtle)
There's a subtle but critical mistake in a lot of the runtime code (which is unfortunately easy to make): When an error occurs in any Halide runtime code, you must...
> @steven-johnson Please take a look at the error handling fixes when you get a chance, and let me know if I missed anything! I am on vacation next week...
> in my brief testing the performance penalty due to the extra memory bandwidth far outweighs the extra compute to unpack the pixels Understandable, but... if a device doesn't have...
> @steven-johnson I don't have write access to do the merge ... could you approve this PR when you get a chance? Sure, but did the other reviewers approve it?...
Please get the buildbots green before requesting a review :-)