Andrew Adams
Andrew Adams
A minimal baseline for the purpose of discussion and comparison with #6887
This is a minimum viable zero-cost way to load vectors of one type as larger or smaller vectors of a narrower or wider type respectively. It does this using two...
Let A, B, C, D be the bytes of a 32-bit x, in big-endian order, so: ``` x = A * (1
A mux is equivalent to a select tree, but the skip-stages pass doesn't understand that it means some values are unused. Consider the following: ``` #include "Halide.h" using namespace Halide;...
This single 2x downsample program: ``` #include "Halide.h" using namespace Halide; int main(int argc, char **argv) { Func f, g; Var x, y; ImageParam im(UInt(8), 1); g(x) = im(2 *...
This code: ``` #include "Halide.h" using namespace Halide; int main(int argc, char **argv) { Func f, g; Var x, y; ImageParam im(UInt(8), 1); g(x) = im(2 * x) + im(2...
In various places it has memcpy calls that assume buf->host is the lowest address in memory for that buffer. Negative strides are not tested, so it's not surprising that they...
Avoid making a DAG of Halide Funcs, and instead just produce a Halide Stmt directly. Should have no observable effect on anything, other than producing slightly simpler variable names (hence...
We currently incorrectly treat all tensors as dense. Need to add support for strided tensors. This can be done by adding the strides as extra params, and either constructing the...