tvm
tvm copied to clipboard
Open deep learning compiler stack for cpu, gpu and specialized accelerators
This PR fixes the shuffle rewrite pass to handle the case where the vector lanes are larger than the data type of the input vector.
This allows existing files to be updated. cc @tqchen
This commit introduces rewrite rules for indices which can arise from splitting axes by scalable factors (e.g. `xo, xi = sch.split(x, factors = [None, 8 * T.vscale()])`): ``` (v_x_o *...
I would like to ask what configuration causes the generated file types to be different? 
This PR fixes a bug in the PagedKVCache which may happen when the sequence removal order is not consistent with the reverse order of sequence add/fork order. With this fix,...
How should the following issues be resolved when running demo_static? demo_static.c: In function ‘main’: demo_static.c:52:3: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] (void)fread(input_storage, 3 * 224...
How should the following issues be resolved when running demo_static? demo_static.c: In function ‘main’: demo_static.c:52:3: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] (void)fread(input_storage, 3 * 224...
How should the following issues be resolved when running demo_static? demo_static.c: In function ‘main’: demo_static.c:52:3: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] (void)fread(input_storage, 3 * 224...
### Actual behavior  What actually happened ### Environment Any environment details, such as: Operating System, TVM version, etc ### Steps to reproduce ``` import tvm import tvm.relay as relay...
This commit extends the SME conv2d NHWC schedule to support convolutions with float16 inputs (data and kernel) and a float32 output using the tensor intrinsics added in #16981. cc @ekalda...