tvm icon indicating copy to clipboard operation
tvm copied to clipboard

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Results 636 tvm issues
Sort by recently updated
recently updated
newest added
trafficstars

> [Last release v0.17.0](https://github.com/apache/tvm/issues/17122) was proposed at the end of July. and the release day is 25 July, more detail refer [v0.17.0 release schedule](https://github.com/apache/tvm/issues/17122). It has been almost **three months**...

# Introduction The TVM community has worked since the last release to deliver the following new exciting improvements! The main tags are below (**bold text is with lots of progress**):...

The pass `LowerThreadAllreduce` enables efficient block reduction. However, block reduction often requires a large amount of shared memory space. The current implementation of `LowerThreadAllreduce` only enable static shared memory reduce...

support [torch.index_fill_](https://pytorch.org/docs/stable/generated/torch.Tensor.index_fill.html)

Hello, I am currently using auto_scheduler to automatically tune a naive gemm operator. However, after the tuning is completed, I checked the corresponding assembly code and found that the registers...

type: bug
needs-triage

I encountered a segmentation fault when applying the `PartitionTransformParams` pass to a Relax IR module that performs tensor concatenation and transposition operations. The segmentation fault occurs during the execution of...

type: bug
needs-triage

When loading from database_tuning_record.json in Meta Schedule (this line: `B_reindex_pad_shared_dyn[v0, v1] = T.if_then_else(v0 < 1, B[v1, v0], T.float16(0)))`, the parameter dtype of the primitive pad_einsum is read as int64, causing...

TVM is built with USE_MRVL=ON and TVM Compiler is invoked with a default (LLVM) target alone. Command line processor emits the below error "Error: Passed --target-mrvl-accelerator_config but did not specify...

type: bug
needs-triage

As discussed in #17439, The phase of ThreadSync injection should be applied when the memory allocations are all deterministic.

Lead to Suboptimal Shared Memory Reuse. pr #9341 introduced liveness analysis to merge the shared memory allocations , places touched buffer records at the outermost scope (e.g., outer loops) rather...

type: bug
needs-triage