TensorComprehensions
TensorComprehensions copied to clipboard
A domain specific language to express machine learning workloads.
See, e.g., https://ci.pytorch.org/jenkins/job/tensorcomp-builds/job/tc-cuda9.0-cudnn7.1-ubuntu16.04-devel-build-test/259/consoleText I was able to reproduce it on my system. top10 contains exactly one element, but it is different from top1. The element in top10 corresponds to the...
#489 introduces more code that needs the mapping to be represented as both a mupa and a `union_set`, depending on the context of use. We now have (almost) duplicate code...
- [x] teach parser about `bool, int8, int16, int32, int64, float16, float64` - [x] make sure Halide types are properly constructed - [ ] make sure polyhedral dependence analysis behaves...
Implements timeout for cuda backend using mapping option. Also adds a flag to change the default mapping option of the timeout. The aim of this PR is to allow the...
Hi, I'm interested in running a TC-generated CUDA kernel outside of PyTorch. Currently, I'm using the TC options to specify grid and block dim3. E.g., with ``` .mapToThreads(320) .mapToBlocks(32, 320)...
in some case, i don't know why this happens but autotuning never finished. autotuner is freeze with "100/100" and job is unfinshing. in that case i try to use ctrl+c...
We currently incorrectly treat all tensors as dense. Need to add support for strided tensors. This can be done by adding the strides as extra params, and either constructing the...
Thanks for your great library. I just started to play around with TC. I experience that some of the initial jobs created by the autotuner in generation 0 run for...
Hi, so I had to slightly modify the autotuner_parallel.sh script, and I still have a few questions: 1. Does the script work in srun mode as well? 2. What exactly...