TensorComprehensions
TensorComprehensions copied to clipboard
A domain specific language to express machine learning workloads.
#354 exhibits potential issues with size selections Revisit after #307 and tinker with the tuner.
Currently TC requires to compile a new fully-specialized version for each new tensor size. #327 has some context about usage in the C2 case: > What if dimensions changed? You...
I got this error message on a run of `test.sh` on 806e3b066a6a841786be4dde38357c972816dcaf: ``` [----------] 7 tests from CompilationCacheTest [ RUN ] CompilationCacheTest.ExpectQuerySuccess [ OK ] CompilationCacheTest.ExpectQuerySuccess (3558 ms) [ RUN...
For parallel reductions, CUB requires a block of shared memory, which is wrapped in an opaque object. Shared memory promotion mechanism needs to know the amount of available shared memory....
the reason for having the LHS indirection is because let's say someone has the lookup table: ``` def lut(float(B, R) M, int32(B, N) I) -> (O) { O(b, n) +=!...
As discussed in Slack, support to argmin/argmax operations should be added in the future.
```python import tensor_comprehensions as tc import math as ma import torch import torch.nn as nn import torch.nn.functional as F import random import time import os from torch.autograd import Function LANG...
I am experimenting a simple TC op, the calculation finished quickly but the program then hangs and never exit. Both GPU and CPU are still in full utilization even after...
@ttheodor i autotune the function in server with 4gpu and get cache file. and i try to reuse generated cache file in different server with 4gpu. but it throws error...
Hi all, When playing around with large convolution filters, I discovered an unexpected behavior: with a large filter, if the input image size exceeds 32, a standard convolution outputs gibberish...