TensorComprehensions
TensorComprehensions copied to clipboard
A domain specific language to express machine learning workloads.
This is a WIP experiment, please do not review. I am looking on some feedback on how to best propagate vector types through Halide following up on the discussion from...
[DO NOT MERGE] Repro maybe uninitialized variable warning triggerring error appearing with TC_CHECK
For repro purposes only (@ftynse @skimo-openhub ) This reverts commit 02b7b057c986430b35c7a214a7e665f944ae535b.
#543 removed the notion of schedule tree element: now specific, node types inherit directly from `ScheduleTree`, which is simpler and offers more type safety. This refactoring uncovered several technical and...
Similarly to shared memory promotion, we may want to limit the number of elements promoted to registers. In particular, it should be less than the number of available registers of...
With #537, it is possible to promote to shared memory at disjoint subtrees. Tthe `maxSharedMemory` option controls the _total_ amount of shared memory used by _all_ subtrees, whereas the same...
If a tensor reference group is promoted to shared memory at some scope, it may be interesting to promote it to registers at some deeper scope. There are two possibilities:...
Currently, TC makes extensive use of command line flags (provided by gflags) for debugging or configuration purposes. These flags are essentially global variables, and global variables are generally discouraged. In...
I noticed that in some cases the first statement after variable definitions in the kernel is a `__syncthreads();` which if I am not mistaken makes no sense. For example, in...
This is a wip allowing grid synchronizations to be produced. It also changes the mapping algorithm, to be able to map bands bands to blocks even if there is no...