accelerated-scan icon indicating copy to clipboard operation
accelerated-scan copied to clipboard

Accelerated First Order Parallel Associative Scan

Results 4 accelerated-scan issues
Sort by recently updated
recently updated
newest added

Running the triton implementation with torch 2.2 on inputs of type float16 and bfloat16 result in the following error: ``` File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None,...

@sustcsonglin has suggested that float accumulation might improve stability of the implementation. The current test I'm trying using to see this is: ``` python -m pytest tests -s -v -k...

@proger Awesome work! Always appreciate the wonderful contributions of OSS advancing the frontiers of research. I know you've done a number of experiments comparing various scan implementations in your other...