accelerated-scan
accelerated-scan copied to clipboard
Accelerated First Order Parallel Associative Scan
Results
4
accelerated-scan issues
Sort by
recently updated
recently updated
newest added
Running the triton implementation with torch 2.2 on inputs of type float16 and bfloat16 result in the following error: ``` File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None,...
@sustcsonglin has suggested that float accumulation might improve stability of the implementation. The current test I'm trying using to see this is: ``` python -m pytest tests -s -v -k...
@proger Awesome work! Always appreciate the wonderful contributions of OSS advancing the frontiers of research. I know you've done a number of experiments comparing various scan implementations in your other...