Jay Zhuang

Results 56 comments of Jay Zhuang
trafficstars

Hi @cheshmi ! I implemented the sequential, levelset and lbc versions of csr_ltsolve, and verified the lbc executor with HDagg partiontioning. Let me post the executor kernels here. If they...

Is it better to send pull requests to this sympiler repo or to the [aggregation](https://github.com/sympiler/aggregation) repo? Both have sptrsv implementations ([this](https://github.com/sympiler/sympiler/blob/master/sparse_blas/sptrsv.cpp) and [that](https://github.com/sympiler/aggregation/blob/master/example/sptrsv_src/sptrsv.cpp)) and associated unit tests ([this](https://github.com/sympiler/sympiler/blob/master/Catch_tests/sptrsv_tests.cpp) and [that](https://github.com/sympiler/aggregation/blob/master/Catch_tests/sptrsv_tests.cpp))....

For completeness, also implemented the CSR `usolve`. The sequential version is the same as CSC [`ltsolve`](https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/stable/CSparse/Source/cs_ltsolve.c). The parallel version shares the same level schedule with `lsolve` and `ltsolve` (more like...

> I hope the new parallel code makes your code faster. Please let me know if it is not the case. `usolve` takes similar time as `lsolve`, and sometimes even...

FYI, I am able to run this code repo using `torch==2.1.2+cu121` (current stable release, not nightly), by just commenting out `torch._inductor.config.fx_graph_cache = True` [in generate.py](https://github.com/pytorch-labs/gpt-fast/blob/c5d345462c8d455c7c5f0fc96d175d4e2142501e/generate.py#L18) which is not available in...

This should solve the problem😄 https://github.com/huggingface/transformers/issues/28075 https://github.com/huggingface/transformers/pull/27931

Sorry, I think the above implementations are wrong. Their results depend on the initial values of array `x`, which doesn't make sense. The correct one should be: ```cpp void sptrsv_csr(int...

Also implemented the ouf-of-place version of parallel levelset and LBC executors, and with the [`usolve` variant](https://github.com/sympiler/sympiler/issues/7#issuecomment-1352999869). **All verified correctness against the sequential in-place version.** Expand code: ```cpp void sptrsv_csr_usolve(int n,...

> I plan to include these in/out place versions in the sparse BLAS. Sure, please feel free to take those. As it involves organizing similar functions' naming and signature, should...

> I can give you an example. Great, thanks very much!