Yan Xu issues

Results 84 issues of


                                            Yan Xu

To support aten::copy_ operator

TorchDisc

To support aten::_softmax operator

TorchDisc

To support aten::zero_ operator

TorchDisc

Survey nvfuser to complete PyTorch training accelerating design doc

training

Add DiscCache in TorchDisc to cache dynamic shape executable object

feature

TorchDisc

Support Scalar type as the Disc module input/output type

To enable BladeDISC compilable, the input and output should be Tensor type, It works well in TensorFlow world, but insufficient in PyTorch world, because a considerable number of inputs/outputs is...

need discussion

speed up CI for markdown file only

The CI system can skip the build and test step if a pull request contains markdown files only.

enhancement shape analysis pass

This PR add some op shape analysis, to make this pass more stable, better to add a unit test to check the static shape and dynamic shape for a new...

Add scalar reduction codegen schedule

add scalar-reduction codegen template , the algorithm comes from https://developer.download.nvidia.com/assets/cuda/files/reduction.pdf

support collective operators

To optimize distributed training graph (DP, FSDP), DISC needs to support collective ops as a preliminary preparation - [ ] support collective ops compilation and execution (all_reduce, all_gather, reduce_scatter) @Yancey1989...