Tian, Feng comments

Results 52 comments of


                                            Tian, Feng

Noise in NMS threshold filtering (GetMaxScoreIndex)

I have made corresponding changes in local to pass test. your PR will be merged to next release. Thanks

Faster processing speed using Intel caffe

you need update your machine to XEON server for faster speed.

support autoTP with weight only quantization in DS inference path

> @ftian1 if accelerator other than CUDA want to support AutoTP WOQ, which set of OpBuilder/kernels needs to be implemented? Can you provide a link to kernel usage in the...

[BUG] The latest master code doesn't work with pydantic 2.0a2

PS: after manually switch pydantic version from 2.0a2 to 1.10.7. the latest code works. I have a quick glance on that, it's because pydantic 2.0a2 has removed \_\_field\_\_ attributes in...

Add snip_momentum structured pruning which supports higher sparse ratio

@microsoft-github-policy-service agree company="Intel"

Add snip_momentum structured pruning which supports higher sparse ratio

@xiaoxiawu-microsoft sorry for the late response due to PRC holiday and thanks for your review. I have fixed the yapf scan issue. but in my local, the detection of destroyed...

Add snip_momentum structured pruning which supports higher sparse ratio

@xiaoxiawu-microsoft Those pre-ci errors are not related with my changes, could you pls have a check?

[REQUEST] Add more device-agnostic compression algorithms

@yaozhewei thanks for the valuable feedback. we are evaluating if we can remove or enhance callback like you suggested. will get back to you soon. as for post-training quantization support,...

[REQUEST] Add more device-agnostic compression algorithms

@yaozhewei per our investigation, we found it's doable to remove those explicit callbacks. we are preparing PR for further review. will ping you when it's ready. as for calibration based...

[REQUEST] Add more device-agnostic compression algorithms

>> For quantization proposal, the post-training quantization is in some sense already implemented. Users can use static activation quantization method with few batches inference to get the calibration. See here...