Lu Fang

Results 9 issues of Lu Fang

https://github.com/pytorch/pytorch/issues/2892

Summary: Without weight lists, the script is broken. The issue was probably introduced in D30824731 (https://github.com/pytorch/FBGEMM/commit/10c3f6a4f9238ffad9e30479f959f5f5a452d388). Differential Revision: D31436538

fb-exported
cla signed

Otherwise GPU build and test is not covered.

This is for testing, please don't merge. To run it, please build your Caffe2 with -DUSE_ATEN=ON

Follow the implementation in vllm/entrypoints/openai/logits_processors.py. The idea is straightforward, adding a [batch_size x vocab_size] mask tensor, and leverage a list of bools to determine whether to do the inplace masked...

v1

Black by default uses 88 characters as line length, and reasoning is here: https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#line-length To be compatible with Black, using 88 seems a wise choice.

Revert the one and zero, so masked_fill_ is applied appropriately.

v1

It may improve the qk norm efficiency TODO - [ ] Polish the code - [ ] Accuracy test - [ ] Perf test

needs-rebase

### 🚀 The feature, motivation and pitch For simple models, we may not need fusion from torch.compile. And piecewise approach may be slow. So we would like to enable this...

feature request
torch.compile