Yang Yang comments

Results 5 comments of


                                            Yang Yang

RNN-T: enable batch decoder

Hi, suggest to hold on the integration considering below issue. This PR is aimed to give a batched version of decoder, and then make the model can end-to-end infer under...

Enable Intel GPU

Convert to draft first. Pending to [#129919](https://github.com/pytorch/pytorch/pull/129919) ready.

> @dbyoung18 , may I know why the change is [torchao/_models/llama/generate.py](https://github.com/pytorch/ao/pull/753/files#diff-608f69e373105b539411379f7c1930589b600fd0f0c89ce5890934eda727b233) only? Hi, @EikanWang. We have a plan to gradually support torch-ao on Intel GPU with different models(llama2,llama3,sam etc.) and...

Enable Intel GPU

> @dbyoung18 does this one support int4 woq ? Currently, it doesn't support int4 woq on Intel GPU. We are in the upstream progress to support INT4 xpu backend in...

Enable Intel GPU

Closed due to duplicate w/ PR:[ao#1259](https://github.com/pytorch/ao/pull/1259). THX for above review comments.