ao icon indicating copy to clipboard operation
ao copied to clipboard

Tutorial "run_vit_b_quant.py" cannot run

Open relaxtheo opened this issue 1 year ago • 13 comments

Running the tutorial "run_vit_b_quant.py" and report following errors in 'benchmark_model(model,20,inputs)': torch._dynamo.exc.InternalTorchDynamoError: 'PlainAQTLayout' object has no attribute 'layout_type'

What's the potential reason for this? Hope someone can give me some advice, thank you

relaxtheo avatar Aug 14 '24 13:08 relaxtheo

@jerryzh168 I can repro this fails on both torch stable and torch nightlies https://gist.github.com/msaroufim/2e764de83ab98f337e8e0d093c08f705

Can we fix and and make this example a test as well?

msaroufim avatar Aug 14 '24 16:08 msaroufim

will take a look

jerryzh168 avatar Aug 14 '24 16:08 jerryzh168

I did run into this error in the beginning, but after I uninstall and reinstalled torchao it works. I run with torchao nightly and torch 2.4.0 and nightly and both seems to work

jerryzh168 avatar Aug 14 '24 20:08 jerryzh168

After uninstall and reinstall torchao by pip, I still have the same problem.

My environment: torch:2.4.0+cu121 torchao: 0.4.0

relaxtheo avatar Aug 15 '24 08:08 relaxtheo

yeah I can repro the issue with the same torch and torchao version, but the issue seems to be fixed in torchao nightly though, we are doing a 0.5 release in 3 weeks: https://github.com/pytorch/ao/issues/667 so probably doesn't make sense to patch 0.4 version now. I'm wondering if this is surfaced in the 0.4 testing. cc @jcaip @msaroufim

jerryzh168 avatar Aug 17 '24 04:08 jerryzh168

I faced this problem a few times before. Also faced it again when try to do

quantize_(model, torch.compile(int8_weight_only()))

What worked for me was to add @torch._dynamo.disable to __new__() and __init__() of tensor subclass, following #544. Adding it to __torch_dispatch__ doesn't seem necessary.

https://github.com/pytorch/ao/blob/b523f9f9e15b6fb80d10f585d9cf45e0c5e4d10e/torchao/prototype/quantized_training/int8.py#L29-L40

If there are no downsides, maybe we should add this decorator to all tensor subclasses?

gau-nernst avatar Aug 17 '24 09:08 gau-nernst

@jerryzh168 All I did was run pytest/test and run the API examples mentioned in the doc, so it's entirely possible that this slipped through if there wasn't a test. We could add a step to run the tutorials to the runbook?

Although I think the better solution is to add a test - maybe just a generic one to run all the .py files in the tutorial folder and make sure there's no obvious errors.

jcaip avatar Aug 19 '24 20:08 jcaip

I faced this problem a few times before. Also faced it again when try to do

quantize_(model, torch.compile(int8_weight_only()))

What worked for me was to add @torch._dynamo.disable to __new__() and __init__() of tensor subclass, following #544. Adding it to __torch_dispatch__ doesn't seem necessary.

https://github.com/pytorch/ao/blob/b523f9f9e15b6fb80d10f585d9cf45e0c5e4d10e/torchao/prototype/quantized_training/int8.py#L29-L40

If there are no downsides, maybe we should add this decorator to all tensor subclasses?

oh, actually we don't expect people to use our API in this way I think, what would be the benefit of compiling int8_weight_only()? does it make quantization run faster?

jerryzh168 avatar Aug 20 '24 17:08 jerryzh168

@jerryzh168 All I did was run pytest/test and run the API examples mentioned in the doc, so it's entirely possible that this slipped through if there wasn't a test. We could add a step to run the tutorials to the runbook?

I see, I think we can add to the release runbook, some of the tests would require models to be downloaded so probably not easy to run in CI

jerryzh168 avatar Aug 20 '24 17:08 jerryzh168

@jerryzh168 we can download models in CI, we have huggingface credentials to do so built in. I kinda agree that if this is an important enough tutorial it should be tested, I'd rather not have the release process be more manual because we already do it monthly which is quite frequent

msaroufim avatar Aug 20 '24 17:08 msaroufim

@jerryzh168 All I did was run pytest/test and run the API examples mentioned in the doc, so it's entirely possible that this slipped through if there wasn't a test. We could add a step to run the tutorials to the runbook?

I see, I think we can add to the release runbook, some of the tests would require models to be downloaded so probably not easy to run in CI

I see, maybe it would make sense to make this a button? downloading large models on each commit seems a bit overkill

jerryzh168 avatar Aug 20 '24 17:08 jerryzh168

GitHub actions can be enabled on workflow dispatch (button) or cron job. If you need any advice setting that up lmk but ChatGPT should be able to give you a fully functional snippet that does this

msaroufim avatar Aug 20 '24 20:08 msaroufim

oh, actually we don't expect people to use our API in this way I think, what would be the benefit of compiling int8_weight_only()? does it make quantization run faster?

Yea ignore the compiling int8_weight_only() part. I was just testing it to try improving quantization speed (in the past, when I worked on FP6-LLM, the speed up was quite significant only both CPU and GPU, but we don't have a way to expose it right now). Anyway it's a separate issue.

My point is that usually when errors similar to the OP torch._dynamo.exc.InternalTorchDynamoError: 'PlainAQTLayout' object has no attribute 'layout_type' happen, adding torch._dynamo.disable to __new__ and __init__ will solve it.

gau-nernst avatar Aug 21 '24 06:08 gau-nernst