Bing Xu comments

Results 72 comments of


                                            Bing Xu

fix resnet50 example

This change will break AMD MI250 benchmark. A better way is to re-enable hint based dynamic shape support in this version at compiling time, then at benchmark time we can...

Any plans for official releases?

We will make it this week after making the v0.1 tag. On Mon, Oct 10, 2022 at 20:33 Mark Saroufim ***@***.***> wrote: > See title, would be nice to pip...

Upgrade compiler to ROCM 5.3 version

Thanks @illsilin! Please sign the CLA. Also could you sign off on the performance & correctness on all examples?

Memory usage increases when tableDiffusionAITPipeline is run repeatedly.

Sounds like possible to be torch cache allocator issue. Will investigate today. Thanks for letting us know.

Memory usage increases when tableDiffusionAITPipeline is run repeatedly.

https://github.com/facebookincubator/AITemplate/pull/43

Memory usage increases when tableDiffusionAITPipeline is run repeatedly.

According to https://github.com/facebookincubator/AITemplate/pull/43 it looks like CUDA graph caused CPU memory leaking. While we are debugging whether it is caused by AIT side or CUDA Graph side, we can disable...

Memory usage increases when tableDiffusionAITPipeline is run repeatedly.

@mikeiovine fixed the bug. Will do a sync today to fix this issue.

can we get stable diffusion example work with xformers?

For 512x512, AIT is running at 42it/s vs 27it/s in this PR. So I don't think we need to support xformer. Loop @terrychenism for 1024x1024 generation.

can we get stable diffusion example work with xformers?

If it is out of memory we may consider to make UNet batch size to 1 and run twice in each step, this will save a lot of memory.

automatically parsing Pytorch module to AIT module, not define AIT module layer by layer

It is not open sourced in this release. There is a FX2AIT project probably will be open sourced in the future. On Tue, Oct 4, 2022 at 05:49 Lee kwang...