[TRACKER] V-Diffusion-PyTorch Model
Hi
I am working on providing the support for v-diffusion-pytorch model via torch-mlir. As of now, I am able to get the simplified version of the model compiled successfully via torch-mlir. But the complete version of the model is not getting compiled. It results in the following error:
Traceback (most recent call last):
File "v_diffusion_with_sampling.py", line 127, in <module>
module = torch_mlir.compile(
File "/home/vivek/work/02_07/vivekkhandelwal1-SHARK/shark.venv/lib/python3.8/site-packages/torch_mlir/__init__.py", line 212, in compile
mb.import_module(scripted._c, class_annotator, import_options)
ValueError: Unhandled tensor that shares storage with another tensor.
Found at path '<root>._tensor_constant0' from root object '__torch__.torch.fx.graph_module.model_inference'
The model is available at https://github.com/nod-ai/SHARK/tree/main/tank/pytorch/v_diffusion. To test or work on the model, please follow the installation instructions available at: https://github.com/nod-ai/SHARK/tree/main/tank/pytorch/v_diffusion#installation
To run the PyTorch version of the model:
1.) Run the following command:
./v-diffusion-pytorch/cfg_sample.py "the rise of consciousness":5 -n 5 -bs 5 --seed 0
To compile the simplified version of the model via torch-mlir:
1.) Run the script v_diffusion.py, with the following command:
python v_diffusion.py 2> v_diffusion_ir.mlir
To reproduce the error:
1.) Run the script v_diffusion_with_sampling.py, with the following command:
python v_diffusion_with_sampling.py
The above error happens to be because of this line of code: https://github.com/crowsonkb/v-diffusion-pytorch/blob/master/diffusion/models/cc12m_1.py#L91. If I comment this part of the code and return a dummy tensor, it results in a Segmentation Fault.
@silvasean Your suggestions are welcome!
This is due to having two tensors in the program that share storage, but aren't the exact same tensor (e.g. they are two different views of the same underlying tensor) You can make the logic here smarter, but could get quite complicated: https://github.com/llvm/torch-mlir/blob/e322f6a8784009b37aa354abfa9a40a80f30877d/python/torch_mlir/dialects/torch/importer/jit_ir/csrc/ivalue_importer.cpp#L224 we have a minimal test here https://github.com/llvm/torch-mlir/blob/main/test/python/importer/jit_ir/ivalue_import/object-identity-error.py#L17 I would recommend first identifying the two tensors that share storage and see how theyare related. Is one a view of the other? Etc. Based on that then you need to make the ivalue_importer.cpp smart enough to recognize their relationship and emit them correctly (e.g. emit a "view" op to represent one in terms of the other)
The current blocker for this model is adding support for multiple indexing tensors in aten.index.Tensor with dynamic dimensions. I opened an issue discussing adding support for this case at #1226.
So is the blocker the "Unhandled tensor that shares storage with another tensor" error or aten.index.Tensor?
Ah sorry I'm not up to date on the former error. @vivekkhandelwal1 can comment on whether that has been unblocked.
So is the blocker the "Unhandled tensor that shares storage with another tensor" error or aten.index.Tensor?
aten.index.Tensor
Are there any remaining issues with V-Difussion?
Are there any remaining issues with V-Difussion?
No
Is there any end-to-end examples for V-Diffusion?
Is there any end-to-end examples for
V-Diffusion?
It's available here: https://github.com/nod-ai/SHARK/tree/main/tank/pytorch/v_diffusion_pytorch