sigma

Results 11 comments of sigma

> Thanks for your job. By using your script in this repository, I generate the ground truth of kitti raw in bird's eye view. However, the ground truth generated in...

@Helios77760 thanks for your code, but when i run this script, i got this "Cannot copy param 0 weights from layer 'PReLU_1'; shape mismatch. Source param shape is (1); target...

谢谢,转换权重是要加载caffemodel模型,将prelu层替换掉,再重新保存吗

你好,这个数据的链接失效了,请问可以分享一下吗,谢谢

Thank you for your explanation. I got it.

还挺奇怪的,有两个地方没想明白: 1、为什么text的位置编码都是0呢 2、在_get_positional_embeddings中对视频添加了sincos位置编码,在CogVideoXAttnProcessor2_0中又对query和key的视频部分添加了rope位置编码,为什么要对视频加两次位置编码呢

又看了下,代码中的self.use_positional_embeddings和self.use_learned_positional_embeddings都为False,应该是没有加sincos位置编码,但还是不太明白为什么text不用加位置编码 https://github.com/huggingface/diffusers/blob/de6a88c2d7659c616b44c0856677335110b8ff2e/src/diffusers/models/embeddings.py#L736

是的,我看的是CogVideoX,不是1.5的

我读了下代码应该是False,可以参考下这部分 ![Image](https://github.com/user-attachments/assets/c9217c41-ac14-4151-8c76-22731e772d5e) ![Image](https://github.com/user-attachments/assets/3d07f28f-2da7-4157-b342-8e5aa966a6ff)