sigma
sigma
> Thanks for your job. By using your script in this repository, I generate the ground truth of kitti raw in bird's eye view. However, the ground truth generated in...
@Helios77760 thanks for your code, but when i run this script, i got this "Cannot copy param 0 weights from layer 'PReLU_1'; shape mismatch. Source param shape is (1); target...
谢谢,转换权重是要加载caffemodel模型,将prelu层替换掉,再重新保存吗
你好,这个数据的链接失效了,请问可以分享一下吗,谢谢
Thank you for your explanation. I got it.
同样的疑问
还挺奇怪的,有两个地方没想明白: 1、为什么text的位置编码都是0呢 2、在_get_positional_embeddings中对视频添加了sincos位置编码,在CogVideoXAttnProcessor2_0中又对query和key的视频部分添加了rope位置编码,为什么要对视频加两次位置编码呢
又看了下,代码中的self.use_positional_embeddings和self.use_learned_positional_embeddings都为False,应该是没有加sincos位置编码,但还是不太明白为什么text不用加位置编码 https://github.com/huggingface/diffusers/blob/de6a88c2d7659c616b44c0856677335110b8ff2e/src/diffusers/models/embeddings.py#L736
是的,我看的是CogVideoX,不是1.5的
我读了下代码应该是False,可以参考下这部分  