Gao, Ruiyuan
Gao, Ruiyuan
我这边测试 L05C 使用 MiIOService,可以用 tts 发声,之后推下 pr
docker-compose 中对应关系应该是 ```yaml ports: - aaaa:bbbb environment: XIAOMUSIC_PORT: bbbb # 配置文件中的 port,后台:监听端口(修改后需要重启) XIAOMUSIC_PUBLIC_PORT: aaaa # 配置文件中的 public_port,后台:外网访问端口(0表示跟监听端口一致) ``` 以上,docker 环境中基本不存在需要修改 bbbb 的情况,也就是不用设置 XIAOMUSIC_PORT。如果需要修改端口,只需要修改两处 aaaa 如果使用反向代理,则转发 localhost:aaaa,XIAOMUSIC_PUBLIC_PORT 设置成代理的监听端口 cccc 另外,setting 文件存在会覆盖环境变量。启动过之后需要直接修改...
Any update on this?
You may check xformers to acquire the attention map. xformers adopts a block-wise calculation where no explicit "map" is stored in the memory.
Regarding 1, another solution is to change ```python q = self.q_linear(x).view(1, -1, self.num_heads, self.head_dim) ``` to ```python q = self.q_linear(x).view(B, -1, self.num_heads, self.head_dim) ``` if `mask` is `None`. Current implementation...
It is possible. Actually, we are limited by GPU memory (80G A800), so we only train up to 60 frames.
> thanks for your reply, and have you ever think about reduce the requirement of GPU memory ? We've done a lot to save GPU memory. You may check the...
> and can i ask what's the difference between video generate and image generate ? just increase the number of batch size ? They are fundamentally different. Images are 2D,...
> thanks for your answer, and i found that you didn't use any image to generate latents (you just use bev) to generate video, and my questions is that how...
Sorry, I cannot provide that because I did not try any personally. I think a quick search could give you the answer.