AULAY WANG issues

Results 5 issues of


                                            AULAY WANG

DAPE微调细节和收敛问题

我按照作者实验设置，将DIV2K，Flickr2K，FFHQ1w张，OST，共计2.3w张做成配对图片后放到DAPE中微调，设置也按照dape.yaml的配置做的，然后模型收敛不了（l_logits在0.5左右），放在推理部分也不能产生有效标签信息（全是null）。想问问大家是怎么解决收敛问题的？

我试着在prompt里增加camera motion相关的表述（比如zoom in），发现效果不佳。同时，我发现Gradio下有关于camera motion的选项[链接](https://github.com/hpcaitech/Open-Sora/blob/main/assets/readme/gradio_option.png)，然而我没有在作者的command案例里找到类似的设置方式[链接](https://github.com/hpcaitech/Open-Sora/blob/main/docs/commands.md)。有人了解怎么设置吗？

documentation

Problems for Motion LoRA training and inference

I use ~100k videos selected from WebVid-2M to train the MotionLoRA, and I used the checkpoint of 40k steps (`outputs/train_motion_training_xxx_2024-06-13T04-09-10/checkpoints/checkpoint-step-40000.ckpt`) to replace the `mm_sd_v15_v2.ckpt` in `configs/prompts/v2/v2-2-RealisticVision-MotionLoRA.yaml`. However, the results are...

RuntimeError: The shape of the 2D attn_mask is torch.Size([77, 77]), but should be (1, 1).

When I use the command `sh configs/inference/run.sh`, this error occurs: ``` Traceback (most recent call last): File "/data/code/MotionCtrl/main/evaluation/motionctrl_inference.py", line 354, in run_inference(args, gpu_num, rank) File "/data/code/MotionCtrl/main/evaluation/motionctrl_inference.py", line 261, in run_inference...

什么时候支持Deepseek-OCR？When will support Deepseek-OCR?

### Reminder - [x] I have read the above rules and searched the existing issues. ### Description https://huggingface.co/deepseek-ai/DeepSeek-OCR ### Pull Request _No response_

enhancement

pending

AULAY WANG

DAPE微调细节和收敛问题

关于camera motion的prompt

Problems for Motion LoRA training and inference

RuntimeError: The shape of the 2D attn_mask is torch.Size([77, 77]), but should be (1, 1).

什么时候支持Deepseek-OCR？When will support Deepseek-OCR?