费政聪 issues

Results 7 issues of


                                            费政聪

improve LLaMA for multi-language performance

Thanks for the good works! I have tried to improve LLaMa model to generate more fluency Chinese. We are inspired that LLaMa have learned good English expression and a little...

[Feature] 支持视觉信息输入和理解，类似于GPT-4

### Is your feature request related to a problem? Please describe. 通过微调模型，支持视觉信息输入，类似于GPT-4。目前已经在LLaMA语言模型上进行尝试，且结果不错。参考：https://github.com/feizc/Visual-LLaMA 在迁移到chatglm的时候遇到问题，是否有计划让chatglm支持视觉理解？也欢迎大家提供和讨论可行的方案。

另一个思路：使用clip做语义分割结果的选择

感谢作者的工作～这里提供另一个open-vocabulary的图像编辑角度，使用clip对于分割结果进行选择，再进行inpainting。代码参考实现：https://github.com/feizc/IEA

Project

improve LLaMA for visual understanding like GPT-4

Thanks for the good works! We have tried to improve LLaMa model to understand visual information and support multi-modal chatting. We are inspired that a good vit, e.g., CLIP vision...

About music generation with perceiver-ar model

Hi, @lucidrains Thanks for the implementation of Perceiver-AR model. We conduct the experiments on pop music generation at: https://github.com/feizc/Perceiver-Music-Generation. The results are encouraging, be grateful to you : )

CogvideoX for Keyframe Interpolation

Hi, thank you very much for the amazing open source work! We have implemented a frame interpolation model based on i2V model and open-source the training data, model, training code,...

Flux for music

Hi, thank you very much for your open-source and amazing work. We attempted to apply Flux structure to text-to-music generation and achieved some very interesting results. But there are still...