Tyrion Liu comments

Results 12 comments of


                                            Tyrion Liu

Files that are not images are not listed in Files view

I see. I happened to integrate a newly installed zipline instance in some shell scripts to upload logs, thus didn't notice the support for video/audio.

How to deploy trained llava model?

It would be very helpful since we have trained some llava models. We hope we can test them in an interactive way.

fix ERROR: Invalid interpolation format for "healthcheck" option in s…

在docker-compose文件中的双引号字串中使用$符号需要转义，否则默认是查找环境变量

多轮多模态对话

随便猜的，你试试：大模型实际上不存在“多轮”对话，第一轮之后模型没有保存着中间状态等着你输入下一句。每次添加一轮对话模型都会先计算前面的对话历史，然后继续输出。因此在处理“第二轮”对话时，应当确保images中的图像数与历史对话+当轮问题中的数量等同。换句话说你依然需要process已经对话过的图像并且放在images里

> @zodiacg > > > 第一轮之后模型没有保存着中间状态等着你输入下一句。每次添加一轮对话模型都会先计算前面的对话历史，然后继续输出。 > > 在接收新一轮的对话时，历史对话的kvcache完全可以保留，这样就不用重复计算了，这是推理框架决定的，而不是大模型都这样。 > > 关键的问题在于，多轮多图时，prompt该怎么拼。如果图片token的位置拼在当前轮user的prompt当中，是可以做到不重复计算，但是如果要求拼在开头，那么就会破坏历史，历史的kvcache也就无效了。 > > 很多模型做不到“图文交错”对话。很多模型给的多图多轮对话的例子，都只是在最开始输入了多图。如果不用实例代码给的接口，直接手动拼embedding，然后调transformers的接口，会发现效果真的不怎么行。我指的就是prompt怎么拼。如果我没记错，目前所有大模型提供的api接口时都是要求多轮提交时自带对话历史，并且占用token数，不存在服务器保留session只提交下一轮对话的情况。prefill比输出便宜当然是因为kvcache没错，但这跟你每次提交依然要把所有对话历史拼进去没有冲突。 images放在一起不等于拼在了开头，internlm-xcomposer恰恰就是做到了图文交错，它会把图像编码后放在文本中出现的位置。所以拼接新一轮次的对话时，必须保留历史图像，否则images与的数量就会不对应。如果把历史的图像不再输入才会破坏输入结构导致kvcache失效

多轮多模态对话

> @zodiacg > > lmdeploy 就支持interactive的对话，即获取下一轮的输入后可以不对历史的input做prefill。 > 那倒是非常实用 > > 举个例子：对于这样的历史 `{image}{question1} {answer1}`，如果下一轮有图片，理想的prompt应该是`{image}{question1} {answer1}{image}{question2}`，但是按照某些模型的demo会拼成这样 `{image}{image}{question1} {answer1}{question2}`。 > > 前者每一轮的输入不会改变历史prompt结构，kvcache可以复用，就看框架是否支持。后者会破坏历史prompt的结构(因为后一轮的图片放到第一轮了），kvcache没办法复用。我非常get到你的点，可能是你没搞懂internlm-xcomposer2。 internlm-xcomposer2实际组织形式就是你所说的这种，图片位置由输入中的\token指出（抱歉上个回复忘了转义被吞了，可能你没注意到）。它只是把图像的具体路径或内容放在了统一的位置，这跟最终token的组织方式没有任何必然联系。你可以任选一版xcomposer去看它的[modeling_internlm_xcomposer2.py中对应的代码](https://huggingface.co/internlm/internlm-xcomposer2-4khd-7b/blob/a2c222ebd3a723c3dff00232e4f5cc6429f472d1/modeling_internlm_xcomposer2.py#L180)。事实上没往下几行就有贴主报的错误，所以我做出了上面的推测。

Proxypass using Nginx proxy manager

Unless the proxied app explicitly states it supports running from a sub path, don't proxypass it with a sub path. Prefix stripping is not enough. As you already noticed, normally...

How to deploy trained llava model?

> @zodiacg @flotos Please follow this new docs https://github.com/LZHgrla/xtuner/tree/lzh/llama3_convert/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336. It introduces the commands for model conversion and chat. > > We also release the related LLaVA-Llama-3-8B models, which can be...

Training time issue

Can confirm this problem. Tried SNLI with a2t on a 2080Ti (batch size 12), the first clean epoch took 7 hours and the generation with a2t was estimated to take...

ghproxy.com 打不开了

> 或许可以fork一下scoop仓库，修改 current/lib/install.ps1 的源码，实现自动判断使用哪个镜像。国内有做 https://gitee.com/glsnames/scoop-installer 但配合spc不太行，spc bucket已经硬编码了ghproxy