Ziqing Yang comments

Results 212 comments of


                                            Ziqing Yang

When using your project with just a few modification in my lisence plate recognition work, the CTC loss always gave nan.

If you are using your own data, you may check if the value of x is in the right range [-1,1] ? and if y starts with a SOS symbol(for...

关于训练和测试

decoder的每一步的input用的是上一步预测位置的真实标签(target)，

关于训练和测试

evaluate的时候第一个输入taget[:,0]是一个表示开始的标志位呀，不是真实标签。就是说在数据预处理的时候已经在所有样本的第一位添加了一个特殊标识符，所以测试也是没问题的

Captcha.image

You have to install captcha package first : pip install captcha

CMNews dataset

We will release the CMNews as we finish the camera ready.

Size([49952, 5120]) from checkpoint V.S. Size([49953, 5120]). from model

That's weird. The vocab size of the Chinese-Llama-13B-LoRA is 49953, but where does the 49952 come from? Can you provide more information, such as your loading script？

Size([49952, 5120]) from checkpoint V.S. Size([49953, 5120]). from model

Possible reason https://github.com/ymcui/Chinese-LLaMA-Alpaca/issues/133

请教合并模型merge_llama_with_chinese_lora.py的原理和中文LLaMA训练细节源码

将HF模型和原版模型的state_dict的key一一匹配即可。训练代码可参考公开的Transformers中的run_clm.py和Stanford Alpaca代码。

hf转换后的13B model无法达到公布性能，求公开模型hf格式的模型权重哈希值以及transformer版本

> 我这边llama是从meta那边下载的，中文的alpaca权重是从hf下载的，哈希都没问题因为peft变动比较大，大多数情况下是peft的问题；建议更新peft，使用新的合并脚本再试一下。

> > 你仔细看过example目录下的readme了么😂？至于差距大或不大这个我也很难去给出判断，毕竟不是客观可以度量的东西。我们只能确定用我们给定的方法可以给出可比的效果。 > > 嗯嗯，因为我尝试几次都无法得到demo水平的回答，所以想确认是用[inference_hf.py](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/inference_hf.py)里的参数跑出来的结果吗？（我也怀疑是不是自己的权重有问题是用llama.cpp的结果，参数在[wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/llama.cpp量化部署)里列了： ``` -c 2048 --temp 0.2 -n 256 --repeat_penalty 1.1 ```