codingma comments

Results 76 comments of


                                            codingma

Stuck on "[INFO|modeling_utils.py:3283] 2024-04-10 02:10:48,571 >> loading weights file model.safetensors from cache"

recommanded gradio verision is 4.21.0， please install this version and try again. bless

Custom Format Datasets

sorry, not supported now

LLaMA3-8B won't inference

Please first check if the original model can infer properly before training. Then check the log and GPU status when do inference process by the way. My System is "win11,...

LLaMA3-8B won't inference

> Please first check if the original model can infer properly before training. Then check the log and GPU status when do inference process by the way. My System is...

请作者重视，我多次实验发现，如果回答的内容比较长，就回出现截断情况。

full 的时候，会忽略lora_target 参数

cutoff_len可能会导致chat的template被截断？

会的，cutoff 会导致样本的长度和原始的不一致，导致template填充的结果里信息丢失

NotImplementedError: Cannot copy out of meta tensor; no data!

使用非 Mixtral 类模型会有问题吗？需要看一下是不是 MOE的模型类型不同导致的。

NotImplementedError: Cannot copy out of meta tensor; no data!

目前看到 https://github.com/hiyouga/LLaMA-Factory/issues/2933 跟你是类似的问题，我们有空会进一步排查。

关于llama3base版本的评测

使用 fewshot 的 template 试一下，我的结果是 Average: 61.67 STEM: 48.18 Social Sciences: 74.15 Humanities: 55.64 Other: 70.67

关于llama3base版本的评测

python ./src/evaluate.py \ --model_name_or_path /media/codingma/LLM/llama3/Meta-Llama-3-8B \ --template fewshot \ --task mmlu \ --split validation \ --lang en \ --n_shot 5 \ --batch_size 4