huangjf11 issues

Results 7 issues of


                                            huangjf11

llama2-7B 全量微调中存在异常？

想请教下，在gsm8k数据集上采取完全相同的设置，仅修改精度bf16,fp16 和学习率、epoch等训练均已收敛，存在如下三种效果：（多次实验，很普遍的现象。训练和推理用的是同一个模版） ![f47b2665288d45292595c44da663255f](https://github.com/hiyouga/LLaMA-Factory/assets/43243253/2e2418fc-c263-4e7a-aa7b-8384442fa250) ![eece5e1823159f4c27a39bb0a7a93fd8](https://github.com/hiyouga/LLaMA-Factory/assets/43243253/e193bb26-b711-46af-8cfd-451fd22dd5a0) ![e9d98c1b6275031ce73e8521a0e78145](https://github.com/hiyouga/LLaMA-Factory/assets/43243253/15fd5329-bdc3-4501-a23c-1bbfe7245b86)

pending

ValueError: Transformers now supports natively BetterTransformer optimizations

ValueError: Transformers now supports natively BetterTransformer optimizations (torch.nn.functional.scaled_dot_product_attention) for the model type llama. As such, there is no need to use `model.to_bettertransformers()` or `BetterTransformer.transform(model)` from the Optimum library. Please upgrade...

triaged

The official llama3-8B model of Hugging Face lacks tokenizer.model files.

The official llama3-8B model of Hugging Face lacks **tokenizer.model** file. Can you help me to solve this issue?

metadata.jsonl in the clip-filtered-dataset

Hello, I would like to ask about the metadata.jsonl in the clip-filtered-dataset. Could you please explain the meanings of the attributes contained in each data sample and some formulas for...

AttributeError: 'OpenAILanguageModel' object has no attribute 'openai_api_call_handler'

Has anyone encountered this problem before? ![image](https://github.com/kyegomez/tree-of-thoughts/assets/43243253/3e4d30e5-e639-4408-8ba2-162903c894fa) ![image](https://github.com/kyegomez/tree-of-thoughts/assets/43243253/aab1ab77-005b-4e28-a166-dd5d78419b46)

SFT training works fine, but pretraining throws an error

### Reminder - [x] I have read the above rules and searched the existing issues. ### System Info **I'm using the same launch command,** but SFT training works fine while...

bug

pending

test.py问题

训练好模型做测试时，不应该针对输入直接做输出吗？为什么preds还要受trues的影响。 output = model(**batch) labels = batch["labels"].detach().cpu().numpy() logits = output.logits preds = torch.argmax(logits, -1).detach().cpu().numpy() preds = preds[:, :-1] labels = labels[:, 1:] **preds = np.where(labels != -100, preds, tokenizer.pad_token_id) decoded_preds...