fxnie
fxnie
(longbench) root@pod-1317538633040015360:/workspace/mnt/cm-nfx/LongBench# CUDA_VISIBLE_DEVICES=0 python pred.py --model chatglm3-6b-32k The repository for THUDM/LongBench contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at...
 在执行权重转换的时候,使用readme中的模型和预测器进行hf到guff的权重转换报错
local_setup.yml: "data_path": "/workspace/mnt/cm-nfx/gpt-neox/data/mamba/mamba-2.8b/algorithmic_corpus/algorithmic_corpus_text_document,/workspace/mnt/cm-nfx/gpt-neox/data/mamba/mamba-2.8b/opencode_sft/opencode_sft_text_document,/workspace/mnt/cm-nfx/gpt-neox/data/mamba/mamba-2.8b/synthetic_code_snippet/synthetic_code_snippet_text_document,/workspace/mnt/cm-nfx/gpt-neox/data/mamba/mamba-2.8b/synthetic_qa/synthetic_qa_text_document", Another question is how to reduce the ratio of the test set and validation set to 0 when setting the training set ratio?
How to set the learning rate using WSD? As mentioned in MiniCPM