Moses Hu
Moses Hu
### 如何处理instruction,input,output 这种数据类型的labels 我在训练alpaca-lora的时候,使用了没有input的模版,但是在生成labels的时候,我需要把output前面的文本都设置成-100么? 比如: ``` def tokenize(prompt, add_eos_token=True): # there's probably a way to do this with the tokenizer settings # but again, gotta move fast result = tokenizer(...
## 👉 python examples/mms/asr/infer/mms_infer.py --model "/path/to/asr/model" --lang lang_code \ --audio "/path/to/audio_1.wav" "/path/to/audio_2.wav" "/path/to/audio_3.wav" It throws the error is ```RuntimeError: Error(s) in loading state_dict for Wav2VecCtc: Unexpected key(s) in state_dict: "w2v_encoder.w2v_model.encoder.layers.0.adapter_layer.W_a",...
the below is script,When I run it on multi GPUs ,there is no checkpoint- in output_dir.but runing it on single GPU ,it's ok. what's wrong with it? ``` python qlora.py...
Fairseq 0.12.2 torch 1.10.0 lightseq 3.0.1 follow the ls_fs_transformer_export.py to export model pb file , it gave me error: assert emb_size % emb_dim == 0
新版本的Fairseq跟0.10.0版本的参数名称不一样了,能否支持一下
how to do in-context-learning based on bloom? like instructGPT . the bloom can be as a rewardmodel? if I use bloom train a prompt learning ,how to do that?
你好,请问数据集在哪下载呢?
I only have one A100?how to set params to train Llama-alpaca
I user aotugptq convert blfloat model to in4 the avg loss is a bit larger than int 8. Model is Mixtral-8X7B int8 loss is almost 0.0004 it means that the...
how did you train [Fine-tuning Llama-2-7B-32K-beta](https://github.com/togethercomputer/OpenChatKit/blob/main/README.md#fine-tuning-llama-2-7b-32k-beta)