Minillm: can you share a detail operation to reproduce the result?
I was confused about the readme text, especially which model and dataset should be download to which directory. Can you share a detail steps.
I run "bash scripts/gpt2/tools/process_data_dolly.sh " right, but when run "bash scripts/gpt2/tools/process_data_pretrain.sh" it report error: "Token indices sequence length is longer than the specified maximum sequence length for this model (1186 > 1024). Running this sequence through the model will result in indexing errors"
But in 2.1 The processed data can be downloaded from the following links: dolly. I use this link to download processed data, why its length exceeds maximum sequence length?
where is the “gpt2-base” model?Is it init-gpt2-120M? download several “ init-gpt2-120M MiniLLM-gpt2-120M SFT-gpt2-120M” but no “gpt2-base” model.
3.2 Change Model Parallel Size You can increase/decrease the tensor parallel sizes with
python3 tools/convert_mp.py
--input_path results/llama/train/minillm/7B-init-13B-sft
--source_mp_size 1
--target_mp_size 4
--model_type llama # choose from opt and llama
To use the model with Model Parallel, we provide two example scripts for training and evaluation
Why use gpt2 as example but here change into llama? The codes did not include results fold after downloading? What is the operation order? We can not reproduce the result according to the steps shown on the readme.
I was confused about the readme text, especially which model and dataset should be download to which directory. Can you share a detail steps.
I run "bash scripts/gpt2/tools/process_data_dolly.sh " right, but when run "bash scripts/gpt2/tools/process_data_pretrain.sh" it report error: "Token indices sequence length is longer than the specified maximum sequence length for this model (1186 > 1024). Running this sequence through the model will result in indexing errors"
But in 2.1 The processed data can be downloaded from the following links: dolly. I use this link to download processed data, why its length exceeds maximum sequence length?
Hi! The processed data download from this link shold be put under this path /PATH/TO/LMOps/minillm/processed_data/dolly. After downloading this data, you do not need to run bash scripts/gpt2/tools/process_data_dolly.sh again.
The "maxmimum sequence length" notice is a warning from the Transformers tokenizer trggiered hwne we are tokenizing a long document from openwebtext. In the data processing scripts, we will construct "chunks" with a max length 1024 by merging the exceed tokens in the current document to the next document (lines 68-78).
where is the “gpt2-base” model?Is it init-gpt2-120M? download several “ init-gpt2-120M MiniLLM-gpt2-120M SFT-gpt2-120M” but no “gpt2-base” model.
The gpt2-base model is the official pre-trained gpt2-base model trained by OpenAI, which can be downloaded from this repo. This works as the intiailization for SFT (to train init-gpt2-120M and SFT-gpt2-120M).
3.2 Change Model Parallel Size You can increase/decrease the tensor parallel sizes with
python3 tools/convert_mp.py --input_path results/llama/train/minillm/7B-init-13B-sft --source_mp_size 1 --target_mp_size 4 --model_type llama # choose from opt and llama To use the model with Model Parallel, we provide two example scripts for training and evaluation
Why use gpt2 as example but here change into llama? The codes did not include results fold after downloading? What is the operation order? We can not reproduce the result according to the steps shown on the readme.
Generally, gpt2 does not need model parallelism because it is sufficiently small to fit in common GPUs. Meanwhile, gpt2's vocabulary size is 50257, not divided by 2, which means it cannot directly be paralleled to multiple GPUs. We use gpt2 as an example because it is small and conenient for quick reproduce. The instructions for LLaMA are almost the same as gpt2
We have provided more detailed on the downloaded data and model paths in the readme.
I am checking to see if I am doing this correctly.
- I have downloaded the LLaMA2-13B from HuggingFace (https://huggingface.co/meta-llama/Llama-2-13b-hf).
- I have generated a weight configuration file named "llama2.json," which is the same as the "mp_weight_configs/llama.json."
- Next, I have converted the model for model parallelism as follows:
python3 tools/convert_mp.py
--input_path checkpoint-for-llama2-13b
--source_mp_size 1
--target_mp_size 2
--model_type llama2
- I generated a script "sft_13B_mp2.sh" following the script at "scripts/llama2/sft/sft_7B_mp4.sh".
- I have run the script at "scripts/llama2/sft/sft_13B_mp2.sh" using MP_SIZE=2.
But I am getting the following error:
"ValueError: Trying to set a tensor of shape torch.Size([16000, 5120]) in "weight" (which has shape torch.Size([32000, 5120])), this looks incorrect."
I would really appreciate it if you could tell me if I am missing anything.
I am checking to see if I am doing this correctly.
- I have downloaded the LLaMA2-13B from HuggingFace (https://huggingface.co/meta-llama/Llama-2-13b-hf).
- I have generated a weight configuration file named "llama2.json," which is the same as the "mp_weight_configs/llama.json."
- Next, I have converted the model for model parallelism as follows:
python3 tools/convert_mp.py --input_path checkpoint-for-llama2-13b --source_mp_size 1 --target_mp_size 2 --model_type llama2
- I generated a script "sft_13B_mp2.sh" following the script at "scripts/llama2/sft/sft_7B_mp4.sh".
- I have run the script at "scripts/llama2/sft/sft_13B_mp2.sh" using MP_SIZE=2.
But I am getting the following error:
"ValueError: Trying to set a tensor of shape torch.Size([16000, 5120]) in "weight" (which has shape torch.Size([32000, 5120])), this looks incorrect."
I would really appreciate it if you could tell me if I am missing anything.
Hello, I am having same problem. Did you solved it?
Hi, did you install our modified transformers library?