yongjer
yongjer
But isn't that accelerate can only use with no-trainer version script? Or I misunderstood ?
and here is the error: ``` WARNING:accelerate.commands.launch:The following values were not passed to `accelerate launch` and had defaults used instead: `--num_machines` was set to a value of `1` `--mixed_precision` was...
``` WARNING:accelerate.commands.launch:The following values were not passed to `accelerate launch` and had defaults used instead: `--num_machines` was set to a value of `1` `--mixed_precision` was set to a value of...
thanks for your help, wish you happy holidays
sorry, I'm not sure what you mean here is already the whole log of output
Unfortunately, there is no other trace. It leaves the whole line blank as above
it does look like cut off ``` hf@8913c96d24e3:/workspaces/hf$ deepspeed --autotuning run ./script/run_classification.py --model_name_or_path ckip-joint/bloom-1b1-zh --do_train --do_eval --output_dir ./bloom --train_file ./data/train.csv --validation_file ./data/test.csv --text_column_names sentence --label_column_name label --overwrite_output_dir --fp16 --torch_compile --deepspeed...
btw, here is my full dockerfile: ``` FROM huggingface/transformers-pytorch-deepspeed-latest-gpu:latest RUN apt-get update && apt-get install -y pdsh RUN pip install --upgrade pip bitsandbytes deepspeed[autotuning] # non-root user ARG USERNAME=hf ARG...
# I'm not sure whether these help ``` hf@ffc9973e2c76:/workspaces/hf$ tree . ├── DockerFile.hf ├── autotuning_exps │ └── profile_model_info.json ├── autotuning_results │ └── profile_model_info │ ├── cmd.txt │ ├── ds_config.json │...