stanford_alpaca
stanford_alpaca copied to clipboard
DeepSpeed compilation (cpu_adam issue)
Thanks for the repo. If I use torchrun as suggested by the webpage (see below) it fails due to an error while compiling cpuadam within the deepspeed library. The actual error is due to a compile command from nvcc (see below). The error is:
/usr/include/c++/11/bits/std_function.h:435:145: note: ‘_ArgTypes’ /usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
This leads to a follow up error: AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
I googled and tried a few things: https://github.com/NVlabs/instant-ngp/issues/119 https://github.com/microsoft/DeepSpeed/issues/1846
but that did not help. Anyone any idea?
My guess it there is some version issue either with gcc or cuda env or so. But since I installed alpaca into a new virtual environment (I tried both conda and venv) versioning issues should not really happen. So maybe sth. else...
Commands causing errors:
torchrun --nproc_per_node=4 --master_port=23222 train.py --model_name_or_path /home/johannes/modelhf/llama-7b/ --data_path /home/johannes/alpaca/alpaca_data.json --bf16 True --output_dir /home/johannes/outalpaca --num_train_epochs 3 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --gradient_accumulation_steps 8 --evaluation_strategy "no" --save_strategy "steps" --save_steps 2000 --save_total_limit 1 --learning_rate 2e-5 --weight_decay 0. --warmup_ratio 0.03 --deepspeed "./configs/default_offload_opt_param.json" --tf32 True
then later it calls, which raises the above error in /usr/include/c++/11/bits/std_function.h:
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/home/johannes/.local/lib/python3.10/site-packages/deepspeed/ops/csrc/includes -I/usr/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.10/dist-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_86,code=compute_86 -c /home/johannes/.local/lib/python3.10/site-packages/deepspeed/ops/csrc/common/custom_cuda_kernel.cu -o custom_cuda_kernel.cuda.o