llm-foundry could you give an elaborated steps about how to run llm-foundry on AMD mi250 devices

could you give an elaborated steps about how to run llm-foundry on AMD mi250 devices

Open Alice1069 opened this issue 1 year ago • 1 comments

could run the llm-foundry on AMD 4xMi250 machine

Steps to reproduce the behavior:

follow latest instructions from: https://github.com/ROCm/flash-attention/tree/flash_attention_for_rocm start from docker image: rocm/pytorch:rocm5.7_ubuntu22.04_py3.10_pytorch_2.0.1 export GPU_ARCHS="gfx90a" export PYTHON_SITE_PACKAGES=$(python -c 'import site; print(site.getsitepackages()[0])') patch "${PYTHON_SITE_PACKAGES}/torch/utils/hipify/hipify_python.py" hipify_patch.patch pip install . verified by PYTHONPATH=$PWD python benchmarks/benchmark_flash_attention.py "pip list" shows "flash-attn 2.0.4"
get llm-foundry v0.7 code modify setup.py

'torch>=2.2.1,<2.3',

'torch>=2.0,<2.0.2',

pip3 install --upgrade pip
pip install -e .
command to run : python data_prep/convert_dataset_hf.py
--dataset c4 --data_subset en
--out_root my-copy-c4 --splits train_small val_small
--concat_tokens 2048 --tokenizer EleutherAI/gpt-neox-20b --eos_text '<|endoftext|>'

composer train/train.py train/yamls/pretrain/mpt-1b.yaml data_local=my-copy-c4 train_loader.dataset.split=train_small eval_loader.dataset.split=val_small max_duration=10ba eval_interval=0 loss_fn=torch_crossentropy save_folder=mpt-1b

it said lack of lotary_emb
pip install lotary_emb
re run command, it said lack of libcudart.11.0
export LD_LIBRARY_PATH to include libudart
re run command , it said lack of libtorch_cuda.so

could you give me a detailed version of hwo to run llm-foundry on AMD mi250, i read through the 2 blogs about AMD, but not get the hint. any version of code is ok. Thank you!

May 27 '24 07:05 Alice1069

llm-foundry llm-foundry copied to clipboard

could you give an elaborated steps about how to run llm-foundry on AMD mi250 devices

llm-foundry
llm-foundry copied to clipboard