FuseAI icon indicating copy to clipboard operation
FuseAI copied to clipboard

Out of Memory Issue with OpenLLaMA-7B in Default FuseLLM Setting on A100 (80G)

Open runtsang opened this issue 6 months ago • 2 comments

Description

I am currently attempting to reproduce the results of your excellent work, FuseLLM, following the doc (https://github.com/18907305772/FuseAI/blob/main/FuseLLM/README.md). During these operations, I am encountering an Out of Memory (OOM) issue.

It is very weird that I am encountering an OOM issue given the situation where I strictly follow and use the command in the document. I also tried to use ZeRO3 for optimizing memory consumption following https://github.com/18907305772/FuseAI/issues/10. But it does not help. From my naive assumption, it may be due to the reason for different packages versions and thus have a different memory optimization result. Would you mind me asking for detailed environment information for further attempts?

For your convenience, below is my relevant environment information.

Environment

Hardware: 8 x Nvidia A100 (80G) GPUs Python version: 3.9 CUDA version: 11.8

Package Version absl-py 2.1.0 accelerate 0.24.1 aiohttp 3.9.5 aiosignal 1.3.1 annotated-types 0.7.0 async-timeout 4.0.3 attrs 23.2.0 audioread 3.0.1 certifi 2024.7.4 cffi 1.16.0 charset-normalizer 3.3.2 datasets 2.14.7 decorator 5.1.1 deepspeed 0.14.4 dill 0.3.7 editdistance 0.6.2 einops 0.8.0 filelock 3.15.4 flash_attn 0.2.8 frozenlist 1.4.1 fsspec 2023.10.0 grpcio 1.65.1 hjson 3.1.0 huggingface-hub 0.17.3 idna 3.7 importlib_metadata 8.2.0 Jinja2 3.1.4 joblib 1.4.2 lazy_loader 0.4 librosa 0.10.2.post1 llvmlite 0.43.0 Markdown 3.6 MarkupSafe 2.1.5 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 multiprocess 0.70.15 networkx 3.2.1 ninja 1.11.1.1 numba 0.60.0 numpy 2.0.1 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-ml-py 12.555.43 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.5.82 nvidia-nvtx-cu12 12.1.105 packaging 24.1 pandas 2.2.2 peft 0.12.0 pip 24.0 platformdirs 4.2.2 pooch 1.8.2 protobuf 4.25.4 psutil 6.0.0 py-cpuinfo 9.0.0 pyarrow 17.0.0 pyarrow-hotfix 0.6 pycparser 2.22 pydantic 2.8.2 pydantic_core 2.20.1 python-dateutil 2.9.0.post0 pytz 2024.1 PyYAML 6.0.1 regex 2024.7.24 requests 2.32.3 safetensors 0.4.3 scikit-learn 1.5.1 scipy 1.13.1 sentencepiece 0.2.0 setuptools 69.5.1 six 1.16.0 soundfile 0.12.1 soxr 0.4.0 sympy 1.13.1 tensorboard 2.17.0 tensorboard-data-server 0.7.2 threadpoolctl 3.5.0 tokenizers 0.14.1 torch 2.4.0 tqdm 4.66.4 transformers 4.35.1 triton 3.0.0 typing_extensions 4.12.2 tzdata 2024.1 urllib3 2.2.2 Werkzeug 3.0.3 wheel 0.43.0 xxhash 3.4.1 yarl 1.9.4 zipp 3.19.2

Thank you for any assistance or suggestions you might provide.

runtsang avatar Aug 08 '24 13:08 runtsang