xtuner
xtuner copied to clipboard
train time decrease from 13 hours to 9
Hello. I built a conda environemtn with these settings: `name: xtuner channels:
- nvidia/label/cuda-12.4.0
- pytorch
- conda-forge dependencies:
- python=3.11 # Specify Python version here
- pytorch
- torchvision
- torchaudio
- cuda
- pytorch-cuda
- compilers
- sysroot_linux-64
- gcc
- ninja
- py-cpuinfo
- libaio
- ca-certificates
- certifi
- openssl
- pydantic
- deepspeed
- mpi4py
- docutils
- myst-parser
- sphinx
- sphinx-argparse
- sphinx-book-theme
- sphinx-copybutton
- pip
- pip:
- transformers>=4.44.2
- transformers_stream_generator
- sphinx_markdown_tables
- lagent
- bitsandbytes
- datasets
- einops
- mmengine
- openpyxl
- peft
- scikit-image
- scipy
- sentencepiece
- tiktoken
And i downloaded the soruce code for xtuner and I remvoed all the lower requeitmenrs from it. fromtall the requiremtns text fie4ls and updated the setup.py
if name == 'main': setup( name='xtuner', version=get_version(), description=('An efficient, flexible and full-featured toolkit for ' 'fine-tuning large models'), long_description=readme(), long_description_content_type='text/markdown', author='XTuner Contributors', author_email='[email protected]', keywords='large language model, parameter-efficient fine-tuning', url='https://github.com/InternLM/xtuner', packages=find_packages(), include_package_data=True, classifiers=[ 'Development Status :: 4 - Beta', 'License :: OSI Approved :: Apache Software License', 'Operating System :: OS Independent', 'Programming Language :: Python :: 3', 'Programming Language :: Python :: 3.8', 'Programming Language :: Python :: 3.9', 'Programming Language :: Python :: 3.10', 'Topic :: Utilities', ],Python maximum version <3.11, to support mpi4py-mpich
python_requires='>=3.8, <3.12', license='Apache License 2.0', install_requires=parse_requirements('requirements/runtime.txt'), extras_require={ 'all': parse_requirements('requirements.txt'), 'deepspeed': parse_requirements('requirements/runtime.txt') + parse_requirements('requirements/deepspeed.txt'), 'modelscope': parse_requirements('requirements/runtime.txt') + parse_requirements('requirements/modelscope.txt'), }, zip_safe=False, entry_points={'console_scripts': ['xtuner = xtuner:cli']})and i created a project.toml file
[build-system] requires = ["setuptools >= 64.0", "wheel"] build-backend = "setuptools.build_meta" ` and then i ran the script NPROC_PER_NODE=6 xtuner train llava_llama3.1_8b_instruct_siglip-so400m-patch14-384_e1_gpu5_pretrainworkginforllam3google --deepspeed deepspeed_zero2 and it traiend in 9 hours. when i run your original python setup it takes 13 horus. on the same dataset. ther is still an issue tryign to run with deepspeed zero3. the xtuner code need sto be udpated for this. yet it trained well on zero2