LMFlow icon indicating copy to clipboard operation
LMFlow copied to clipboard

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Results 177 LMFlow issues
Sort by recently updated
recently updated
newest added

chat_env) [root@192 LMFlow]# pip install -r requirements.txt Collecting peft@ git+https://github.com/huggingface/peft.git@deff03f2c251534fffd2511fc2d440e84cc54b1b Cloning https://github.com/huggingface/peft.git (to revision deff03f2c251534fffd2511fc2d440e84cc54b1b) to /tmp/pip-install-qp3u6wif/peft_bb110dd6776941069294566614e126c0 Running command git clone --quiet https://github.com/huggingface/peft.git /tmp/pip-install-qp3u6wif/peft_bb110dd6776941069294566614e126c0 fatal: unable to access 'https://github.com/huggingface/peft.git/': TCP...

(lmflow) PS C:\Users\Satan> pip install deepspeed==0.8.3 Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Collecting deepspeed==0.8.3 Using cached https://mirrors.aliyun.com/pypi/packages/0f/c0/9b57e9ec56f6f405726a384b109f8da1267e41feea081850c2fce1735712/deepspeed-0.8.3.tar.gz (765 kB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not...

AFAIK,there're two methods to continue finetuning based on the previously finetuned model: 1. (base model + lora model + dataset1) -> (base model + finetuned model + dataset2) 2. (base...

[2023-04-21 22:17:06,284] [INFO] [launch.py:142:main] WORLD INFO DICT: {'localhost': [0]} [2023-04-21 22:17:06,284] [INFO] [launch.py:148:main] nnodes=1, num_local_procs=1, node_rank=0 [2023-04-21 22:17:06,284] [INFO] [launch.py:161:main] global_rank_mapping=defaultdict(, {'localhost': [0]}) [2023-04-21 22:17:06,284] [INFO] [launch.py:162:main] dist_world_size=1 [2023-04-21 22:17:06,284]...

**Describe the bug** (gh_lmflow) ub2004@ub2004-B85M-A0:~/llm_dev/LMFlow$ bash ./scripts/run_finetune.sh [2023-04-20 15:03:57,800] [WARNING] [runner.py:186:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only. [2023-04-20 15:03:57,810] [INFO] [runner.py:550:main] cmd = /usr/bin/python3...

bug

* Restarting with stat [2023-04-20 08:22:35,933] [INFO] [comm.py:652:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [W socket.cpp:426] [c10d] The server socket has failed to bind to [::]:10086 (errno: 98 -...

[2023-04-09 13:43:32,793] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 11961 [2023-04-09 13:43:32,907] [ERROR] [launch.py:324:sigkill_handler] ['/home/seali/anaconda3/bin/python', '-u', 'examples/finetune.py', '--local_rank=0', '--model_name_or_path', 'gpt2', '--dataset_path', '/home/seali/LMFlow-main/data/alpaca/train', '--output_dir', '/home/seali/LMFlow-main/output_models/finetune', '--overwrite_output_dir', '--num_train_epochs', '0.01', '--learning_rate', '2e-5', '--block_size', '512', '--per_device_train_batch_size', '1',...

当我运行./scripts/run_chatbot.sh想要测试模型时,出现了如下信息: ./scripts/run_chatbot.sh: line 3: /home/ubuntu/llama-7b-hf: Is a directory [2023-04-19 16:43:07,239] [WARNING] [runner.py:186:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only. Detected CUDA_VISIBLE_DEVICES=1: setting --include=localhost:1 [2023-04-19 16:43:07,265]...

I can confirm that 29500 is not being used..... Traceback (most recent call last): File "/data1/xxx/chat/LMFlow/service/app.py", line 35, in model = AutoModel.get_model(model_args, tune_strategy='none', ds_config=ds_config, init_method="tcp://localhost:29501") File "/data1/xxx/chat/LMFlow/src/lmflow/models/auto_model.py", line 16, in...

I've been fine-tuning a model with LoRa and now I want to continue that process for another related task, using a different dataset. I noticed that there is a pre-finetuned...