PaddleFleetX
PaddleFleetX copied to clipboard
开发环境配置过程中问题频出
trafficstars
- 这个项目有即开即用的 docker 环境
当前出现的问题有:
- ModuleNotFoundError: No module named 'paddle.fluid'
- ImportError: libcudart.so.10.2: cannot open shared object file: No such file or directory
- 其他一些包的不兼容或者安装失败,如
#7 74.65 ERROR: Command errored out with exit status 1:
#7 74.65 command: /usr/bin/python /usr/local/lib/python3.7/dist-packages/pip-20.0.1-py3.7.egg/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-41peemij/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://mirror.baidu.com/pypi/simple -- 'setuptools>=64' wheel scikit-build 'setuptools_scm>=8'
#7 74.65 cwd: None
#7 74.65 Complete output (9 lines):
#7 74.65 Looking in indexes: https://mirror.baidu.com/pypi/simple
#7 74.65 Collecting setuptools>=64
#7 74.65 Downloading https://mirror.baidu.com/pypi/packages/c7/42/be1c7bbdd83e1bfb160c94b9cafd8e25efc7400346cf7ccdbdb452c467fa/setuptools-68.0.0-py3-none-any.whl (804 kB)
#7 74.65 Collecting wheel
#7 74.65 Downloading https://mirror.baidu.com/pypi/packages/c7/c3/55076fc728723ef927521abaa1955213d094933dc36d4a2008d5101e1af5/wheel-0.42.0-py3-none-any.whl (65 kB)
#7 74.65 Collecting scikit-build
#7 74.65 Downloading https://mirror.baidu.com/pypi/packages/fa/af/b3ef8fe0bb96bf7308e1f9d196fc069f0c75d9c74cfaad851e418cc704f4/scikit_build-0.17.6-py3-none-any.whl (84 kB)
#7 74.65 ERROR: Could not find a version that satisfies the requirement setuptools_scm>=8 (from versions: 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0, 1.4.1, 1.5.0, 1.5.2, 1.5.3, 1.5.4, 1.5.5, 1.6.0, 1.7.0, 1.8.0, 1.9.0, 1.10.0, 1.10.1, 1.11.0, 1.11.1, 1.13.0, 1.13.1, 1.14.0rc1, 1.14.0, 1.15.0rc1, 1.15.0, 1.15.1rc1, 1.15.4, 1.15.5, 1.15.6, 1.15.7, 1.16.0, 1.16.1, 1.16.2, 1.17.0, 2.0.0, 2.1.0, 3.0.0, 3.0.1, 3.0.2, 3.0.4, 3.0.5, 3.0.6, 3.1.0, 3.2.0, 3.3.1, 3.3.2, 3.3.3, 3.4.0, 3.4.1, 3.4.2, 3.4.3, 3.5.0, 4.0.0, 4.1.0, 4.1.1, 4.1.2, 5.0.0, 5.0.1, 5.0.2, 6.0.0, 6.0.1, 6.1.0.dev0, 6.1.0, 6.1.1, 6.2.0, 6.3.0, 6.3.1, 6.3.2, 6.4.0, 6.4.1, 6.4.2, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.0.5, 7.1.0)
#7 74.65 ERROR: No matching distribution found for setuptools_scm>=8
- 是不是paddlepaddle-gpu==0.0.0.post112 这个版本是随着时间变化的?有没有 PaddleFleetX 具体对应的 paddle 版本信息?
v2.4.0分枝的dockerfile镜像中还出现了python版本的问题:
/workspace python ./tools/train.py -c ./ppfleetx/configs/nlp/gpt/pretrain_gpt_345M_single_card.yaml
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
Traceback (most recent call last):
File "./tools/train.py", line 32, in <module>
from ppfleetx.data import build_dataloader
File "/workspace/ppfleetx/data/__init__.py", line 23, in <module>
from ppfleetx.data import dataset, sampler, utils
File "/workspace/ppfleetx/data/dataset/__init__.py", line 22, in <module>
from .gpt_dataset import GPTDataset, LM_Eval_Dataset, Lambada_Eval_Dataset
File "/workspace/ppfleetx/data/dataset/gpt_dataset.py", line 27, in <module>
from ppfleetx.data.tokenizers import GPTTokenizer
File "/workspace/ppfleetx/data/tokenizers/__init__.py", line 16, in <module>
from .ernie_tokenizer import get_ernie_tokenizer
File "/workspace/ppfleetx/data/tokenizers/ernie_tokenizer.py", line 15, in <module>
from paddlenlp.transformers import ErnieTokenizer
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/__init__.py", line 35, in <module>
from . import (
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/data/__init__.py", line 18, in <module>
from .data_collator import *
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/data/data_collator.py", line 26, in <module>
from ..transformers import BertTokenizer
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/__init__.py", line 16, in <module>
from .configuration_utils import PretrainedConfig
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/configuration_utils.py", line 37, in <module>
from ..utils.download import resolve_file_path
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/utils/download/__init__.py", line 18, in <module>
from typing import Dict, Literal, Optional, Union
ImportError: cannot import name 'Literal' from 'typing' (/usr/lib/python3.7/typing.py)
branch v2.4.0
- ModuleNotFoundError: No module named 'paddle.distributed.fleet.meta_parallel.sharding.sharding_utils'
建议使用https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm
Paddle里有支持像 LLama 13B/GPT-3 30B这类大模型的加载和保存吗? 有相关参考实现吗?
建议去PaddleNLP提个issue,会有相关开发人员回复。https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm