MiniCPM
MiniCPM copied to clipboard
MiniCPM inference env / 模型推理环境
Is there an existing issue ? / 是否已有相关的 issue ?
- [x] I have searched, and there is no existing issue. / 我已经搜索过了,没有相关的 issue。
Describe the bug / 描述这个 bug
The model repo currently does not have a requirement.txt, so it is estimated that many user will have problems running it.
模型目前没有 requirement.txt ,所以估计不少同学运行会出问题。
I tried to run it in the latest Nvidia docker container environment and use the new version of xformers. The currently running normal environment is as follows, for the reference of other users: preview
我尝试在最新的 Nvidia 容器环境中运行,并使用新版本的 xformers,目前运行正常的环境如下,供其他同学参考:相关运行结果
Package Version Editable project location
------------------------- ------------------------ -------------------------
absl-py 2.0.0
accelerate 0.26.1
aiofiles 23.2.1
aiohttp 3.9.1
aiosignal 1.3.1
altair 5.2.0
annotated-types 0.6.0
anyio 4.2.0
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
asttokens 2.4.1
astunparse 1.6.3
async-timeout 4.0.3
attrs 23.1.0
audioread 3.0.1
beautifulsoup4 4.12.2
bleach 6.1.0
blis 0.7.11
cachetools 5.3.2
catalogue 2.0.10
certifi 2023.11.17
cffi 1.16.0
charset-normalizer 3.3.2
click 8.1.7
cloudpathlib 0.16.0
cloudpickle 3.0.0
cmake 3.27.9
colorama 0.4.6
comm 0.2.0
confection 0.1.4
contourpy 1.2.0
cubinlinker 0.3.0+2.gbde7348
cuda-python 12.3.0rc4+8.gcb4e395
cudf 23.10.0
cugraph 23.10.0
cugraph-dgl 23.10.0
cugraph-service-client 23.10.0
cugraph-service-server 23.10.0
cuml 23.10.0
cupy-cuda12x 12.2.0
cycler 0.12.1
cymem 2.0.8
Cython 3.0.6
dask 2023.9.2
dask-cuda 23.10.0
dask-cudf 23.10.0
debugpy 1.8.0
decorator 5.1.1
defusedxml 0.7.1
distributed 2023.9.2
dm-tree 0.1.8
einops 0.7.0
exceptiongroup 1.2.0
execnet 2.0.2
executing 2.0.1
expecttest 0.1.3
fastapi 0.109.0
fastjsonschema 2.19.0
fastrlock 0.8.2
ffmpy 0.3.1
filelock 3.13.1
flash-attn 2.0.4
fonttools 4.46.0
frozenlist 1.4.0
fsspec 2023.12.0
gast 0.5.4
google-auth 2.25.0
google-auth-oauthlib 0.4.6
gradio 3.48.0
gradio_client 0.6.1
graphsurgeon 0.4.6
grpcio 1.59.3
h11 0.14.0
httpcore 1.0.2
httptools 0.6.1
httpx 0.26.0
huggingface-hub 0.20.3
hypothesis 5.35.1
idna 3.6
importlib-metadata 7.0.0
importlib-resources 6.1.1
iniconfig 2.0.0
intel-openmp 2021.4.0
ipykernel 6.27.1
ipython 8.18.1
ipython-genutils 0.2.0
jedi 0.19.1
Jinja2 3.1.2
joblib 1.3.2
json5 0.9.14
jsonschema 4.20.0
jsonschema-specifications 2023.11.2
jupyter_client 8.6.0
jupyter_core 5.5.0
jupyter-tensorboard 0.2.0
jupyterlab 2.3.2
jupyterlab_pygments 0.3.0
jupyterlab-server 1.2.0
jupytext 1.16.0
kiwisolver 1.4.5
langcodes 3.3.0
lazy_loader 0.3
librosa 0.10.1
llvmlite 0.40.1
locket 1.0.0
Markdown 3.5.1
markdown-it-py 3.0.0
MarkupSafe 2.1.3
matplotlib 3.8.2
matplotlib-inline 0.1.6
mdit-py-plugins 0.4.0
mdurl 0.1.2
mistune 3.0.2
mkl 2021.1.1
mkl-devel 2021.1.1
mkl-include 2021.1.1
mock 5.1.0
mpmath 1.3.0
msgpack 1.0.7
multidict 6.0.4
murmurhash 1.0.10
nbclient 0.9.0
nbconvert 7.12.0
nbformat 5.9.2
nest-asyncio 1.5.8
networkx 2.6.3
ninja 1.11.1.1
notebook 6.4.10
numba 0.57.1+1.g4157f3379
numpy 1.24.4
nvfuser 0.1.1+gitunknown
nvidia-dali-cuda120 1.32.0
nvidia-pyindex 1.0.9
nvtx 0.2.5
oauthlib 3.2.2
onnx 1.15.0rc2
opencv-python-headless 4.9.0.80
optree 0.10.0
orjson 3.9.12
packaging 23.2
pandas 1.5.3
pandocfilters 1.5.0
parso 0.8.3
partd 1.4.1
pexpect 4.9.0
Pillow 9.5.0
pip 23.3.1
platformdirs 4.1.0
pluggy 1.3.0
ply 3.11
polygraphy 0.49.1
pooch 1.8.0
preshed 3.0.9
prettytable 3.9.0
prometheus-client 0.19.0
prompt-toolkit 3.0.41
protobuf 4.24.4
psutil 5.9.4
ptxcompiler 0.8.1+2.g5ad1474
ptyprocess 0.7.0
pure-eval 0.2.2
pyarrow 12.0.1
pyarrow-hotfix 0.6
pyasn1 0.5.1
pyasn1-modules 0.3.0
pybind11 2.11.1
pybind11-global 2.11.1
pycocotools 2.0+nv0.8.0
pycparser 2.21
pydantic 1.10.13
pydantic_core 2.16.1
pydub 0.25.1
Pygments 2.17.2
pylibcugraph 23.10.0
pylibcugraphops 23.10.0
pylibraft 23.10.0
pynvml 11.4.1
pyparsing 3.1.1
pytest 7.4.3
pytest-flakefinder 1.1.0
pytest-rerunfailures 13.0
pytest-shard 0.1.2
pytest-xdist 3.5.0
python-dateutil 2.8.2
python-dotenv 1.0.1
python-hostlist 1.23.0
python-multipart 0.0.6
pytorch-quantization 2.1.2
pytz 2023.3.post1
PyYAML 6.0.1
pyzmq 25.1.2
raft-dask 23.10.0
ray 2.9.1
referencing 0.31.1
regex 2023.10.3
requests 2.31.0
requests-oauthlib 1.3.1
rich 13.7.0
rmm 23.10.0
rpds-py 0.13.2
rsa 4.9
ruff 0.1.15
safetensors 0.4.2
scikit-learn 1.2.0
scipy 1.11.4
semantic-version 2.10.0
Send2Trash 1.8.2
sentencepiece 0.1.99
setuptools 68.2.2
shellingham 1.5.4
six 1.16.0
smart-open 6.4.0
sniffio 1.3.0
sortedcontainers 2.4.0
soundfile 0.12.1
soupsieve 2.5
soxr 0.3.7
spacy 3.7.2
spacy-legacy 3.0.12
spacy-loggers 1.0.5
sphinx-glpi-theme 0.4.1
srsly 2.4.8
stack-data 0.6.3
starlette 0.35.1
sympy 1.12
tabulate 0.9.0
tbb 2021.11.0
tblib 3.0.0
tensorboard 2.9.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorrt 8.6.1
terminado 0.18.0
thinc 8.2.1
threadpoolctl 3.2.0
thriftpy2 0.4.17
tinycss2 1.2.1
tokenizers 0.15.1
toml 0.10.2
tomli 2.0.1
tomlkit 0.12.0
toolz 0.12.0
torch 2.2.0a0+81ea7a4
torch-tensorrt 2.2.0a0
torchdata 0.7.0a0
torchtext 0.17.0a0
torchvision 0.17.0a0
tornado 6.4
tqdm 4.66.1
traitlets 5.9.0
transformer-engine 1.1.0+cf6fc89
transformers 4.38.0.dev0
treelite 3.9.1
treelite-runtime 3.9.1
triton 2.1.0+6e4932c
typer 0.9.0
types-dataclasses 0.6.6
typing_extensions 4.8.0
ucx-py 0.34.0
uff 0.6.9
urllib3 1.26.18
uvicorn 0.27.0.post1
uvloop 0.19.0
vllm 0.2.2+cu123
wasabi 1.1.2
watchfiles 0.21.0
wcwidth 0.2.12
weasel 0.3.4
webencodings 0.5.1
websockets 11.0.3
Werkzeug 3.0.1
wheel 0.42.0
xdoctest 1.0.2
xformers 0.0.24+6600003.d20240116 /custom-build/xformers
xgboost 1.7.6
yarl 1.9.3
zict 3.0.0
zipp 3.17.0
To Reproduce / 如何复现
refs to readme.md, lol
Expected behavior / 期望的结果
No response
Screenshots / 截图
No response
Environment / 环境
and the base env is:
基础容器环境信息如下:
PyTorch version: 2.2.0a0+81ea7a4
Is debug build: False
CUDA used to build PyTorch: 12.3
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.3 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 3.27.9
Libc version: glibc-2.35
Python version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-6.5.0-14-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 12.3.107
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4090
Nvidia driver version: 525.147.05
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.7
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.24.4
[pip3] onnx==1.15.0rc2
[pip3] optree==0.10.0
[pip3] pytorch-quantization==2.1.2
[pip3] torch==2.2.0a0+81ea7a4
[pip3] torch-tensorrt==2.2.0a0
[pip3] torchdata==0.7.0a0
[pip3] torchtext==0.17.0a0
[pip3] torchvision==0.17.0a0
[pip3] triton==2.1.0+6e4932c
[conda] Could not collect
Additional context / 其他信息
No response
Hi, currently there are conflicts between different inference environments, we are working on a clearer way for usage.
ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: you need flash_attn package version to be greater or equal than 2.1.0. Detected version 2.0.4. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.
Hi, you may need to install a higher version of flash_attn. Enter pip install flash_attn>=2.1.0 on the command line.
@LDLINGLINGLING @jsonwull The goal of this issue is to provide a reference environment for other students. This is not a feedback on the version of flash attn, and when the above version is completely locked, the program can run normally and quickly :D