n_ctx doesn't work for Yarn-Llama-2-13B-64K-GGUF?

Open surflip opened this issue 2 years ago • 1 comments

I use Yarn-Llama-2-13B-64K-GGUF and set n_ctx = 8192 in ooba webui, but when context exceeds 4096, it replies gibberish. how can i fix it?

the response like "c [tO r {tk { tO {n----------------a { {c * {thoo - [nt2####n0'.ttt?? {tr1tl (we1 {', [s,' go', [th1 {o } ; [th [[tto \t */t[ ) [t {to <st57 f [tf /p g }, ;t , +o'. }te 'n', {tu [w ['. {2 + { { [1 }/o , \o'.o (tOooo5'. ', #st ]Ott46t13st71stisk01w0a {c71d0tsto (tttse2o1ttp####o {'15tboodthtcettgutstn0o2 + 507 '''ta7 {n1thee }tO1sto ]Owo##o69 (f {e',tttr8 [tnestno1tpr4tsttft \h *i7ta2c3ttthtstatO002te0t(fth1thtO',+tO the]O0d.'5lltbe1 {totc1tO6eyd0trobh'. "

Sep 04 '23 13:09 surflip

I use LM Studio-0.2.3 to load the same model, the response is exactly the same.

win10 13700k 4090 NVIDIA-SMI 536.99 Driver Version: 536.99 CUDA Version: 12.2

Package Version

absl-py 1.4.0 accelerate 0.22.0 aiofiles 23.1.0 aiohttp 3.8.5 aiosignal 1.3.1 altair 5.1.1 anyio 4.0.0 appdirs 1.4.4 async-timeout 4.0.3 attrs 23.1.0 auto-gptq 0.4.2+cu117 bitsandbytes 0.41.1 cachetools 5.3.1 certifi 2022.12.7 charset-normalizer 2.1.1 click 8.1.7 colorama 0.4.6 coloredlogs 15.0.1 contourpy 1.1.0 ctransformers 0.2.26 cycler 0.11.0 datasets 2.14.4 dill 0.3.7 diskcache 5.6.3 docker-pycreds 0.4.0 einops 0.6.1 exceptiongroup 1.1.3 exllama 0.1.0 fastapi 0.95.2 ffmpy 0.3.1 filelock 3.9.0 fonttools 4.42.1 frozenlist 1.4.0 fsspec 2023.6.0 gitdb 4.0.10 GitPython 3.1.33 google-auth 2.22.0 google-auth-oauthlib 1.0.0 gptq-for-llama 0.1.0+cu117 gradio 3.33.1 gradio_client 0.2.5 grpcio 1.57.0 h11 0.14.0 httpcore 0.17.3 httpx 0.24.1 huggingface-hub 0.16.4 humanfriendly 10.0 idna 3.4 Jinja2 3.1.2 jsonschema 4.19.0 jsonschema-specifications 2023.7.1 kiwisolver 1.4.5 linkify-it-py 2.0.2 llama-cpp-python 0.1.83 llama-cpp-python-cuda 0.1.83+cu117 llama-cpp-python-ggml 0.1.78+cpuavx2 llama-cpp-python-ggml-cuda 0.1.78+cu117 Markdown 3.4.4 markdown-it-py 2.2.0 MarkupSafe 2.1.2 matplotlib 3.7.2 mdit-py-plugins 0.3.3 mdurl 0.1.2 mpmath 1.2.1 multidict 6.0.4 multiprocess 0.70.15 networkx 3.0 ninja 1.11.1 numpy 1.24.0 nvidia-cublas-cu12 12.2.5.6 nvidia-cuda-runtime-cu12 12.2.140 oauthlib 3.2.2 optimum 1.12.0 orjson 3.9.5 packaging 23.1 pandas 2.1.0 pathtools 0.1.2 peft 0.5.0 Pillow 10.0.0 pip 23.2.1 protobuf 4.24.2 psutil 5.9.5 py-cpuinfo 9.0.0 pyarrow 13.0.0 pyasn1 0.5.0 pyasn1-modules 0.3.0 pydantic 1.10.12 pydub 0.25.1 Pygments 2.16.1 pyparsing 3.0.9 pyreadline3 3.4.1 python-dateutil 2.8.2 python-multipart 0.0.6 pytz 2023.3 PyYAML 6.0.1 referencing 0.30.2 regex 2023.8.8 requests 2.31.0 requests-oauthlib 1.3.1 rouge 1.0.1 rpds-py 0.10.0 rsa 4.9 safetensors 0.3.1 scipy 1.11.2 semantic-version 2.10.0 sentencepiece 0.1.99 sentry-sdk 1.30.0 setproctitle 1.3.2 setuptools 68.0.0 six 1.16.0 smmap 5.0.0 sniffio 1.3.0 starlette 0.27.0 sympy 1.11.1 tensorboard 2.14.0 tensorboard-data-server 0.7.1 tokenizers 0.13.3 toolz 0.12.0 torch 2.0.1+cu117 torchaudio 2.0.2+cu117 torchvision 0.15.2+cu117 tqdm 4.66.1 transformers 4.32.1 typing_extensions 4.7.1 tzdata 2023.3 uc-micro-py 1.0.2 urllib3 1.26.13 uvicorn 0.23.2 wandb 0.15.9 websockets 11.0.3 Werkzeug 2.3.7 wheel 0.38.4 xxhash 3.3.0 yarl 1.9.2

ctransformers-0.2.25+cu117 the same result

Sep 04 '23 14:09 surflip