text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

I do not know how to use some llm models with gpu some deprecated stuff

Open makrse opened this issue 9 months ago • 2 comments

Describe the bug

how do i fix this issue ?

Is there an existing issue for this?

  • [x] I have searched the existing issues

Reproduction

just trying to load the this particular llm model.

Screenshot

imagen_2024-05-01_172140672

Logs

17:12:06-551238 ERROR    Could not find the character "None" inside characters/. No character has been loaded.
Traceback (most recent call last):
  File "D:\text-generation-webui\installer_files\env\Lib\site-packages\gradio\queueing.py", line 527, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\text-generation-webui\installer_files\env\Lib\site-packages\gradio\route_utils.py", line 261, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\text-generation-webui\installer_files\env\Lib\site-packages\gradio\blocks.py", line 1786, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\text-generation-webui\installer_files\env\Lib\site-packages\gradio\blocks.py", line 1338, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\text-generation-webui\installer_files\env\Lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\text-generation-webui\installer_files\env\Lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "D:\text-generation-webui\installer_files\env\Lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\text-generation-webui\installer_files\env\Lib\site-packages\gradio\utils.py", line 759, in wrapper
    response = f(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^
  File "D:\text-generation-webui\modules\chat.py", line 673, in load_character
    raise ValueError
ValueError
17:12:20-281373 INFO     Loading "TheBloke_WizardLM-Uncensored-Falcon-7B-GPTQ"
17:12:20-360289 INFO     The AutoGPTQ params are: {'model_basename': 'model', 'device': 'cuda:0', 'use_triton': False,
                         'inject_fused_attention': True, 'inject_fused_mlp': True, 'use_safetensors': True,
                         'trust_remote_code': True, 'max_memory': {0: '7000MiB', 'cpu': '99GiB'}, 'quantize_config':
                         None, 'use_cuda_fp16': False, 'disable_exllama': False, 'disable_exllamav2': False}
D:\text-generation-webui\installer_files\env\Lib\site-packages\transformers\modeling_utils.py:4371: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead

System Info

windows 10
gpu 3090

makrse avatar May 01 '24 09:05 makrse

I have the same problem but on Linux GPU RTX 3050

mailsonm avatar May 02 '24 14:05 mailsonm

There is no problem, it's just a warning and the model still loads. If you want to remove this warning, disable the exllamav2 kernel.

P.S: This model is available as a GGUF.

Alkohole avatar May 10 '24 13:05 Alkohole