Alpaca Live Chat doesn't work. Keeps prompting for model

Describe the bug

see video

Expected behavior

click button to start live chat, prompted to download model, click microphone, start chatting

Screenshots

If applicable, add screenshots to help explain your problem.

Debugging information flatpack 7.0.1

https://github.com/user-attachments/assets/5f4a4d69-7cfe-481d-84f5-cdbcfaf761bf

Jun 30 '25 07:06 jmsunseri

I can only see a black screen with a mouse cursor in the video you posted.

Jun 30 '25 11:06 SpaciousCoder78

In any case, you can also prompt the STT download by selecting the microphone in the main window, maybe that could help you

Jun 30 '25 18:06 Jeffser

Let's see if this video works any better. All I can see is Alpaca chewing up 100% of all available cores and 20GB of memory

https://github.com/user-attachments/assets/b817608c-8f36-41f5-9228-d9bca42e0968

Jul 02 '25 01:07 jmsunseri

can you open it using your terminal so we can see if there's a specific error in the background?

flatpak run com.jeffser.Alpaca

Jul 02 '25 23:07 Jeffser

INFO    [main.py | main] Alpaca version: 7.0.1
INFO    [ollama_instances.py | start] Starting Alpaca's Ollama instance...
INFO    [ollama_instances.py | start] Started Alpaca's Ollama instance
Couldn't find '/home/justin/.ollama/id_ed25519'. Generating new private key.
time=2025-07-03T07:19:58.822+08:00 level=INFO source=routes.go:1235 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES:1 HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/justin/.var/app/com.jeffser.Alpaca/data/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:1 http_proxy: https_proxy: no_proxy:]"
Your new public key is:

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAICczu+us73vQOLkyLJJ/pOPNAB0tmMRTD0L0D0xlX9WR

time=2025-07-03T07:19:58.824+08:00 level=INFO source=images.go:476 msg="total blobs: 24"
time=2025-07-03T07:19:58.824+08:00 level=INFO source=images.go:483 msg="total unused blobs removed: 0"
time=2025-07-03T07:19:58.825+08:00 level=INFO source=routes.go:1288 msg="Listening on [::]:11434 (version 0.9.3)"
time=2025-07-03T07:19:58.825+08:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
[GIN] 2025/07/03 - 07:19:58 | 200 |      61.315µs |       127.0.0.1 | GET      "/api/version"
time=2025-07-03T07:19:58.847+08:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-07-03T07:19:58.847+08:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="62.5 GiB" available="51.6 GiB"
INFO    [ollama_instances.py | start]
[GIN] 2025/07/03 - 07:19:58 | 200 |     630.144µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/07/03 - 07:19:58 | 200 |    14.75675ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/07/03 - 07:19:58 | 200 |   18.537645ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/07/03 - 07:19:58 | 200 |   54.177908ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/07/03 - 07:19:58 | 200 |    91.29583ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/07/03 - 07:19:59 | 200 |  163.222503ms |       127.0.0.1 | POST     "/api/show"
^[[DINFO        [ollama_instances.py | start] Starting Alpaca's Ollama instance...
INFO    [ollama_instances.py | start] Started Alpaca's Ollama instance
Error: listen tcp 0.0.0.0:11434: bind: address already in use
[GIN] 2025/07/03 - 07:20:18 | 200 |      53.399µs |       127.0.0.1 | GET      "/api/version"
INFO    [ollama_instances.py | start]
[GIN] 2025/07/03 - 07:20:18 | 200 |     536.573µs |       127.0.0.1 | GET      "/api/tags"

(python3:2): Gtk-WARNING **: 07:20:18.346: Trying to measure GtkLabel 0x7fe63c000a30 for width of 324, but it needs at least 367

(python3:2): Gtk-WARNING **: 07:20:18.400: Trying to measure GtkLabel 0x7fe63c000a30 for width of 252, but it needs at least 367

(python3:2): Gtk-WARNING **: 07:20:18.455: Trying to measure GtkLabel 0x7fe63c000a30 for width of 324, but it needs at least 367

(python3:2): Gtk-WARNING **: 07:20:57.587: Trying to measure GtkLabel 0x5560c0f5dca0 for width of 324, but it needs at least 367

(python3:2): Gtk-WARNING **: 07:20:57.596: Trying to measure GtkLabel 0x5560c0f5dca0 for width of 324, but it needs at least 367
/app/lib/python3.12/site-packages/whisper/transcribe.py:126: UserWarning: FP16 is not supported on CPU; using FP32 instead
  warnings.warn("FP16 is not supported on CPU; using FP32 instead")

(python3:2): Gtk-WARNING **: 07:21:12.097: Trying to measure GtkLabel 0x5560c0f69490 for width of 324, but it needs at least 367

(python3:2): Gtk-WARNING **: 07:21:12.106: Trying to measure GtkLabel 0x5560c0f69490 for width of 324, but it needs at least 367

(python3:2): Gtk-WARNING **: 07:21:55.408: Invalid text buffer iterator: either the iterator is uninitialized, or the characters/paintables/widgets in the buffer have been modified since the iterator was created.
You must use marks, character numbers, or line numbers to preserve a position across buffer modifications.
You can apply tags and insert marks without invalidating your iterators,
but any mutation that affects 'indexable' buffer contents (contents that can be referred to by character offset)
will invalidate all outstanding iterators

(python3:2): Gtk-CRITICAL **: 07:21:55.408: gtk_text_buffer_insert: assertion 'gtk_text_iter_get_buffer (iter) == buffer' failed
Exception in thread Thread-21 (recognize_audio):
Traceback (most recent call last):
  File "/usr/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.12/threading.py", line 1012, in run
    self._target(*self._args, **self._kwargs)
  File "/app/share/Alpaca/alpaca/widgets/voice.py", line 190, in recognize_audio
    result = model.transcribe(audio_data, language=language)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/whisper/transcribe.py", line 279, in transcribe
    result: DecodingResult = decode_with_fallback(mel_segment)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/whisper/transcribe.py", line 195, in decode_with_fallback
    decode_result = model.decode(segment, options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/whisper/decoding.py", line 824, in decode
    result = DecodingTask(model, options).run(mel)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/whisper/decoding.py", line 737, in run
    tokens, sum_logprobs, no_speech_probs = self._main_loop(audio_features, tokens)
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/whisper/decoding.py", line 687, in _main_loop
    logits = self.inference.logits(tokens, audio_features)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/whisper/decoding.py", line 163, in logits
    return self.model.decoder(tokens, audio_features, kv_cache=self.kv_cache)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/lib/python3.12/site-packages/whisper/model.py", line 236, in forward
    self.token_embedding(x)
RuntimeError: The size of tensor a (3) must match the size of tensor b (0) at non-singleton dimension 1

(python3:2): Gtk-WARNING **: 07:22:30.736: Trying to measure GtkLabel 0x5560c1416e50 for width of 324, but it needs at least 367

(python3:2): Gtk-WARNING **: 07:22:30.746: Trying to measure GtkLabel 0x5560c1416e50 for width of 324, but it needs at least 367

(python3:2): Gtk-WARNING **: 07:22:36.157: Invalid text buffer iterator: either the iterator is uninitialized, or the characters/paintables/widgets in the buffer have been modified since the iterator was created.
You must use marks, character numbers, or line numbers to preserve a position across buffer modifications.
You can apply tags and insert marks without invalidating your iterators,
but any mutation that affects 'indexable' buffer contents (contents that can be referred to by character offset)
will invalidate all outstanding iterators

(python3:2): Gtk-CRITICAL **: 07:22:36.158: gtk_text_buffer_insert: assertion 'gtk_text_iter_get_buffer (iter) == buffer' failed```

Jul 02 '25 23:07 jmsunseri