Kokoro-FastAPI icon indicating copy to clipboard operation
Kokoro-FastAPI copied to clipboard

Error: Error generating speech: Failed to execute 'endOfStream' on 'MediaSource': The 'updating' attribute is true on one or more of this MediaSource's SourceBuffers.

Open ItsNoted opened this issue 10 months ago • 37 comments

Describe the bug Was working great earlier now getting this error:

Error generating speech: Failed to execute 'endOfStream' on 'MediaSource': The 'updating' attribute is true on one or more of this MediaSource's SourceBuffers.

Branch / Deployment used Docker CPU quickstart version docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.2

Operating System Debian

ItsNoted avatar Feb 18 '25 13:02 ItsNoted

I have the same problem on windows.

gitchat1 avatar Feb 18 '25 13:02 gitchat1

Weird. It worked great all day yesterday but now this error popped up today on some of the voices. I wonder if the API is having issues?

ItsNoted avatar Feb 18 '25 13:02 ItsNoted

@ItsNoted can you post the full logs?

fireblade2534 avatar Feb 18 '25 14:02 fireblade2534

I don't think that will help since the installation doesn't fail but here you go Windows PowerShell

docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.2 Unable to find image 'ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.2' locally v0.2.2: Pulling from remsky/kokoro-fastapi-cpu fac405487e7b: Pulling fs layer fac405487e7b: Download complete 15cd46d12611: Download complete d94653bc0bd7: Download complete 614c4f55d6a2: Download complete 31312498c845: Download complete 041a0f34698b: Download complete 406e12cf3cd9: Download complete 4f4fb700ef54: Already exists 87ccd60e8dd6: Download complete 43d164395a1c: Download complete Digest: sha256:76549cce3c5cc5ed4089619a9cffc3d39a041476ff99c5138cd18b6da832c4d7 Status: Downloaded newer image for ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.2 2025-02-18 14:42:39.528 | INFO | main:download_model:60 - Model files already exist and are valid Building kokoro-fastapi @ file:///app Built kokoro-fastapi @ file:///app Uninstalled 1 package in 1ms Installed 1 package in 1ms INFO: Started server process [31] INFO: Waiting for application startup. 02:42:54 PM | INFO | main:57 | Loading TTS model and voice packs... 02:42:54 PM | INFO | model_manager:38 | Initializing Kokoro V1 on cpu 02:42:54 PM | DEBUG | paths:101 | Searching for model in path: /app/api/src/models 02:42:54 PM | INFO | kokoro_v1:45 | Loading Kokoro model on cpu 02:42:54 PM | INFO | kokoro_v1:46 | Config path: /app/api/src/models/v1_0/config.json 02:42:54 PM | INFO | kokoro_v1:47 | Model path: /app/api/src/models/v1_0/kokoro-v1_0.pth /app/.venv/lib/python3.10/site-packages/torch/nn/modules/rnn.py:123: UserWarning: dropout option adds drop out after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dro pout=0.2 and num_layers=1 warnings.warn( /app/.venv/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:143: FutureWarning: torch.nn.utils. weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. WeightNorm.apply(module, name, dim) 02:42:55 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 02:42:55 PM | DEBUG | paths:131 | Searching for voice in path: /app/api/src/voices/v1_0 02:42:55 PM | DEBUG | model_manager:77 | Using default voice 'af_heart' for warmup 02:42:55 PM | INFO | kokoro_v1:73 | Creating new pipeline for language code: a 02:42:56 PM | DEBUG | kokoro_v1:244 | Generating audio for text with lang_code 'a': 'Warmup text for in itialization.' 02:42:57 PM | DEBUG | kokoro_v1:251 | Got audio chunk with shape: torch.Size([57600]) 02:42:57 PM | INFO | model_manager:84 | Warmup completed in 2782ms 02:42:57 PM | INFO | main:101 |

░░░░░░░░░░░░░░░░░░░░░░░░

╔═╗┌─┐┌─┐┌┬┐
╠╣ ├─┤└─┐ │
╚  ┴ ┴└─┘ ┴
╦╔═┌─┐┬┌─┌─┐
╠╩╗│ │├┴┐│ │
╩ ╩└─┘┴ ┴└─┘

░░░░░░░░░░░░░░░░░░░░░░░░

Model warmed up on cpu: kokoro_v1CUDA: False 67 voice packs loaded

Beta Web Player: http://0.0.0.0:8880/web/ or http://localhost:8880/web/ ░░░░░░░░░░░░░░░░░░░░░░░░

INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8880 (Press CTRL+C to quit) 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43556 - "GET /web/ HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43556 - "GET /web/styles/base.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43594 - "GET /web/styles/forms.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43578 - "GET /web/styles/header.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43568 - "GET /web/styles/layout.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43596 - "GET /web/styles/player.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43610 - "GET /web/styles/responsive.css HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43556 - "GET /web/styles/badges.css HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43578 - "GET /web/styles/controls.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43594 - "GET /web/src/App.js HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43594 - "GET /web/src/services/VoiceService.js HTTP/1.1" 200 OK INFO: 172.17.0.1:43596 - "GET /web/src/state/PlayerState.js HTTP/1.1" 200 OK INFO: 172.17.0.1:43578 - "GET /web/src/services/AudioService.js HTTP/1.1" 200 OK INFO: 172.17.0.1:43556 - "GET /web/src/components/PlayerControls.js HTTP/1.1" 200 OK INFO: 172.17.0.1:43568 - "GET /web/src/components/WaveVisualizer.js HTTP/1.1" 200 OK INFO: 172.17.0.1:43610 - "GET /web/src/components/VoiceSelector.js HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43594 - "GET /web/src/components/TextEditor.js HTTP/1.1" 200 OK 02:43:54 PM | INFO | openai_compatible:65 | Created global TTSService instance 02:43:54 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 INFO: 172.17.0.1:43578 - "GET /v1/audio/voices HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:44:10 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 INFO: 172.17.0.1:45700 - "POST /v1/audio/speech HTTP/1.1" 200 OK 02:44:10 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 02:44:10 PM | INFO | openai_compatible:135 | Starting audio generation with lang_code: None 02:44:10 PM | DEBUG | paths:131 | Searching for voice in path: /app/api/src/voices/v1_0 02:44:10 PM | DEBUG | tts_service:228 | Using single voice path: /app/api/src/voices/v1_0/af_alloy.pt
02:44:10 PM | DEBUG | tts_service:253 | Using voice path: /app/api/src/voices/v1_0/af_alloy.pt 02:44:10 PM | INFO | tts_service:257 | Using lang_code 'a' for voice 'af_alloy' in audio stream
02:44:10 PM | INFO | text_processor:114 | Starting smart split for 6 chars 02:44:10 PM | DEBUG | text_processor:51 | Total processing took 16.50ms for chunk: 'Testt!' 02:44:10 PM | INFO | text_processor:236 | Yielding final chunk 1: 'Testt!' (6 tokens) 02:44:10 PM | DEBUG | kokoro_v1:244 | Generating audio for text with lang_code 'a': 'Testt!' 02:44:11 PM | DEBUG | kokoro_v1:251 | Got audio chunk with shape: torch.Size([28200]) 02:44:11 PM | INFO | text_processor:242 | Split completed in 686.53ms, produced 1 chunks 02:44:17 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 INFO: 172.17.0.1:51710 - "POST /v1/audio/speech HTTP/1.1" 200 OK 02:44:17 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 02:44:17 PM | INFO | openai_compatible:135 | Starting audio generation with lang_code: None 02:44:17 PM | DEBUG | paths:131 | Searching for voice in path: /app/api/src/voices/v1_0 02:44:17 PM | DEBUG | tts_service:228 | Using single voice path: /app/api/src/voices/v1_0/af_alloy.pt
02:44:17 PM | DEBUG | tts_service:253 | Using voice path: /app/api/src/voices/v1_0/af_alloy.pt 02:44:17 PM | INFO | tts_service:257 | Using lang_code 'a' for voice 'af_alloy' in audio stream
02:44:17 PM | INFO | text_processor:114 | Starting smart split for 6 chars 02:44:17 PM | DEBUG | text_processor:51 | Total processing took 0.46ms for chunk: 'Testt!' 02:44:17 PM | INFO | text_processor:236 | Yielding final chunk 1: 'Testt!' (6 tokens) 02:44:17 PM | DEBUG | kokoro_v1:244 | Generating audio for text with lang_code 'a': 'Testt!' 02:44:18 PM | DEBUG | kokoro_v1:251 | Got audio chunk with shape: torch.Size([28200]) 02:44:18 PM | INFO | text_processor:242 | Split completed in 507.60ms, produced 1 chunks

gitchat1 avatar Feb 18 '25 14:02 gitchat1

Mine was very similar. I no longer have the logs as I have turned off the container for now. I cannot seem to reproduce it again either. Strange series of events.

ItsNoted avatar Feb 18 '25 14:02 ItsNoted

Interestingly when I try to built the whole thing from scratch it errors out.

cpu docker compose up --build [+] Building 1.6s (18/18) FINISHED docker:desktop-linux => [kokoro-tts internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 1.83kB 0.0s => [kokoro-tts internal] load metadata for docker.io/library/python:3.10-slim 1.3s => [kokoro-tts internal] load .dockerignore 0.0s => => transferring context: 407B 0.0s => [kokoro-tts stage-0 1/12] FROM docker.io/library/python:3.10-slim@sha256:66aad90b231f011cb80e1 0.0s => => resolve docker.io/library/python:3.10-slim@sha256:66aad90b231f011cb80e1966e03526a7175f058672 0.0s => [kokoro-tts internal] load build context 0.0s => => transferring context: 7.69kB 0.0s => CACHED [kokoro-tts stage-0 2/12] RUN apt-get update && apt-get install -y espeak-ng es 0.0s => CACHED [kokoro-tts stage-0 3/12] RUN curl -LsSf https://astral.sh/uv/install.sh | sh && mv 0.0s => CACHED [kokoro-tts stage-0 4/12] RUN useradd -m -u 1000 appuser && mkdir -p /app/api/src/m 0.0s => CACHED [kokoro-tts stage-0 5/12] WORKDIR /app 0.0s => CACHED [kokoro-tts stage-0 6/12] COPY --chown=appuser:appuser pyproject.toml ./pyproject.toml 0.0s => CACHED [kokoro-tts stage-0 7/12] RUN --mount=type=cache,target=/root/.cache/uv uv venv --p 0.0s => CACHED [kokoro-tts stage-0 8/12] COPY --chown=appuser:appuser api ./api 0.0s => CACHED [kokoro-tts stage-0 9/12] COPY --chown=appuser:appuser web ./web 0.0s => CACHED [kokoro-tts stage-0 10/12] COPY --chown=appuser:appuser docker/scripts/ ./ 0.0s => CACHED [kokoro-tts stage-0 11/12] RUN chmod +x ./entrypoint.sh 0.0s => CACHED [kokoro-tts stage-0 12/12] RUN if [ "true" = "true" ]; then python download_model.py 0.0s => [kokoro-tts] exporting to image 0.1s => => exporting layers 0.0s => => exporting manifest sha256:9e328694d0e783ffb7b500f305800f96f6ca3923603be360ad915839203a8f82 0.0s => => exporting config sha256:9e474636700e2d5226f15aa3f4a85a221cd78df1000a78ad86cc178bb4ff42e2 0.0s => => exporting attestation manifest sha256:1d74c280e98a50efeb21f0b79175dea2bceb51bc69ffea2863b434 0.0s => => exporting manifest list sha256:b35250eb6432e3c2fc14f03bb7a4c2d08ec741db9d63e43fcda696b69aea2 0.0s => => naming to docker.io/library/kokoro-fastapi-cpu-kokoro-tts:latest 0.0s => => unpacking to docker.io/library/kokoro-fastapi-cpu-kokoro-tts:latest 0.0s => [kokoro-tts] resolving provenance for metadata file 0.0s [+] Running 3/3 ✔ kokoro-ttsBuilt0.0s ✔ Network kokoro-fastapi-cpu_defaultCreated0.3s ✔ Container kokoro-fastapi-cpu-kokoro-tts-1 Created0.1s Attaching to kokoro-tts-1 kokoro-tts-1 | exec ./entrypoint.sh: no such file or directory kokoro-tts-1 exited with code 1

gitchat1 avatar Feb 18 '25 15:02 gitchat1

I don't think that will help since the installation doesn't fail but here you go Windows PowerShell

docker run -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.2 Unable to find image 'ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.2' locally v0.2.2: Pulling from remsky/kokoro-fastapi-cpu fac405487e7b: Pulling fs layer fac405487e7b: Download complete 15cd46d12611: Download complete d94653bc0bd7: Download complete 614c4f55d6a2: Download complete 31312498c845: Download complete 041a0f34698b: Download complete 406e12cf3cd9: Download complete 4f4fb700ef54: Already exists 87ccd60e8dd6: Download complete 43d164395a1c: Download complete Digest: sha256:76549cce3c5cc5ed4089619a9cffc3d39a041476ff99c5138cd18b6da832c4d7 Status: Downloaded newer image for ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.2 2025-02-18 14:42:39.528 | INFO | main:download_model:60 - Model files already exist and are valid Building kokoro-fastapi @ file:///app Built kokoro-fastapi @ file:///app Uninstalled 1 package in 1ms Installed 1 package in 1ms INFO: Started server process [31] INFO: Waiting for application startup. 02:42:54 PM | INFO | main:57 | Loading TTS model and voice packs... 02:42:54 PM | INFO | model_manager:38 | Initializing Kokoro V1 on cpu 02:42:54 PM | DEBUG | paths:101 | Searching for model in path: /app/api/src/models 02:42:54 PM | INFO | kokoro_v1:45 | Loading Kokoro model on cpu 02:42:54 PM | INFO | kokoro_v1:46 | Config path: /app/api/src/models/v1_0/config.json 02:42:54 PM | INFO | kokoro_v1:47 | Model path: /app/api/src/models/v1_0/kokoro-v1_0.pth /app/.venv/lib/python3.10/site-packages/torch/nn/modules/rnn.py:123: UserWarning: dropout option adds drop out after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dro pout=0.2 and num_layers=1 warnings.warn( /app/.venv/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:143: FutureWarning: torch.nn.utils. weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. WeightNorm.apply(module, name, dim) 02:42:55 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 02:42:55 PM | DEBUG | paths:131 | Searching for voice in path: /app/api/src/voices/v1_0 02:42:55 PM | DEBUG | model_manager:77 | Using default voice 'af_heart' for warmup 02:42:55 PM | INFO | kokoro_v1:73 | Creating new pipeline for language code: a 02:42:56 PM | DEBUG | kokoro_v1:244 | Generating audio for text with lang_code 'a': 'Warmup text for in itialization.' 02:42:57 PM | DEBUG | kokoro_v1:251 | Got audio chunk with shape: torch.Size([57600]) 02:42:57 PM | INFO | model_manager:84 | Warmup completed in 2782ms 02:42:57 PM | INFO | main:101 |

░░░░░░░░░░░░░░░░░░░░░░░░

╔═╗┌─┐┌─┐┌┬┐
╠╣ ├─┤└─┐ │
╚  ┴ ┴└─┘ ┴
╦╔═┌─┐┬┌─┌─┐
╠╩╗│ │├┴┐│ │
╩ ╩└─┘┴ ┴└─┘

░░░░░░░░░░░░░░░░░░░░░░░░

Model warmed up on cpu: kokoro_v1CUDA: False 67 voice packs loaded

Beta Web Player: http://0.0.0.0:8880/web/ or http://localhost:8880/web/ ░░░░░░░░░░░░░░░░░░░░░░░░

INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8880 (Press CTRL+C to quit) 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43556 - "GET /web/ HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43556 - "GET /web/styles/base.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43594 - "GET /web/styles/forms.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43578 - "GET /web/styles/header.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43568 - "GET /web/styles/layout.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43596 - "GET /web/styles/player.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43610 - "GET /web/styles/responsive.css HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43556 - "GET /web/styles/badges.css HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43578 - "GET /web/styles/controls.css HTTP/1.1" 200 OK INFO: 172.17.0.1:43594 - "GET /web/src/App.js HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43594 - "GET /web/src/services/VoiceService.js HTTP/1.1" 200 OK INFO: 172.17.0.1:43596 - "GET /web/src/state/PlayerState.js HTTP/1.1" 200 OK INFO: 172.17.0.1:43578 - "GET /web/src/services/AudioService.js HTTP/1.1" 200 OK INFO: 172.17.0.1:43556 - "GET /web/src/components/PlayerControls.js HTTP/1.1" 200 OK INFO: 172.17.0.1:43568 - "GET /web/src/components/WaveVisualizer.js HTTP/1.1" 200 OK INFO: 172.17.0.1:43610 - "GET /web/src/components/VoiceSelector.js HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web INFO: 172.17.0.1:43594 - "GET /web/src/components/TextEditor.js HTTP/1.1" 200 OK 02:43:54 PM | INFO | openai_compatible:65 | Created global TTSService instance 02:43:54 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 INFO: 172.17.0.1:43578 - "GET /v1/audio/voices HTTP/1.1" 200 OK 02:43:54 PM | DEBUG | paths:307 | Searching for web file in path: /app/web 02:44:10 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 INFO: 172.17.0.1:45700 - "POST /v1/audio/speech HTTP/1.1" 200 OK 02:44:10 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 02:44:10 PM | INFO | openai_compatible:135 | Starting audio generation with lang_code: None 02:44:10 PM | DEBUG | paths:131 | Searching for voice in path: /app/api/src/voices/v1_0 02:44:10 PM | DEBUG | tts_service:228 | Using single voice path: /app/api/src/voices/v1_0/af_alloy.pt 02:44:10 PM | DEBUG | tts_service:253 | Using voice path: /app/api/src/voices/v1_0/af_alloy.pt 02:44:10 PM | INFO | tts_service:257 | Using lang_code 'a' for voice 'af_alloy' in audio stream 02:44:10 PM | INFO | text_processor:114 | Starting smart split for 6 chars 02:44:10 PM | DEBUG | text_processor:51 | Total processing took 16.50ms for chunk: 'Testt!' 02:44:10 PM | INFO | text_processor:236 | Yielding final chunk 1: 'Testt!' (6 tokens) 02:44:10 PM | DEBUG | kokoro_v1:244 | Generating audio for text with lang_code 'a': 'Testt!' 02:44:11 PM | DEBUG | kokoro_v1:251 | Got audio chunk with shape: torch.Size([28200]) 02:44:11 PM | INFO | text_processor:242 | Split completed in 686.53ms, produced 1 chunks 02:44:17 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 INFO: 172.17.0.1:51710 - "POST /v1/audio/speech HTTP/1.1" 200 OK 02:44:17 PM | DEBUG | paths:153 | Scanning for voices in path: /app/api/src/voices/v1_0 02:44:17 PM | INFO | openai_compatible:135 | Starting audio generation with lang_code: None 02:44:17 PM | DEBUG | paths:131 | Searching for voice in path: /app/api/src/voices/v1_0 02:44:17 PM | DEBUG | tts_service:228 | Using single voice path: /app/api/src/voices/v1_0/af_alloy.pt 02:44:17 PM | DEBUG | tts_service:253 | Using voice path: /app/api/src/voices/v1_0/af_alloy.pt 02:44:17 PM | INFO | tts_service:257 | Using lang_code 'a' for voice 'af_alloy' in audio stream 02:44:17 PM | INFO | text_processor:114 | Starting smart split for 6 chars 02:44:17 PM | DEBUG | text_processor:51 | Total processing took 0.46ms for chunk: 'Testt!' 02:44:17 PM | INFO | text_processor:236 | Yielding final chunk 1: 'Testt!' (6 tokens) 02:44:17 PM | DEBUG | kokoro_v1:244 | Generating audio for text with lang_code 'a': 'Testt!' 02:44:18 PM | DEBUG | kokoro_v1:251 | Got audio chunk with shape: torch.Size([28200]) 02:44:18 PM | INFO | text_processor:242 | Split completed in 507.60ms, produced 1 chunks

I presume the error happened right after ur last log?

fireblade2534 avatar Feb 18 '25 16:02 fireblade2534

Interestingly when I try to built the whole thing from scratch it errors out.

cpu docker compose up --build [+] Building 1.6s (18/18) FINISHED docker:desktop-linux => [kokoro-tts internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 1.83kB 0.0s => [kokoro-tts internal] load metadata for docker.io/library/python:3.10-slim 1.3s => [kokoro-tts internal] load .dockerignore 0.0s => => transferring context: 407B 0.0s => [kokoro-tts stage-0 1/12] FROM docker.io/library/python:3.10-slim@sha256:66aad90b231f011cb80e1 0.0s => => resolve docker.io/library/python:3.10-slim@sha256:66aad90b231f011cb80e1966e03526a7175f058672 0.0s => [kokoro-tts internal] load build context 0.0s => => transferring context: 7.69kB 0.0s => CACHED [kokoro-tts stage-0 2/12] RUN apt-get update && apt-get install -y espeak-ng es 0.0s => CACHED [kokoro-tts stage-0 3/12] RUN curl -LsSf https://astral.sh/uv/install.sh | sh && mv 0.0s => CACHED [kokoro-tts stage-0 4/12] RUN useradd -m -u 1000 appuser && mkdir -p /app/api/src/m 0.0s => CACHED [kokoro-tts stage-0 5/12] WORKDIR /app 0.0s => CACHED [kokoro-tts stage-0 6/12] COPY --chown=appuser:appuser pyproject.toml ./pyproject.toml 0.0s => CACHED [kokoro-tts stage-0 7/12] RUN --mount=type=cache,target=/root/.cache/uv uv venv --p 0.0s => CACHED [kokoro-tts stage-0 8/12] COPY --chown=appuser:appuser api ./api 0.0s => CACHED [kokoro-tts stage-0 9/12] COPY --chown=appuser:appuser web ./web 0.0s => CACHED [kokoro-tts stage-0 10/12] COPY --chown=appuser:appuser docker/scripts/ ./ 0.0s => CACHED [kokoro-tts stage-0 11/12] RUN chmod +x ./entrypoint.sh 0.0s => CACHED [kokoro-tts stage-0 12/12] RUN if [ "true" = "true" ]; then python download_model.py 0.0s => [kokoro-tts] exporting to image 0.1s => => exporting layers 0.0s => => exporting manifest sha256:9e328694d0e783ffb7b500f305800f96f6ca3923603be360ad915839203a8f82 0.0s => => exporting config sha256:9e474636700e2d5226f15aa3f4a85a221cd78df1000a78ad86cc178bb4ff42e2 0.0s => => exporting attestation manifest sha256:1d74c280e98a50efeb21f0b79175dea2bceb51bc69ffea2863b434 0.0s => => exporting manifest list sha256:b35250eb6432e3c2fc14f03bb7a4c2d08ec741db9d63e43fcda696b69aea2 0.0s => => naming to docker.io/library/kokoro-fastapi-cpu-kokoro-tts:latest 0.0s => => unpacking to docker.io/library/kokoro-fastapi-cpu-kokoro-tts:latest 0.0s => [kokoro-tts] resolving provenance for metadata file 0.0s [+] Running 3/3 ✔ kokoro-ttsBuilt0.0s ✔ Network kokoro-fastapi-cpu_defaultCreated0.3s ✔ Container kokoro-fastapi-cpu-kokoro-tts-1 Created0.1s Attaching to kokoro-tts-1 kokoro-tts-1 | exec ./entrypoint.sh: no such file or directory kokoro-tts-1 exited with code 1

This is likly unrelated if you are on windows its probably because GitHub tries to correct line endings and it messes with its ability to find the file for some reason.

Try running this (Make it so all repo will use Linux line endings): git config --global core.autocrlf false then this (or redownload the repo): git add --renormalize .

fireblade2534 avatar Feb 18 '25 16:02 fireblade2534

Okay regarding your first queastion I did it on a completely new installation of docker. I cleared my console before running the comand and what you see is the log file that resulted from that. I'm on windows but run docker through WSL 2. Next question how do I do what you described?

gitchat1 avatar Feb 18 '25 16:02 gitchat1

Okay regarding your first queastion I did it on a completely new installation of docker. I cleared my console before running the comand and what you see is the log file that resulted from that.

So where does the "Error generating speech: Failed to execute 'endOfStream' on 'MediaSource': The 'updating' attribute is true on one or more of this MediaSource's SourceBuffers." come from them?

fireblade2534 avatar Feb 18 '25 16:02 fireblade2534

I'm on windows but run docker through WSL 2. Next question how do I do what you described?

Go into the folder that you cloned onto your machine and execute the commands in your wsl console although if u want the line ending thing on your windows machine execute the first command on your windows terminal as well.

fireblade2534 avatar Feb 18 '25 17:02 fireblade2534

Okay regarding your first queastion I did it on a completely new installation of docker. I cleared my console before running the comand and what you see is the log file that resulted from that.

So where does the "Error generating speech: Failed to execute 'endOfStream' on 'MediaSource': The 'updating' attribute is true on one or more of this MediaSource's SourceBuffers." come from them?

The webinterface loads and as soon as I try to generate an audio file this error pops up.

I'm on windows but run docker through WSL 2. Next question how do I do what you described?

Go into the folder that you cloned onto your machine and execute the commands in your wsl console although if u want the line ending thing on your windows machine execute the first command on your windows terminal as well.

That's exactly what I did. I ran docker opened a console and executed the comand directly in the cloned git folder. WSL was installed as part of my docker installation. I don't know what you mean by line ending.

gitchat1 avatar Feb 18 '25 17:02 gitchat1

Okay regarding your first queastion I did it on a completely new installation of docker. I cleared my console before running the comand and what you see is the log file that resulted from that.

So where does the "Error generating speech: Failed to execute 'endOfStream' on 'MediaSource': The 'updating' attribute is true on one or more of this MediaSource's SourceBuffers." come from them?

The webinterface loads and as soon as I try to generate an audio file this error pops up.

so it pops up on the webui then?

fireblade2534 avatar Feb 18 '25 17:02 fireblade2534

I'm on windows but run docker through WSL 2. Next question how do I do what you described?

Go into the folder that you cloned onto your machine and execute the commands in your wsl console although if u want the line ending thing on your windows machine execute the first command on your windows terminal as well.

That's exactly what I did. I ran docker opened a console and executed the comand directly in the cloned git folder. WSL was installed as part of my docker installation. I don't know what you mean by line ending.

ok and does building it work? A line ending is how operating systems represent that a new line. On windows its "\r\n" and on linux its "\n"

fireblade2534 avatar Feb 18 '25 17:02 fireblade2534

Yes it pops up on the webui. It does not build if you execute the comands shown on the readme page. I still don't quite understand what you are trying to tell me about lines. Is that just some formatting thing? I can only refer you to the log above. The build fails with kokoro-tts-1 exited with code 1.

gitchat1 avatar Feb 18 '25 17:02 gitchat1

ok. What I mean by the line endings is that I had a similar issue to you where it wasn't able to find entrypoint.sh It turns out that git hub was changing Linux line endings to Windows ones and it was producing the same error. In regards to my question I "ok and does building it work?" I meant after executing the commands I gave u did it work. Also are you using the GPU container or the CPU container

fireblade2534 avatar Feb 18 '25 17:02 fireblade2534

Okay using your comands I was able to build everything from source but the error in the Webui remains. I'm using the CPU version.

gitchat1 avatar Feb 18 '25 18:02 gitchat1

Okay using your comands I was able to build everything from source but the error in the Webui remains. I'm using the CPU version.

What browser and generation settings are you using?

fireblade2534 avatar Feb 18 '25 18:02 fireblade2534

Thank you for the question. It was a Firefox problem! In Vivaldi it works as expected.

gitchat1 avatar Feb 18 '25 19:02 gitchat1

Currently running into the same issue. Have tried 3 browsers(Chrome,Edge,FF).

Installed via docker run GPU using (docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2 #NVIDIA GPU"

I can connect to the web interface. Upon clicking Generate speech i get this error "Error generating speech: MediaSource.addSourceBuffer: Type not supported in MediaSource"

changing the text uploaded, model, file type doesnt seem to do anything for me.

Dayto123beast avatar Feb 18 '25 23:02 Dayto123beast

yeah its not releated to browser I don't think cause I checked and I'm getting it too

fireblade2534 avatar Feb 18 '25 23:02 fireblade2534

Currently running into the same issue. Have tried 3 browsers(Chrome,Edge,FF).

Installed via docker run GPU using (docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2 #NVIDIA GPU"

I can connect to the web interface. Upon clicking Generate speech i get this error "Error generating speech: MediaSource.addSourceBuffer: Type not supported in MediaSource"

changing the text uploaded, model, file type doesnt seem to do anything for me.

I wouldn't know about the GPU version since I use the CPU Before doing anything else try installing Vivaldi I'm 99% sure that this will work. If not it is always a good idea to build from source. I suspect if you are on Windows you will run into the same problem I did Here are the comands that should work to build it.

git config --global core.autocrlf false git clone https://github.com/remsky/Kokoro-FastAPI.git cd Kokoro-FastAPI cd docker/gpu docker compose up --build

Make sure you delete any previous downloads of the Kokoro Repository you might have downloaded.

gitchat1 avatar Feb 19 '25 11:02 gitchat1

Currently running into the same issue. Have tried 3 browsers(Chrome,Edge,FF). Installed via docker run GPU using (docker run --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:v0.2.2 #NVIDIA GPU" I can connect to the web interface. Upon clicking Generate speech i get this error "Error generating speech: MediaSource.addSourceBuffer: Type not supported in MediaSource" changing the text uploaded, model, file type doesnt seem to do anything for me.

I wouldn't know about the GPU version since I use the CPU Before doing anything else try installing Vivaldi I'm 99% sure that this will work. If not it is always a good idea to build from source. I suspect if you are on Windows you will run into the same problem I did Here are the comands that should work to build it.

git config --global core.autocrlf false git clone https://github.com/remsky/Kokoro-FastAPI.git cd Kokoro-FastAPI cd docker/gpu docker compose up --build

Make sure you delete any previous downloads of the Kokoro Repository you might have downloaded.

Your issue that I helped u with had nothing to do with the main issue of the thread your issue was that the docker build could not find ./entrypoint.sh

fireblade2534 avatar Feb 19 '25 15:02 fireblade2534

Yes I know but I've also learned that building things from scratch often helps with such problems.

gitchat1 avatar Feb 19 '25 15:02 gitchat1

Still errors. How can I fixed this error? Thanks.

jdola avatar Feb 25 '25 17:02 jdola

What error are you getting and are you using the cpu or the GPU build?

gitchat1 avatar Feb 25 '25 19:02 gitchat1

I got this error as well while testing the ui. Running docker with the cpu version on Windows. I noticed that the error occurs in Firefox (135.0.1 (64-bit)) but Chrome runs this without issues.

florian-kalisch avatar Feb 25 '25 20:02 florian-kalisch

What error are you getting and are you using the cpu or the GPU build? I'm using GPU, I get error with message "Error generating speech: Failed to execute 'endOfStream' on 'MediaSource': The 'updating' attribute is true on one or more of this MediaSource's SourceBuffers." on all browsers when using http://localhost:8880/web

jdola avatar Feb 27 '25 11:02 jdola

It's a Firefox issue. I can confirm that the programme works in Vivaldi as well as Brave.

gitchat1 avatar Feb 27 '25 11:02 gitchat1

Old version worked in firefox but with latest version getting the mediasource error. Works i chrome tho. Using CPU docker container.

dicksondickson avatar Feb 28 '25 11:02 dicksondickson