wsl中安装llama cpp cuda调用bug,Warming up...失败,服务启动失败
如下: .\sparktts_env\python.exe server.py --model_path Spark-TTS-0.5B --backend llama-cpp --llm_device cuda --tokenizer_device cuda --detokenizer_device cuda --wav2vec_attn_implementation sdpa --llm_attn_implementation sdpa --torch_dtype "bfloat16" --max_length 32768 --llm_gpu_memory_utilization 0.6 --host 0.0.0.0 --port 8000
[Fast-Spark-TTS] 2025-04-03 00:21:31 [INFO] [server:131] >> Warming up... [Fast-Spark-TTS] 2025-04-03 00:26:59 [ERROR] [spark_engine:313] >> Semantic tokens 预测为空,prompt:<|task_controllable_tts|><|start_content|>测试音频。<|end_content|><|start_style_label|><|gender_0|><|pitch_label_2|><|speed_label_2|><|end_style_label|>,llm output:GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG ERROR: Traceback (most recent call last): File "D:\AI\Fast-Spark-TTS-AllInOne-CUDA\sparktts_env\Lib\site-packages\starlette\routing.py", line 692, in lifespan async with self.lifespan_context(app) as maybe_state: ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\Fast-Spark-TTS-AllInOne-CUDA\sparktts_env\Lib\contextlib.py", line 210, in aenter return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\Fast-Spark-TTS-AllInOne-CUDA\sparktts_env\Lib\site-packages\fastapi\routing.py", line 133, in merged_lifespan async with original_context(app) as maybe_original_state: ^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\Fast-Spark-TTS-AllInOne-CUDA\sparktts_env\Lib\contextlib.py", line 210, in aenter return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\Fast-Spark-TTS-AllInOne-CUDA\server.py", line 384, in lifespan await warmup_engine(engine) File "D:\AI\Fast-Spark-TTS-AllInOne-CUDA\server.py", line 132, in warmup_engine await async_engine.generate_voice_async( File "D:\AI\Fast-Spark-TTS-AllInOne-CUDA\fast_tts\engine\spark_engine.py", line 775, in generate_voice_async first_output = await generate_audio(segments[0], acoustic_token=None) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\Fast-Spark-TTS-AllInOne-CUDA\fast_tts\engine\spark_engine.py", line 749, in generate_audio generated = await self._generate_audio_tokens( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\AI\Fast-Spark-TTS-AllInOne-CUDA\fast_tts\engine\spark_engine.py", line 314, in _generate_audio_tokens raise ValueError(err_msg) ValueError: Semantic tokens 预测为空,prompt:<|task_controllable_tts|><|start_content|>测试音频。<|end_content|><|start_style_label|><|gender_0|><|pitch_label_2|><|speed_label_2|><|end_style_label|>,llm output:GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG.......................
ERROR: Application startup failed. Exiting.