Davud Kakaie
Davud Kakaie
@huseinzol05 I created a new Conda environment and was able to load and infer using `python -m vllm.entrypoints.openai.api_server --model openai/whisper-large-v3 --dtype bfloat16 --whisper-input-type input_features --max-model-len 448 --max-size-mb-whisper 100 --gpu_memory_utilization=0.80`. Works...
@huseinzol05 your latest commit brought it into life, working like a charm. I don't know if, taking vLLM into account, its technically possible to have token/word level timestamps?
This really helped with timing of my sentences. A segment would start long before it was actually spoken specially when music is played in between segments. However my word-level timestamps...
@Muertoe Had the same issue, turned out that some of the files `create_data.py` created were invalid. In my case I had 25 invalid files, 78 bytes each.