noScribe icon indicating copy to clipboard operation
noScribe copied to clipboard

Linux Mint: Killed, crash during transcription

Open mv1769 opened this issue 9 months ago • 27 comments

Good evening,

I've had NoScribe crash several times now when transcribing, and the command table says ‘Killed’. Do you have the same problem? Does anyone know what's going on and how to fix it?

Image

mv1769 avatar Mar 28 '25 23:03 mv1769

Hmm, no, I have no idea. For a start, can you tell me more about your system? Are you on Windows, Mac, Linux? If on Windows, are you using cuda or not?

kaixxx avatar Mar 29 '25 09:03 kaixxx

Ah, and which model are you using for transcription? The default "precise" setting?

kaixxx avatar Mar 29 '25 09:03 kaixxx

Yes sorry ! I'm on Linux Mint and I'm using "precise" for transcription because with others I have an error message for model.bin...

Image

but I resolve it

mv1769 avatar Mar 29 '25 09:03 mv1769

Thank you. A few more questions:

  • What Python version do you use? I would recommend 3.12.
  • Did you download the following model in the /model/precise folder: https://huggingface.co/mobiuslabsgmbh/faster-whisper-large-v3-turbo? This is the one I recommend.
  • Did you make any other changes to resolve the issues you had?

kaixxx avatar Mar 29 '25 10:03 kaixxx

  • I think I use python 3.10 because in my command line I have :

</home/.../.../noScribe-0.6/.venv/lib/python3.10/site-packages/pyannote/audio/models/blocks/pooling.py:104: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /pytorch/aten/src/ATen/native/ReduceOps.cpp:1831.)

std = sequences.std(dim=-1, correction=1)>

(I'm a beginner on Linux so maybe I did something wrong...)

  • Yes I downloaded it in this folder and II copied and pasted them from faster-whisper-large-v3-turbo to the precise folder :

Image

  • Yes, one of the main changes we made at the time of installation was that the lines in the text editor did not correspond to these indications:

"Edit the noScribe.py file as follow in order to be able to open noScribeEditor from noScribe Edit line 566 so that it reads program = os.path.join(app_dir, 'noScribeEdit', "noScribeEdit.py") Edit line 578 so that it reads Popen(['python3', program, file], **kwargs) #might need to be python instead of python3 depending on your environment Edit line 580 so that it reads Popen(['python3', program], **kwargs)"

so we put them in these lines:

line 753 : (maybe I made an error because it's not NoScribe.py but NoScribe.exe ... )

Image

line 775 and 777 :

Image

mv1769 avatar Mar 29 '25 11:03 mv1769

and I discover this morning that the html file exist in my folder even if the AI disappeared.... (but it's much shorter than I thought, I had reduced the length to 30 minutes, but I had to make a mistake and put 30 seconds... attention problems don't help!)

mv1769 avatar Mar 29 '25 11:03 mv1769

Thank you, that looks fine.

I discover this morning that the html file exist in my folder even if the AI disappeared

Interesting. We had a similar error (still unresolved) where noScribe crashes after the transcription is finished (so far only reported on Windows and with cuda). Is it the same in your case?

kaixxx avatar Mar 29 '25 11:03 kaixxx

I think it's not the same because it crashes before the end, as we can see here :

Image

the html file contains the first two minutes before ‘killed’ whereas I had asked for 30 minutes as you can see from the command above (see the INFO faster whisper line)

mv1769 avatar Mar 29 '25 11:03 mv1769

Ok, it seems that there is an issue with the faster-whisper library on your machine. Let's test this with a minimal example.

Please create a new file in the main folder of noScribe and name it test_whisper.py. Copy and paste the following code into this file:

from faster_whisper import WhisperModel

# adjust the following paths for your system:
model_folder = "C:/path/to/noscribe/models/precise"
audio_input = "C:/path/to/an/audio/file.wav"   # <= mp3 works too

print("loading whisper model...")
model = WhisperModel(model_folder, local_files_only=True)

segments, info = model.transcribe(audio_input)

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

Make sure to adjust the paths at the beginning of the script.

Now navigate in your terminal to the noscribe folder and run the script with python3 test_whisper.py. Be patient, it takes some time for the model to load. After a while, you should see the transcribed text appear on your screen. Does it run through?

kaixxx avatar Mar 30 '25 10:03 kaixxx

Yes ! I tried with or without .venv and ... : (without .venv) :

python3 test_whisper.py
Traceback (most recent call last):
  File "/home/juleny9687/Téléchargements/noScribe-0.6/test_whisper.py", line 1, in <module>
    from faster_whisper import WhisperModel
ModuleNotFoundError: No module named 'faster_whisper'

(with source .venv/bin/activate) :

python3 test_whisper.py
loading whisper model...
[2025-04-01 13:46:04.073] [ctranslate2] [thread 106801] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
Killed

mv1769 avatar Apr 01 '25 11:04 mv1769

Yes ! I tried with or without .venv and ... : (without .venv) :

python3 test_whisper.py
Traceback (most recent call last):
  File "/home/juleny9687/Téléchargements/noScribe-0.6/test_whisper.py", line 1, in <module>
    from faster_whisper import WhisperModel
ModuleNotFoundError: No module named 'faster_whisper'

That is expected, because you haven't (and also shouldn't have) installed any packages globally.

(with source .venv/bin/activate) :

python3 test_whisper.py
loading whisper model...
[2025-04-01 13:46:04.073] [ctranslate2] [thread 106801] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
Killed

This seems to be a faster-whisper issue: https://github.com/SYSTRAN/faster-whisper/issues/373. Any other thoughts on this, @kaixxx?

gernophil avatar Apr 01 '25 12:04 gernophil

@gernophil The issue you've linked was about a particularly large file (> 10h audio). I think this is not the case here, ablbeit the symptoms look similar.

Let's try a different model. Please change two lines in the above code:

  • model_folder = "small" (instead of the folder path)
  • model = WhisperModel(model_folder, local_files_only=False) (=> changing 'local_files_only' to False, nothing else)

This will download the small model automatically and use it for transcription. Run the script from within your venv.

kaixxx avatar Apr 01 '25 12:04 kaixxx

It always doesn't work :'( I have :

python3 test_whisper.py
loading whisper model...
config.json: 100%|█████████████████████████| 2.37k/2.37k [00:00<00:00, 9.12MB/s]
vocabulary.txt: 100%|█████████████████████████| 460k/460k [00:03<00:00, 122kB/s]
tokenizer.json: 100%|██████████████████████| 2.20M/2.20M [00:27<00:00, 80.4kB/s]
model.bin: 100%|██████████████████████████████| 484M/484M [29:48<00:00, 270kB/s]
[2025-04-01 15:20:10.900] [ctranslate2] [thread 116790] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
Killed

and the audio isn't that long, it lasts two hours (and I've already tried taking only the first 30 minutes, but I still got ‘killed’); on the other hand, maybe the file is too heavy and I need to change the format? I'll do a test.

mv1769 avatar Apr 01 '25 13:04 mv1769

OR or, given this text in the command line : "The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead."

is it because my version of Linux or my file (.WAV) are incompatible?

mv1769 avatar Apr 01 '25 13:04 mv1769

It looks like this type of error (but I'm on Linux, not Windows, so I don't know how to adapt it...) :

https://github.com/SYSTRAN/faster-whisper/issues/42 https://github.com/SYSTRAN/faster-whisper/issues/727

mv1769 avatar Apr 01 '25 13:04 mv1769

How did you generate your venv? The Linux requirements.txt doesn't have cuda support implemented as far as I see.

gernophil avatar Apr 01 '25 14:04 gernophil

Forget what I just wrote. CUDA is the default for Linux (for windows its CPU, that's why I mixed it up). So, if the CUDA version is the issue, you need to install the older one: In the requirements_linux.txt replace the line torchaudio with torchaudio --index-url https://download.pytorch.org/whl/cu118 and recreate the venv (or run pip install torchaudio --index-url https://download.pytorch.org/whl/cu118 within the existing venv).

gernophil avatar Apr 01 '25 14:04 gernophil

uh in which requirements file? I have several, I've tried but nothing works... (I'm a beginner in linux and in coding so maybe I made a mistake somewhere)

Image

Image

mv1769 avatar Apr 01 '25 14:04 mv1769

The compute type inferred from the saved model is float16...

You can ignore this warning, I get it too on my machine. It refers to the format of the whisper model, not your audio.

Interesting idea, @gernophil, that it might be CUDA related. However, the current version of faster-whisper requires CUDA 12: https://github.com/SYSTRAN/faster-whisper#gpu So please do not downgrade to CUDA 11.8

kaixxx avatar Apr 01 '25 14:04 kaixxx

Then maybe you haven't activated the proprietary NVIDIA driver. I remember this is necessary to fully use NVIDIA GPUs in Linux. Need to look that up at home for mint.

gernophil avatar Apr 01 '25 14:04 gernophil

We can test if it is CUDA related if we switch temporary to the cpu. Please change the code of your test_whisper.py by adding device="cpu" to the model creation:

model = WhisperModel(model_folder, local_files_only=False, device="cpu")

kaixxx avatar Apr 01 '25 14:04 kaixxx

I did it :

Image

and then :

Image

mv1769 avatar Apr 01 '25 15:04 mv1769

OK, this is a hard one. Let's rule out the audio file as a potential cause of the issue. Please try it with this: test_audio.zip

It's the beginning of this interview: https://youtu.be/ap_xtj1kPF0 (in German)

kaixxx avatar Apr 01 '25 15:04 kaixxx

IT WOOOORKS

Image

Now I'll try to convert the files without rotting them - thanks!

mv1769 avatar Apr 01 '25 15:04 mv1769

it works with the command python3 test_whisper.py but not with noScribe :

Image

and I tried with a new folder, always "killed" :

python3 test_whisper.py
loading whisper model...
[2025-04-01 23:56:03.869] [ctranslate2] [thread 3569420] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
Killed

mv1769 avatar Apr 01 '25 21:04 mv1769

I tried with a new folder, always "killed"

What do you mean by "new folder"? Can you explain a bit more?

I'm uncertain if the audio file is the culprit. My general impression is that faster-whisper is unstable on your system for some reason. Even with the original audio file, noScribe was working for a few minutes in the first screenshot posted at the beginning of this thread, and then crashed.

Switching to python 3.12 and creating a new venv with the latest version of faster-whisper and all its dependencies might help. If this doesn't work, it would be good to seek help over at the faster-whisper repository. I can support you in that, post a link here.

kaixxx avatar Apr 02 '25 07:04 kaixxx

Not sure, if you use cuda or cpu now, but for cuda it would be advisable to use the newest proprietary NVIDIA drivers: https://linuxmint-installation-guide.readthedocs.io/en/latest/drivers.html

I also think the audio file might be the culprit. In the faster-whisper thread they said splitting and remerging the file might help. Maybe that's worth a try.

gernophil avatar Apr 02 '25 07:04 gernophil