Eric Bizet issues

Repositories
Issues
Comments

Results 2 issues of


                                            Eric Bizet

Non latin transcripts cannot be written to files

Proposing to add utf-8 encoding to file writes when exporting results of transcripts. Default ASCII write mode did not allow Japanese characters to be written correctly for instance. Fixing https://github.com/shashikg/WhisperS2T/issues/53

Non latin characters cannot get exported to files

When exporting a transcript in Japanese I got: ``` File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/whisper_s2t/utils.py:95, in ExportVTT(transcript, file, single_sentence_in_one_utterance, end_punct_marks) 93 f.write("WEBVTT\n\n") 94 for _utt in transcript: ---> 95 f.write(f"{format_timestamp(_utt['start_time'])} --> {format_timestamp(_utt['end_time'])}\n{_utt['text']}\n\n") UnicodeEncodeError: 'ascii'...