VideoLingo icon indicating copy to clipboard operation
VideoLingo copied to clipboard

处理日语长视频时报错

Open DeepFal opened this issue 3 months ago • 3 comments

2024-11-07 21:11:45.306 Uncaught app exception Traceback (most recent call last): File "C:\Users\deepf\anaconda3\envs\videolingo\lib\site-packages\streamlit\runtime\scriptrunner\exec_code.py", line 88, in exec_func_with_error_handling result = func() File "C:\Users\deepf\anaconda3\envs\videolingo\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 590, in code_to_exec exec(code, module.dict) File "C:\Users\deepf\Desktop\VideoLingo\VideoLingo\st.py", line 117, in main() File "C:\Users\deepf\Desktop\VideoLingo\VideoLingo\st.py", line 113, in main text_processing_section() File "C:\Users\deepf\Desktop\VideoLingo\VideoLingo\st.py", line 30, in text_processing_section process_text() File "C:\Users\deepf\Desktop\VideoLingo\VideoLingo\st.py", line 47, in process_text step3_1_spacy_split.split_by_spacy() File "C:\Users\deepf\Desktop\VideoLingo\VideoLingo\core\step3_1_spacy_split.py", line 17, in split_by_spacy split_by_mark(nlp) File "C:\Users\deepf\Desktop\VideoLingo\VideoLingo\core\spacy_utils\split_by_mark.py", line 21, in split_by_mark doc = nlp(input_text) File "C:\Users\deepf\anaconda3\envs\videolingo\lib\site-packages\spacy\language.py", line 1037, in call doc = self._ensure_doc(text) File "C:\Users\deepf\anaconda3\envs\videolingo\lib\site-packages\spacy\language.py", line 1128, in ensure_doc return self.make_doc(doc_like) File "C:\Users\deepf\anaconda3\envs\videolingo\lib\site-packages\spacy\language.py", line 1120, in make_doc return self.tokenizer(text) File "C:\Users\deepf\anaconda3\envs\videolingo\lib\site-packages\spacy\lang\ja_init.py", line 56, in call sudachipy_tokens = self.tokenizer.tokenize(text) Exception: Tokenization error: Input is too long, it can't be more than 49149 bytes, was 116123

image

DeepFal avatar Nov 07 '24 13:11 DeepFal