Montreal-Forced-Aligner
Montreal-Forced-Aligner copied to clipboard
[BUG] MFA 3.0.6 unable to open database file
Describe the issue MAF Version: 3.0.6 MFA broken during the training of the acoustic model:
Traceback (most recent call last):
File "/zhangpai21/envs/aligner3/bin/mfa", line 10, in <module>
sys.exit(mfa_cli())
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/rich_click/rich_command.py", line 126, in main
rv = self.invoke(ctx)
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/montreal_forced_aligner/command_line/train_acoustic_model.py", line 144, in train_acoustic_model_cli
trainer.train()
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/montreal_forced_aligner/acoustic_modeling/trainer.py", line 529, in train
self.align()
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/montreal_forced_aligner/acoustic_modeling/trainer.py", line 713, in align
session.query(CorpusWorkflow).filter(CorpusWorkflow.id == wf.id).update(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3251, in update
result: CursorResult[Any] = self.session.execute(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2306, in execute
return self._execute_internal(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2191, in _execute_internal
result: Result[Any] = compile_state_cls.orm_execute_statement(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/orm/bulk_persistence.py", line 1617, in orm_execute_statement
return super().orm_execute_statement(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/orm/context.py", line 293, in orm_execute_statement
result = conn.execute(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1422, in execute
return meth(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 514, in _execute_on_connection
return connection._execute_clauseelement(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1644, in _execute_clauseelement
ret = self._execute_context(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1850, in _execute_context
return self._exec_single_context(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1990, in _exec_single_context
self._handle_dbapi_exception(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2357, in _handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1971, in _exec_single_context
self.dialect.do_execute(
File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 919, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file
[SQL: UPDATE corpus_workflow SET dirty=? WHERE corpus_workflow.id = ?]
[parameters: (1, 16)]
(Background on this error at: https://sqlalche.me/e/20/e3q8)
For Reproducing your issue
Training Command:
path_corpus=/zhangpai21/webdataset/audio/fma_train_data/select_wav3kh
path_dict=/zhangpai21/workspace/cgy/1_projects/1_valle/lft_rep/test_mfa/my4.dict
path_model=/zhangpai21/workspace/cgy/1_projects/1_valle/lft_rep/test_mfa/model_trained_3kh
mfa train $path_corpus $path_dict $path_model --clean --num_jobs 100 --single_speaker
- Corpus structure
The
path_corpus
contains the.wav
and.lab
files. The .wav is in Mandarin, and the .lab is the phoneme generated by a private g2p model. There are a total of 4,289,530 audio files in this folder, totaling approximately 3k hours. Lab file example:
m ei3 d ao4 q van2 vn4 h uei4 sp q ing1 vn4 h uei4 d eng3 q van2 g uo2 x ing4 sp d a4 x ing2 sp s an1 sh iii4 j v3 b an4 q i1 sp
Since g2p is not needed, I use a one-to-one mapping dictionary based the solution in this issue. Dict example:
<pad> <pad>
<unk> <unk>
AA0 AA0
AA1 AA1
AA2 AA2
AE0 AE0
AE1 AE1
AE2 AE2
AH0 AH0
AH1 AH1
AH2 AH2
...
Platform:
- OS: Linux
- Version:Ubuntu 20.04.6 LTS
MFA makes use of a sqlite database to store its data, so make sure the sqlite database file does exist and maybe try sudo mfa tarin ...
to ensure you really have the permissions to read/write that file.