pytorch-kaldi
pytorch-kaldi copied to clipboard
shared_list does not have data_set in forward block with TIMIT tutorial
------------------------------ Epoch 23 / 23 ------------------------------
----- Summary epoch 23 / 23
Training on ['TIMIT_tr']
Loss = 0.932 | err = 0.298
-----
Validating on TIMIT_dev
Loss = 1.811 | err = 0.468
-----
Learning rate on architecture1 = 0.08
-----
Elapsed time (s) = 574
Testing TIMIT_test chunk = 1 / 1
shared list []
shared list [None, None, None, {'mfcc': ['mfcc', 'exp/TIMIT_MLP_basic/exp_files/forward_TIMIT_test_ep23_ck0_mfcc.lst', 'apply-cmvn --utt2spk=ark:/home/sysadmin/kaldi/egs/timit/s5_0827_test/data/test/utt2spk ark:/home/sysadmin/kaldi/egs/timit/s5_0827_test/mfcc/cmvn_test.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |', '5', '5']}, {}, {'MLP_layers1': ['architecture1', 'MLP_layers1', 0]}, {'input': None, 'ref': None}]
output folder exp/TIMIT_MLP_basic
data_set_dict <class 'dict'>
data_set_dict {'input': None, 'ref': None}
Traceback (most recent call last):
File "run_exp.py", line 340, in <module>
data_set_inp, data_set_ref = convert_numpy_to_torch(data_set_dict, save_gpumem, use_cuda)
File "/home/sysadmin/pytorch-kaldi/core.py", line 46, in convert_numpy_to_torch
data_set_inp=torch.from_numpy(data_set_dict['input']).float()
TypeError: expected np.ndarray (got NoneType)
# --------FORWARD--------#
for forward_data in forward_data_lst:
# Compute the number of chunks
N_ck_forward=compute_n_chunks(out_folder,forward_data,ep,N_ep_str_format,'forward')
N_ck_str_format='0'+str(max(math.ceil(np.log10(N_ck_forward)),1))+'d'
processes = list()
info_files = list()
for ck in range(N_ck_forward):
if not is_production:
print('Testing %s chunk = %i / %i' %(forward_data,ck+1, N_ck_forward))
else:
print('Forwarding %s chunk = %i / %i' %(forward_data,ck+1, N_ck_forward))
# output file
info_file=out_folder+'/exp_files/forward_'+forward_data+'_ep'+format(ep, N_ep_str_format)+'_ck'+format(ck, N_ck_str_format)+'.info'
config_chunk_file=out_folder+'/exp_files/forward_'+forward_data+'_ep'+format(ep, N_ep_str_format)+'_ck'+format(ck, N_ck_str_format)+'.cfg'
# Do forward if the chunk was not already processed
if not(os.path.exists(info_file)):
# Doing forward
# getting the next chunk
next_config_file=cfg_file_list[op_counter]
# run chunk processing
if _run_forwarding_in_subprocesses(config):
shared_list = list()
print("shared list",shared_list)
output_folder = config['exp']['out_folder']
save_gpumem = strtobool(config['exp']['save_gpumem'])
use_cuda=strtobool(config['exp']['use_cuda'])
p = read_next_chunk_into_shared_list_with_subprocess(read_lab_fea, shared_list, config_chunk_file, is_production, output_folder, wait_for_process=True)
data_name, data_end_index_fea, data_end_index_lab, fea_dict, lab_dict, arch_dict, data_set_dict = extract_data_from_shared_list(shared_list)
print("shared list", shared_list)
print("output folder",output_folder)
print("data_set_dict",type(data_set_dict))
print("data_set_dict",data_set_dict)
data_set_inp, data_set_ref = convert_numpy_to_torch(data_set_dict, save_gpumem, use_cuda)
When is shared_list overwrite? and How to bring the correct data_set?
Hi ! Isn't it simply a problem with the path of the test dataset in the config file ?
Yes, it looks like that!
On Thu, 29 Aug 2019 at 04:48, Parcollet Titouan [email protected] wrote:
Hi ! Isn't it simply a problem with the path of the test dataset in the config file ?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mravanelli/pytorch-kaldi/issues/157?email_source=notifications&email_token=AEA2ZVUTZHOAI7NUZZN4R4DQG6EM3A5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5NX6NI#issuecomment-526090037, or mute the thread https://github.com/notifications/unsubscribe-auth/AEA2ZVUQKVVQBV35NWHQSSLQG6EM3ANCNFSM4IRZSXTQ .
I will check again.
I'm still in trouble.
ERROR MSG
------------------------------ Epoch 23 / 23 ------------------------------
----- Summary epoch 23 / 23
Training on ['TIMIT_tr']
Loss = 0.932 | err = 0.298
-----
Validating on TIMIT_dev
Loss = 1.812 | err = 0.468
-----
Learning rate on architecture1 = 0.08
-----
Elapsed time (s) = 489
Testing TIMIT_test chunk = 1 / 1
config chunk file exp/TIMIT_MLP_basic/exp_files/forward_TIMIT_test_ep23_ck0.cfg
shared list [None, None, None, {'mfcc': ['mfcc', 'exp/TIMIT_MLP_basic/exp_files/forward_TIMIT_test_ep23_ck0_mfcc.lst', 'apply-cmvn --utt2spk=ark:/home/sysadmin/kaldi/egs/timit/s5/data/test/utt2spk ark:/home/sysadmin/kaldi/egs/timit/s5/mfcc/cmvn_test.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |', '5', '5']}, {}, {'MLP_layers1': ['architecture1', 'MLP_layers1', 0]}, {'input': None, 'ref': None}]
Traceback (most recent call last):
File "run_exp.py", line 338, in <module>
data_set_inp, data_set_ref = convert_numpy_to_torch(data_set_dict, save_gpumem, use_cuda)
File "/home/sysadmin/pytorch-kaldi/core.py", line 46, in convert_numpy_to_torch
data_set_inp=torch.from_numpy(data_set_dict['input']).float()
TypeError: expected np.ndarray (got NoneType)
cfg
[dataset1]
data_name = TIMIT_tr
fea = fea_name=mfcc
fea_lst=/home/sysadmin/kaldi/egs/timit/s5/data/train/feats.scp
fea_opts=apply-cmvn --utt2spk=ark:/home/sysadmin/kaldi/egs/timit/s5/data/train/utt2spk ark:/home/sysadmin/kaldi/egs/timit/s5/mfcc/cmvn_train.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |
cw_left=5
cw_right=5
lab = lab_name=lab_cd
lab_folder=/home/sysadmin/kaldi/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_ali
lab_opts=ali-to-pdf
lab_count_file=auto
lab_data_folder=/home/sysadmin/kaldi/egs/timit/s5/data/train/
lab_graph=/home/sysadmin/kaldi/egs/timit/s5/exp/tri3/graph
n_chunks = 5
[dataset2]
data_name = TIMIT_dev
fea = fea_name=mfcc
fea_lst=/home/sysadmin/kaldi/egs/timit/s5/data/dev/feats.scp
fea_opts=apply-cmvn --utt2spk=ark:/home/sysadmin/kaldi/egs/timit/s5/data/dev/utt2spk ark:/home/sysadmin/kaldi/egs/timit/s5/mfcc/cmvn_dev.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |
cw_left=5
cw_right=5
lab = lab_name=lab_cd
lab_folder=/home/sysadmin/kaldi/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_ali_dev
lab_opts=ali-to-pdf
lab_count_file=auto
lab_data_folder=/home/sysadmin/kaldi/egs/timit/s5/data/dev/
lab_graph=/home/sysadmin/kaldi/egs/timit/s5/exp/tri3/graph
n_chunks = 1
[dataset3]
data_name = TIMIT_test
fea = fea_name=mfcc
fea_lst=/home/sysadmin/kaldi/egs/timit/s5/data/test/feats.scp
fea_opts=apply-cmvn --utt2spk=ark:/home/sysadmin/kaldi/egs/timit/s5/data/test/utt2spk ark:/home/sysadmin/kaldi/egs/timit/s5/mfcc/cmvn_test.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |
cw_left=5
cw_right=5
lab = lab_name=lab_cd
lab_folder=/home/sysadmin/kaldi/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_ali_test
lab_opts=ali-to-pdf
lab_count_file=auto
lab_data_folder=/home/sysadmin/kaldi/egs/timit/s5/data/test/
lab_graph=/home/sysadmin/kaldi/egs/timit/s5/exp/tri3/graph
n_chunks = 1
data_name, data_end_index_fea, data_end_index_lab, lab_dict, data_set_dict is None. Especially why can not read lab_dict?
shared list [None, None, None, {'mfcc': ['mfcc', 'exp/TIMIT_MLP_basic/exp_files/forward_TIMIT_test_ep23_ck0_mfcc.lst', 'apply-cmvn --utt2spk=ark:/home/sysadmin/kaldi/egs/timit/s5/data/test/utt2spk ark:/home/sysadmin/kaldi/egs/timit/s5/mfcc/cmvn_test.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |', '5', '5']}, {}, {'MLP_layers1': ['architecture1', 'MLP_layers1', 0]}, {'input': None, 'ref': None}]
lab_folder
$ ls /home/sysadmin/kaldi/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_ali_test
ali.1.gz ali.2.gz ali.3.gz ali.4.gz final.mdl log num_jobs phones.txt tree
exp/TIMIT_MLP_basic/exp_files/forward_TIMIT_test_ep23_ck0.cfg
[cfg_proto]
cfg_proto = proto/global.proto
cfg_proto_chunk = proto/global_chunk.proto
[exp]
cmd =
run_nn_script = run_nn
out_folder = exp/TIMIT_MLP_basic
seed = 1257
use_cuda = False
multi_gpu = False
save_gpumem = False
n_epochs_tr = 24
production = False
to_do = forward
out_info = exp/TIMIT_MLP_basic/exp_files/forward_TIMIT_test_ep23_ck0.info
[batches]
batch_size_train = 128
max_seq_length_train = 1000
batch_size_valid = 128
max_seq_length_valid = 1000
[architecture1]
arch_name = MLP_layers1
arch_proto = proto/MLP.proto
arch_library = neural_networks
arch_class = MLP
arch_pretrain_file = exp/TIMIT_MLP_basic/exp_files/train_TIMIT_tr_ep23_ck4_architecture1.pkl
arch_freeze = False
arch_seq_model = False
dnn_lay = 1024,1024,1024,1024,1896
dnn_drop = 0.15,0.15,0.15,0.15,0.0
dnn_use_laynorm_inp = False
dnn_use_batchnorm_inp = False
dnn_use_batchnorm = True,True,True,True,False
dnn_use_laynorm = False,False,False,False,False
dnn_act = relu,relu,relu,relu,softmax
arch_lr = 0.08
arch_halving_factor = 0.5
arch_improvement_threshold = 0.001
arch_opt = sgd
opt_momentum = 0.0
opt_weight_decay = 0.0
opt_dampening = 0.0
opt_nesterov = False
[model]
model_proto = proto/model.proto
model = out_dnn1=compute(MLP_layers1,mfcc)
loss_final=cost_nll(out_dnn1,lab_cd)
err_final=cost_err(out_dnn1,lab_cd)
[forward]
forward_out = out_dnn1
normalize_posteriors = True
normalize_with_counts_from = exp/TIMIT_MLP_basic/exp_files/forward_out_dnn1_lab_cd.count
save_out_file = False
require_decoding = True
[data_chunk]
fea = fea_name=mfcc
fea_lst=exp/TIMIT_MLP_basic/exp_files/forward_TIMIT_test_ep23_ck0_mfcc.lst
fea_opts=apply-cmvn --utt2spk=ark:/home/sysadmin/kaldi/egs/timit/s5/data/test/utt2spk ark:/home/sysadmin/kaldi/egs/timit/s5/mfcc/cmvn_test.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |
cw_left=5
cw_right=5
lab = lab_name=lab_cd
lab_folder=/home/sysadmin/kaldi/egs/timit/s5/exp/dnn4_pretrain-dbn_dnn_ali_test
lab_opts=ali-to-pdf
lab_count_file=auto
lab_data_folder=/home/sysadmin/kaldi/egs/timit/s5/data/test/
lab_graph=/home/sysadmin/kaldi/egs/timit/s5/exp/tri3/graph
Did you find a solution to this? I am having the exact same issue. Double checked all paths in my cfg file and the same error is occurring.
Note: I am using PyTorch-Kaldi on WSL without CUDA (still no CUDA support on WSL) not sure if this would make a difference.
It looks like and error in reading features and labels with kaldi. To debug, you can try to "manually" read the features in this way: 1- select one ark file in the /mnt/mscteach_home/s1870525/dissertation/PruninNeuralNetworksSpeech/s5/data/test_dev93/feats.scp (e.g, quick_test/fbank/raw_fbank_dev.1.ark) 2- run copy-feats ark:your_ark_file.ark ark,t:- . If everything works you should see a lot of numbers is standard output. If it doesn't work, try to take a look into the error. 3- If it works, you can add the options and you can write: copy-feats ark:your_ark.ark ark:- | apply-cmvn --utt2spk=ark:/mnt/mscteach_home/s1870525/dissertation/PruninNeuralNetworksSpeech/s5/data/test_dev93/utt2spk ark:/mnt/mscteach_home/s1870525/dissertation/PruninNeuralNetworksSpeech/s5/data/test_dev93/data/cmvn_test_dev93.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark,t:- If it doesn't work, take a look into the error message. You can also try to take a look into the log.log file you find into the output folder.
Please, let me know if you are able to solve the data loading issue...
On Wed, 2 Oct 2019 at 09:29, spencerkirn [email protected] wrote:
Did you find a solution to this? I am having the exact same issue. Double checked all paths in my cfg file and the same error is occurring.
Note: I am using PyTorch-Kaldi on WSL without CUDA (still no CUDA support on WSL) not sure if this would make a difference.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mravanelli/pytorch-kaldi/issues/157?email_source=notifications&email_token=AEA2ZVTDF3BUVASYTBN5OV3QMSO3LA5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAEXV4A#issuecomment-537492208, or mute the thread https://github.com/notifications/unsubscribe-auth/AEA2ZVU5Z7HOD7RZI763UTDQMSO3LANCNFSM4IRZSXTQ .
Thank you for the quick reply. I apologize if these are basic questions, I am new to using Kaldi and this toolkit. So I ran copy-feats ark:/home/spencer/kaldi/egs/timit/s5/mfcc/raw_mfcc_dev.1.ark ark,t:-
and it ran just like you said it should, with a lot of numbers output to the terminal. So after that I ran copy-feats ark:/home/spencer/kaldi/egs/timit/s5/mfcc/raw_mfcc_dev.1.ark ark:- | apply-cmvn --utt2spk=ark:/home/spencer/kaldi/egs/timit/data/dev/utt2spk ark:/home/spencer/kaldi/egs/timit/s5/data/cmvn_dev.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark, t:-
and got the attached error. One thing I noticed is that there is no cmvn_dev.ark in my data folder (no .ark files at all in that folder) is that meant to be the output or should there be a .ark file there? Seems like the error is centered around that file.
Does /home/spencer/kaldi/egs/timit/s5/data/cmvn_dev.ark exists?
Mirco
On Thu, 3 Oct 2019 at 09:24, spencerkirn [email protected] wrote:
Thank you for the quick reply. I apologize if these are basic questions, I am new to using Kaldi and this toolkit. So I ran copy-feats ark:/home/spencer/kaldi/egs/timit/s5/mfcc/raw_mfcc_dev.1.ark ark,t:- and it ran just like you said it should, with a lot of numbers output to the terminal. So after that I ran copy-feats ark:/home/spencer/kaldi/egs/timit/s5/mfcc/raw_mfcc_dev.1.ark ark:- | apply-cmvn --utt2spk=ark:/home/spencer/kaldi/egs/timit/data/dev/utt2spk ark:/home/spencer/kaldi/egs/timit/s5/data/cmvn_dev.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark, t:- and got the attached error. One thing I noticed is that there is no cmvn_dev.ark in my data folder (no .ark files at all in that folder) is that meant to be the output or should there be a .ark file there? Seems like the error is centered around that file.
[image: TIMITError] https://user-images.githubusercontent.com/49201733/66129779-8fc35180-e5be-11e9-8b3c-d0ea6a826948.PNG
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mravanelli/pytorch-kaldi/issues/157?email_source=notifications&email_token=AEA2ZVRWMRDPA6HIA2ECK3TQMXXCJA5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAIGC4A#issuecomment-537944432, or mute the thread https://github.com/notifications/unsubscribe-auth/AEA2ZVSLWWHPNLZPMECNK23QMXXCJANCNFSM4IRZSXTQ .
No like I said there are not .ark files in that folder (or subfolders). I thought this might be an output folder, but it looks like the issue is in the creation of those files.
This cmvn file is created by kaldi during the feature extraction phase and it performs mean and variance normalization. You should probably have the cmvn file somewhere else like in data/dev/cmvn* or mfcc/cmv*
Mirco
On Oct 3, 2019 09:36, "spencerkirn" [email protected] wrote:
No like I said there are not .ark files in that folder (or subfolders). I thought this might be an output folder, but it looks like the issue is in the creation of those files.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mravanelli/pytorch-kaldi/issues/157?email_source=notifications&email_token=AEA2ZVREGRGKV5LM674GLATQMXYODA5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAIHFCY#issuecomment-537948811, or mute the thread https://github.com/notifications/unsubscribe-auth/AEA2ZVQZEBDVBCBQJ4SVEDDQMXYODANCNFSM4IRZSXTQ .
Yea I had the wrong path for cmvn file, but when I run copy-feats ark:/home/spencer/kaldi/egs/timit/s5/mfcc/raw_mfcc_test.1.ark ark,t:- | apply-cmvn --utt2spk=ark:/home/spencer/kaldi/egs/timit/s5/data/test/utt2spk ark:/home/spencer/kaldi/egs/timit/s5/mfcc/cmvn_test.ark ark:- ark:-
now I get a Kaldi Fatal error
In case anyone else has this issue: I resolved it by bypassing the if statement on line 328 of run_exp.py. There was some issue in how the shared_list object was being created that I could not figure out, but the else statement ran the run_nn function in a similar fashion as the training and validation steps.
So I commented out line 328 and created another variable set to False to bypass that if statement.
test=False
#if _run_forwarding_in_subprocesses(config)
if test:
This is weird, are you sure that you don't have a path problem only?
On Fri, 25 Oct 2019 at 08:54, spencerkirn [email protected] wrote:
In case anyone else has this issue: I resolved it by bypassing the if statement on line 328 of run_exp.py. There was some issue in how the shared_list object was being created that I could not figure out, but the else statement ran the run_nn function in a similar fashion as the training and validation steps.
So I commented out line 328 and created another variable set to False to bypass that if statement. test=False #if _run_forwarding_in_subprocesses(config) if test:
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mravanelli/pytorch-kaldi/issues/157?email_source=notifications&email_token=AEA2ZVSA33MYYYZIFPEVEGLQQLT65A5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECIIHSQ#issuecomment-546341834, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEA2ZVSLDBJZY4BCVRFHB3TQQLT65ANCNFSM4IRZSXTQ .
Yes, I checked all the paths in the config file and they were all correct. Bypassing that if statement though gave a result that looked very similar to the one in the tutorial.
Interesting, we haven't experimented this issue on our side.
On Fri, 25 Oct 2019 at 11:07, spencerkirn [email protected] wrote:
Yes, I checked all the paths in the config file and they were all correct. Bypassing that if statement though gave a result that looked very similar to the one in the tutorial.
[image: TIMITResult] https://user-images.githubusercontent.com/49201733/67582253-6ad28200-f717-11e9-9d6e-40d0d73a7744.PNG
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mravanelli/pytorch-kaldi/issues/157?email_source=notifications&email_token=AEA2ZVXGWLJEAJNF6KUJHWDQQMDT5A5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECIUSZI#issuecomment-546392421, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEA2ZVQWJTWPJJDJDTNX2N3QQMDT5ANCNFSM4IRZSXTQ .
There is still an error in the log.log file apparently (I had not check that file when I got the correct result). Something to do with decode_dnn.sh. Looks like the forward_TIMIT_test_ep*_ck*_out_dnn1_to_decode.ark
files are not being created for some reason. Though for whatever reason this does not affect the outcome it seems.
Maybe this file has not been created because there is a problem with test data. Could you better check them?
Mirco
https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail Virus-free. www.avast.com https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
On Fri, 25 Oct 2019 at 12:19, spencerkirn [email protected] wrote:
There is still an error in the log.log file apparently (I had not check that file when I got the correct result). Something to do with decode_dnn.sh. Looks like the forward_TIMIT_test_ep*_ck*_out_dnn1_to_decode.ark files are not being created for some reason. Though for whatever reason this does not affect the outcome it seems. [image: TIMITError3] https://user-images.githubusercontent.com/49201733/67587312-965a6a00-f721-11e9-8b54-54dcbcebeef6.PNG
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mravanelli/pytorch-kaldi/issues/157?email_source=notifications&email_token=AEA2ZVQFPWFAZX3MKBOWJTDQQMMABA5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECI27SQ#issuecomment-546418634, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEA2ZVWADWVSFXVKNZ4NFU3QQMMABANCNFSM4IRZSXTQ .
I am also having error at testing phase
------------------------------ Epoch 23 / 23 ------------------------------
----- Summary epoch 23 / 23
Training on ['TIMIT_tr']
Loss = 0.916 | err = 0.290
-----
Validating on TIMIT_dev
Loss = 1.674 | err = 0.450
-----
Learning rate on architecture1 = 0.0025
-----
Elapsed time (s) = 3338
Testing TIMIT_test chunk = 1 / 1
Traceback (most recent call last):
File "run_exp.py", line 475, in <module>
data_set_inp, data_set_ref = convert_numpy_to_torch(data_set_dict, save_gpumem, use_cuda)
File "/home/dev_ds/pytorch-kaldi/core.py", line 53, in convert_numpy_to_torch
data_set_inp = torch.from_numpy(data_set_dict["input"]).float()
TypeError: expected np.ndarray (got NoneType)
when printed shared_list [print(shared_list)
] in run_exp.py, looks as below.
[None, None, None, {'mfcc': ['mfcc', '/home/dev_ds/kaldi_dnn/egs/timit/s5/exp/TIMIT_MLP_basic/exp_files/forward_TIMIT_test_ep23_ck0_mfcc.lst', 'apply-cmvn --utt2spk=ark:/home/dev_ds/kaldi_dnn/egs/timit/s5/data/test/utt2spk ark:/home/dev_ds/kaldi_dnn/egs/timit/s5/mfcc/cmvn_test.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |', '5', '5']}, {}, {'MLP_layers1': ['architecture1', 'MLP_layers1', 0]}, {'input': None, 'ref': None}]
I used same validation data [dev
] as test data, training and validation have no errors, but testing with same data throwing error.
@kumarh22 I got the same problem with you, have you solved?
@mravanelli I also got the error in test phase.
Testing TIMIT_test chunk = 1 / 1
info [None, None, None, {'mfcc': ['mfcc', 'exp/TIMIT_MLP_basic/exp_files/forward_TIMIT_test_ep23_ck0_mfcc.lst', 'apply-cmvn --utt2spk=ark:/home/zhang/code/kaldi_maked/egs/timit/s5/data/dev/utt2spk ark:/home/zhang/code/kaldi_maked/egs/timit/s5/mfcc/cmvn_dev.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |', '5', '5']}, {}, {'MLP_layers1': ['architecture1', 'MLP_layers1', 0]}, {'input': None, 'ref': None}]
Traceback (most recent call last):
File "run_exp.py", line 476, in
I had "manually" read the features to debug as you said above. It works in step2, and not came into error in step3(for step3, it runs for such a long time but without error, this is the same with eval file) and the log.log is just prov dopo prima
ps. I am using python3.7, torch 1.0 cpu only version
could you help me
Is the problem happening if you use the validation or training set as the test set?
Is the problem happening if you use the validation or training set as the test set?
yes. I use the validation set as test set, but it still happen
Is the problem happening if you use the validation or training set as the test set?
yes. I use the validation set as test set, but it still happen
I find that when I use gpu version, the problem not appear again.
Had the same issue today. Here are some findings:
Why does it only happen when running on CPU?
Because when CPU is used the forward will run in a subprocess and the method to run forward pass in a subprocess uses another version of read_lab_fea method here - read_lab_fea_refac01. While the same process forward pass will use the original read_lab_fea method.
So why it crashes when using the read_lab_fea_refac01 method?
First of all, because it will switch to production mode when reading fea_dict, lab_dict, arch_dict. By removing this line I fixed the initial issue. But there is another problem.
It will also return -1 as data_end_index
and run_nn will crash anyway.
How to fix:
You can update this method to return False. I tried to use read_lab instead of read_lab_fea_refac01 here but it will crash anyway when trying to unpack the shared_list
here. The shared_list
has 6 items, not 7. There is only one item for the data_end_index
data.