muzic icon indicating copy to clipboard operation
muzic copied to clipboard

Error when running ttrain/mf-lmd6remi-1.sh 

Open taktak1 opened this issue 2 years ago • 5 comments

Currently I'm trying to implement museformer.

I don't know how to deal with errors in the model learning stage.(I fixed the errors leading up to this point)

ttrain/mf-lmd6remi-1.sh

...................


2022-11-11 07:08:28 | WARNING | fairseq.tasks.fairseq_task | 290 samples have invalid sizes and will be skipped, max_positions=1024, first few sample ids=[0, 1, 2, 4, 5, 6, 7, 8, 9, 10]
2022-11-11 07:08:28 | INFO | fairseq.trainer | begin training epoch 1
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-train", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 352, in cli_main
    distributed_utils.call_main(args, main)
  File "/usr/local/lib/python3.8/dist-packages/fairseq/distributed_utils.py", line 301, in call_main
    main(args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 125, in main
    valid_losses, should_stop = train(args, trainer, task, epoch_itr)
  File "/usr/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 208, in train
    log_output = trainer.train_step(samples)
  File "/usr/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/usr/local/lib/python3.8/dist-packages/fairseq/trainer.py", line 480, in train_step
    loss, sample_size_i, logging_output = self.task.train_step(
  File "/usr/local/lib/python3.8/dist-packages/fairseq/tasks/fairseq_task.py", line 416, in train_step
    loss, sample_size, logging_output = criterion(model, sample)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/fairseq/criterions/cross_entropy.py", line 35, in forward
    net_output = model(**sample["net_input"])
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/fairseq/models/fairseq_model.py", line 481, in forward
    return self.decoder(src_tokens, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "muzic/museformer/museformer/museformer_decoder.py", line 413, in forward
    x, extra = self.extract_features(
  File "muzic/museformer/museformer/museformer_decoder.py", line 645, in extract_features
    (sum_x, reg_x), inner_states = self.run_layers(
  File "/content/muzic/museformer/museformer/museformer_decoder.py", line 731, in run_layers
    x, _ = layer(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "muzic/museformer/museformer/museformer_decoder_layer.py", line 413, in forward
    x, attn = self.run_self_attn(
  File "muzic/museformer/museformer/museformer_decoder_layer.py", line 486, in run_self_attn
    r, weight = self.self_attn(
TypeError: 'NotImplementedError' object is not callable

I have confirmed that the environments match.

tensorboardX 2.2 Python 3.8.15 fairseq 0.10.2 CUDA 11.3

taktak1 avatar Nov 11 '22 10:11 taktak1

Hi! We have fixed the problem in the latest commit. Thank you!

btyu avatar Nov 12 '22 04:11 btyu

However, I got a new error.

2022-11-12 13:03:53 | WARNING | fairseq.tasks.fairseq_task | 308 samples have invalid sizes and will be skipped, max_positions=1024, first few sample ids=[0, 1, 2, 3, 4, 5, 7, 8, 9, 10]
2022-11-12 13:03:53 | INFO | fairseq.trainer | begin training epoch 1

Traceback (most recent call last):
  File "/usr/local/bin/fairseq-train", line 8, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 352, in cli_main
    distributed_utils.call_main(args, main)
  File "/usr/local/lib/python3.8/dist-packages/fairseq/distributed_utils.py", line 301, in call_main
    main(args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 125, in main
    valid_losses, should_stop = train(args, trainer, task, epoch_itr)
  File "/usr/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/usr/local/lib/python3.8/dist-packages/fairseq_cli/train.py", line 208, in train
    log_output = trainer.train_step(samples)
  File "/usr/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/usr/local/lib/python3.8/dist-packages/fairseq/trainer.py", line 480, in train_step
    loss, sample_size_i, logging_output = self.task.train_step(
  File "/usr/local/lib/python3.8/dist-packages/fairseq/tasks/fairseq_task.py", line 416, in train_step
    loss, sample_size, logging_output = criterion(model, sample)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/fairseq/criterions/cross_entropy.py", line 35, in forward
    net_output = model(**sample["net_input"])
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/fairseq/models/fairseq_model.py", line 481, in forward
    return self.decoder(src_tokens, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "muzic/museformer/museformer/museformer_decoder.py", line 413, in forward
    x, extra = self.extract_features(
  File "muzic/museformer/museformer/museformer_decoder.py", line 645, in extract_features
    (sum_x, reg_x), inner_states = self.run_layers(
  File "muzic/museformer/museformer/museformer_decoder.py", line 731, in run_layers
    x, _ = layer(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "muzic/museformer/museformer/museformer_decoder_layer.py", line 413, in forward
    x, attn = self.run_self_attn(
  File "muzic/museformer/museformer/museformer_decoder_layer.py", line 486, in run_self_attn
    r, weight = self.self_attn(
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
TypeError: forward() got multiple values for argument 'key_padding_mask'


taktak1 avatar Nov 12 '22 13:11 taktak1

I meet the same question. And torch is which one?

hhhyc333 avatar Dec 29 '22 08:12 hhhyc333

Do you have fixed it?

hhhyc333 avatar Dec 29 '22 08:12 hhhyc333

Hi! We have fixed this problem in the last commit. Thank you for pointing it out.

btyu avatar Dec 29 '22 14:12 btyu