speechbrain icon indicating copy to clipboard operation
speechbrain copied to clipboard

[Bug]: AttributeError: Can't pickle local object 'dataio_prep.<locals>.audio_pipeline'

Open hieuminh65 opened this issue 2 years ago • 4 comments

Describe the bug

Hi I am working with the IEMOCAP in recipes for emotion recognition. When I started the training I got this error.

Expected behaviour

The training process is done and I got the new model weight without error

To Reproduce

if name == "main":

# Reading command line arguments.
hparams_file, run_opts, overrides = sb.parse_arguments(sys.argv[1:])
# run_opts['device'] = 'cpu'
# Initialize ddp (useful only for multi-GPU DDP training).
if run_opts["device"] != "cpu":
    sb.utils.distributed.ddp_init_group(run_opts)

# Load hyperparameters file with command-line overrides.
with open(hparams_file) as fin:
    hparams = load_hyperpyyaml(fin, overrides)

# Create experiment directory
sb.create_experiment_directory(
    experiment_directory=hparams["output_folder"],
    hyperparams_to_save=hparams_file,
    overrides=overrides,
)

from ravdess_prepare import prepare_data  # noqa E402

# Data preparation, to be run on only one process.
if not hparams["skip_prep"]:
    sb.utils.distributed.run_on_main(
        prepare_data,
        kwargs={
            "data_original": hparams["data_folder"],
            "save_json_train": hparams["train_annotation"],
            "save_json_valid": hparams["valid_annotation"],
            "save_json_test": hparams["test_annotation"],
            "split_ratio": hparams["split_ratio"],
            "seed": hparams["seed"],
        },
    )

# Create dataset objects "train", "valid", and "test".
datasets = dataio_prep(hparams)

device = torch.device("cpu")
# hparams["wav2vec2"] = hparams["wav2vec2"].to(device=run_opts["device"])
hparams["wav2vec2"] = hparams["wav2vec2"].to(device=device)
# freeze the feature extractor part when unfreezing
if not hparams["freeze_wav2vec2"] and hparams["freeze_wav2vec2_conv"]:
    hparams["wav2vec2"].model.feature_extractor._freeze_parameters()

# Initialize the Brain object to prepare for mask training.
emo_id_brain = EmoIdBrain(
    modules=hparams["modules"],
    opt_class=hparams["opt_class"],
    hparams=hparams,
    run_opts=run_opts,
    checkpointer=hparams["checkpointer"],
)

# The `fit()` method iterates the training loop, calling the methods
# necessary to update the parameters of the model. Since all objects
# with changing state are managed by the Checkpointer, training can be
# stopped at any point, and will be resumed on next call.
emo_id_brain.fit(
    epoch_counter=emo_id_brain.hparams.epoch_counter,
    train_set=datasets["train"],
    valid_set=datasets["valid"],
    train_loader_kwargs=hparams["dataloader_options"],
    valid_loader_kwargs=hparams["dataloader_options"],
)

# Load the best checkpoint for evaluation
test_stats = emo_id_brain.evaluate(
    test_set=datasets["test"],
    min_key="error_rate",
    test_loader_kwargs=hparams["dataloader_options"],
)

Versions

No response

Relevant log output

/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/transformers/configuration_utils.py:380: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`.
  warnings.warn(
Some weights of the model checkpoint at facebook/wav2vec2-base were not used when initializing Wav2Vec2Model: ['quantizer.weight_proj.bias', 'project_q.bias', 'project_hid.weight', 'project_q.weight', 'quantizer.weight_proj.weight', 'project_hid.bias', 'quantizer.codevectors']
- This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing Wav2Vec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
speechbrain.lobes.models.huggingface_wav2vec - wav2vec 2.0 feature extractor is frozen.
speechbrain.core - Beginning experiment!
speechbrain.core - Experiment folder: results/train_with_wav2vec2/1993
ravdess_prepare - Preparation completed in previous run, skipping.
speechbrain.dataio.encoder - Load called, but CategoricalEncoder is not empty. Loaded data will overwrite everything. This is normal if there is e.g. an unk label defined at init.
speechbrain.core - Info: ckpt_interval_minutes arg from hparam file is used
speechbrain.core - 90.2M trainable parameters in EmoIdBrain
speechbrain.utils.checkpoints - Would load a checkpoint here, but none found yet.
speechbrain.utils.epoch_loop - Going into epoch 1
  0%|                                                                                                    | 0/135 [00:00<?, ?it/s]
speechbrain.core - Exception:
Traceback (most recent call last):
  File "/Users/hieunguyenminh/CODE ALL/HuggingFace/SpeechBrain/speechbrain/recipes/IEMOCAP/emotion_recognition/train_with_wav2vec2.py", line 288, in <module>
    emo_id_brain.fit(
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/speechbrain/core.py", line 1264, in fit
    self._fit_train(train_set=train_set, epoch=epoch, enable=enable)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/speechbrain/core.py", line 1111, in _fit_train
    for batch in t:
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/speechbrain/dataio/dataloader.py", line 286, in __iter__
    iterator = super().__iter__()
               ^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 441, in __iter__
    return self._get_iterator()
           ^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 388, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1042, in __init__
    w.start()
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/context.py", line 288, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'dataio_prep.<locals>.audio_pipeline'

Additional context

I change the device to cpu because I don't have CUDA system but I wonder if that is the reason.

hieuminh65 avatar Aug 10 '23 00:08 hieuminh65

This seems like a multiprocessing problem. If you are using a windows PC, you can try to set num_workers into 0 since The yaml file says that 0 works for windows.

backspacetg avatar Aug 10 '23 09:08 backspacetg

This seems like a multiprocessing problem. If you are using a windows PC, you can try to set num_workers into 0 since The yaml file says that 0 works for windows.

Hey I use macOS, I try that but it does not work

hieuminh65 avatar Aug 24 '23 02:08 hieuminh65

Can you try to add these lines at the beginning of your training script:

import multiprocessing
multiprocessing.set_start_method("fork")

lucadellalib avatar Sep 26 '23 04:09 lucadellalib

Hello @hieuminh65, any news on this issue please ?

Adel-Moumen avatar Feb 01 '24 15:02 Adel-Moumen

Hello @hieuminh65, any news on this issue please ?

Hey, I had the same problem (MacOS) and the following worked :

import multiprocessing multiprocessing.set_start_method("fork")

Thanks @lucadellalib

matheus-rzende avatar Mar 09 '24 10:03 matheus-rzende

I use macOS M1, I add import multiprocessing multiprocessing.set_start_method("fork")

and show something like this: [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)

how can i fix it?

zixiaosunbro avatar May 30 '24 07:05 zixiaosunbro

I use macOS M1, I add import multiprocessing multiprocessing.set_start_method("fork")

and show something like this: [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)

how can i fix it?

Not sure, but this doesn't look like a SB-specific issue.

I can find this which seems related, and it doesn't take them the set_start_method to cause the issue: https://stackoverflow.com/questions/64772335/pytorch-w-parallelnative-cpp206

You could try disabling workers altogether in the mentioned way though I'm not sure this behaves well with SB (on top of slowing down training somewhat).

asumagic avatar May 30 '24 13:05 asumagic