vak icon indicating copy to clipboard operation
vak copied to clipboard

crashes during runs of learncurve with .mat files may be caused by multiple workers opening the same file

Open NickleDave opened this issue 4 years ago • 2 comments

haven't confirmed this yet but that's we just thought of it, writing down here

@yardencsGitHub observed more crashes with smaller window size --> increases likelihood of opening the same file, since we're openig and closing so many

fix?

  • check if file is open, and if so, wait

NickleDave avatar Mar 14 '21 20:03 NickleDave

maybe something like this? https://discuss.pytorch.org/t/dataloader-when-num-worker-0-there-is-bug/25643

weird that it only happens for .mat files and not .npz files though

NickleDave avatar Mar 14 '21 22:03 NickleDave

@yardencsGitHub got this error when trying to run the llb3 canary config with the .mat spectrogram files

\Logging results to results/Canaries/llb3/results_210414_224514
Using dataset from .csv: data/Canaries/llb3/annotated_prep_210413_230053.csv
Saving results to: results/Canaries/llb3/results_210414_224514
Size of each timebin in spectrogram, in seconds: 0.0027
Traceback (most recent call last):
File "/home/pimienta/anaconda3/envs/vak040b3/bin/vak", line 8, in
sys.exit(main())
File "/home/pimienta/anaconda3/envs/vak040b3/lib/python3.8/site-packages/vak/main.py", line 35, in main
cli.cli(command=args.command,
File "/home/pimienta/anaconda3/envs/vak040b3/lib/python3.8/site-packages/vak/cli/cli.py", line 30, in cli
COMMAND_FUNCTION_MAPcommand
File "/home/pimienta/anaconda3/envs/vak040b3/lib/python3.8/site-packages/vak/cli/learncurve.py", line 51, in learning_curve
core.learning_curve(model_config_map,
File "/home/pimienta/anaconda3/envs/vak040b3/lib/python3.8/site-packages/vak/core/learncurve/learncurve.py", line 176, in learning_curve
has_unlabeled = csv.has_unlabeled(csv_path, labelset, timebins_key)
File "/home/pimienta/anaconda3/envs/vak040b3/lib/python3.8/site-packages/vak/csv.py", line 35, in has_unlabeled time_bins = files.spect.load(spect_path)[timebins_key]
File "/home/pimienta/anaconda3/envs/vak040b3/lib/python3.8/site-packages/vak/files/spect.py", line 88, in load
spect_dict = constants.SPECT_FORMAT_LOAD_FUNCTION_MAPspect_format
File "/home/pimienta/anaconda3/envs/vak040b3/lib/python3.8/site-packages/scipy/io/matlab/mio.py", line 226, in loadmat
matfile_dict = MR.get_variables(variable_names)
File "/home/pimienta/anaconda3/envs/vak040b3/lib/python3.8/site-packages/scipy/io/matlab/mio5.py", line 333, in get_variables res = self.read_var_array(hdr, process)
File "/home/pimienta/anaconda3/envs/vak040b3/lib/python3.8/site-packages/scipy/io/matlab/mio5.py", line 293, in read_var_array return self._matrix_reader.array_from_header(header, process)
File "mio5_utils.pyx", line 671, in scipy.io.matlab.mio5_utils.VarReader5.array_from_header File "mio5_utils.pyx", line 701, in scipy.io.matlab.mio5_utils.VarReader5.array_from_header File "mio5_utils.pyx", line 775, in scipy.io.matlab.mio5_utils.VarReader5.read_real_complex File "mio5_utils.pyx", line 448, in scipy.io.matlab.mio5_utils.VarReader5.read_numeric File "mio5_utils.pyx", line 353, in scipy.io.matlab.mio5_utils.VarReader5.read_element File "streams.pyx", line 174, in scipy.io.matlab.streams.ZlibInputStream.read_string File "streams.pyx", line 150, in scipy.io.matlab.streams.ZlibInputStream.read_into File "streams.pyx", line 137, in scipy.io.matlab.streams.ZlibInputStream._fill_buffer zlib.error: Error -3 while decompressing data: incorrect data check

dumping here because I'm not sure what other issue is relevant -- the BrokenPipeError was Windows specific, I think

but to me this is one vote against using .mat files, ever

will try converting to .npz and see if it still fails

NickleDave avatar Apr 15 '21 02:04 NickleDave