vak icon indicating copy to clipboard operation
vak copied to clipboard

allow data_dir option in PREP section to be a list of directories

Open NickleDave opened this issue 6 years ago • 2 comments

instead of forcing user to e.g. write a bash script with a bunch of commands + separate .toml files

NickleDave avatar Nov 09 '19 03:11 NickleDave

instead of changing / adding option(s), just make it so the data_dir option can be either a string, i.e. a single directory path, or a list of strings, i.e. multiple paths to directories

this would mean that functions in io.dataframe would need to "know" that data_dir can be a list

I think it should happen with in the dataframe.from_files function itself and not any "lower" functions, that get called by from_files

e.g. here from_files should just loop over data_dir (if it is a list) and concatenate the results into a single (flattened) list, instead of audio.files_from_dir needing to handle both strings and lists for data_dir https://github.com/NickleDave/vak/blob/78cc5cb764302e6fc927e206ccaf2eac3e6eff5b/src/vak/io/dataframe.py#L132

but later on when data_dir gets passed to spect.to_dataframe I think that spect might actually need to handle both a str and lists, since it only gets called once https://github.com/NickleDave/vak/blob/78cc5cb764302e6fc927e206ccaf2eac3e6eff5b/src/vak/io/dataframe.py#L183

@yardencsGitHub please let me know if you have opinions on this planning to add this weekend to make it easier to finish up the behavioral expt part

NickleDave avatar Jun 10 '20 01:06 NickleDave

Need to also:

  • [ ] make it so PrepConfig.data_dir can be a single directory or a list of directories
    • create a validator just for data_dir that validates this
  • [ ] also make it so that the core.prep parameter data_dir can be a single directory or a list of directories
    • need to do the same validation we did for the PrepConfig attribute -- trying to maintain ability to call core functions outside of the cli

NickleDave avatar Jun 14 '20 17:06 NickleDave