echopype
echopype copied to clipboard
Enable correct provenance `source_filenames` construction for "appending" a converted file to a combined file
In combine_echodata
we currently gather the source file names using echodata.source_file
and then these variables are put into the new combined echodata.provenance.source_filenames
(this variable will exist once PR #674 has been merged). The mechanism that allows for this are the following lines
https://github.com/OSOceanAcoustics/echopype/blob/e0b3cbff9459b57cc8320388cc1ac86da6063343/echopype/echodata/combine.py#L144-L151
If we perform the following actions
ed_1_2 = ep.combine_echodata([ed1, ed2])
ed_1_2_3 = ep.combine_echodata([ed_1_2, ed3])
then it might be possible that the above lines produce a list of lists. If this is the case, then it may cause issues with how we are currently creating ed_1_2_3.provenance.source_filenames
. I think this scenario should be investigated further.
Additionally, in the lines above, we should consider changing echodata.source_file
to echodata.provenance.source_filenames
.
Thanks, @b-reyes . I'd like to add more certainty/clarity to a couple of your statements:
then ~~it might be possible that~~ the above lines produce a list of lists, ~~. If this is the case, then it may~~ which will cause issues with how we are currently creating
ed_1_2_3.provenance.source_filenames
. I think this scenario should be investigated further.
I think we can agree that the scenario you describe will cause an error or an undesired outcome at this time.
@emiliom : this has been addressed in the recent PRs, is this correct?
@emiliom : this has been addressed in the recent PRs, is this correct?
It wasn't addressed, not quite. We decided this was an edge case (combining two or more datasets that are already the result of combine operations) that we were not going to address.
I suggest we close the issue, though.
@emiliom : Thanks. Could you create an issue to make sure that we don't forget about that, and close this issue?
We've clarified the primary use case where this issue would be important: when "appending" a new, converted file to a pre-existing combined file. For example, in a continuous monitoring deployment, where new files are generated continuously and a combined file is maintained and updated by appending each new file as it's generated.
We will address this issue in the next release, 0.6.4.
The original scope of this issue has been addressed in PR #908.
#895 describes a larger problem uncovered in the process of working on this issue. That problem will be tracked there.