echopype icon indicating copy to clipboard operation
echopype copied to clipboard

Enable correct provenance `source_filenames` construction for "appending" a converted file to a combined file

Open b-reyes opened this issue 2 years ago • 5 comments

In combine_echodata we currently gather the source file names using echodata.source_file and then these variables are put into the new combined echodata.provenance.source_filenames (this variable will exist once PR #674 has been merged). The mechanism that allows for this are the following lines

https://github.com/OSOceanAcoustics/echopype/blob/e0b3cbff9459b57cc8320388cc1ac86da6063343/echopype/echodata/combine.py#L144-L151

If we perform the following actions

ed_1_2 = ep.combine_echodata([ed1, ed2])

ed_1_2_3 = ep.combine_echodata([ed_1_2, ed3])

then it might be possible that the above lines produce a list of lists. If this is the case, then it may cause issues with how we are currently creating ed_1_2_3.provenance.source_filenames. I think this scenario should be investigated further.

Additionally, in the lines above, we should consider changing echodata.source_file to echodata.provenance.source_filenames.

b-reyes avatar May 12 '22 17:05 b-reyes

Thanks, @b-reyes . I'd like to add more certainty/clarity to a couple of your statements:

then ~~it might be possible that~~ the above lines produce a list of lists, ~~. If this is the case, then it may~~ which will cause issues with how we are currently creating ed_1_2_3.provenance.source_filenames. I think this scenario should be investigated further.

I think we can agree that the scenario you describe will cause an error or an undesired outcome at this time.

emiliom avatar May 12 '22 17:05 emiliom

@emiliom : this has been addressed in the recent PRs, is this correct?

leewujung avatar Oct 13 '22 01:10 leewujung

@emiliom : this has been addressed in the recent PRs, is this correct?

It wasn't addressed, not quite. We decided this was an edge case (combining two or more datasets that are already the result of combine operations) that we were not going to address.

I suggest we close the issue, though.

emiliom avatar Oct 13 '22 07:10 emiliom

@emiliom : Thanks. Could you create an issue to make sure that we don't forget about that, and close this issue?

leewujung avatar Oct 13 '22 15:10 leewujung

We've clarified the primary use case where this issue would be important: when "appending" a new, converted file to a pre-existing combined file. For example, in a continuous monitoring deployment, where new files are generated continuously and a combined file is maintained and updated by appending each new file as it's generated.

We will address this issue in the next release, 0.6.4.

emiliom avatar Oct 13 '22 16:10 emiliom

The original scope of this issue has been addressed in PR #908.

#895 describes a larger problem uncovered in the process of working on this issue. That problem will be tracked there.

emiliom avatar Dec 21 '22 16:12 emiliom