echopype icon indicating copy to clipboard operation
echopype copied to clipboard

Error storing filter coefficients for sequenced ek80 file

Open julianblanco opened this issue 1 year ago • 9 comments

When trying to open ek80 data (2 channel system 70khz and 200khz) i get the following value error ValueError: conflicting sizes for dimension 'channel': length 1 on 'WBT_filter_i' and length 2 on {'channel': 'frequency_nominal', 'pulse_length_bin': 'sa_correction'}

calling code echodata = ep.convert.api.open_raw('/three_ensemble-Phase0-D20240506-T053349-0.raw', sonar_model='EK80')

version 0.8.4 mac, installed via pip in pycharm venv python ver 3.11

link to file: https://drive.google.com/file/d/1doLT8V3CFXNxkulIKtV1-BXMwvm0ln5N/view?usp=sharing

julianblanco avatar May 07 '24 07:05 julianblanco

@leewujung This is interesting and I was able to get the same error message. What are your thoughts on this? One of the channels is missing the filter values, or they're assumed to be shared for both? These are the transducer names from the datagram: ['ES70-18CD', 'ES200-7CDK-Split']

ctuguinay avatar May 09 '24 17:05 ctuguinay

Hmm, the filters are specific to each channel, so could not be shared. I think I've seen files before that have a subset of filter values missing, and those parsed fine before. It is possible that some later changes we introduced made those not allowable, and we did not have tests covering that.

leewujung avatar May 10 '24 02:05 leewujung

Thanks for the quick attention. This is from a WBT mini system saving to a usb drive rather than collected via the ek80 software.

Let me know if i can help in any way. I wrote something to filter out some of the stuff and stitch a echogram but id be interested in helping to fix the underlying issue and learn how to make contributions back upstream.

julianblanco avatar May 10 '24 03:05 julianblanco

@julianblanco That would be great if you could help us out with this!

Here's the doc for how to contribute: https://echopype.readthedocs.io/en/stable/contributing.html We're currently no longer using the dev branch so point any PR to main instead.

Here's the doc for how data is organized in the raw Echodata object: https://echopype.readthedocs.io/en/stable/data-format-raw.html. Scroll down to the EK80 part and take a look at Vendor_specific. A valid Vendor_specfic dataset should be one such that the number of individual WBT_filter_n arrays matches the number of individual channels/tranducers.

In the code itself, I would first look at the call to create the Vendor_specific dataset and work backward from there:

echopype/echopype/convert/api.py:487:
...
tree_dict["Vendor_specific"] = setgrouper.set_vendor()

It could be the case that the filtering of the original datagrams removed a channel's WBT_filter_n array, but that is just a guess. However, this case of a 'missing' array should be allowed and there should be code somewhere in the Vendor_specific dataset creation that allows for NaN padding of this missing array, and some logger warning when this NaN padding happens.

ctuguinay avatar May 10 '24 16:05 ctuguinay

Also, once you set up the test data suite, you can see how the Vendor_specific creation should 'run' on a few other EK80 examples. That may clue you into what part(s) are missing and/or need NaN padding.

ctuguinay avatar May 10 '24 17:05 ctuguinay

Ha, I've got some new files that also have this problem. @julianblanco: were you able to make some head way into this?

leewujung avatar Jun 04 '24 04:06 leewujung

This source of the problem is that the WBT and PC filter coefficients in the file only exist for one of the channels -- that is the channel that is actually ON. But, all other parameters stored in the Vendor_specific group has 2 channels. We can resolve this by padding NaNs for the channel that was not transmitting, somewhere between:

Code section: https://github.com/OSOceanAcoustics/echopype/blob/66fda23cb9cdc7ab06d6d3f8297e2e69a56a046c/echopype/convert/set_groups_ek80.py#L1276-L1308 And the function _add_filter_params: https://github.com/OSOceanAcoustics/echopype/blob/66fda23cb9cdc7ab06d6d3f8297e2e69a56a046c/echopype/convert/set_groups_ek80.py#L1313

leewujung avatar Jun 04 '24 04:06 leewujung

@leewujung Push this to 0.9.1? We didn't encounter anything like this on Shimada or Lasker

ctuguinay avatar Jul 23 '24 16:07 ctuguinay

Agreed!

leewujung avatar Jul 23 '24 16:07 leewujung

@leewujung Hi! Do you think NaN padding is the best long-term solution? I’ll need to replicate the issue first before making any recommendations, but I’d love to hear if there’s a 'dream scenario' solution that could address this.

spacetimeengineer avatar Nov 13 '24 16:11 spacetimeengineer

Hey @spacetimeengineer! I think NaN padding would work fine, since it’s easy to drop them when needed. These coefficients are also small so space is not an issue (though zarr compressed very well as we’ve found for the backscatter data).

leewujung avatar Nov 14 '24 07:11 leewujung

Sounds good to me. Thanks.

spacetimeengineer avatar Nov 15 '24 16:11 spacetimeengineer

It's been a while and we need to get this into the next release, so I've made this change in #1458.

leewujung avatar Feb 17 '25 07:02 leewujung

This is resolved in #1458 and will be in the upcoming v0.10.0.

leewujung avatar Feb 21 '25 07:02 leewujung