echopype icon indicating copy to clipboard operation
echopype copied to clipboard

"Processed" power data in EK60/80 `backscatter_r`

Open leewujung opened this issue 4 years ago • 1 comments

Currently a minimum processing is applied in the EK parser with the power data to convert the counts to power. https://github.com/OSOceanAcoustics/echopype/blob/bc8afa190fa2ca4fab5944bac83cd4b20f7abcf6/echopype/convert/parse_base.py#L122-L140

This means that the power data we stored as backscatter_r are technically "processed" data, which conflicts a bit with our intention to store the most "raw" form of data in at level 0.

@emiliom: If we change this and actually save just the counts in backscatter_r and move the scaling to part of compute_Sv, does it still work with the convention specification "Real part or amplitude or power of backscatter measurements." ?

This would be a breaking change since it changes the content of data in open_raw and have downstream impact in compute_Sv. We can mitigate the latter by including this scaling in the set of changes for v0.5.x --> v0.6.0 conversion (#606). The breaking aspect though does mean that we need to include this in v0.6.0.

leewujung avatar Apr 22 '22 02:04 leewujung

When we consider this revisit putting in detail of what this is in variable attribute.

leewujung avatar May 05 '22 16:05 leewujung

Circling back to this. It's a good question / observation. I've pasted below, for reference, the complete SONAR-netCDF4 v1 information about backscatter_r.

In general, I agree with your point about counts being more appropriate as rawer, less processed data. But looking at the code, the conversion to power is based on a simple, fixed constant (INDEX2POWER) that's not dependent on any variable. So, in this case the distinction between "power" and "counts" seems very small, from a processing perspective.

I think there are other factors to consider, as we make a decision about this:

  • It looks like echopype retains counts for AZFP, for backscatter_r. I don't see a conversion factor analogous to INDEX2POWER. But the variable long_name is the same as with EK, "Backscatter power".
  • The convention itself is open ended about what exactly backscatter_r can hold. I interpret the descriptions as encompassing either counts or power. Do you agree?
  • Is there an argument to be made for echopype returning consistent backscatter_r (eg, always counts) regardless of the instrument? Or are instruments too heterogeneous in practice for this to be a realistic aim?

When we consider this revisit putting in detail of what this is in variable attribute.

Agreed. It could go into a comment attribute. BTW, the convention specifies the long_name "Raw backscatter measurements (real part)". I realize "Backscatter power" is more specific and user friendly, but we should probably revisit this.

Description Obligation Comment
sample_t backscatter_r(ping_time, beam) M Real part or amplitude or power of backscatter measurements. Each element in the 2D matrix is a variable length vector (of type sample_t) that contains the samples for that beam and ping time.
:long_name = "Raw backscatter measurements (real part)"
:units = "as appropriate" Use units appropriate for the data

emiliom avatar Apr 18 '23 16:04 emiliom

We decided to:

  • Leave EK60/EK80 backscatter_r and backscatter_i as is, because the definition for these variables in SONAR-netCDF4 v1 is broad enough to encompass the application of the INDEX2POWER conversion. Plus that's a very simple conversion by a constant
  • Correct the units for AZFP backscatter_r` to "count" (from "dB")
  • Use long_name for backscatter_r and backscatter_i to the strings specified in SONAR-netCDF4 v1

emiliom avatar May 22 '23 21:05 emiliom

I believe we can close this issue, now that #1047 is merged.

emiliom avatar May 24 '23 16:05 emiliom