SeisIO.jl icon indicating copy to clipboard operation
SeisIO.jl copied to clipboard

Malformed SeisChannel ID (station code) in `read_sac_stream`

Open adigitoleo opened this issue 2 years ago • 4 comments

I'm having some issues with the SAC reader. One of them is that the SeisChannel ID seems to contain a malformed station code:

julia> versioninfo()
Julia Version 1.6.2
Commit 1b93d53fc4* (2021-07-14 15:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-8705G CPU @ 3.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)

(@v1.6) pkg> st SeisIO
      Status `~/.julia/environments/v1.6/Project.toml`
  [b372bb87] SeisIO v1.2.1

shell> ls data
2019.262.07.06.33.3000.CN.FRB..BHE.M.SAC  2019.262.07.06.33.3000.CN.FRB..BHN.M.SAC  2019.262.07.06.33.3000.CN.FRB..BHZ.M.SAC

julia> using SeisIO

julia> s = read_data("sac", "data/*", full = true)
SeisData with 1 channels (1 shown)
    ID: C.FR..BH
  NAME:
   LOC: 63.7469 N, -68.5451 E, 25.0 m
    FS: 40.0
  GAIN: 1.0
  RESP: a0 1.0, f0 1.0, 0z, 0p
 UNITS:
   SRC: /home/admin/hazcode/julia/julia-s…
  MISC: 46 entries
 NOTES: 1 entries
     T: 2019-09-19T07:06:33 (2 gaps)
     X: -6.722e+04
        -6.701e+04
            ...
        -2.850e+04
        (nx = 432000)
     C: 0 open, 0 total

julia> s.id
1-element Vector{String}:
 "C.FR..BH"

julia> s.misc[1]["kstnm"]
"     FRB"

This affects the filename written by writesac, which ends up having an incomplete station code. I tried to look in read_sac_stream and I think the issue is in that function (where the code points are being extracted), but I'm not familiar enough with SAC internals to fix it. I've attached the SAC files, I wonder if someone could try the above commands to make sure it's not just me :) sacfiles.zip

(Notice that the SeisData instance also only has one channel, I might open a separate issue for that)

adigitoleo avatar Sep 25 '21 00:09 adigitoleo

I checked this out - looks like the station name in the header in these SAC files is offset by 1 byte than what SeisIO expects. This is messing with the station (FR instead of FRB) and channel names (BH instead of BHZ). This is causing #88 because all three channels are named C.FR..BH.

Maybe @jpjones76 can comment on how SAC headers are read with fill_id?

tclements avatar Oct 15 '21 00:10 tclements

This is an unusual situation. I've been working with SAC data since 1997 and yours are the first files I've seen whose string header values use leading whitespace. Every other file I've seen uses trailing whitespace for them, e.g. for KSTNM, your files are written like

julia> String(deepcopy(cv[1:8])) " FRB" ; leading whitespace; GitHub markdown only prints the first space, but there are five.

rather than like FRB (trailing whitespace, where we pretend that GitHub prints five spaces instead of one).

The issue with your station name being one character short might be a bug in fill_id. However, if I'm right about what (I think) the bug is, then it can only potentially affect leading-whitespace strings. As I said, I've never seen that before. Neither has @tclements, I suspect, and I believe he's processed millions of them.

I've never actually seen written documentation on whether SAC strings expect leading or trailing whitespace...I mean, this is only a possible issue for format conversion to/from data forms that use shorter station identifiers (SeisIO, like SEED, allows 5 characters; SAC allows 8). But knowing that this situation does exist (whether or not it should) is enough to make me think about messing with fill_id.

Gonna tag this as "unintended behavior" while I investigate.

jpjones76 avatar Oct 15 '21 01:10 jpjones76

Thanks for looking at this, the downloader I used to get these was a matlab shear-wave splitting plugin. However, I was using an older version (of both Matlab and the plugin), since it wouldn't work for me on newer releases. It seems strange that the downloader would affect the SAC header formatting, but I'll see if I can re-download the traces differently just in case.

adigitoleo avatar Oct 17 '21 01:10 adigitoleo

Found it. Testing a fix now.

jpjones76 avatar Dec 09 '21 00:12 jpjones76