EDF.jl icon indicating copy to clipboard operation
EDF.jl copied to clipboard

initial work on supporting discontiguous data

Open palday opened this issue 2 years ago • 3 comments

The idea is to provide a separate read_discontiguous to be as backward compatible as possible and to make very clear that something different is happening, namely the gaps are being filled with zeros.

palday avatar Oct 20 '23 23:10 palday

Codecov Report

Merging #78 (7868210) into main (2eef49f) will decrease coverage by 10.91%. The diff coverage is 0.00%.

:exclamation: Current head 7868210 differs from pull request most recent head 75022dc. Consider uploading reports for the commit 75022dc to get more accurate results

@@             Coverage Diff             @@
##             main      #78       +/-   ##
===========================================
- Coverage   95.59%   84.68%   -10.91%     
===========================================
  Files           4        5        +1     
  Lines         295      333       +38     
===========================================
  Hits          282      282               
- Misses         13       51       +38     
Files Coverage Δ
src/EDF.jl 100.00% <ø> (ø)
src/discontiguous.jl 0.00% <0.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Oct 20 '23 23:10 codecov[bot]

Alternative, backwards-compatible-ish possibility:

--- a/src/types.jl
+++ b/src/types.jl
@@ -48,20 +48,26 @@ struct SignalHeader
 end

 """
-    EDF.Signal{T}
+    EDF.Signal{T,V}

-Type representing a single EDF signal with sample type `T`.
+Type representing a single EDF signal with sample type `T`, stored in a collection
+of type `V`. For contiguous files (EDF+C), `V === Vector{T}`. For discontiguous files
+(EDF+D), this is `Vector{Vector{T}}`, where the inner vectors are the contiguous samples
+within each data record.

 # Fields

 * `header::SignalHeader`
-* `samples::Vector{T}`
+* `samples::V`
 """
-struct Signal{T}
+struct Signal{T,V<:Union{Vector{T},Vector{Vector{T}}}}
     header::SignalHeader
-    samples::Vector{T}
+    samples::V
 end

+const ContiguousSignal{T} = Signal{T,Vector{T}}
+const DiscontiguousSignal{T} = Signal{T,Vector{Vector{T}}}
+
 Signal{T}(header::SignalHeader) where {T} = Signal(header, T[])
 Signal(header::SignalHeader) = Signal{EDF_SAMPLE_TYPE}(header)

Advantages of this approach:

  • In some sense it's a more direct representation of the "intent" of EDF+D (and is actually a more faithful representation of the underlying data in the file since sample data isn't stored contiguously in the file even for EDF+C)
  • It doesn't require prior knowledge of whether the file is contiguous (which requires reading the header) in order for the user to call the appropriate reading function
  • It doesn't use placeholder data values which may actually be significant for a user's use case
  • Round-tripping a discontiguous file should be easier since you can define separate methods for writing different kinds of signals

ararslan avatar Oct 31 '23 20:10 ararslan

I would probably want both: there are a fair number of downstream tools that making a contiguous assumption in various ways.

palday avatar Oct 31 '23 21:10 palday