pyedflib
pyedflib copied to clipboard
fix writing of decimal sample frequency
In this PR I'm fixing some mistakes that have been made before, and include some clarifications w.r.t to sampling_frequency
and sample_rate
Summary:
-
sample_frequency
is now the main term being used and denoted the samples in Hz (samples per second) - we hide/discourage the use of
sample_rate
. To prevent confusion, for now we pretendsample_frequency
andsample_rate
are the same term towards the user. - the use of
record_duration
was entirely wrong. Fixed that now.
Terms:
smp_per_record
: tells us how many samples there are in each record.
record_duration
: denotes the time window of one record. If all sampling frequencies are ints, this can simply be 1.
sample_frequency
: is calculated by dividing smp_per_record/record_duration. If we write a file, we have to do the reverse, and find an optimal record_duration such that can actually represent the wished sample_frequency.
Problem:
Sampling frequency (samples per seconds in Hz) is not explicitly denoted in EDF. It is calculated by smp_per_record
(samples, int) / record_duration
(seconds, float). Previously it was just assumed that sample_rate
is equivalent to sample_frequency
, which is not the case, as sample_rate
is taken directly from smp_per_record
. However, when the record_duration
is unequal to 1, this is not the same as the sampling frequency.
Similarly, if we want to save a file with decimal sampling_frequency
, e.g. 0.5 Hz, we need to make the equation 0.5=smp_per_record/record_duration
work such that smp_per_record
is an int. That means we need to look for a record_duration
that matches the smp_per_record
for all channels in the EDF. I posted a simplified solution that should work in most cases (just try out record_duration
for 1s to 60s in 1s steps), however, we can think of some more mathematical solution as well later.
Proposition
I think nobody cares how many samples are stored within a record_duration. These are technical details. People want to set a sampling frequency. They do not care how this is realized and stored within the file. Therefore I opt to remove any setting of record_duration
and ``
closes #148 #111
This is a rather large commit, and I would appreciate if someone would take a close look at it @holgern
Codecov Report
Patch coverage: 76.92
% and project coverage change: +5.00
:tada:
Comparison is base (
b92d2f9
) 76.08% compared to head (ff82c93
) 81.08%.
Additional details and impacted files
@@ Coverage Diff @@
## master #159 +/- ##
==========================================
+ Coverage 76.08% 81.08% +5.00%
==========================================
Files 4 4
Lines 828 809 -19
Branches 176 170 -6
==========================================
+ Hits 630 656 +26
+ Misses 156 112 -44
+ Partials 42 41 -1
Flag | Coverage Δ | |
---|---|---|
unittests | 81.08% <76.92%> (+5.00%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
pyedflib/highlevel.py | 72.72% <0.00%> (-0.90%) |
:arrow_down: |
pyedflib/edfwriter.py | 86.50% <77.08%> (+4.49%) |
:arrow_up: |
pyedflib/edfreader.py | 80.95% <100.00%> (+13.13%) |
:arrow_up: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
I changed using @master
to @v3
and @v4
in GitHub actions as suggested generally in https://github.com/actions/upload-artifact/issues/41 (not specifically for Python setup in GitHub actions, but is probably generally better to use a released version instead of the master/main branch)
Also Python 3.6 is removed from testing as its no longer available.
Went once through all the changes, added some more comments. I think all changes are okay and it's long overdue to merge this PR
Thanks for @gcathelain for reviewing!
I'll merge it soon.