pygac icon indicating copy to clipboard operation
pygac copied to clipboard

replacing median with 50th percentile

Open oliversus opened this issue 4 years ago • 3 comments

Method correct_times_median should provide median values for year, doy, and milliseconds to correct invalid timestamps. However, np.median will return floats, which could even be fractions due to np.median's internal interpolation method of the two closest values. I'd suggest to use np.percentile instead, where one can explicitly set the interpolation method. Setting this to "nearest" will always return the value closest to the statistical median, thus not changing its type.

np.median([1, 2, 3, 4, 5, 6]) returns 3.5 np.percentile([1, 2, 3, 4, 5, 6], 50, interpolation="nearest") returns 3

This will also avoid a ValueError in method to_datetime64, line 449, where conversion to datetime64 is only possible with year as an integer.

oliversus avatar Sep 29 '21 10:09 oliversus

Codecov Report

Merging #97 (07d4de7) into main (a8b68a3) will increase coverage by 0.05%. The diff coverage is 90.90%.

@@            Coverage Diff             @@
##             main      #97      +/-   ##
==========================================
+ Coverage   74.39%   74.44%   +0.05%     
==========================================
  Files          32       32              
  Lines        2882     2888       +6     
==========================================
+ Hits         2144     2150       +6     
  Misses        738      738              
Impacted Files Coverage Δ
pygac/reader.py 81.85% <85.71%> (+0.07%) :arrow_up:
pygac/tests/test_reader.py 98.36% <100.00%> (+0.01%) :arrow_up:

codecov[bot] avatar Sep 29 '21 10:09 codecov[bot]

@oliversus do you have time to fix this soon?

mraspaud avatar Oct 28 '22 09:10 mraspaud

@oliversus do you have time to fix this soon?

Hi Martin and Stephan, sorry for ignoring this for so long. I do have a bit of time right now and will try to implement this as Stephan suggested. Cheers Olli

oliversus avatar Jul 18 '23 07:07 oliversus