pyhecdss icon indicating copy to clipboard operation
pyhecdss copied to clipboard

Handling of timezone-aware pandas dataframes/series

Open pyRobShrk opened this issue 9 months ago • 0 comments

  • pyhecdss version: 1.1.4 and 1.4
  • Python version: 3.7.7 and 3.11.8
  • Operating System: Windows

Description

Trying to download USGS data using dataretrieval, and subsequently write it to DSS.

What I Tried

from dataretrieval import nwis
import pyhecdss
df, meta = nwis.get_iv('11455495', '2021-06-01', parameterCd='00095')
ts = df['00095_lower-4 ft from bed'].tz_convert('America/Los_Angeles').tz_localize(None)
with pyhecdss.DSSFile('usgs.dss', create_new=True) as f:
    f.write_its('/A/B/EC//IR-MONTH/USGS/', ts, 'mmhos','INST-VAL')

Notes

The above code fails silently - no error just a blank dss file. The silent error is caused by trying to write more than one value with the same timestamp (tz_convert includes daylight savings). If you try to write the data in UTC, pandas throws an error from line 710 of pyhecdss.py: TypeError: Timestamp subtraction must have the same timezones or no timezones.

End-user fix

df.set_index(df.index.tz_localize(None) - pd.Timedelta('8h'), inplace=True)

Possible Improvements

  • In write_its, an error should be thrown for duplicate timestamps. Check that len(df.index) == len(df.index.unique()) before trying to write to dss.
  • Add support for writing timezone-aware pandas data to DSS with timezone attributes.

pyRobShrk avatar Apr 30 '24 21:04 pyRobShrk