borg icon indicating copy to clipboard operation
borg copied to clipboard

Add preliminary support for ISO-8601 timestamps via date: archive match pattern (#8715)

Open c-herz opened this issue 8 months ago β€’ 8 comments

This PR adds preliminary support for matching ISOβ€―8601 timestamps with the date: archive filter, and intends to begin addressing the requirements of #8715.

Timestamps (except for Unix epoch forms) are currently assumed to be in the user's local timezone and converted to UTC internally. The following formats are currently supported:

  1. YYYY
  2. YYYY-MM
  3. YYYY-MM-DD
  4. YYYY-MM-DDTHH -> matches a 1-hour interval
  5. YYYY-MM-DDTHH:MM -> matches a 1-minute interval
  6. YYYY-MM-DDTHH:MM:SS -> matches a 1-second interval
  7. YYYY-MM-DDTHH:MM:SS.ffff -> matches an exact timestamp, including fractional seconds
  8. @123456789 -> Unix epoch (interpreted as UTC)

(Is the fractional-second exact match useful in practice? Feedback welcome on this.)

Still in progress:

  • [x] Support for wildcard-style matching (date:2025-*-01)
  • [x] User-specified timezones and offsets
  • [x] Keywords like now, oldest, newest, etc
  • [x] Explicit duration intervals
  • [x] RFC 3339 / RFC 9557 format support
  • [ ] Test/documentation coverage

c-herz avatar Apr 19 '25 22:04 c-herz

Thanks for picking this up! :heart:

  1. YYYY-MM-DDTHH:MM:SS.ffff -> matches an exact timestamp, including fractional seconds
  2. @123456789 -> Unix epoch (interpreted as UTC)

(Is the fractional-second exact match useful in practice? Feedback welcome on this.)

What's the precision of an archive's creation time? From the code I assume it's with fractional seconds, right? I absolutely agree with you then: There should be a variant that is guaranteed to match a single archive. I feel like that Unix timestamps should also optionally support fractional seconds then.

  1. YYYY-MM-DDTHH -> matches a 1-hour interval

From reading the code I derive that the 1-hour interval is matched exclusively, i.e. it's actually matching any archive within 00:59:59.9999… hours, correct? Perfect πŸ‘

# Year/Year-month/Year-month-day
parts = expr.split("-")
try:
    if len(parts) == 1:                    # YYYY
        year = int(parts[0])

Even though I like the simplicity, I feel like that Borg should be pretty strict about the format, because being less strict easily leads to ambiguity. For example, is date:0 supposed to match any archive created in year 0? Probably, but it gets way less clear with (now deprecated) truncated ISO8601 dates: What does the pattern date:25-1 describe? January of the year 25, January 1925, or January 2025?

If there's no library that can be used, I always imagined that the code would basically revolve around a single, rather strict regex with bottom-up optional groups for year, month, day, hour, minutes, seconds, and fractal seconds, or * as wildcard, supplemented by another regex to match periods, and simple matchers for Unix timestamps and keywords. I'm not saying that this is the best approach, that's just what I imagined while writing #8715.

In general I like to encourage creating extensive unit tests as early as possible. It's elegant and simple code now (πŸš€ πŸ‘), but complexity will increase greatly when adding more and more features.

Note: I can read the code, but can't do an actual code review - for that I just don't known enough of Borg's code.

PhrozenByte avatar Apr 20 '25 13:04 PhrozenByte

BTW, if you install the pre-commit hook, you can have your commits automatically formatted.

https://borgbackup.readthedocs.io/en/stable/development.html#building-a-development-environment

ThomasWaldmann avatar Apr 21 '25 12:04 ThomasWaldmann

Codecov Report

Attention: Patch coverage is 90.71038% with 17 lines in your changes missing coverage. Please review.

Project coverage is 81.62%. Comparing base (d1899c1) to head (9cb5e5f). Report is 29 commits behind head on master.

Files with missing lines Patch % Lines
src/borg/helpers/time.py 88.66% 11 Missing and 6 partials :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8776      +/-   ##
==========================================
- Coverage   81.89%   81.62%   -0.28%     
==========================================
  Files          74       74              
  Lines       13324    13517     +193     
  Branches     1968     2008      +40     
==========================================
+ Hits        10912    11033     +121     
- Misses       1750     1802      +52     
- Partials      662      682      +20     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Apr 25 '25 19:04 codecov[bot]

before doing further changes, please rebase on current master branch to get the cython workaround. otherwise, builds will fail.

ThomasWaldmann avatar May 09 '25 20:05 ThomasWaldmann

Guess I'ld like to merge this for next beta, can we finish this until then?

ThomasWaldmann avatar May 22 '25 17:05 ThomasWaldmann

ping?

ThomasWaldmann avatar Jun 03 '25 17:06 ThomasWaldmann

My apologies, I have been caught up with finals and starting a new internship. I should be able to look into @PhrozenByte's suggestions by this weekend. Apologies for the delays and the rather poor communication!

c-herz avatar Jun 05 '25 18:06 c-herz

ping? (no hurry, but if you find some time, it would be nice to finish this)

ThomasWaldmann avatar Aug 05 '25 14:08 ThomasWaldmann