julia icon indicating copy to clipboard operation
julia copied to clipboard

add ISO year / ISO week utilities

Open isentropic opened this issue 2 years ago • 21 comments

Based on some discussion in https://discourse.julialang.org/t/determine-date-range-given-year-and-week-number/93878 and the relevant issue https://github.com/JuliaLang/julia/issues/48490 I created a couple of utilities to help handle 2023-W05-4 (YYYY-WWW-D) dateformats (https://en.wikipedia.org/wiki/ISO_week_date) Currently there is no functionality of ISO Week with weekday | 2023-W05-4 in Julia, but this could be good start and could help a lot of people who require such dateformat.

Notably, with firstmondayofyear it is trivial to convert from week format 2023-W05-4 to usual date format 2023-02-02 as:

firstmondayofyear(Date(2023)) + Week(5 - 1) + Day(3)

This was otherwise impossible within std lib

isentropic avatar Feb 03 '23 03:02 isentropic

Since the first week is defined in terms of Thursday according to ISO, perhaps this should be named firstthursdayofyear and then it would be both right by intuition and ISO convention. But then, most users will need to subtract three days to come back to monday.

isentropic avatar Feb 03 '23 04:02 isentropic

I find both firstmondayofyear and firstthursdayofyear a bit ambiguous, because with ISO week dates the weird thing is the year in which a date can fall. This "ISO week-numbering year" is often called "ISO year" for short (see e.g. here).

For example, 2019-12-30 and 2021-01-01 are two dates that have ISO year 2020: https://www.epochconverter.com/weeks/2020.

So with

julia> date = Date(2021)   # A day of 2021 that has ISO year 2020
2021-01-01

what should firstmondayofyear(date) or firstthursdayofyear(date) be, when it's not clear which year we're talking about?

The current implementation in this PR gives

julia> firstmondayofyear(date)   # date = 2021-01-01
2021-01-04

Is this expected? If the function is supposed to help computations with ISO weeks I would expect here to get 2019-12-30, because that is the first Monday in the ISO year that contains the date 2021-01-01.

knuesel avatar Feb 03 '23 10:02 knuesel

Here are some ideas of other functions/names to help working with ISO week dates:

  • firstmondayofisoyear (and matching lastmondayofisoyear) to get the first Monday of the ISO year containing the given date,
  • isoyear to get the the ISO year containing the given date,
  • Date and DateTime constructors that accept Year and Week values and interpret the inputs as ISO values.

Maybe most useful would be weekdates to address the original problem directly (find date range given ISO year and week).

Examples:

julia> date = Date(2021)  # 2021-01-01 falls in ISO year 2020 which has first Monday 2019-12-30
2021-01-01

julia> firstmondayofisoyear(date)
2019-12-30

julia> isoyear(date)
2020

julia> Date(Year(2021), Week(1))
2019-12-30

julia> weekdates(Year(2020), Week(1))
Date("2019-12-30"):Date("2020-01-05")

knuesel avatar Feb 03 '23 10:02 knuesel

The current behavior is intentional, but you are right about the ambiguity. Initially I was going to only let iso year as an input, but then I thought it would make it easier to let users input arbitrary Date which is technically wrong. I could keep the functions implementations mostly the same, but the input will be iso year (integer). The names will change to reflect iso year. How about this for a start?

Next, for the extra functionality I will follow the examples you have given. Do you know where could I find test data?

isentropic avatar Feb 03 '23 17:02 isentropic

I think it makes sense to let users pass an arbitrary Date: this is how the existing first... functions work (for example firstdayofyear). If a function is called firstmondayofyear, it should probably work the same: accept any TimeType that specifies a day (so either a Date or a DateTime) and give the correct answer for that day.

In the case of firstmondayofyear, this means the user can give any day, and should get the first Monday of the year that contains this day. So that's unrelated to the notion of ISO week... For example the proposed implementation gives

julia> firstmondayofyear(Date(1901))
1900-12-31

while I think it should give 1901-01-06...

But anyway for working with ISO weeks, aren't the functions I proposed easier and clearer? For example to find 2023-W05-4, you could call Date(Year(2023), Week(5), Day(4)), or weekdates(Year(2023), Week(1))[4]. What do you think? Do you see a case that would be easier to solve with firstmondayofyear?

For the test data maybe the easiest is to make a CSV with Excel (or with Julia and the week function but that would be less of a cross-check). Or we can just take a few interesting cases from a page like https://www.epochconverter.com/weeks/ ?

knuesel avatar Feb 08 '23 16:02 knuesel

I actually thought and decided against firstmondayofyear functions completely, as I feel it is not needed now that I learned more about ISO year system. I implemented some utils that would translate from normal date to ISO-week-day format.

Though, the current implementation might not be too fast, but it feels to be more robust. What do you think? If this looks to be satisfactory I will proceed to implement inverse conversion and weekdates functionality and Date wrappers. I will add tests too, once we agree on the implementation.

isentropic avatar Feb 09 '23 06:02 isentropic

For weeksperyear I used https://en.wikipedia.org/wiki/ISO_week_date#Weeks_per_year equivalent definition which seems safe

isentropic avatar Feb 09 '23 06:02 isentropic

Yeah, I'll get on to that, sorry for the delay. My editor keeps formatting files on save according to juliaformatter (thereby chaning like 50% of the lines), i have not committed these changes but do you guys try to format the julia stdlib codebase? Should I commit the stylistic changes?

isentropic avatar Mar 06 '23 02:03 isentropic

do you guys try to format the julia stdlib codebase?

We do want to format the whole codebase. And there is a related issue: #47052

Should I commit the stylistic changes?

Better not, Only format your newly written code.

inkydragon avatar Mar 06 '23 05:03 inkydragon

Yeah i have no idea why tests are failing on apple ...

isentropic avatar Mar 06 '23 06:03 isentropic

#!/usr/bin/env python3

import pandas as pd
data = pd.read_csv("testdate.csv", parse_dates=['date'])
data["iso-date"] = data.date.apply(lambda x: x.isocalendar())

for row in range(data.shape[0]):
    normaldate = data.iloc[row, 0].strftime("%Y,%m,%d")
    isoyear = data.iloc[row,1].year
    isoweek = data.iloc[row,1].week
    isoday = data.iloc[row,1].weekday
    print(f"@test isoweekdate(Date({normaldate})) == ({isoyear}, {isoweek}, {isoday})")

was used to generate the test cases

isentropic avatar Mar 06 '23 06:03 isentropic

i have no idea why tests are failing on apple ...

The error is not related to Dates, just ignore it.

inkydragon avatar Mar 06 '23 08:03 inkydragon

doctest failed. The output needs to be updated.

inkydragon avatar Mar 06 '23 10:03 inkydragon

I dont understand, where/what should I fix?

isentropic avatar Mar 06 '23 11:03 isentropic

Thanks a lot, so i guess the doctest actually runs the examples am I right?

isentropic avatar Mar 07 '23 11:03 isentropic

the doctest actually runs the examples am I right?

yes!

A doctest is a fenced code block (see Code blocks) starting with ```jldoctest and contains any number of julia> prompts together with inputs and expected outputs that mimic the Julia REPL. —— https://docs.julialang.org/en/v1/manual/documentation/#Writing-Documentation

And Doctests · Documenter.jl

inkydragon avatar Mar 07 '23 11:03 inkydragon

Anything left to do? @knuesel

isentropic avatar Mar 08 '23 04:03 isentropic

I feel a little stupid for pinging this, but I just saw julia 1.10 early alpha dropping out, I just wish this was somehow available in the standard lib. @JeffBezanson @timholy

isentropic avatar Jul 07 '23 01:07 isentropic

My goodness, you've been waiting a while for this! Sorry you've been left hanging for so long, and thanks for holding out hope :sweat_smile:

It looks like recent contributions to Dates have come from a mix of people, so I'm not sure who's worth pinging in particular, but looking at the code I think this seems rather reasonable.

If nobody has any particular objections at this point, I think this would be worth merging.

tecosaur avatar May 13 '24 08:05 tecosaur

Don't we need the compat admonitions?

kimikage avatar May 13 '24 12:05 kimikage

@tecosaur so anything I can do from this point on?

isentropic avatar May 21 '24 04:05 isentropic

Adding Julia 1.12 compat admonitions would be the one thing that comes to mind. I don't think I'm the best person to do so, but after that I'm inclined to just merge this as it seems rather sensible on the whole and remaining objections/concerns are rather minor.

The one unaddressed comment I wonder about is from Jeremie

I's a bit odd to return Year, Week and Int64 instead of Year, Week and Day. On the other hand, returning an Int64 for the day is consistent with dayofweek, and I understand Day to be rather for the day in the year. It would be nice to have someone else's thoughts on this.

Looking at Day, it seems like a sensible semantic wrapper type to apply here.

tecosaur avatar Jul 25 '24 07:07 tecosaur

I think it makes a lot of sense to match the return type with dayofweek, this feels very natural:

isoweekdate(dt::DateTime) = (isoyear(dt).value, week(dt), dayofweek(dt))

I just thought whoever wrote dayofweek function must've had a good reason to make it return Int64. I can do it anyway you like but this better not take another year

isentropic avatar Jul 25 '24 08:07 isentropic

I think the thing with dayofweek is that it returning an Int makes it already a bit inconsistent internally, as is mentioned in #19212

tecosaur avatar Jul 25 '24 08:07 tecosaur

Well, it seems like deprecation of dayofweek wasn't successful as per #19212, so can we really do something on our end in this PR?

isentropic avatar Jul 25 '24 08:07 isentropic

Probably not, I just wonder if there's a way we can make something like the change in #19212 not break the API here?

tecosaur avatar Jul 25 '24 09:07 tecosaur

I can just wrap it as Int64(dayofweek(dt)) but I can't think of anything we can do here. Any suggestions?

isentropic avatar Jul 25 '24 11:07 isentropic

If/when Dates becomes an upgradable stdlib, one can make all sorts of breaking changes and release Dates 2.0.0. But this may not be the PR for that.

timholy avatar Jul 25 '24 18:07 timholy

What's the downside of current PR anyways? I just added a bunch of tested functionality that was absent before. Finding ISO year/week is present in most other stdlibs of other prog. languages.

I just find it frustrating to mentally refocus on this PR every few months, without hearing actionable feedback. I don't wanna abandon this work, because I believe this will help someone someday

isentropic avatar Jul 26 '24 05:07 isentropic

This current PR is good, it just seems to be the that for consistency with the rest of the Dates API dayofweek shouldn't return an Int, but that isn't this PR's fault.

Perhaps the way to go would be to merge this PR be introduce a weekday type and switch this introduced code to use it in another PR?

tecosaur avatar Jul 26 '24 06:07 tecosaur