stringi icon indicating copy to clipboard operation
stringi copied to clipboard

`stri_datetime_parse()` add base date/time argument

Open kwhkim opened this issue 2 years ago • 5 comments

Hello, great package.

stri_datetime_parse('2022-01-01', 'yyyy MM dd') works well,

except that time part is set to current time, something like

stri_datetime_parse('2022-01-01', 'yyyy MM dd')
[1] "2022-01-01 10:46:40 KST"

10:46:40 is current time. is this intended? if so why?

kwhkim avatar Dec 05 '21 01:12 kwhkim

This is an intended behaviour. I now mention it explicitly in the manual. The assumption is that if you wish to get any other defaults (say, midnight in the current time zone), you can always do something like:

x <- c('2015-02-28', '2015-02-29')
stri_datetime_parse(x, 'yyyy-MM-dd')
## [1] "2015-02-28 10:17:21 AEDT" NA                        
stri_datetime_parse(x %s+% " 00:00:00", "yyyy-MM-dd HH:mm:ss")
## [1] "2015-02-28 10:17:21 AEDT" "2015-03-01 10:17:21 AEDT"
[1] "2015-02-28 00:00:00 AEDT" NA  

gagolews avatar Dec 05 '21 23:12 gagolews

I knew how to set the time at midnight. I just wanted to know the reason why you had made that decision. Is it better in some ways or is it useful? Because I do not know the occasion where it is more useful to set the time at current time.

I am experimenting with your package because I like your goal: code should work the same in every other platforms or generalized to other locale. But in this case, I do not know how it is useful. Can you enlighten me?

kwhkim avatar Dec 06 '21 15:12 kwhkim

Actually, now I see I have a bug there, as the 'base' time is not re-set on every parsing activity.

Thanks

> stri_datetime_parse(c('1970-01-01', '12'), c('yyyy-MM-dd', 'HH'))
[1] "1970-01-01 09:30:13 AEST" "1970-01-01 12:30:13 AEST"

gagolews avatar Dec 07 '21 22:12 gagolews

This is actually the default behaviour that we get from ICU.

Manual on parse() (https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/classicu_1_1DateFormat.html#a82a10d2a3b70277bd983a638477bf154) says

cal ---- A Calendar set on input to the date and time to be used for missing values in the date/time string being parsed, and set on output to the parsed date/time. When the calendar type is different from the internal calendar held by this DateFormat instance, the internal calendar will be cloned to a work calendar set to the same milliseconds and time zone as the cal parameter, field values will be parsed based on the work calendar, then the result (milliseconds and time zone) will be set in this calendar.

And cal is by default initialised with the current time.

On a side note, strptime defaults to midnight today, in the current time zone:

> strptime('2019', '%Y')
[1] "2019-12-08 AEDT"
> strptime('12', '%M')
[1] "2021-12-08 00:12:00 AEDT"

gagolews avatar Dec 07 '21 22:12 gagolews

Actually, the calendar set on input to the date and time to be used for missing values in the date/time string being parsed could be made changeable by the user

I'll introduce a new argument to stri_datetime_parse so that it can be changed manually and to indicate explicitly that stri_datetime_now() is the default

Thanks

gagolews avatar Dec 07 '21 22:12 gagolews

It seems that R's strptime default to midnight local timezone today:

> strptime("2023 32", "%Y %M")
[1] "2023-11-08 00:32:00 AEDT"

I will soon make stringi's behaviour compatible with the above. Fix on the way.

gagolews avatar Nov 07 '23 23:11 gagolews