chrono
chrono copied to clipboard
Consider refactoring chrono to avoid systematic misuse of panics
Many constructors in chrono has the behavior to panic on invalid inputs. This issue proposes to systematically change this behavior to instead make use of fallible constructors which in effect means the removal of the *_opt family of functions.
I'll be trying to make an argument for why below, at the end are the concrete changes I'm proposing.
Why should we do this?
The use of chrono has been the culprit of many unwanted panics in code that I've personally written in my career as a Rust programmer. The reason why this happens is sloppiness from my part. I use a method without realizing that it panics even though it says so in the documentation. But after consideration I've come to believe that the design of these methods could be improved.
The standard I abide by when it comes to panicking is broadly that correct Rust programs should not panic. So what I'll try and determine next is if uses of these constructors contributes to writing correct Rust programs that do or do not panic.
There are a few tests I'll be applying here each time one of these methods are used:
- Does the caller check the necessary preconditions before calling the function?
- Does the caller control the input to the function so that it's runtime invariant?
- Is the function call in a context where it's OK to panic. Such as a test case or a short lived CLI program.
Preconditions
If we look at the doc for most of these functions they contain wording such as TimeZone::ymd:
Panics on the out-of-range date, invalid month and/or day.
This is a difficult precondition to check. An out-of-range date isn't a mechanical test such as "is this value smaller than 1000". It requires a deep understanding of the gregorian calendar system which chrono itself is responsible for providing. It's implausible that uses ensure that its input passes the required condition due to complexity. So the first test probably fails systematically across many projects.
Runtime invariant inputs or tests
It's easy to find many uses of these functions where the input is runtime variant and not tests. I've found that many of these relate to ad-hoc use in combination with serde, which is a strong indication that its arguments truly are runtime variant. A bit of taxonomy lead me to believe that there might be a case of systematic misuse.
What to do?
I believe many of the examples one can find of chrono production use fails the tests I proposed. The way to correct these uses would be through the *_opt family of methods. But since the non-opt methods seems to be considered the default and probably leads to systematic problems when used I think it's worth considering changing them.
What I'd suggest and would like to discuss is:
- Rewrite all panicking functions that do not have an easy precondition to test to be fallible instead.
- Consider rewriting other functions which panic if there is a high risk of misuse where the caller should actually be using the
*_optsibling function but for one reason or another doesn't. - For all the rewritten functions: deprecate or remove their
*_optsibling function.
Thank you!
Thanks for raising the issue @udoprog. I think it should definitely be considered whether we promote the Option returning methods as the default ones, rather than the panicking ones. We could also consider whether returning a Result with an error type that implements Error would be even better, however this would have to be gated by the std feature. Now is a good time to do this as we are now working on a 0.5 release which can include breaking changes.
My summary of things to consider:
- Consider having the fallible methods be the default, with the panicking methods having some sort of common suffix (or just require users to
?orunwrapthem) - Consider more use of the type system (eg https://github.com/chronotope/chrono/issues/727) - this works well for
Month(similar to the "parse, dont' validate" strategy) - Helper methods where returns have multiple failure cases (eg https://github.com/chronotope/chrono/issues/716 and also on
TimeDelta) AddandSub- some implementations can panic (egAdd<Days> for DateTime<Tz>) - potentially we should only implement these for cases where the only way to panic is due to numeric overflow (we also have to consider here numeric overflow of the underlying storage, or simply reaching the maximum allowable year value)
Yes, one of things on my list for 0.5 is to audit all causes of panic in Chrono and remove it where possible. The cases that would be hard to prevent is things like Add or Sub impls that might overflow, where we have started making sure we offer checked variants as well.
Any help you could offer on this topic would be much appreciated!
Thanks for the positive reception! I'd be glad to put together a PR as a proposal now that I've gotten a feel for the waters.
So I've got a work-in-progress branch where I tested some of the above things out, a preliminary review to see if it's on the right track would be nice before I do the work of making all doctests pass 😅. All the regular unit tests do pass.
Note: this has been opened as #817
The broad strokes are:
I've opted for a single ChronoError type which has a private kind that determines the message presented to the user (but is not public). I initially started with one error type per kind of error, but eventually noticed that it would make propagating errors overly hard like how constructing a DateTime either raises an error because the date or the time was invalid.
I've opted to make loading the local timezone fallible instead of panicking (note that I have not yet changed the underlying implementations to be fallible). I think this is motivated because technically the underlying platform might perform checks which might fail all though what these look like exactly is a bit unclear.
This has the side effect of making a number of TimeZone methods fallible - which is unfortunate because a number of timezones can easily be interacted with without erroring (Utc and FixedOffset). To mitigate this I've introduced a separate FixedTimeZone trait that extends the TimeZone trait which has a number of non-fallible methods.
Making things fallible had the side effect of clearly marking the standard trait implementations which might panic, these are:
- The
From<T>implementations which interacts with theLocaltimezone, because this doesn't implementFixedTimeZone. Where appropriate these have been moved to beTryFrom<Error = ChronoError>implementations. - As mentioned above
AddandSub.
If you think something is going in the wrong direction I'd appreciate the guidance.
We could also consider whether returning a Result with an error type that implements Error would be even better, however this would have to be gated by the std feature.
core::error::Error exists in nightly now.
Another alternative is to use the core2 crate to polyfill the error type for no_std targets.
I think returning Result for these cases would be mostly more idiomatic than Option, we can guard implementations of the std Error trait as necessary.
I think returning Result for these cases would be mostly more idiomatic than Option, we can guard implementations of the std Error trait as necessary.
That's what I've opted for in #817. I also thought that's what @esheppa meant!
Like how much more reliable that makes code especially with regards to timezone changes overall.
However it looks like for a few functions this is overkill and the ergonomics loss don't outweigh the benefits:
- NaiveDate::and_hms:
IIUC the only case where it may fail is if
h >= 24 || m >= 60 || s >= 60. This is obvious to check for, and this function is basically only ever used in scenarios where this is a constant or close so, so the ergonomics loss doesn't seem worth. - NaiveDate::succ and NaiveDate::pred:
We're talking dates ranging from Jan 1, 262145 BCE to Dec 31, 262143 CE. The 0.0000001% people who may hit that range while calling succ or pred can pay (that they should pay attention to it themselves because the deprecation warning isn't gonna help them) for the ~100% who don't, and we should probably just propagate the behavior of
Date +- Duration::days(1). (At least this doesn't make much sense untilAdd<Duration> for NaiveDatecan be deprecated as well.)
NaiveDate::and_hms(): I disagree that it "is obvious to check for", and I don't see why this function would only ever be used in scenarios where this is a constant.NaiveDate::succ()andNaiveDate::pred(): I agree that the case is not as clear-cut here. Personally I find thesucc()andpred()API pretty unidiomatic and I would prefer just straight up getting rid of these, maybe in favor ofiter_days(). In the meantime we could potentially undeprecatesucc()andpred().
@djc We have 2 use cases in which we use and_hms with constants:
- During tests & doc tests: seems alright to unwrap here
- To convert a naive_date to midnight (not sure all midnights exists in all offsets though)
DateTime::from_utc(
// Date computation
.date_naive()
.and_hms_opt(0, 0, 0)
.unwrap(),
Utc,
)
NaiveDate::and_hms(): I disagree that it "is obvious to check for", and I don't see why this function would only ever be used in scenarios where this is a constant.
Hmm. I don't have any example where we're not using hms with either a constant or a data which we know by construction is always valid (such as values with modular arithmetic). I can imagine one such use-case would be parsing libraries (from api/database, etc, e.g. strftime...), but then in that context I would consider falling in the "obvious to check for" category.
That's not a big deal for us (only 4-5 use-cases throughout our codebase), but if you have a use-case that doesn't fall in either of these two categories I'm curious of what it looks like :)
Thanks again! :)
For testing fixed dates which are known valid (which I do as well, and some commenters on the PR such as https://github.com/chronotope/chrono/pull/827#issuecomment-1314720557) maybe a macro that statically checks the date and cannot panic at runtime would be more suitable.
let dt = datetime!(2006-01-02 15:04:05);
Such a macro sounds good to me. I'm not sure how one would do this with declarative macros?
This would be great, but I think we first need to be able to construct date(time)s in a const context (such that the decl or proc macro generated code can itself create the date), and then we need to consider if the MSRV bump for const-panic is worth it such that we can get compile time validation.
If only the former is possible we should still be able to have some non-panicking helper methods that can create dates at the start of given years or year+month combinations.
Alternatively we could gate the const-panic functionality behind a (potentially default) feature to get a best of both worlds.
It would be great to be able to store dates and times etc, in consts!
TLDR: macro looks fine for our use case. A typical use case is "midnight on this naive date", or "18:00 on this naive date", or "application-specific constant passed as parameter of the function :00 on this naive date". I guess if we have a macro to build Time that would indeed cover for these use-cases, although I think I slightly prefer the simplicity of the standard function here because of the low odds that somebody would set any of these to a non-existent hour.
Having re-read my comment, I'm not sure construction in a const context is actually necessary for this, but it is nice and solves some other issues as well.
I've pushed a branch with a sample ymd! macro here: https://github.com/chronotope/chrono/blob/9a0585af81ed1270d1170f6f9d0520716d28e8e9/src/naive/date.rs#L33
This uses an unnamed const to force const evaluation and will give a compile error if the inputs are invalid, eg:
let a = ymd!(2022, 11, 19); // compiles
let b = ymd!(2022, 11, 32); // compile error
and in constants:
const A: NaiveDate = NaiveDate::from_ymd_validated(2022, 11, 19); // compiles
const B: NaiveDate = NaiveDate::from_ymd_validated(2022, 11, 19); // compile error
one disadvantage of NaiveDate::from_ymd_validated is that if it is called in a non-const context, it will error at runtime instead of compile time. This could either be noted in documentation or potentially via a #[doc(hidden)] or other strategy to discourage unintentional use.
This would raise the MSRV to 1.47, which just happens to fit nicely along with https://github.com/chronotope/chrono/pull/712#issuecomment-1320328262 and so perhaps 1.48 could be a good MSRV for 0.5. (The const-validation feature which gates the above code requires much more recent rust, but is an optional feature)
edit:
We could also use const generic parameters, for example:
NaiveDate::from_ymd_validated_2::<2022, 1, 1>()
A typical use case is "midnight on this naive date",
I've created https://github.com/chronotope/chrono/pull/893 to assist with this, let us know your thoughts @Ten0 and @Elrendio
A typical use case is "midnight on this naive date",
I've created #893 to assist with this, let us know your thoughts @Ten0 and @Elrendio
Thank you! However it seems that only covers for midnight, which is far from being as common as hardcoded time anywhere in the day (e.g. end of working day at 18:30 or start of working day at 9:00 or middle at 13:00...). So that seems too restrictive.
Wanted to share my experience on this:
Of my uses of these (deprecated) functions that could be replaced by their *_opt variant, 71 lay in tests, and 1 lay in runtime code. For the single function that is in runtime code, I believe it is always correct - though I cannot be sure, but I am inclined to agree that I should handle the case that it fails.
However for the test cases, unwrapping (potentially once for ymd and once for time) is overkill.
As it is, I find specifying many dates / times in test cases very lengthy, and quite hard to see what the dates are vs function calls to construct them.
It would be massively preferrable to be able to specify them in a standard string format (preferably with timezone support), such as 2022-02-19 07:00:00 (And I may replace my test cases with these). If macros were able to do this, I think that would be amazing as that would minimize surrounding code. However, for my particular use case, verifying at compile time, while it sounds useful, would be irrelevant in tests, and any live-code is variable (such as through configuration) and hence wouldn't benefit. However, having a macro would definitely make me believe that the dates were checked at compile time.
If there was an alternative designed for test cases (either promoting parsing from strings, or macros), I think that would then make complete sense to drop the non-opt variants.
Additionally, it would be nice if the deprecation message included a link to this issue.
@tyhdefu - thanks for the feedback, and it's a good idea to link to this issue in the deprecation feedback.
Regarding your specific use case, perhaps the impl FromStr for NaiveDateTime would be an option, you could alias let dt = |s: &str| s.parse::<chrono::NaiveDateTime>().unwrap(); and then call eg dt("2022-02-19T07:00:00") in relevant places?
So that seems too restrictive.
Yes this is fair enough. The NaiveTime::MIN is just an initial fix that was very easy to do because we already had it in the code, and will cover hopefully many cases. From here we have to design an API that is flexible enough to handle most of the other cases, one option being something like:
let my_time = NaiveTime::<22, 05, 05>::new_hms();
@Ten0, re:
Hmm. I don't have any example where we're not using hms with either a constant or a data which we know by construction is always valid (such as values with modular arithmetic).
fwiw, I have hundreds of usages of .and_hms with constant values in rust code I've written or worked on in the past few years:
~/src$ rg '\.and_hms\(\d,\s*\d,\s*\d\)' -t rust --stats
392 matches
385 matched lines
109 files contained matches
In most cases, the usage is .and_hms(0, 0, 0) to get a datetime at the beginning of the day in a specific timezone. A significant amount of those are in tests, but not more than 50%.
I don't have any example where we're not
I only have such cases as well
If someone wants to add const_-prefix helpers that panic at compile-time with an opt-in Cargo feature guard, I'd be happy to review that on 0.4.x. (Don't pull in new dependencies, though.) Should come with tests that only run on modern Rust versions.
If someone wants to add
const_-prefix helpers that panic at compile-time with an opt-in Cargo feature guard, I'd be happy to review that on 0.4.x. (Don't pull in new dependencies, though.) Should come with tests that only run on modern Rust versions.
I've had a quick play with it. I don't see the point in "const" prefix helpers. Either the functions are const or they aren't. I think its best to just upgrade the existing functions to const functions wherever possible and we can guarantee that they will be const in future.
In addition to this, const implies that it will be checked at compile time, but infact it is the assignation to a const variable that does this check (or a macro that creates a const variable which we can create). To this end we could just have the *_opt functions be const, and unwrap them where we promise they will be available. I think this avoids confusion about exactly what const is. Having a const_ prefix like you suggest does not convey that it is not only but its const_panicky_at_runtime_ which is exactly what we are trying to avoid to by this refactor. People will think, oooh look const shiny and assume it cannot panic, using it instead of the _opt methods when actually it is equally fallible (and at runtime too).
However i'm not sure how older rust versions handle the const keyword - I believe they will just recognise it as a keyword but ignore it.
I've spend a little bit working on converting things to const and so far the problems I've encountered are:
- mod_floor is not const - though it can be replaced by rem_euclid from the standard library (where RHS is positive), although this could cause issues down the line, it seems your own implementation of it is already existent within the codebase to comply with MSRV - so this should be an easy change.
- Try /
?operator is not const. Converting things is going to cause lots of extra code without the?operator being fallible- https://github.com/rust-lang/rust/issues/74935
- static arrays for YEAR / MDF stuff needs to be made into const (i don't think this is actually a problem but I don't understand why they are static instead of const, so something that should be checked, perhaps a MSRV thing again?
- Contains on ranges and fallible get on slices are not
const- This is work aroundable but will cause possible introduction of bugs (since changing the implementation slightly) and harder to read code.
I don't see the point in "const" prefix helpers. Either the functions are const or they aren't. I think its best to just upgrade the existing functions to const functions wherever possible and we can guarantee that they will be const in future.
The point is to offer a library with a conservative MSRV (that doesn't have the const features we need for const constructors) with opt-in support (for users who don't care about the conservative MSRV) leveraging const support. I agree the const_ prefix is not ideal; in 0.5 we would invert this to remove the const_ prefix from these, adding a try_ prefix to the non-const alternatives instead.
See also https://github.com/chronotope/chrono/pull/882.
As I see it, the main benefit is the macro, and also the majority of rules for (utc) date and time are fairly simple.
I managed to create a macro that will confirm whether a datetime is definitely correct. This macro could then expand into an unwrap, without needed any kind of const support, making it available for all users.
Furthermore you could upgrade this macro in future to use const, or feature gate the const on the macro to improve it for those with const support.
Here is the macro in a gist: https://gist.github.com/tyhdefu/8bc6344160b50913a5d0f88702dfae24
I think a macro in this case is just harder to use and maintain, especially when we have const compilation which already works for most users without requiring an extra abstraction (and the vast majority of users should be able to use it).
Closing in favor of #1049. We have non-panicking alternatives to almost all methods now, and making them default for 0.5 is definitely on the roadmap.
We now have a good number of const initializers after #1043, #1080, #1205, #1286, #1337 and #1400. #1270 is for adding convenient initialization macros (with #13 as the corresponding issue).