chrono
chrono copied to clipboard
Consider refactoring chrono to avoid systematic misuse of panics
Many constructors in chrono
has the behavior to panic on invalid inputs. This issue proposes to systematically change this behavior to instead make use of fallible constructors which in effect means the removal of the *_opt
family of functions.
I'll be trying to make an argument for why below, at the end are the concrete changes I'm proposing.
Why should we do this?
The use of chrono
has been the culprit of many unwanted panics in code that I've personally written in my career as a Rust programmer. The reason why this happens is sloppiness from my part. I use a method without realizing that it panics even though it says so in the documentation. But after consideration I've come to believe that the design of these methods could be improved.
The standard I abide by when it comes to panicking is broadly that correct Rust programs should not panic. So what I'll try and determine next is if uses of these constructors contributes to writing correct Rust programs that do or do not panic.
There are a few tests I'll be applying here each time one of these methods are used:
- Does the caller check the necessary preconditions before calling the function?
- Does the caller control the input to the function so that it's runtime invariant?
- Is the function call in a context where it's OK to panic. Such as a test case or a short lived CLI program.
Preconditions
If we look at the doc for most of these functions they contain wording such as TimeZone::ymd
:
Panics on the out-of-range date, invalid month and/or day.
This is a difficult precondition to check. An out-of-range date isn't a mechanical test such as "is this value smaller than 1000". It requires a deep understanding of the gregorian calendar system which chrono
itself is responsible for providing. It's implausible that uses ensure that its input passes the required condition due to complexity. So the first test probably fails systematically across many projects.
Runtime invariant inputs or tests
It's easy to find many uses of these functions where the input is runtime variant and not tests. I've found that many of these relate to ad-hoc use in combination with serde
, which is a strong indication that its arguments truly are runtime variant. A bit of taxonomy lead me to believe that there might be a case of systematic misuse.
What to do?
I believe many of the examples one can find of chrono
production use fails the tests I proposed. The way to correct these uses would be through the *_opt
family of methods. But since the non-opt methods seems to be considered the default and probably leads to systematic problems when used I think it's worth considering changing them.
What I'd suggest and would like to discuss is:
- Rewrite all panicking functions that do not have an easy precondition to test to be fallible instead.
-
Consider rewriting other functions which panic if there is a high risk of misuse where the caller should actually be using the
*_opt
sibling function but for one reason or another doesn't. - For all the rewritten functions: deprecate or remove their
*_opt
sibling function.
Thank you!
Thanks for raising the issue @udoprog. I think it should definitely be considered whether we promote the Option
returning methods as the default ones, rather than the panicking ones. We could also consider whether returning a Result
with an error type that implements Error
would be even better, however this would have to be gated by the std
feature. Now is a good time to do this as we are now working on a 0.5
release which can include breaking changes.
My summary of things to consider:
- Consider having the fallible methods be the default, with the panicking methods having some sort of common suffix (or just require users to
?
orunwrap
them) - Consider more use of the type system (eg https://github.com/chronotope/chrono/issues/727) - this works well for
Month
(similar to the "parse, dont' validate" strategy) - Helper methods where returns have multiple failure cases (eg https://github.com/chronotope/chrono/issues/716 and also on
TimeDelta
) -
Add
andSub
- some implementations can panic (egAdd<Days> for DateTime<Tz>
) - potentially we should only implement these for cases where the only way to panic is due to numeric overflow (we also have to consider here numeric overflow of the underlying storage, or simply reaching the maximum allowable year value)
Yes, one of things on my list for 0.5 is to audit all causes of panic in Chrono and remove it where possible. The cases that would be hard to prevent is things like Add or Sub impls that might overflow, where we have started making sure we offer checked variants as well.
Any help you could offer on this topic would be much appreciated!
Thanks for the positive reception! I'd be glad to put together a PR as a proposal now that I've gotten a feel for the waters.
So I've got a work-in-progress branch where I tested some of the above things out, a preliminary review to see if it's on the right track would be nice before I do the work of making all doctests pass 😅. All the regular unit tests do pass.
Note: this has been opened as #817
The broad strokes are:
I've opted for a single ChronoError
type which has a private kind that determines the message presented to the user (but is not public). I initially started with one error type per kind of error, but eventually noticed that it would make propagating errors overly hard like how constructing a DateTime
either raises an error because the date or the time was invalid.
I've opted to make loading the local timezone fallible instead of panicking (note that I have not yet changed the underlying implementations to be fallible). I think this is motivated because technically the underlying platform might perform checks which might fail all though what these look like exactly is a bit unclear.
This has the side effect of making a number of TimeZone
methods fallible - which is unfortunate because a number of timezones can easily be interacted with without erroring (Utc
and FixedOffset
). To mitigate this I've introduced a separate FixedTimeZone
trait that extends the TimeZone
trait which has a number of non-fallible methods.
Making things fallible had the side effect of clearly marking the standard trait implementations which might panic, these are:
- The
From<T>
implementations which interacts with theLocal
timezone, because this doesn't implementFixedTimeZone
. Where appropriate these have been moved to beTryFrom<Error = ChronoError>
implementations. - As mentioned above
Add
andSub
.
If you think something is going in the wrong direction I'd appreciate the guidance.
We could also consider whether returning a Result with an error type that implements Error would be even better, however this would have to be gated by the std feature.
core::error::Error
exists in nightly now.
Another alternative is to use the core2
crate to polyfill the error type for no_std
targets.
I think returning Result for these cases would be mostly more idiomatic than Option, we can guard implementations of the std Error trait as necessary.
I think returning Result for these cases would be mostly more idiomatic than Option, we can guard implementations of the std Error trait as necessary.
That's what I've opted for in #817. I also thought that's what @esheppa meant!