Support date times in date fields
This PR supports RFC3339 formatted datetimes in date fields. The datetimes are read and stored with maximum available precision.
The following points could still be addressed:
- [ ] With the current implementation the string "19 94" is parsed as "1994". Should this be fixed?
- [x] Roundtrips are not fully idempotent: the offset
Zturns into+00:00. - [ ] The tests are relatively limited, perhaps property-based tests could be implemented to improve this.
- [x] Dates could also support conversion from/to YAML hashes.
Technically, this is a breaking change, since the Date struct got new fields. For the future, this could be fixed by adding #[non_exhaustive] to the struct definition.
Fixes #30.
I have a change request with regards to error handling in the parse function. The parsing error should be reported instead of panicking or silently ignoring the malformed date.
Yes, the panic was an oversight. Ignoring the ParseErrorKind::TooShort has been deliberate however, since chrono's RFC3339 parser expects a full datetime, including offset. I've documented this in an additional commit.
If at all possible, whitespace should not be accepted within the dates' numbers. However, if this is not possible with
chronowithout preprocessing, we could keep the current behavior. Alternatively,unscannycould be used for a small custom RFC3339 parser instead ofchrono. I wanted to get rid of the regexes in favor of unscanny parsers anyways.
There's no need to fix chrono's parser. The problem is in hayagriva and can be fixed rather simply.
However, I've noticed that by ignoring ParseErrorKind::TooShort, we also accept dates like 1994-. If we want to fix this, we would probably have to implement a custom RFC3339 parser anyways. This could also be an opportunity to improve the error handling, since chrono's errors are admittedly not good.
Should we go down that road?
I am okay with the roundtrip conversion from "Z" to "+00:00". Is there a semantic difference between the Zulu timezone and "+00:00"? At any rate, they specify the same time zone right now and I think that's unlikely to change unless the IETF wants to break the world.
No, there's no semantic difference. It's purely syntactical.
Aren't YAML date hashes a bit niche? At any rate, I think they are not within the scope of this PR.
Breaking changes are fine at this stage. I do not think that
Dateneeds a#[non_exhaustive]. Changes to the struct will likely change the semantics of all of its fields and also change what users can express and thus choose to, so the notification to library consumers that something is up is warranted.
I think we could improve type safety by using an enum instead of options. That way, encoding "impossible" configurations like year + day, but no month would become impossible. The existing members could then be accessors on the enum. Does that sound good?
Hi @reknih, just a quick follow-up. Have you had some time to review the fixes and think about the questions?
Hey @m-haug, @reknih is currently on vacation, so you'll probably have to wait some more.
On a side note, I'm curious about the enum thing. I see that it would prevent impossible configurations, but wouldn't there be lots of boilerplate and overlapping variants?
Thanks for the heads-up and the feedback!
Yes, there would probably be some duplication in the enum itself, since "larger" variants, like a full datetime, have to contain the same information as "smaller" variants, like a date or a year only. This also depends on some other decisions surrounding the design of the parser. For example, should a time always contain a minute, or is an hour specification sufficient?
However, I don't think there would necessarily be lots of boilerplate when using the enum design, especially on the consumer side. By providing accessors (month / get_month), the existing interface could mostly be preserved (apart from requiring a method call). Internally, some boilerplate would probably be necessary to provide the accessors, but thanks to pattern matching with or-patterns that should also be limited.
It should also be possible to provide mutations capabilities in the enum design, albeit with more effort. Of the top of my head, I can see
- precision-preserving operations, like
set_month, - narrowing operations, like
round_to_monthortruncate_to_month, - and widening operations, like
append_month.
The naming is obviously open to bike-shedding. The first and the third category may have some overlap, depending on the chosen semantics. For example, set_day on a date containing only year and month could automatically widen the date once.
If you have further feedback, or see anything that I may not have considered yet, I'd be thrilled to hear about it.
The enum + methods design sounds good, in principle. One other consideration is whether it's a public enum or a private enum within a struct. That way we could also enforce that the individual fields are within bounds. It might also make sense to have a Date and a Time struct/enum which could be composed in a Datetime struct with Date and Option<Time> fields.
Is there still interesting in completing this?
Closed because of inactivity. #30 will remain open to track this.