odftoolkit icon indicating copy to clipboard operation
odftoolkit copied to clipboard

support the java.time classes

Open xzel23 opened this issue 7 months ago • 11 comments

Java 8 introduced the java.time package and since then it is recommended to use java.time classes instead of Date/Calendar etc. - your IDE hopefully warns you about this. ODF Toolkit should directly support java.time and mark the "old" methods as deprecated.

PR incoming.

xzel23 avatar May 10 '25 12:05 xzel23

OK, I have prepared something and no unit test failures - because all TableCell tests are disabled. That has to be fixed first, so this one will have to wait.

xzel23 avatar May 10 '25 14:05 xzel23

Hi @svanteschubert , I need some input on this because the Calendar class that is currently used all over the place can be everything and nothing at the same time.

Can you confirm or clarify the folllowing?

cell type 'date' in ODF

  • contains a date
  • no timezone information
  • can it contain only a date or a date and time combination? I think it should be a date/time combination.

cell type 'time' in ODF

  • contains a time
  • no timezone information
  • can have fractional parts of a second (although I see that ODF Toolkit currently does not support this)

Meta information in ODF like modification date, print date, etc.

  • contains a date time combination
  • do times in meta information contain timezone information? That would be the logical thing to do since these values describe the time when an event took place and thus can only be represented accurately when either timezone information is stored or times are coverted to for example UTC before storing.

xzel23 avatar May 13 '25 16:05 xzel23

@xzel23 The Date is specified in ODF 1.4 here: https://docs.oasis-open.org/office/OpenDocument/v1.4/OpenDocument-v1.4-part4-formula.html#Date

Another view on the problem is to look at the XML, here the XML Grammar as HTML in ODF 1.3: https://docs.oasis-open.org/office/OpenDocument/v1.3/os/schemas/OpenDocument-v1.3-schema-rng.html#2708

Another way to solve it is to take a look at LibreOffice and test what kind of dates are able to be added and how they are being saved.

In the end, although I am not certain that there is always a timezone required, I believe a timezone is a good idea.

svanteschubert avatar May 13 '25 16:05 svanteschubert

@svanteschubert When I look at both documents, there are three different types: date, time, and datetime. This does not seem to be correctly supported. When setDate is used, only the date is stored, and only the time in setTime.

In the XML, we have this:

  <rng:define name="dateOrDateTime">
    <rng:choice>
      <rng:data type="date"/>
      <rng:data type="dateTime"/>
    </rng:choice>
  </rng:define>

I looks like "dateTime" is not implemented for OdfTableCell. I look into this.

xzel23 avatar May 13 '25 17:05 xzel23

@xzel23 I was in a hurry earlier, allow me to answer now in more detail:

The XML grammar states for a cell, which is <table:table-cell> XML element, the following structure:

  1. https://docs.oasis-open.org/office/OpenDocument/v1.3/os/schemas/OpenDocument-v1.3-schema-rng.html#table-table-cell
  2. https://docs.oasis-open.org/office/OpenDocument/v1.3/os/schemas/OpenDocument-v1.3-schema-rng.html#table-table-cell-attlist
  3. https://docs.oasis-open.org/office/OpenDocument/v1.3/os/schemas/OpenDocument-v1.3-schema-rng.html#common-value-and-type-attlist pointing to the date and time:

Image Which leads, following the links of dateOrDateTime or duration, to the following three W3C XSD datatypes, which need to be mapped to Java:

  1. https://www.w3.org/TR/xmlschema-2/#date
  2. https://www.w3.org/TR/xmlschema-2/#time
  3. https://www.w3.org/TR/xmlschema-2/#duration

The XML DOM Java classes of ODFDOM are being generated in the dom package so <table:table-cell> is TableTableCellElement.java to be found here, but there is a base class - manually added as the covered-cells have very much the same data in common - like these data types and values. The typed XML DOM class TableTableCellElementBase.java contains the XML attributes - the data in the syntax layer - and should be "wired" from the doc layer - the convenient/semantic layer, like setting the time at https://github.com/tdf/odftoolkit/blob/master/odfdom/src/main/java/org/odftoolkit/odfdom/doc/table/OdfTableCell.java#L960

I guess this should now answer your question! Otherwise, don't hesitate to ask for more details!

Making an update to state-of-the-art JDK classes might involve the following mapping:

XSD Type Preferred Java Type Notes
xsd:date java.time.LocalDate / OffsetDate Use OffsetDate if timezone is included
xsd:time java.time.LocalTime / OffsetTime Use OffsetTime if timezone is included
xsd:duration javax.xml.datatype.Duration Full fidelity for all components
java.time.Duration / Period Split into time-based or date-based

PS: Note the above table is from a now public ChatGPT inquiry

svanteschubert avatar May 13 '25 18:05 svanteschubert

  • the type "dateOrDateTime" may also refer to https://www.w3.org/TR/xmlschema-2/#dateTime
  • "time"/"date"/"dateTime" may have a time zone but in LibreOffice this is only implemented for meta.xml elements, not (yet?) for spreadsheet cells

mistmist avatar May 14 '25 09:05 mistmist

I'll look over the PR once more. For spreadsheets that would mean to also add "OffsetDate" - the description mentions time zones, but the notation rather matches the offset format - time zones are represented as "CEST", and offsets have an offset like described there.

I think we should wait with implementing this for LibreOffice implementing it first and see what they do as offset times are not really compatible with daylight savings time, you need a zone-id for that to work.

On the other hand, for meta information it is mentioned that UTC has to be used. This should simply work, as that is exactly what java.time instant is - a point in time stored in UTC.

xzel23 avatar May 14 '25 10:05 xzel23

The changes have been merged and this can be closed. If LibreOffice decides to support zoned/offset dates and times, this might need some changes but I think this is OK for now.

xzel23 avatar May 18 '25 03:05 xzel23

@xzel23 Could you tell a bit more about the zoned/offset dates and times? But there is still a related problem in #372 which is related to this patch, please take a look at https://github.com/tdf/odftoolkit/pull/372#issuecomment-2886401235

svanteschubert avatar May 18 '25 06:05 svanteschubert

ZonedDateTime vs OffsetDateTime

ZonedDateTime and OffsetDateTime are closely related. An OffsetDateTime contains a datetime and an offset from UTC time, ZonedDateTime contains a datetime and a time zone.

Example using `jshell:

jshell> java.time.ZonedDateTime.now().toString()
$1 ==> "2025-05-18T10:59:53.570955+02:00[Europe/Berlin]"

jshell> java.time.OffsetDateTime.now().toString()
$2 ==> "2025-05-18T11:00:08.209576+02:00"

These are nearly the same, both are based on UTC time, and both are two hours ahead of UTC.

But with daylight savings, the time jumps forward and backwards twice a year, so when you go back 60 days by subtracting 60 * 24 hours, you get this:

jshell> java.time.ZonedDateTime.now().minusHours(60*24)
$5 ==> 2025-03-19T10:08:02.520534+01:00[Europe/Berlin]

jshell> java.time.OffsetDateTime.now().minusHours(60*24)
$6 ==> 2025-03-19T11:08:12.852106+02:00

Note the time and offset changed for the ZonedDateTime but not for the OffsetDateTime.

But when you go back 60 days, you get this:

jshell> java.time.ZonedDateTime.now().minusDays(60)
$10 ==> 2025-03-19T11:11:41.045632+01:00[Europe/Berlin]

jshell> java.time.OffsetDateTime.now().minusDays(60)
$11 ==> 2025-03-19T11:11:46.848481+02:00

Notice that here, the offset has changed but the time stayed the same.

Now the spec talks about dats and times with timezones, but only talks about specifying times in UTC + offset. To me, it is unclear how this should work. What result would you expect in your workbook for the examples above?

xzel23 avatar May 18 '25 09:05 xzel23

I'll take a look at #372.

xzel23 avatar May 18 '25 09:05 xzel23