hyperformula icon indicating copy to clipboard operation
hyperformula copied to clipboard

Improve date/time parsing and serialization (was: Refactor `parseDateTime` and `stringifyDate`)

Open wojciechczerniak opened this issue 4 years ago • 2 comments

Description

Adding DateTime #229 exposed a problem with =TEXT() function. From what I've seen [1] the TEXT function can do a lot more than date and time. So the first problem is that it's located in DatePlugin.ts:

https://github.com/handsontable/hyperformula/blob/1f11e0766ee6f46ebd26fd8fe927571b6c130551/src/interpreter/plugin/DatePlugin.ts#L27-L29

stringifyDate

That makes an assumption we should have stringifyDate callback as an option. This probably should be a formatter(value: string, format: string) that could handle any format TEXT (and any other) would like to stringify. One callback, because we don't know all the formats and it would be hard to categorize them.

This way we can avoid writing complicated number formatter, percentage formatter, fraction formatter and others [2]. This will save us time and work (at this stage) to implement all possible formats. To achieve compatibility with other spreadsheet apps we will need them or an easy way (described in the docs) how this can be done with custom formatter callback.

parseDateTime

The function that works in reverse to TEXT is =VALUE()[3] which takes a string and parses it to a numerical value. This function should be the main consumer of parseDateTime function in the new form. Or a series of callbacks as we spoke with @izulin

There may be multiple formats that VALUE or other APIs (setCellValue, buildFromArray) may have to parse. To not include everything into the engine we can expose a parser function and let the developers fill the gap with a module (moment.js and similar) that perfectly fits their needs.

VALUE is locale dependant as well so this will be the best solution and allow end-developers to implement their own culture logic.

XL: obraz

ODFF: obraz

And more: https://docs.oasis-open.org/office/OpenDocument/v1.3/csprd02/part4-formula/OpenDocument-v1.3-csprd02-part4-formula.html#__RefHeading__1018410_715980110

Basically, the currency and dates and fractions are the same to TEXT but the function works other way, parses instead formatting. It does not accept format, just tryies all the formats that are known to the spreadsheet app and returns value.

TODO

  • [ ] Implement TEXT function that can handle all formats (with external callback)
  • [ ] Implement VALUE function that can handle the same formats (with external callback)

Links

[1] https://support.office.com/en-us/article/TEXT-function-20D5AC4D-7B94-49FD-BB38-93D29371225C [2] https://help.libreoffice.org/6.4/en-US/text/shared/01/05020301.html [3] https://docs.oasis-open.org/office/OpenDocument/v1.3/csprd02/part4-formula/OpenDocument-v1.3-csprd02-part4-formula.html#__RefHeading__1018410_715980110

wojciechczerniak avatar Apr 21 '20 19:04 wojciechczerniak

Use case idea, as pseudo code. Use moment.js, and any other helper lib to parse/stringify the values.

parseString: function (string) {
    const date = moment(string, formats, true)
    if (date.isValud()) {
        const value = { year: date.getYear(), month: date.getMonth(), day: date.getDay() }
        return { value: value, type: DATETIME_TYPE }
    }

    const currency = accounting.parse(string, format)
    if (currency.valid) {
        const value = { amount: currency.amount. currency: currency.currency }
        return { value: value, type: CURRENCY_TYPE }
    }

    const number = parseInt(string)
    if (isNumber(number)) {
        return { value: number, type: NUMBER_TYPE }
    }

    // it's just a string, or unknown format
    return { value: string, type: STRING_TYPE }
}

It's much easier the other way round, we already know the cell type:

stringifyValue: function(value, type, format) {
    switch (type) {
        case NUMBER_TYPE: 
            return value.toString()
        case CURRENCY_TYPE: 
            return accounting.format(value, format)
        case DATE_TYPE:
        case TIME_TYPE:
        case DATETIME_TYPE:
            return moment(value).format(format)
        case STRING_TYPE: 
            return value
    }
}

wojciechczerniak avatar Oct 26 '21 09:10 wojciechczerniak

Also linking use case #844 where current implementation can't handle ISO format. ATM we're arbitrarily assuming that date and time are separated with a white-space.

wojciechczerniak avatar Oct 26 '21 09:10 wojciechczerniak