tempo icon indicating copy to clipboard operation
tempo copied to clipboard

Performance problems with "format".

Open oderwat opened this issue 6 months ago • 2 comments

We have a use case where we need to convert 13 Million datetime to a string like YYYY-DD-MM HH:MM:SS (DateTime in MySQL). This could also have a fractional part, but that is not used. So we used format(). This was lasting longer than expected. We moved to naive_datetime, but this was not much better. So we profiled it and found format being the single function that used a lot of time. Looking further we found that it creates a regexp per call. Because of the opaque types we wrote an (unsafe) Erlang function that builds the string and had a performance increase of factor 4+.

I think that the module should probably check for some common patterns and implement them natively. Something like "YYYY-DD-MM HH:MM:SS", "YYYY-DD-MMTHH:MM:SS" (without fractional) and "YYYY-DD-MMTHH:MM:SSZ" comes to my mind. Having some converter to #(#(Y,M,D),#(H,M,D)) would probably also help a lot when actually using this in production code.

oderwat avatar Jun 11 '25 14:06 oderwat

Hello! Yes, this certainly could be made better. The library provides some format templates that should be implemented natively, as you said. More templates should also be added. When I wrote the formatting logic, I had very light need for it so I did not spend a lot of time making it as performant as it could be. Making it better is on the todo list. If you find enough reason to fork the library and make some improvements for your use case (git dependencies make this really easy), feel free to upstream them!

If I added the following function, would it allow you to do everything you need to do in pure Gleam?

let #(
  calendar.Date(year:, month:, day:),
  calendar.TimeOfDay(hour:, minute:, second:, _nanosecond),  
  offset,
) = datetime.to_calendar_parts(my_datetime)    

jrstrunk avatar Jun 11 '25 23:06 jrstrunk

Adding such a function should be enough, yes. We implemented our own converter for the epoch day to date calculation with that. I hope we got that right, but it tests good and was used with a broad range of dates without problem already.

And then we wrote something like this:

build_datetime_db_string(Year, Month, Day, Hours, Minutes, Seconds) ->
    Y1 = Year div 1000,
    Y2 = (Year rem 1000) div 100,
    Y3 = (Year rem 100) div 10,
    Y4 = Year rem 10,

    M1 = Month div 10,
    M2 = Month rem 10,

    D1 = Day div 10,
    D2 = Day rem 10,

    H1 = Hours div 10,
    H2 = Hours rem 10,

    Min1 = Minutes div 10,
    Min2 = Minutes rem 10,

    S1 = Seconds div 10,
    S2 = Seconds rem 10,

    % Build binary with single quotes (39) around the whole datetime
    % Format: 'YYYY-MM-DD HH:MM:SS'
    <<39, (Y1 + 48), (Y2 + 48), (Y3 + 48), (Y4 + 48), 45,
      (M1 + 48), (M2 + 48), 45,
      (D1 + 48), (D2 + 48), 32,
      (H1 + 48), (H2 + 48), 58,
      (Min1 + 48), (Min2 + 48), 58,
      (S1 + 48), (S2 + 48), 39>>.

We used io:format before, but that was still to slow for my taste. This outputs directly what we need. But that is pretty special. Having your extra function we could use that to get the calendar types. And then still use the same Formatter or try if Gleam is "good enough".

Later we looked more into the Erlang calendar module and then, we created our own modules. Because we require some specific operations that Erlang provides and also use the types in our Database code generator which means that we can implement decoders for them in the same module. Things like adding a day offset (like subtracting 3 weeks, -21), finding day of the week, calculating the last Monday for a date, getting the ISO-week (Germany loves that), getting the "start, end" for a month of a given date. Stuff like that.

I still think that enhancing the performance of tempo goes a long way when writing code that is less focused than our database focused use case. And after all, we only target Erlang!

oderwat avatar Jun 12 '25 00:06 oderwat