HowardHinnant.github.io
HowardHinnant.github.io copied to clipboard
Days algorithms
Found your algorithms days_from_civil and civil_from_days (http://howardhinnant.github.io/date_algorithms.html). l saw their use in the hffix (https://github.com/jamesdbrock/hffix) and Microsoft STL (https://github.com/microsoft/STL) libraries. Based on the description on the site, it seems that this is the fastest of the existing implementations, but this is not the case. The gmtime implementation in musl libc (https://github.com/ifduyue/musl) is 40% more efficient. I think it is worth writing that there are more efficient implementations, and that it is worth using them.
Can you point more specifically to the algorithm? I found https://github.com/ifduyue/musl/blob/cfdfd5ea3ce14c6abf7fb22a531f3d99518b5a1b/src/time/__secs_to_tm.c, but I have doubts that is what you're referring to.
it's him.
You sure? I haven't timed it, but I see iteration in __secs_to_tm
.
The next thing I'm trying to find is a claim that the algorithms on http://howardhinnant.github.io/date_algorithms.html are the fastest of the existing implementations. At http://howardhinnant.github.io/date_algorithms.html I see "fast", but not "fastest".
Also check out my stars page: https://github.com/HowardHinnant?tab=stars Take a look at anything by cassioneri. I'm pretty sure I have __secs_to_tm
beat. And I know the work at cassioneri has mine beat. He does sacrifice some range, but nothing that the C++20 chrono spec doesn't allow. And he's gotten his implementation into the gcc C++20 implementation: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/chrono#L2448-L2513 It's really nice work.
I got around to doing some comparative timing between https://github.com/ifduyue/musl/blob/cfdfd5ea3ce14c6abf7fb22a531f3d99518b5a1b/src/time/__secs_to_tm.c and civil_from_days.
Problem: These two algorithms do different things.
-
__secs_to_tm
computes year, month, day, weekday, day-of-year [0, 365], hour, minute and second. -
civil_from_days
computes year, month and day.
So to do a fair comparison, I did it two ways:
- I wrote the equivalent of
__secs_to_tm
using date.h which usescivil_from_days
under the hood. - I stripped out the compuation of weekday, day-of-year, hour, minute and second from both function in the first comparison and re-compared.
Here is the first comparison:
#include <time.h>
#include <limits.h>
/* 2000-03-01 (mod 400 year, immediately after feb29 */
#define LEAPOCH (946684800LL + 86400*(31+29))
#define DAYS_PER_400Y (365*400 + 97)
#define DAYS_PER_100Y (365*100 + 24)
#define DAYS_PER_4Y (365*4 + 1)
int __secs_to_tm(long long t, volatile tm *tm)
{
long long days, secs, years;
int remdays, remsecs, remyears;
int qc_cycles, c_cycles, q_cycles;
int months;
int wday, yday, leap;
static const char days_in_month[] = {31,30,31,30,31,31,30,31,30,31,31,29};
// Reject time_t values whose year would overflow int
if (t < INT_MIN * 31622400LL || t > INT_MAX * 31622400LL)
return -1;
secs = t - LEAPOCH;
days = secs / 86400;
remsecs = secs % 86400;
if (remsecs < 0) {
remsecs += 86400;
days--;
}
wday = (3+days)%7;
if (wday < 0) wday += 7;
qc_cycles = days / DAYS_PER_400Y;
remdays = days % DAYS_PER_400Y;
if (remdays < 0) {
remdays += DAYS_PER_400Y;
qc_cycles--;
}
c_cycles = remdays / DAYS_PER_100Y;
if (c_cycles == 4) c_cycles--;
remdays -= c_cycles * DAYS_PER_100Y;
q_cycles = remdays / DAYS_PER_4Y;
if (q_cycles == 25) q_cycles--;
remdays -= q_cycles * DAYS_PER_4Y;
remyears = remdays / 365;
if (remyears == 4) remyears--;
remdays -= remyears * 365;
leap = !remyears && (q_cycles || !c_cycles);
yday = remdays + 31 + 28 + leap;
if (yday >= 365+leap) yday -= 365+leap;
years = remyears + 4*q_cycles + 100*c_cycles + 400LL*qc_cycles;
for (months=0; days_in_month[months] <= remdays; months++)
remdays -= days_in_month[months];
if (months >= 10) {
months -= 12;
years++;
}
if (years+100 > INT_MAX || years+100 < INT_MIN)
return -1;
tm->tm_year = years + 100;
tm->tm_mon = months + 2;
tm->tm_mday = remdays + 1;
tm->tm_wday = wday;
tm->tm_yday = yday;
tm->tm_hour = remsecs / 3600;
tm->tm_min = remsecs / 60 % 60;
tm->tm_sec = remsecs % 60;
return 0;
}
#include "date/date.h"
int
secs_to_tm(long long t, volatile tm *tm)
{
using namespace date;
using namespace std::chrono;
constexpr sys_seconds min = sys_days{year::min()/1/1};
constexpr sys_seconds max = sys_days{year::max()/12/31};
sys_seconds tp{seconds{t}};
if (tp < min || tp > max)
return -1;
auto tp_days = floor<days>(tp);
year_month_day ymd = tp_days;
tm->tm_year = int{ymd.year()} - 1900;
tm->tm_mon = unsigned{ymd.month()} - 1;
tm->tm_mday = unsigned{ymd.day()};
tm->tm_wday = weekday{tp_days}.c_encoding();
tm->tm_yday = (tp_days - sys_days{ymd.year()/1/1})/days{1};
hh_mm_ss tod{tp - tp_days};
tm->tm_hour = tod.hours().count();
tm->tm_min = tod.minutes().count();
tm->tm_sec = tod.seconds().count();;
return 0;
}
#include <cassert>
#include <iostream>
bool
operator==(struct tm& x, struct tm& y)
{
return x.tm_year == y.tm_year &&
x.tm_mon == y.tm_mon &&
x.tm_mday == y.tm_mday &&
x.tm_wday == y.tm_wday &&
x.tm_yday == y.tm_yday &&
x.tm_hour == y.tm_hour &&
x.tm_min == y.tm_min &&
x.tm_sec == y.tm_sec;
}
int
main()
{
using namespace date;
using namespace std::chrono;
auto constexpr m = (sys_days{year::min()/1/1} - sys_seconds{})/1s;
auto constexpr M = (sys_days{year::max()/12/31} - sys_seconds{})/1s;
int i = 0;
int count = 0;
auto tp0 = steady_clock::now();
for (auto t = m; t <= M; t += 10'000)
{
struct tm tm1{};
struct tm tm2{};
assert(__secs_to_tm(t, &tm1) == secs_to_tm(t, &tm2));
assert(tm1 == tm2);
++count;
}
auto tp1 = steady_clock::now();
volatile tm tm1{};
for (auto t = m; t <= M; t += 10'000)
{
i += __secs_to_tm(t, &tm1);
}
auto tp2 = steady_clock::now();
volatile tm tm2{};
for (auto t = m; t <= M; t += 10'000)
{
i += secs_to_tm(t, &tm2);
}
auto tp3 = steady_clock::now();
using ps = duration<int64_t, std::pico>;
std::cout << (tp1-tp0)/count << '\n';
std::cout << (tp2-tp1)/count << '\n';
std::cout << (tp3-tp2)/count << '\n';
return i;
}
compiled with clang++ -03 on a 2013, 2.4 GHz Intel Core i5, MacBook Pro.
The test first warms everything up and ensures that both functions are getting identical results. Results are spot checked every 10,000s between -32767-01-01 and 32767-12-31. Then the test repeats for timing purposes, looping over each function. Care is used to ensure that the optimizer doesn't eliminate the subject being tested. Results are printed out in terms of nanoseconds per iteration.
My results for this test are:
59ns
28ns
29ns
Meaning the correctness test took 59ns, __secs_to_tm
took 28ns and the date.h version took 29ns. __secs_to_tm
performed much better than I thought it would!
Here's what got commented out for part 2: Just compute year, month and day:
#include <time.h>
#include <limits.h>
/* 2000-03-01 (mod 400 year, immediately after feb29 */
#define LEAPOCH (946684800LL + 86400*(31+29))
#define DAYS_PER_400Y (365*400 + 97)
#define DAYS_PER_100Y (365*100 + 24)
#define DAYS_PER_4Y (365*4 + 1)
int __secs_to_tm(long long t, volatile tm *tm)
{
long long days, secs, years;
int remdays, remsecs, remyears;
int qc_cycles, c_cycles, q_cycles;
int months;
int wday, yday, leap;
static const char days_in_month[] = {31,30,31,30,31,31,30,31,30,31,31,29};
// Reject time_t values whose year would overflow int
if (t < INT_MIN * 31622400LL || t > INT_MAX * 31622400LL)
return -1;
secs = t - LEAPOCH;
days = secs / 86400;
remsecs = secs % 86400;
if (remsecs < 0) {
// remsecs += 86400;
days--;
}
// wday = (3+days)%7;
// if (wday < 0) wday += 7;
qc_cycles = days / DAYS_PER_400Y;
remdays = days % DAYS_PER_400Y;
if (remdays < 0) {
remdays += DAYS_PER_400Y;
qc_cycles--;
}
c_cycles = remdays / DAYS_PER_100Y;
if (c_cycles == 4) c_cycles--;
remdays -= c_cycles * DAYS_PER_100Y;
q_cycles = remdays / DAYS_PER_4Y;
if (q_cycles == 25) q_cycles--;
remdays -= q_cycles * DAYS_PER_4Y;
remyears = remdays / 365;
if (remyears == 4) remyears--;
remdays -= remyears * 365;
// leap = !remyears && (q_cycles || !c_cycles);
// yday = remdays + 31 + 28 + leap;
// if (yday >= 365+leap) yday -= 365+leap;
years = remyears + 4*q_cycles + 100*c_cycles + 400LL*qc_cycles;
for (months=0; days_in_month[months] <= remdays; months++)
remdays -= days_in_month[months];
if (months >= 10) {
months -= 12;
years++;
}
if (years+100 > INT_MAX || years+100 < INT_MIN)
return -1;
tm->tm_year = years + 100;
tm->tm_mon = months + 2;
tm->tm_mday = remdays + 1;
// tm->tm_wday = wday;
// tm->tm_yday = yday;
//
// tm->tm_hour = remsecs / 3600;
// tm->tm_min = remsecs / 60 % 60;
// tm->tm_sec = remsecs % 60;
return 0;
}
#include "date/date.h"
int
secs_to_tm(long long t, volatile tm *tm)
{
using namespace date;
using namespace std::chrono;
constexpr sys_seconds min = sys_days{year::min()/1/1};
constexpr sys_seconds max = sys_days{year::max()/12/31};
sys_seconds tp{seconds{t}};
if (tp < min || tp > max)
return -1;
auto tp_days = floor<days>(tp);
year_month_day ymd = tp_days;
tm->tm_year = int{ymd.year()} - 1900;
tm->tm_mon = unsigned{ymd.month()} - 1;
tm->tm_mday = unsigned{ymd.day()};
// tm->tm_wday = weekday{tp_days}.c_encoding();
// tm->tm_yday = (tp_days - sys_days{ymd.year()/1/1})/days{1};
//
// hh_mm_ss tod{tp - tp_days};
// tm->tm_hour = tod.hours().count();
// tm->tm_min = tod.minutes().count();
// tm->tm_sec = tod.seconds().count();;
return 0;
}
And now the results are:
41ns
21ns
16ns
I suppose that the most major bottleneck could be division, since division is significantly slower than addition, subtraction, multiplication.
The musl __secs_to_tm
uses a lookup table instead of doing integer arithmetic to find out the month.
Here's some of my thoughts:
- Tuning the constants so that the divisor is power-of-2 may be beneficial, since the division can be optimized to a bitshift. I used an Excel spreadsheet to search for candidates and found the following:
// days from month
(1958 * (m > 2 ? m - 3 : m + 9) + 35) / 64;
// month from days
(535 * doy + 333) / 16384;
-
__secs_to_tm
uses linear search in the lookup table, perhaps a binary search can be used instead?
{
static const int days_month_end[] = {31,61,92,122,153,184,214,245,275,306,337,366};
int month_lbound = 0, month_rbound = 11;
while (month_lbound < month_rbound) {
month_middle = (month_lbound + month_rbound) / 2;
int days_middle = days_month_accumulate[month_middle];
if (remdays >= days_middle) {
month_lbound = month_middle + 1;
} else {
month_rbound = month_middle;
}
}
month = month_lbound;
remdays -= days_in_month[month];
}
yeah I confirmed @yangbowen's formulae, and improved the second one slightly. here are what i believe are the most optimized versions:
(535 * doy + 332) >> 14
(979 * mp + 16) >> 5
See this repository for some impressive work in this area: https://github.com/cassioneri/eaf