anytime
anytime copied to clipboard
R session crashes with nonsensical values on Windows
This crashes the R session on Windows:
anytime::anytime(c("2.343423423", "3.435435345"))
It works fine with useR = TRUE:
anytime::anytime(c("2.343423423", "3.435435345"), useR = TRUE)
# [1] NA NA
and also on other platforms.
Using R 3.5 on Windows 7, with latest anytime.
Arrggh. If it changes with useR=TRUE then it is clearly the Boost parser, and there is likely little to nothing we can do :-/.
FWIW, on Linux,
R> anytime::anytime(c("2.343423423", "3.435435345"))
[1] NA NA
R>
and I presume your macOS box is fine too.
Try adding anytime::setDebug(1) to see how 'far' we get. Here:
R> anytime:::setDebug(1)
[1] TRUE
R> anytime::anytime(c("2.343423423", "3.435435345"))
before tests: 2.343423423
In: 2.343423423 out: 2.343423423 and
s: 2.343423423 len: 11 res: 0
s: len: 0 res: 0
One: 2.343423423 two:
before parse: 2.343423423
before tests: 3.435435345
In: 3.435435345 out: 3.435435345 and
s: 3.435435345 len: 11 res: 0
s: len: 0 res: 0
One: 3.435435345 two:
before parse: 3.435435345
[1] NA NA
R>
Crashing Example, next try
Sorry, I messed something up with the 'reproducible' example, because it is not. Here is a reproducible one.
This crashes on Win 7, I checked on two machines:
x <- c("1.26453318526932")
anytime::anytime(x)
The same machines produce an NA with useR=TRUE
x <- c("1.26453318526932")
anytime::anytime(x, useR=TRUE)
# [1] NA
However on the Mac (and probably on Linux, I didn't try) I get a result for with the Boost parser.
> x <- c("1.26453318526932")
> anytime::anytime(x)
[1] "5331-01-28 05:33:00 CET"
Debug
Now, for the debugging, I get on Windows:
> anytime:::setDebug(1)
[1] TRUE
> x <- "1.26453318526932"
> anytime::anytime(x)
before tests: 1.26453318526932
In: 1.26453318526932 out: 1.26453318526932 and
s: 1.26453318526932 len: 16 res: 0
s: len: 0 res: 0
One: 1.26453318526932 two:
before parse: 1.26453318526932
which looks similar to the Mac, where it doesn't crash:
> anytime:::setDebug(1)
[1] TRUE
> x <- "1.26453318526932"
> anytime::anytime(x)
before tests: 1.26453318526932
In: 1.26453318526932 out: 1.26453318526932 and
s: 1.26453318526932 len: 16 res: 0
s: len: 0 res: 0
One: 1.26453318526932 two:
before parse: 1.26453318526932
[1] "5331-01-28 05:33:00 CET"
Thanks. I later realized the debug is not that useful -- that I just the string parsing.
Linux result is the same as the macOS result.
R Session is also crashing in this case
anytime::anytime("9999-01-01")
With anytime:::setDebug(1) I get sinking output in a file
before tests: 9999-01-01
In: 9999-01-01 out: 9999-01-01 and
s: 9999-01-01 len: 10 res: 0
s: len: 0 res: 0
One: 9999-01-01 two:
before parse: 9999-01-01
However, this is working
anytime::anytime("1990-02-03")
#> [1] "1990-02-03 CET"
is this related or should I open a new issue on this one ?
Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#> setting value
#> version R version 3.5.1 (2018-07-02)
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate French_France.1252
#> tz Europe/Paris
#> date 2018-09-29
#> Packages -----------------------------------------------------------------
#> package * version date source
#> anytime 0.3.1.1 2018-09-26 Github (eddelbuettel/anytime@841d7fd)
#> backports 1.1.2 2017-12-13 CRAN (R 3.5.0)
#> base * 3.5.1 2018-07-02 local
#> compiler 3.5.1 2018-07-02 local
#> datasets * 3.5.1 2018-07-02 local
#> devtools 1.13.6 2018-06-27 CRAN (R 3.5.1)
#> digest 0.6.17 2018-09-12 CRAN (R 3.5.1)
#> evaluate 0.11 2018-07-17 CRAN (R 3.5.1)
#> graphics * 3.5.1 2018-07-02 local
#> grDevices * 3.5.1 2018-07-02 local
#> htmltools 0.3.6 2017-04-28 CRAN (R 3.5.1)
#> knitr 1.20 2018-02-20 CRAN (R 3.5.1)
#> magrittr 1.5 2014-11-22 CRAN (R 3.5.1)
#> memoise 1.1.0 2017-04-21 CRAN (R 3.5.1)
#> methods * 3.5.1 2018-07-02 local
#> RApiDatetime 0.0.3 2017-04-02 CRAN (R 3.5.0)
#> Rcpp 0.12.18 2018-07-23 CRAN (R 3.5.1)
#> rmarkdown 1.10 2018-06-11 CRAN (R 3.5.1)
#> rprojroot 1.3-2 2018-01-03 CRAN (R 3.5.1)
#> stats * 3.5.1 2018-07-02 local
#> stringi 1.2.4 2018-07-20 CRAN (R 3.5.1)
#> stringr 1.3.1 2018-05-10 CRAN (R 3.5.1)
#> tools 3.5.1 2018-07-02 local
#> utils * 3.5.1 2018-07-02 local
#> withr 2.1.2 2018-03-15 CRAN (R 3.5.1)
#> yaml 2.2.0 2018-07-25 CRAN (R 3.5.1)
That was already documented in https://github.com/eddelbuettel/anytime/issues/78, and is likely also a Boost issue. Try with useR=TRUE.
yes, It is working using RApiDatetime.
It always surprising to have crash when using a tool. If the bug is in Boost there is nothing that can really be done I guess. Also, this is kind of an edge case to have data with year 9999. Thanks for the quick answer and all the great 📦
I wish this was fixed, but given that it only happens on Windows ... We could report it to the Boost date_library maintainer. If you have a few minutes, I could even send you some old standalone code...
Sorry for the delay. Yes you can send me some infos on this and I can see where to report it. However, I have never look at Boost code and I hope I can help
I presume you have C++ compiler handly on Windows, at a mininum Rtools to work with R, and to install packages like this one from source, correct?
What I have done in the past may have required linking (as Boost is easily available on Linux), but let me see if I can cook something up just using Boost headers and eg the BH package just like anytime does.
Yes it is correct. I have Rtools to build package from source.
See program below can be build with just g++. You need to find where you have BH and set -I on your box. Here I can also rely on system headers and just dp:
edd@rob:/tmp$ g++ -Wall -o boostex boost_date_parser_cmdline.cpp
edd@rob:/tmp$ ./boostex "2018-09-10 10:11:12" "20180910"
Parsed ptime is 2018-Sep-10 10:11:12 -- seconds from epoch are 1536574272 from input 2018-09-10 10:11:12
Parsed ptime is 2018-Sep-10 00:00:00 -- seconds from epoch are 1536537600 from input 20180910
edd@rob:/tmp$
It should crash for you and show a Boost bug. If it doesn't then the issue is R interacting with Boost. Possibly, but now something I expect.
// cf http://stackoverflow.com/a/3787188/143305 and extended
#include <iostream>
#include <boost/date_time.hpp>
namespace bt = boost::posix_time;
const std::locale formats[] = {
std::locale(std::locale::classic(),new bt::time_input_facet("%Y-%m-%d %H:%M:%S")),
std::locale(std::locale::classic(),new bt::time_input_facet("%Y%m%d %H%M%S")),
std::locale(std::locale::classic(),new bt::time_input_facet("%m-%d-%Y %H:%M:%S")),
std::locale(std::locale::classic(),new bt::time_input_facet("%d.%m.%Y %H:%M:%S")),
std::locale(std::locale::classic(),new bt::time_input_facet("%Y-%m-%d")),
std::locale(std::locale::classic(),new bt::time_input_facet("%Y%m%d"))
};
const size_t formats_n = sizeof(formats)/sizeof(formats[0]);
std::time_t pt_to_time_t(const bt::ptime& pt) {
bt::ptime timet_start(boost::gregorian::date(1970,1,1));
bt::time_duration diff = pt - timet_start;
return diff.ticks()/bt::time_duration::rep_type::ticks_per_second;
}
void seconds_from_epoch(const std::string& s) {
bt::ptime pt, ptbase;
for (size_t i=0; pt == ptbase && i < formats_n; ++i) {
std::istringstream is(s);
is.imbue(formats[i]);
is >> pt;
}
if (pt == bt::ptime()) {
std::cerr << "Parse error for " << s << '\n';
exit(-1);
}
std::cout << "Parsed ptime is " << pt << " -- seconds from epoch are "
<< pt_to_time_t(pt) << " from input " << s << '\n';
}
int main(int argc, char *argv[]) {
if (argc < 2) {
std::cerr << "Usage: " << argv[0] << " arg1 [arg2...]\n";
exit(-1);
}
for (int i=1; i<argc; i++) {
seconds_from_epoch(argv[i]);
}
exit(0);
}
It works without any Boost bug
$g++ -Wall -IC:/Users/chris/Documents/R/win-library/3.5/BH/include -o boostex boost_date_parser_cmdline.cpp
$ ./boostex.exe "2018-09-10 10:11:12" "20180910"
Parsed ptime is 2018-Sep-10 10:11:12 -- seconds from epoch are 1536574272 from input 2018-09-10 10:11:12
Parsed ptime is 2018-Sep-10 00:00:00 -- seconds from epoch are 1536537600 from input 20180910
This means it is on the R side ?
Something off during compilation when installing anytime with Rtools ?
Interesting. And yes, something else creeps in from somewhere.
If you could, now please modify you modify your R package to log ie add some printf statements. The key of the conversion is the same. Maybe the issue is somewhere else, and we have a chance to get to it?
Start with something like this from R side, setting the debug boolean in the C++ code:
R> library(anytime)
R> anytime:::setDebug(TRUE)
[1] TRUE
R> anydate("2018-09-30")
before tests: 2018-09-30
In: 2018-09-30 out: 2018-09-30 and
s: 2018-09-30 len: 10 res: 0
s: len: 0 res: 0
One: 2018-09-30 two:
before parse: 2018-09-30
days 17804
[1] "2018-09-30"
R>
Most of this, though, is from the string splitting before we call the Boost routine I suspected, maybe wrongly. But you see the pattern: try to add more conditional output like I had done to trace where it blows up. We may get to it...
OK, I did add some lines like if (debug) Rcpp::Rcout << "some text" << std::endl; everywhere in the code of anytime.cpp, and I was able to find the line that makes R crash
https://github.com/eddelbuettel/anytime/blob/841d7fd046aea953214d8e66fd90fb0b7e5a07dc/src/anytime.cpp#L172
If I put a debug print log just before and just after, the second one do not print. I think it means the error is in localAsTm->tm_isdst*60*60.
If I try to activate one line you have commented https://github.com/eddelbuettel/anytime/blob/841d7fd046aea953214d8e66fd90fb0b7e5a07dc/src/anytime.cpp#L162 In this case, the crash happens at this line.
Is this helping you identify what could be the cause ?
That is awesome work, and quite possible! The tm_isdst is probably not set, or not accessible. If we ifdef that away things will not crash. They may be wrong, though.
So some thinking time required. When/where/how does this become invalid? Years after 2038 maybe?
(Fixed initial typo. 2038 is the end of the 'since epoch counter' under current variable sizes.)
Ie could you test with some reasonable guesses? We year 9999 is bad. Can you triage down? Would 5000 work? 3000? 2038, as suspect?
Sorry for the delay. I made a small for loops to test all the year. I can say for sure that the first year with error is 3002. 2038 is working fine;
The testing code:
years <- seq(3000, 9999, by = 1)
try_conversion <- function(years) {
error <- FALSE
for(year in years) {
message("Year: ", year)
res <- tryCatch(
callr::r(function(year) anytime::anytime(paste0(year, "-01-01")), args = list(year = year)),
error = function(cnd) {
error <<- TRUE
message("Error for year ", year)
})
if (error) {
stop("Error found for year : ", year, call. = FALSE)
}
}
"no error"
}
try_conversion(years)
We are actually in the wrong issue ticket in a way. What about the values @christophsax reported above, ie
anytime::anytime(c("2.343423423", "3.435435345"))
Would be good to have the "smallest" possible set of exclusion. "On Windows" && year >= 3002 works foe me, but may need a little more on the other with years below "some" (to be found) value. Can you loop down too?
About this issue ticket, i get this
anytime::anytime(c("2.343423423", "3.435435345"))
#> [1] NA NA
So it seems to work. About the loop, I tried from 2008 to 9999 currently. Error start at 3002. I will run the loop down too.
We are actually in the wrong issue ticket in a way.
Oh I forgot. Please open a new one if you think it is better for follow up.
Please open a new one if you think it is better for follow up.
I guess the better way would have been to go there and reopen, but it's that we have thousands of people following. And for now we are getting (good!) work done here so let's keep at it.
As for the error @christophsax reported and you no longer see: good. One less error then? I have not made a BH release so it it a little strange that it blew up for him but not for you. Oh well. Maybe a different test. We are getting this "max year on Windows" sorted out. Maybe add a min year too.
No error when I go down but NA value from year 1399.
So currently we have something that already tries to avoid a seg.fault on Windows:
#if defined(_WIN32)
if (totsec > 0) { // on Windows, for dates before 1970-01-01: segfault
dstadj = localAsTm->tm_isdst*60*60;
}
#else
dstadj = localAsTm->tm_isdst*60*60;
#endif
I guess I should try something like
#if defined(_WIN32)
if (totsec > 0 && // on Windows before 1970-01-01: segfault
localAsTm->tm_year >= 1102) { // tm_year is year-1900, so year 3002 is it
dstadj = localAsTm->tm_isdst*60*60;
}
#else
dstadj = localAsTm->tm_isdst*60*60;
#endif
Can you test that, please?
Comparison is wrong, of course. < 1102:
#if defined(_WIN32)
if (totsec > 0 && // on Windows before 1970-01-01: segfault
localAsTm->tm_year < 1102) { // tm_year is year-1900, so year 3002 is it
dstadj = localAsTm->tm_isdst*60*60;
}
#else
dstadj = localAsTm->tm_isdst*60*60;
#endif
I am currently trying but there still an error when I try. I'll keep looking where it comes from and if I am doing everything right.
I built it here. There may have been a copy and paste error. Do you want me to push as branch?
OK. I think localAsTm->tm_year is getting R crashed now. Is there a way to debug by looking inside localAsTm ?
I tried if (debug) Rcpp::Rcout << "Before fix - tm_year: " << localAsTm->tm_year << std::endl; and this does not log before crash.
(And I tested it on Windows per rhub. Something changed in R with tz info now progressing better so I need to check that -- but build is fine, and the new test I added for this is fine. I think :) One can hope.)
As for tm_year etc, man localtime for me says:
Broken-down time is stored in the structure tm, which is defined in <time.h> as follows:
struct tm {
int tm_sec; /* Seconds (0-60) */
int tm_min; /* Minutes (0-59) */
int tm_hour; /* Hours (0-23) */
int tm_mday; /* Day of the month (1-31) */
int tm_mon; /* Month (0-11) */
int tm_year; /* Year - 1900 */
int tm_wday; /* Day of the week (0-6, Sunday = 0) */
int tm_yday; /* Day in the year (0-365, 1 Jan = 0) */
int tm_isdst; /* Daylight saving time */
};
You need to find the header file R uses on Windows and look. Appreciate all your diligence and help!!
if there is a way for me to test the version you've build, I can try it and report. Also, I'll try to find the header file and tell you. Not sure what could be going on here but happy to help !
I am currently tied up with something else but all I did so far (besides incrementing dev version and NEWS.Rd) was this:
diff --git a/src/anytime.cpp b/src/anytime.cpp
index d1a075b..157f13f 100644
--- a/src/anytime.cpp
+++ b/src/anytime.cpp
@@ -168,7 +168,8 @@ double ptToDouble(const bt::ptime & pt, const bool asDate=false) {
// contain the additional DST adjustment
double totsec = tdiff.total_microseconds()/1.0e6, dstadj = 0;
#if defined(_WIN32)
- if (totsec > 0) { // on Windows, for dates before 1970-01-01: segfault
+ if (totsec > 0 && // on Windows before 1970-01-01: segfault
+ localAsTm->tm_year < 1102) { // tm_year is year-1900, so year 3002 is it
dstadj = localAsTm->tm_isdst*60*60;
}
#else
I also added this test which git diff does not yet know about as the file tests/gh_issue_78.R isn't staged yet:
## https://github.com/eddelbuettel/anytime/issues/78 as well as
## https://github.com/eddelbuettel/anytime/issues/76
library(anytime)
anydate("3003-01-01") # crashed on Windows before tm_isdst disable for post 3002 years
stopifnot(anydate("3003-01-01") == as.Date("3003-01-01"))
This is what I also modified in anytime.cpp. I build then installed but I still have an error. 🤔
I found time.h header that I think R use in C:\Rtools\mingw_64\x86_64-w64-mingw32\include\time.h
Not sure what to look for in it - I put the file in this gist
I'll leave it for now as I am not sure I can do more by myself. If you still need my help further, please tell me.
Thank you for your assistance helping me help you.
We can let it rest. I will put this into a proper branch, try win-builder etc pp. Right now I know too little about your error (you gave no details) but sometimes ... sleep and a little distance helps.
We made huge leaps today. I think we got this. So a big Thank you! to you for your patient trials.
But I still get Windows errors when testing (via winbuilder). This is not done.
Maybe I misunderstood you in terms of having run-time vs build-time errors. I think we are at run-time now.
I added another commit to the new branch. It now passes for me on Windows (apart from a vignette issue which is unrelated).
Try this when you have a chance.
Thanks for the branch - I tested it but there is still an issue.
In fact, the issue is from the beginning with anytime::anytime and not with anytime::anydate. At run time, when we try a big year like anytime::anytime("2999-01-01") it works but, then with something like anytime::anytime("3002-01-01"), it crashes.
When anytime::anydate is used, the faulty part of the code I linked in my previous comment is not pass through because asDate argument is TRUE and code returns before in ptToDouble
https://github.com/eddelbuettel/anytime/blob/841d7fd046aea953214d8e66fd90fb0b7e5a07dc/src/anytime.cpp#L154-L157
I think this is why your tests are passing because tests/gh_issue_78.R is using anytime::anydate.
Also, I am a french computer, so default timezone is CET:
anytime:::getTZ()
#> [1] "Europe/Paris"
Sorry if all that was not clear from the start.
more tests
As time is count in seconds, I dig a little further to exact date for which there is a crash between 3001-01-01 and 3002-01-01.
With default timezone (Europe/Paris)
anytime::anytime("3001-01-01 08:59:59")this worksanytime::anytime("3001-01-01 09:00:00")this crashes
If I use asUTC=TRUE, no crash so ptToDoubleUTC in anytime.cpp is working fine.
If I force UTC for R with Sys.setenv(TZ="UTC"), as expected
anytime::anytime("3001-01-01 07:59:59")is okanytime::anytime("3001-01-01 08:00:00")makes R crash
It seems the datetime in seconds that causes crash is 32535244800. I tried to use anytime with numeric to see if it crashes too. anytime::anytime(32535244800) does not crash. So, it crashes when given as character but not as numeric.
I hope this new detailed tests can help find the part that makes R crash.
That is a lot to chew over. Point taken re anytime() not anydate(). Will update the test.
anytime::anytime(32535244800) does not parse as text but it caught earlier and converted directly as a type, so it does not apply.
I will set the year comparison to 3000 then, ie 3000-1900 = 1100 as the cutoff.