icews icon indicating copy to clipboard operation
icews copied to clipboard

Auto-retry when dataverse is slow

Open mayeulk opened this issue 6 years ago • 3 comments

Often, the dataverse server is slow and update_icews() stops. It would be great to have an option to relaunch it automatically in these cases (maybe after a delay, specified in seconds). There are at least 2 types of errors for which relaunching works:

  • Gateway Timeout (HTTP 504).

  • parse error: premature EOF

> update_icews(dryrun = FALSE); date()
Error in value[[3L]](cond) : 
  Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Downloading '20181129-icews-events.zip'
Ingesting records from '20181129-icews-events.tab'
Downloading '20181130-icews-events.zip'
Ingesting records from '20181130-icews-events.tab'
Downloading '20181203-icews-events.zip'
Ingesting records from '20181203-icews-events.tab'
Downloading '20181204-icews-events.zip'
Ingesting records from '20181204-icews-events.tab'
Downloading '20181205-icews-events.zip'
Ingesting records from '20181205-icews-events.tab'
Downloading '20181206-icews-events.zip'
Ingesting records from '20181206-icews-events.tab'
Downloading '20181207-icews-events.zip'
Ingesting records from '20181207-icews-events.tab'
Downloading '20181208-icews-events.zip'
Error in get_file(file_ref, get_doi()[[repo]]) : 
  Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Error in value[[3L]](cond) : 
  Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
parse error: premature EOF
                                       
                     (right here) ------^
> update_icews(dryrun = FALSE); date()
Downloading '20181208-icews-events.zip'
Ingesting records from '20181208-icews-events.tab'
Downloading '20181209-icews-events.zip'
Ingesting records from '20181209-icews-events.tab'
Downloading '20181210-icews-events.zip'
Ingesting records from '20181210-icews-events.tab'
Downloading '20181211-icews-events.zip'
Error in get_file(file_ref, get_doi()[[repo]]) : 
  Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Error in value[[3L]](cond) : 
  Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Downloading '20181211-icews-events.zip'
Error in get_file(file_ref, get_doi()[[repo]]) : 
  Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Downloading '20181211-icews-events.zip'
Error in get_file(file_ref, get_doi()[[repo]]) : 
  Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Downloading '20181211-icews-events.zip'
Error in get_file(file_ref, get_doi()[[repo]]) : 
  Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Downloading '20181211-icews-events.zip'
Ingesting records from '20181211-icews-events.tab'
Downloading '20181212-icews-events.zip'
Ingesting records from '20181212-icews-events.tab'
Downloading '20181213-icews-events.zip'

mayeulk avatar Jun 19 '19 16:06 mayeulk

I just launched this in one go:

update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()

This worked,... but then Harvard's Dataverse "crashed":


> > update_icews(dryrun = FALSE)
> Downloading '20181215-icews-events.zip'
> Ingesting records from '20181215-icews-events.tab'
> Downloading '20181216-icews-events.zip'
> Ingesting records from '20181216-icews-events.tab'
> Downloading '20181217-icews-events.zip'
> Ingesting records from '20181217-icews-events.tab'
> Downloading '20181218-icews-events.zip'
> Ingesting records from '20181218-icews-events.tab'
> Downloading '20181219-icews-events.zip'
> Ingesting records from '20181219-icews-events.tab'
> Downloading '20181220-icews-events.zip'
> Ingesting records from '20181220-icews-events.tab'
> Downloading '20181221-icews-events.zip'
> Error in get_file(file_ref, get_doi()[[repo]]) : 
>   Internal Server Error (HTTP 500).
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:18 2019"
> > update_icews(dryrun = FALSE); date()
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > update_icews(dryrun = FALSE); date()
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^

Going to https://dataverse.harvard.edu/ gives a 503 - Service Unavailable "Service Unavailable The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later."

2 or 3 minutes later, it was working fine again, and update_icews(dryrun = FALSE) worked

mayeulk avatar Jun 19 '19 16:06 mayeulk

Note: after the (possible?) server restart (?) of dataverse.harvard.edu/ , things went smoother, with about 107 downloads, until stopping with the following issue again: https://github.com/andybega/icews/issues/45#issuecomment-503646348

mayeulk avatar Jun 19 '19 17:06 mayeulk

Hmm. I'm wondering if this is something that should be done in the actual client (https://github.com/IQSS/dataverse-client-r), but it's at the moment not being actively maintained for lack of a new owner.

Any suggestions for how this should properly be done? Upon encountering one of these errors, iterate through waiting for some small amount of time until either success or some limit is reached?

andybega avatar Jun 21 '19 08:06 andybega