dataverse-client-r
dataverse-client-r copied to clipboard
Creating a dataverse fails with HTTP 400: Bad Request
I sat down to try the Dataverse API and dataverse
today and ran into HTTP 400 (Bad Request) codes when calling create_dataverse()
. As often happens I figured out the solution while writing up the issue.
problem
Without the dataverse
argument, the expected behavior from create_dataverse()
is "a top-level Dataverse is created." I probably don't have the permissions for that on dataverse.harvard.edu
, but this isn't relevant for demonstration:
library("dataverse")
> dv <- create_dataverse()
Error in create_dataverse() : Bad Request (HTTP 400).
> traceback()
3: stop(http_condition(x, "error", task = task, call = call))
2: httr::stop_for_status(r) at create_dataverse.R#27
1: create_dataverse()
Code 400. Stepping through with debug()
the request was
debug at /home/.../dataverse-client-r/R/create_dataverse.R#26: r <- httr::POST(u, httr::add_headers(`X-Dataverse-key` = key), ...)
Browse[2]> r$request
<request>
POST https://dataverse.harvard.edu/api/dataverses
Output: write_memory
Options:
* useragent: libcurl/7.55.1 r-curl/3.1 httr/1.3.1
* post: TRUE
* postfieldsize: 0
Headers:
* Accept: application/json, text/xml, application/xml, */*
* Content-Type:
* X-Dataverse-key: ####-####
Note the empty body and content-type. For completeness, I do have the admin
role for the medsl
dataverse at harvard.edu
, and get the same result with
> r = create_dataverse("medsl")
Error in create_dataverse("medsl") : Bad Request (HTTP 400).
solution
From the Dataverse API docs I see this behavior is actually expected. I'm sending an empty body, and minimally the API requires fields name
, alias
, and dataverseContacts
.
I can confirm this is the issue by sending a request with httr
, using the example content from the docs as body content:
api_url = "https://dataverse.harvard.edu/api/dataverses/medsl"
meta = jsonlite::read_json("http://guides.dataverse.org/en/latest/_downloads/dataverse-complete.json")
r <- httr::POST(api_url, httr::add_headers("X-Dataverse-key" = key), body =
meta, encode = "json")
r$status_code
[1] 201
That's a success and the new dataverse appears in the GUI unpublished. This result can be replicated with create_dataverse()
using its dots argument, which is passed to httr::POST()
.
# same body; nb encode='json' is required
r = create_dataverse("medsl", body = meta, encode = "json")
> str(r)
chr "{\"status\":\"OK\",\"data\":{\"id\":3131902,\"alias\":\"science\",\"name\":\"Scientific Research\",\"affiliatio"| __truncated__
And that's also successful!
If create_dataverse()
will always require body content, it might be worthwhile to move body
into its signature as a new named argument and handle the encoding. Alternatively, the minimal metadata fields (name
, alias
, dataverseContacts
) could appear in the signature, since passing a named list is a little clunky. It looks something like this (.Names
elements added by dput
):
structure(list(name = "Scientific Research", alias = "science", dataverseContacts = list(structure(list(contactEmail = "[email protected]"), .Names = "contactEmail"), structure(list(contactEmail = "[email protected]"), .Names = "contactEmail")), affiliation = "Scientific Research University", description = "We do all the science.", dataverseType = "LABORATORY"), .Names = c("name", "alias", "dataverseContacts", "affiliation", "description", "dataverseType"))
Thoughts? If you're open to an update I can submit a PR.
@jamesdunham thanks for such a thorough bug report and for being willing to submit a pull request! I'll defer to @leeper on what approach to take.
I think this may have changed at some point, or I never fully built-out the code. We definitely want to bring all of these arguments up into the function signature. I'd say all of the fields should be separate arguments with the non-required fields set by default to NULL
. Something like:
create_dataverse(
dataverse,
alias,
contacts,
affiliation = NULL,
description = NULL,
type = NULL
) {
bod <- list()
bod$name <- dataverse
bod$alias <- alias
bod$dataverseContacts <- setNames(as.list(contacts), rep("contactEmail", length(contacts)))
if (!is.null(affiliation)) {
bod$affiliation <- affiliation
}
if (!is.null(description)) {
bod$description <- description
}
if (!is.null(type)) {
bod$type <- match.arg(toUpper(type), c("DEPARTMENT", "JOURNAL", "LABORATORY", "ORGANIZATIONS_INSTITUTIONS", "RESEARCHERS", "RESEARCH_GROUP", "RESEARCH_PROJECTS", "TEACHING_COURSES", "UNCATEGORIZED"))
}
r <- httr::POST(u, httr::add_headers("X-Dataverse-key" = key), body = bod, ...)
}
If you want to submit that (or a modification thereof) as a PR along with updated documentation, please do. Otherwise I'll get to it as soon as I can.