dataverse-client-r icon indicating copy to clipboard operation
dataverse-client-r copied to clipboard

Creating a dataverse fails with HTTP 400: Bad Request

Open jamesdunham opened this issue 6 years ago • 2 comments

I sat down to try the Dataverse API and dataverse today and ran into HTTP 400 (Bad Request) codes when calling create_dataverse(). As often happens I figured out the solution while writing up the issue.

problem

Without the dataverse argument, the expected behavior from create_dataverse() is "a top-level Dataverse is created." I probably don't have the permissions for that on dataverse.harvard.edu, but this isn't relevant for demonstration:

library("dataverse")
> dv <- create_dataverse()
Error in create_dataverse() : Bad Request (HTTP 400).
> traceback()
3: stop(http_condition(x, "error", task = task, call = call))
2: httr::stop_for_status(r) at create_dataverse.R#27
1: create_dataverse()

Code 400. Stepping through with debug() the request was

debug at /home/.../dataverse-client-r/R/create_dataverse.R#26: r <- httr::POST(u, httr::add_headers(`X-Dataverse-key` = key), ...)
Browse[2]> r$request
<request>
POST https://dataverse.harvard.edu/api/dataverses
Output: write_memory
Options:
* useragent: libcurl/7.55.1 r-curl/3.1 httr/1.3.1
* post: TRUE
* postfieldsize: 0
Headers:
* Accept: application/json, text/xml, application/xml, */*
* Content-Type:
* X-Dataverse-key: ####-####

Note the empty body and content-type. For completeness, I do have the admin role for the medsl dataverse at harvard.edu, and get the same result with

> r = create_dataverse("medsl")
Error in create_dataverse("medsl") : Bad Request (HTTP 400).

solution

From the Dataverse API docs I see this behavior is actually expected. I'm sending an empty body, and minimally the API requires fields name, alias, and dataverseContacts.

I can confirm this is the issue by sending a request with httr, using the example content from the docs as body content:

api_url = "https://dataverse.harvard.edu/api/dataverses/medsl"
meta = jsonlite::read_json("http://guides.dataverse.org/en/latest/_downloads/dataverse-complete.json")

r <- httr::POST(api_url, httr::add_headers("X-Dataverse-key" = key), body =
  meta, encode = "json")
r$status_code
[1] 201

That's a success and the new dataverse appears in the GUI unpublished. This result can be replicated with create_dataverse() using its dots argument, which is passed to httr::POST().

# same body; nb encode='json' is required
r = create_dataverse("medsl", body = meta, encode = "json")  
> str(r)
 chr "{\"status\":\"OK\",\"data\":{\"id\":3131902,\"alias\":\"science\",\"name\":\"Scientific Research\",\"affiliatio"| __truncated__

And that's also successful!

If create_dataverse() will always require body content, it might be worthwhile to move body into its signature as a new named argument and handle the encoding. Alternatively, the minimal metadata fields (name, alias, dataverseContacts) could appear in the signature, since passing a named list is a little clunky. It looks something like this (.Names elements added by dput):

structure(list(name = "Scientific Research", alias = "science", dataverseContacts = list(structure(list(contactEmail = "[email protected]"), .Names = "contactEmail"), structure(list(contactEmail = "[email protected]"), .Names = "contactEmail")), affiliation = "Scientific Research University", description = "We do all the science.", dataverseType = "LABORATORY"), .Names = c("name", "alias", "dataverseContacts", "affiliation", "description", "dataverseType"))

Thoughts? If you're open to an update I can submit a PR.

jamesdunham avatar Mar 09 '18 19:03 jamesdunham

@jamesdunham thanks for such a thorough bug report and for being willing to submit a pull request! I'll defer to @leeper on what approach to take.

pdurbin avatar Mar 09 '18 23:03 pdurbin

I think this may have changed at some point, or I never fully built-out the code. We definitely want to bring all of these arguments up into the function signature. I'd say all of the fields should be separate arguments with the non-required fields set by default to NULL. Something like:

create_dataverse(
 dataverse,
 alias,
 contacts,
 affiliation = NULL,
 description = NULL,
 type = NULL
) {

  bod <- list()
  bod$name <- dataverse
  bod$alias <- alias
  bod$dataverseContacts <- setNames(as.list(contacts), rep("contactEmail", length(contacts)))
  if (!is.null(affiliation)) {
    bod$affiliation <- affiliation
  }
  if (!is.null(description)) {
    bod$description <- description
  }
  if (!is.null(type)) {
    bod$type <- match.arg(toUpper(type), c("DEPARTMENT", "JOURNAL", "LABORATORY", "ORGANIZATIONS_INSTITUTIONS", "RESEARCHERS", "RESEARCH_GROUP", "RESEARCH_PROJECTS", "TEACHING_COURSES", "UNCATEGORIZED"))
  }
  r <- httr::POST(u, httr::add_headers("X-Dataverse-key" = key), body = bod, ...)
}

If you want to submit that (or a modification thereof) as a PR along with updated documentation, please do. Otherwise I'll get to it as soon as I can.

leeper avatar Mar 10 '18 11:03 leeper