rinat icon indicating copy to clipboard operation
rinat copied to clipboard

"All inputs to rbind.fill must be data.frames" error on getting project's observations

Open jbohenek opened this issue 10 months ago • 2 comments

I am trying grab project observations but I am getting an rbind error.

``` r
library(rinat)
project_slug <- "unt-herpetology-spr-2025"
inat_data <- get_inat_obs_project(project_slug, type="observations") 
#> 3611 records
#> Getting records 0-200
#> Getting records up to 400
#> Getting records up to 600
#> Getting records up to 800
#> Getting records up to 1000
#> Getting records up to 1200
#> Getting records up to 1400
#> Getting records up to 1600
#> Getting records up to 1800
#> Getting records up to 2000
#> Getting records up to 2200
#> Getting records up to 2400
#> Getting records up to 2600
#> Getting records up to 2800
#> Getting records up to 3000
#> Getting records up to 3200
#> Getting records up to 3400
#> Getting records up to 3600
#> Getting records up to 3800
#> Done.
#> Error: All inputs to rbind.fill must be data.frames

Created on 2025-03-08 with reprex v2.1.1

I also noticed I can grab observations from some projects but not others. How does rinat interface with different project options? The above project has a member's only setting. The below (presumably) does not. Is it related to this issue? https://forum.inaturalist.org/t/select-observations-to-batch-download-from-list-of-observation-ids/61657

``` r
library(rinat)
project_slug <- "test"
inat_data <- get_inat_obs_project(project_slug, type="observations") 
#> 3 records
#> Getting records 0-200
#> Done.

Created on 2025-03-08 with reprex v2.1.1

jbohenek avatar Mar 08 '25 19:03 jbohenek

Indeed, I can reproduce the issue. It looks like a big mismatch between the API's reported "project_observations_count" (currently 4368) and what the website shows (currently 132). I don't know why there is such a large discrepancy.

I had previously added a step to cater for a mismatch, but it would only work if the last element of the list was empty. In this case, more than one element is empty. I will try to find time to release an update soon, but in the meantime you can use this modified version of the function, get_inat_obs_project2():

library(jsonlite)
library(httr)
library(plyr)
get_inat_obs_project2 <- function (grpid,
                                   type = c("observations", "info"),
                                   raw = FALSE) {
  if (!curl::has_internet()) {
    message("No Internet connection.")
    return(invisible(NULL))
  }
  base_url <- "http://www.inaturalist.org/"
  if (httr::http_error(base_url)) {
    message("iNaturalist API is unavailable.")
    return(invisible(NULL))
  }
  argstring <- switch(match.arg(type), observations = "obs", 
                      info = "info")
  url <- paste0(base_url, "projects/", grpid, ".json")
  xx <- fromJSON(content(GET(url), as = "text"))
  recs <- xx$project_observations_count
  dat <- NULL
  if (is.null(recs)) 
    (return(dat))
  message(paste(recs, "records\n"))
  if (argstring == "info") {
    output <- list()
    output[["title"]] <- xx$title
    output[["description"]] <- xx$description
    output[["slug"]] <- xx$slug
    output[["created_at"]] <- xx$created_at
    output[["id"]] <- xx$id
    output[["location"]] <- c(as.numeric(xx$lat), as.numeric(xx$long))
    output[["place_id"]] <- xx$place_id
    output[["taxa_number"]] <- xx$observed_taxa_count
    output[["taxa_count"]] <- xx$project_observations_count
    if (raw) {
      output[["raw"]] <- xx
    }
    return(output)
  }
  else if (argstring == "obs") {
    per_page <- 200
    if (recs%%per_page == 0) {
      loopval <- recs%/%per_page
    }
    if (recs >= 10000) {
      warning("Number of observations in project greater than current API limit.\nReturning the first 10000.\n")
      loopval <- 10000/per_page
    }
    else {
      loopval <- (recs%/%per_page) + 1
    }
    obs_list <- vector("list", loopval)
    for (i in 1:loopval) {
      url1 <- paste0(base_url, "observations/project/", 
                     grpid, ".json?page=", i, "&per_page=", per_page)
      if (i == 1) {
        message(paste0("Getting records 0-", per_page))
      }
      if (i > 1) {
        message(paste0("Getting records up to ", i * 
                         per_page))
      }
      obs_list[[i]] <- fromJSON(content(GET(url1), as = "text"), 
                                flatten = TRUE)
    }
    message("Done.\n")
    # remove empty elements
    obs_list <- obs_list[lengths(obs_list) >= 1]
    project_obs <- do.call("rbind.fill", obs_list)
    if (recs != nrow(project_obs)) {
      message("Note: mismatch between number of observations reported and returned by the API.")
    }
    return(project_obs)
  }
}

This works for me, hopefully it gets you what you need.

stragu avatar Mar 13 '25 07:03 stragu

Appreciate it!

jbohenek avatar Mar 22 '25 18:03 jbohenek