taxize
taxize copied to clipboard
Extend functions for user specified data sources
via https://github.com/ropensci/taxa/issues/208
E.g., one could define an s3 method for classification()
library(taxize)
classification.foobar <- function(x) paste0("what about ", x)
x <- "toads"
class(x) <- "foobar"
classification(x)
#> [1] "what about toads"
So that's easy. But, because data can come in so many different shapes and sizes and locations and everything - the logic will have to be all done by the user to get the classification data, put it in a data.frame, etc. So there's not really much benefit to extending classification over just writing your own code/function.
I think to take advantage of the high level functions (e.g,. classification, synonyms, children, downstream), we'd need to be able to define various things about the custom data source:
- where the data source is, probably only support an R object, a file on disk or a database on disk
- a mapping of all fields (and what tables they're in if there's more than one table)
- logic to fetch data by taxon ID (which field has the taxon ID field)
- logic to fetch data by taxon name (do we do an exact match, fuzzy search, etc.)
- probably more things ...
And these high level functions all revolve around there being taxonomic IDs so there's no ambiguity about what data you want. So if the data source doesn't have IDs, it's probably not a good fit
Sorry for the delay coming over to this thread! Would love to see if this is possible with the PR2 database. Let me know if there is anything I can do to give this a shot.
thanks @ctekellogg - will let you know when I have something to test or questions