I maintain, along with other awesome people, the taxize R package - a taxonomic toolbelt for R, for interacting with taxonomic data sources on the web.
Taxonomy data is not standardized, but there are a lot of common elements, and there is a finite list of taxonomic ranks, and finite number of major taxonomic data sets. Thus, I’ve been interested in attempting to define a pseudo standard for expressing taxonomic data in R. The conversation started a while back in a GitHub issue, and hasn’t moved very far.
I decided to start playing with this more, which is easier to do in a separate pacakge. Thus: binomen
. It’s an attempt to define a set of taxonomic classes/objects in R, along with a suite of functions to help construct and parse these objects.
Would love any/all feedback.
Here’s some examples:
Install
Install binomen
install.packages("devtools")
devtools::install_github("ropensci/binomen")
library("binomen")
Make a taxon
Make a taxon object
(obj <- make_taxon(genus="Poa", epithet="annua", authority="L.",
family='Poaceae', clazz='Poales',
kingdom='Plantae', variety='annua'))
#> <taxon>
#> binomial: Poa annua
#> classification:
#> kingdom: Plantae
#> clazz: Poales
#> family: Poaceae
#> genus: Poa
#> species: Poa annua
#> variety: annua
Index to various parts of the object
The binomial
obj$binomial
#> <binomial>
#> genus: Poa
#> epithet: annua
#> canonical: Poa annua
#> species: Poa annua L.
#> authority: L.
The authority
obj$binomial$authority
#> [1] "L."
The classification
obj$classification
#> <classification>
#> kingdom: Plantae
#> clazz: Poales
#> family: Poaceae
#> genus: Poa
#> species: Poa annua
#> variety: annua
The family
obj$classification$family
#> <taxonref>
#> rank: family
#> name: Poaceae
#> id: none
#> uri: none
Subset taxon objects
Get a single rank
obj %>% select(family)
#> <taxonref>
#> rank: family
#> name: Poaceae
#> id: none
#> uri: none
Get a range of ranks
obj %>% range(kingdom, family)
#> $kingdom
#> <taxonref>
#> rank: kingdom
#> name: Plantae
#> id: none
#> uri: none
#>
#> $clazz
#> <taxonref>
#> rank: clazz
#> name: Poales
#> id: none
#> uri: none
#>
#> $family
#> <taxonref>
#> rank: family
#> name: Poaceae
#> id: none
#> uri: none
Extract classification as a data.frame
gethier(obj)
#> rank name
#> 1 kingdom Plantae
#> 2 clazz Poales
#> 3 family Poaceae
#> 4 genus Poa
#> 5 species Poa annua
#> 6 variety annua
Taxonomic data.frame’s
Make one
df <- data.frame(
order=c('Asterales','Asterales','Fagales','Poales','Poales','Poales'),
family=c('Asteraceae','Asteraceae','Fagaceae','Poaceae','Poaceae','Poaceae'),
genus=c('Helianthus','Helianthus','Quercus','Poa','Festuca','Holodiscus'),
stringsAsFactors = FALSE)
(df2 <- taxon_df(df))
#> order family genus
#> 1 Asterales Asteraceae Helianthus
#> 2 Asterales Asteraceae Helianthus
#> 3 Fagales Fagaceae Quercus
#> 4 Poales Poaceae Poa
#> 5 Poales Poaceae Festuca
#> 6 Poales Poaceae Holodiscus
Parse - get rank order matching Fagales
df2 %>% select(order, Fagales)
#> order family genus
#> 3 Fagales Fagaceae Quercus
get rank family matching Asteraceae
df2 %>% select(family, Asteraceae)
#> order family genus
#> 1 Asterales Asteraceae Helianthus
#> 2 Asterales Asteraceae Helianthus
get rank genus matching Poa
df2 %>% select(genus, Poa)
#> order family genus
#> 4 Poales Poaceae Poa