Recology

R/etc.

Getting taxonomic names downstream

    R open access data taxonomy ropensci ritis taxize
 Source: .Rmd/.md

It can be a pain in the ass to get taxonomic names. For example, I sometimes need to get all the Class names for a set of species. This is a relatively easy problem using the ITIS API (example below).

The much harder problem is getting all the taxonomic names downstream. ITIS doesn’t provide an API method for this - well, they do (getHirerachyDownFromTSN), but it only provides direct children (e.g., the genera within a tribe - but it won’t give all the species within each genus).

So in the taxize package, we wrote a function called downstream, which allows you to get taxonomic names to any downstream point, e.g.:

Install packages. You can get other packages from CRAN, but taxize is only on GitHub for now.

1
2
3
4
5
# install_github('ritis', 'ropensci') # uncomment if not already installed
# install_github('taxize_', 'ropensci') # uncomment if not already
# installed
library(ritis)
library(taxize)

Get upstream taxonomic names.

1
2
3
4
5
6
# Search for a TSN by scientific name
df <- searchbyscientificname("Tardigrada")
tsn <- df[df$combinedname %in% "Tardigrada", "tsn"]

# Get just one immediate higher taxonomic name
gethierarchyupfromtsn(tsn = tsn)
  parentName parentTsn rankName  taxonName    tsn
1   Animalia    202423   Phylum Tardigrada 155166
1
2
# Get full hierarchy upstream from TSN
getfullhierarchyfromtsn(tsn = tsn)
  parentName parentTsn rankName        taxonName    tsn
1                       Kingdom         Animalia 202423
2   Animalia    202423   Phylum       Tardigrada 155166
3 Tardigrada    155166    Class     Eutardigrada 155362
4 Tardigrada    155166    Class Heterotardigrada 155167
5 Tardigrada    155166    Class   Mesotardigrada 155358

Get taxonomc names downstream.

1
2
# Get genera downstream fromthe Class Bangiophyceae
downstream(846509, "Genus")
    tsn parentName parentTsn   taxonName rankId rankName
1 11531 Bangiaceae     11530      Bangia    180    Genus
2 11540 Bangiaceae     11530    Porphyra    180    Genus
3 11577 Bangiaceae     11530 Porphyrella    180    Genus
4 11580 Bangiaceae     11530 Conchocelis    180    Genus
1
2
# Get families downstream from Acridoidea
downstream(650497, "Family")
      tsn parentName parentTsn      taxonName rankId rankName
1  102195 Acridoidea    650497      Acrididae    140   Family
2  650502 Acridoidea    650497     Romaleidae    140   Family
3  657472 Acridoidea    650497    Charilaidae    140   Family
4  657473 Acridoidea    650497   Lathiceridae    140   Family
5  657474 Acridoidea    650497     Lentulidae    140   Family
6  657475 Acridoidea    650497    Lithidiidae    140   Family
7  657476 Acridoidea    650497   Ommexechidae    140   Family
8  657477 Acridoidea    650497    Pamphagidae    140   Family
9  657478 Acridoidea    650497  Pyrgacrididae    140   Family
10 657479 Acridoidea    650497    Tristiridae    140   Family
11 657492 Acridoidea    650497 Dericorythidae    140   Family
1
2
# Get species downstream from Ursus
downstream(180541, "Species")
     tsn parentName parentTsn        taxonName rankId rankName
1 180542      Ursus    180541  Ursus maritimus    220  Species
2 180543      Ursus    180541     Ursus arctos    220  Species
3 180544      Ursus    180541 Ursus americanus    220  Species
4 621850      Ursus    180541 Ursus thibetanus    220  Species

Get the .Rmd file used to create this post at my github account - or .md file.

Written in Markdown, with help from knitr.

comments powered by Disqus