R | Recology

Visualizing rOpenSci collaboration

We (rOpenSci) have been writing code for R packages for a couple years, so it is time to take a look back at the data. What data you ask? The commits data from GitHub ~ data that records who did what and when. Using the Github commits API we can gather data on who commited code to a Github repository, and when they did it. Then we can visualize this hitorical record. ...

Getting a simple tree via NCBI

I was just at the Phylotastic hackathon in Tucson, AZ at the iPlant facilities at the UofA. A problem that needs to be solved is getting the incrasingly vast phylogenetic information to folks not comfortable building their own phylogenies. Phylomatic has made this super easy for people that want plant phylogenies (at least 250 or so papers have used and cited Phylomatic in their papers) - however, there are few options for those that want phylogenies for other taxonomic groups. ...

Waiting for an API request to complete

Dealing with API tokens in R In my previous post I showed an example of calling the Phylotastic taxonomic name resolution API Taxosaurus here. When you query their API they give you a token which you use later to retrieve the result (see examples on their page above). However, you don’t know when the query will be done, so how do we know when to send the query to rerieve the data? ...

Resolving species names when you have a lot of them

taxize use case: Resolving species names when you have a lot of them Species names can be a pain in the ass, especially if you are an ecologist. We ecologists aren’t trained in taxonomy, yet we often end up with huge species lists. Of course we want to correct any spelling errors in the names, and get the newest names for our species, resolve any synonyms, etc. We are building tools into our R package taxize, that will let you check your species names to make sure they are correct. ...

Open Science Challenge

Open Science Science is becoming more open in many areas: publishing, data sharing, lab notebooks, and software. There are many benefits to open science. For example, sharing research data alongside your publications leads to increased citation rate (Piwowar et. al. 2007). In addition, data is becoming easier to share and reuse thanks to efforts like FigShare and Dryad. If you don’t understand the problem we are currently facing due to lack of open science, watch this video: ...

Is invasive?

The Global Invasive Species Database (GISD) (see their website for more info here) has data on the invasiveness status of many species. From taxize you can now query the GISD database. Introducing the function gisd_isinvasive. This function was contributed to taxize by Ignasi Bartomeus, a postdoc at the Swedish University Agricultural Sciences. There are two possible outputs from using gisd_isinvasive: “Invasive” or “Not in GISD”. If you use simplify=TRUE in the function you get “Invasive” or “Not in GISD”, but if you use simplify=FALSE you get verbose description of the invasive species instead of just “Invasive” (and you still just get “Not in GISD”). ...

Shiny apps are awesome

RStudio has a new product called Shiny that, quoting from their website, “makes it super simple for R users like you to turn analyses into interactive web applications that anyone can use”. See here for more information. A Shiny basically consists of two files: a ui.r file and a server.r file. The ui.r file, as it says, provides the user interface, and the server.r file provides the the server logic. Below is what it looks like in the wild (on a browser). ...

One R package for all your taxonomic needs

UPDATE: there were some errors in the tests for taxize, so the binaries aren’t avaiable yet. You can install from source though, see below. Getting taxonomic information for the set of species you are studying can be a pain in the ass. You have to manually type, or paste in, your species one-by-one. Or, if you are lucky, there is a web service in which you can upload a list of species. Encyclopedia of Life (EOL) has a service where you can do this here. But is this reproducible? No. ...

Displaying Your Data in Google Earth Using R2G2

Have you ever wanted to easily visualize your ecology data in Google Earth? R2G2 is a new package for R, available via R CRAN and formally described in this Molecular Ecology Resources article, which provides a user-friendly bridge between R and the Google Earth interface. Here, we will provide a brief introduction to the package, including a short tutorial, and then encourage you to try it out with your own data! Nils Arrigo, with some help from Loren Albert, Mike Barker, and Pascal Mickelson (one of the contributors to Recology), has created a set of R tools to generate KML files to view data with geographic components. Instead of just telling you what the tools can do, though, we will show you a couple of examples using publically available data. Note: a number of individual files are linked to throughout the tutorial below, but just in case you would rather download all the tutorial files in one go, have at it (tutorial zip file). ...

Getting taxonomic names downstream

It can be a pain in the ass to get taxonomic names. For example, I sometimes need to get all the Class names for a set of species. This is a relatively easy problem using the ITIS API (example below). The much harder problem is getting all the taxonomic names downstream. ITIS doesn’t provide an API method for this - well, they do (getHirerachyDownFromTSN), but it only provides direct children (e.g., the genera within a tribe - but it won’t give all the species within each genus). ...