Visualizing rOpenSci collaboration

We (rOpenSci) have been writing code for R packages for a couple years, so it is time to take a look back at the data. What data you ask? The commits data from GitHub ~ data that records who did what and when. Using the Github commits API we can gather data on who commited code to a Github repository, and when they did it. Then we can visualize this hitorical record. ...

March 8, 2013 · 3 min · Scott Chamberlain

Academia reboot

Reboot We need to reboot academia, at least for graduate training. I am speaking from the point of view of ecology/evolution (EEB). Why you ask? Because of the following line of reasoning: First, the most important factor for me comes down to supply and demand. We have too much supply (=graduate students) and not enough demand (=faculty positions, etc.) - see this comic at PhDComics for proof. This seems especially apparent when you hear from your fellow postdoc friends that there were hundreds of other people with Ph.D.’s applying for the same position. ...

February 22, 2013 · 3 min · Scott Chamberlain

Getting a simple tree via NCBI

I was just at the Phylotastic hackathon in Tucson, AZ at the iPlant facilities at the UofA. A problem that needs to be solved is getting the incrasingly vast phylogenetic information to folks not comfortable building their own phylogenies. Phylomatic has made this super easy for people that want plant phylogenies (at least 250 or so papers have used and cited Phylomatic in their papers) - however, there are few options for those that want phylogenies for other taxonomic groups. ...

February 14, 2013 · 2 min · Scott Chamberlain

testing ifttt recipe, ignore

testing ifttt recipe

January 26, 2013 · 1 min · Scott Chamberlain

Waiting for an API request to complete

Dealing with API tokens in R In my previous post I showed an example of calling the Phylotastic taxonomic name resolution API Taxosaurus here. When you query their API they give you a token which you use later to retrieve the result (see examples on their page above). However, you don’t know when the query will be done, so how do we know when to send the query to rerieve the data? ...

January 26, 2013 · 2 min · Scott Chamberlain

Resolving species names when you have a lot of them

taxize use case: Resolving species names when you have a lot of them Species names can be a pain in the ass, especially if you are an ecologist. We ecologists aren’t trained in taxonomy, yet we often end up with huge species lists. Of course we want to correct any spelling errors in the names, and get the newest names for our species, resolve any synonyms, etc. We are building tools into our R package taxize, that will let you check your species names to make sure they are correct. ...

January 25, 2013 · 5 min · Scott Chamberlain

Open Science Challenge

Open Science Science is becoming more open in many areas: publishing, data sharing, lab notebooks, and software. There are many benefits to open science. For example, sharing research data alongside your publications leads to increased citation rate (Piwowar et. al. 2007). In addition, data is becoming easier to share and reuse thanks to efforts like FigShare and Dryad. If you don’t understand the problem we are currently facing due to lack of open science, watch this video: ...

January 8, 2013 · 3 min · Scott Chamberlain

Is invasive?

The Global Invasive Species Database (GISD) (see their website for more info here) has data on the invasiveness status of many species. From taxize you can now query the GISD database. Introducing the function gisd_isinvasive. This function was contributed to taxize by Ignasi Bartomeus, a postdoc at the Swedish University Agricultural Sciences. There are two possible outputs from using gisd_isinvasive: “Invasive” or “Not in GISD”. If you use simplify=TRUE in the function you get “Invasive” or “Not in GISD”, but if you use simplify=FALSE you get verbose description of the invasive species instead of just “Invasive” (and you still just get “Not in GISD”). ...

December 13, 2012 · 3 min · Scott Chamberlain

Shiny apps are awesome

RStudio has a new product called Shiny that, quoting from their website, “makes it super simple for R users like you to turn analyses into interactive web applications that anyone can use”. See here for more information. A Shiny basically consists of two files: a ui.r file and a server.r file. The ui.r file, as it says, provides the user interface, and the server.r file provides the the server logic. Below is what it looks like in the wild (on a browser). ...

December 10, 2012 · 3 min · Scott Chamberlain

One R package for all your taxonomic needs

UPDATE: there were some errors in the tests for taxize, so the binaries aren’t avaiable yet. You can install from source though, see below. Getting taxonomic information for the set of species you are studying can be a pain in the ass. You have to manually type, or paste in, your species one-by-one. Or, if you are lucky, there is a web service in which you can upload a list of species. Encyclopedia of Life (EOL) has a service where you can do this here. But is this reproducible? No. ...

December 6, 2012 · 10 min · Scott Chamberlain