Tenure track position in systematics at the University of Vermont

There is an awesome position opening up for an assistant professor in systematics at the University of Vermont. Below is the announcement, and see the original post at the Distributed Ecology blog. Why is this related to R? One can do a lot of systematics work in R, including retrieving scientific collections data through an upcoming package handshaking with VertNet (part of the rOpenSci project), managing large data sets, retrieval of GenBank data through the ape package (see fxn read.genbank), phylogenetic reconstruction and analysis, and more. So I am sure a systematist with R ninja skills will surely have a head up on the rest of the field. ...

August 22, 2011 · 3 min · Scott Chamberlain

(#ESA11) rOpenSci: a collaborative effort to develop R-based tools for facilitating Open Science

Our development team would like to announce the launch of rOpenSci. As the title states, this project aims to create R packages to make open science more available to researchers. http://ropensci.org/ What this means is that we seek to connect researchers using R with as much open data as possible, mainly through APIs. There are a number of R packages that already do this (e.g., infochimps, twitteR), but we are making more packages, e.g., for Mendeley, PLoS Journals, and taxonomic sources (ITIS, EOL, TNRS, Phylomatic, UBio). ...

August 8, 2011 · 1 min · Scott Chamberlain

CRdata vs. Cloudnumbers

Cloudnumbers and CRdata are two new cloud computing services. I tested the two services with a very simple script. The script simply creates a dataframe of 10000 numbers via rnorm, and assigns them to a factor of one of two levels (a or b). I then take the mean of the two factor levels with the aggregate function. In CRdata you need to put in some extra code to format the output in a browser window. For example, the last line below needs to have ‘<crdata_object>’ on both sides of the output object so it can be rendered in a browser. And etc. for other things that one would print to a console. Whereas you don’t need this extra code for using Cloudnumbers. ...

July 14, 2011 · 3 min · Scott Chamberlain

rbold: An R Interface for Bold Systems barcode repository

Have you ever wanted to search and fetch barcode data from Bold Systems? I am developing functions to interface with Bold from R. I just started, but hopefully folks will find it useful. The code is at Github here. The two functions are still very buggy, so please bring up issues below, or in the Issues area on Github. For example, some searches work and other similar searches don’t. Apologies in advance for the bugs. ...

June 28, 2011 · 1 min · Scott Chamberlain

iEvoBio 2011 Synopsis

We just wrapped up the 2011 iEvoBio meeting. It was awesome! If you didn’t go this year or last year, definitely think about going next year. Here is a list of the cool projects that were discussed at the meeting (apologies if I left some out): Vistrails: workflow tool, awesome project by Claudio Silva Commplish: purpose is to use via API’s, not with the web UI Phylopic: a database of life-form silouhettes, including an API for remote access, sweet! Gloome MappingLife: awesome geographic/etc data visualization interace on the web SuiteSMA: visualizating multiple alignments treeBASE: R interface to treebase, by Carl Boettiger VertNet: database for vertebrate natural history collections RevBayes: revamp of MrBayes, with GUI, etc. Phenoscape Knowledge Base Peter Midford lightning talk: talked about matching taxonomic and genetic data BiSciCol: biological science collections tracker Ontogrator TNRS: taxonomic name resolution service Barcode of Life data systems, and remote access Moorea Biocode Project Microbial LTER’s data BirdVis: interactive bird data visualization (Claudio Silva in collaboration with Cornell Lab of Ornithology) Crowdlabs: I think the site is down right now, another project by Claudio Silva Phycas: Bayesian phylogenetics, can you just call this from R? RIP MrBayes!!!! replaced by RevBayes (see 9 above) Slides of presentations will be at Slideshare (not all presentations up yet) A birds of a feather group I was involved in proposed an idea (TOL-o-matic) like Phylomatic, but of broader scope, for easy access and submission of trees, and perhaps even social (think just pushing a ‘SHARE’ button within PAUP, RevBayes, or other phylogenetics software)! Synopses of Birds of a Feather discussion groups: http://piratepad.net/iEvoBio11-BoF-reportouts

June 22, 2011 · 2 min · Scott Chamberlain

PLoS journals API from R: "rplos"

The Public Libraries of Science (PLOS) has an API so that developers can create cool tools to access their data (including full text papers!!). Carl Boettiger at UC Davis and I are working on R functions that use the PLoS API. See our code on Github here. See the wiki at the Github page for examples of use. We hope to deploy rplos as a package someday soon. Please feel free to suggest changes/additions rplos in the comments below or on the Github/rplos site. ...

June 21, 2011 · 1 min · Scott Chamberlain

OpenStates from R via API: watch your elected representatives

I am writing some functions to acquire data from the OpenStates project, via their API. They have a great support community at Google Groups as well. On its face this post is not obviously about ecology or evolution, but well, our elected representatives do, so to speak, hold our environment in a noose, ready to let the Earth hang any day. Code I am developing is over at Github. Here is an example of its use in R, in this case using the Bill Search option (billsearch.R on my Github site), and in this case you do not provide your API key in the function call, but instead put it in your .Rprofile file, which is called when you open R. ...

June 10, 2011 · 2 min · Scott Chamberlain

How to fit power laws

A new paper out in Ecology by Xiao and colleagues (in press, here) compares the use of log-transformation to non-linear regression for analyzing power-laws. They suggest that the error distribution should determine which method performs better. When your errors are additive, homoscedastic, and normally distributed, they propose using non-linear regression. When errors are multiplicative, heteroscedastic, and lognormally distributed, they suggest using linear regression on log-transformed data. The assumptions about these two methods are different, so cannot be correct for a single dataset. ...

June 7, 2011 · 1 min · Scott Chamberlain

searching ITIS and fetching Phylomatic trees

I am writing a set of functions to search ITIS for taxonomic information (more databases to come) and functions to fetch plant phylogenetic trees from Phylomatic. Code at github. Also, see the examples in the demos folder on the Github site above.

June 3, 2011 · 1 min · Scott Chamberlain

A simple function for plotting phylogenies in ggplot2

UPDATE: Greg jordan has a much more elegant way of plotting trees with ggplot2. See his links in the comments below. I wrote a simple function for plotting a phylogeny in ggplot2. However, it only handles a 3 species tree right now, as I haven’t figured out how to generalize the approach to N species. It’s at https://gist.github.com/977207 Any ideas on how to improve this?

May 17, 2011 · 1 min · Scott Chamberlain