rbold: An R Interface for Bold Systems barcode repository

Have you ever wanted to search and fetch barcode data from Bold Systems? I am developing functions to interface with Bold from R. I just started, but hopefully folks will find it useful. The code is at Github here. The two functions are still very buggy, so please bring up issues below, or in the Issues area on Github. For example, some searches work and other similar searches don’t. Apologies in advance for the bugs. ...

June 28, 2011 · 1 min · Scott Chamberlain

iEvoBio 2011 Synopsis

We just wrapped up the 2011 iEvoBio meeting. It was awesome! If you didn’t go this year or last year, definitely think about going next year. Here is a list of the cool projects that were discussed at the meeting (apologies if I left some out): Vistrails: workflow tool, awesome project by Claudio Silva Commplish: purpose is to use via API’s, not with the web UI Phylopic: a database of life-form silouhettes, including an API for remote access, sweet! Gloome MappingLife: awesome geographic/etc data visualization interace on the web SuiteSMA: visualizating multiple alignments treeBASE: R interface to treebase, by Carl Boettiger VertNet: database for vertebrate natural history collections RevBayes: revamp of MrBayes, with GUI, etc. Phenoscape Knowledge Base Peter Midford lightning talk: talked about matching taxonomic and genetic data BiSciCol: biological science collections tracker Ontogrator TNRS: taxonomic name resolution service Barcode of Life data systems, and remote access Moorea Biocode Project Microbial LTER’s data BirdVis: interactive bird data visualization (Claudio Silva in collaboration with Cornell Lab of Ornithology) Crowdlabs: I think the site is down right now, another project by Claudio Silva Phycas: Bayesian phylogenetics, can you just call this from R? RIP MrBayes!!!! replaced by RevBayes (see 9 above) Slides of presentations will be at Slideshare (not all presentations up yet) A birds of a feather group I was involved in proposed an idea (TOL-o-matic) like Phylomatic, but of broader scope, for easy access and submission of trees, and perhaps even social (think just pushing a ‘SHARE’ button within PAUP, RevBayes, or other phylogenetics software)! Synopses of Birds of a Feather discussion groups: http://piratepad.net/iEvoBio11-BoF-reportouts

June 22, 2011 · 2 min · Scott Chamberlain

PLoS journals API from R: "rplos"

The Public Libraries of Science (PLOS) has an API so that developers can create cool tools to access their data (including full text papers!!). Carl Boettiger at UC Davis and I are working on R functions that use the PLoS API. See our code on Github here. See the wiki at the Github page for examples of use. We hope to deploy rplos as a package someday soon. Please feel free to suggest changes/additions rplos in the comments below or on the Github/rplos site. ...

June 21, 2011 · 1 min · Scott Chamberlain

OpenStates from R via API: watch your elected representatives

I am writing some functions to acquire data from the OpenStates project, via their API. They have a great support community at Google Groups as well. On its face this post is not obviously about ecology or evolution, but well, our elected representatives do, so to speak, hold our environment in a noose, ready to let the Earth hang any day. Code I am developing is over at Github. Here is an example of its use in R, in this case using the Bill Search option (billsearch.R on my Github site), and in this case you do not provide your API key in the function call, but instead put it in your .Rprofile file, which is called when you open R. ...

June 10, 2011 · 2 min · Scott Chamberlain

How to fit power laws

A new paper out in Ecology by Xiao and colleagues (in press, here) compares the use of log-transformation to non-linear regression for analyzing power-laws. They suggest that the error distribution should determine which method performs better. When your errors are additive, homoscedastic, and normally distributed, they propose using non-linear regression. When errors are multiplicative, heteroscedastic, and lognormally distributed, they suggest using linear regression on log-transformed data. The assumptions about these two methods are different, so cannot be correct for a single dataset. ...

June 7, 2011 · 1 min · Scott Chamberlain

searching ITIS and fetching Phylomatic trees

I am writing a set of functions to search ITIS for taxonomic information (more databases to come) and functions to fetch plant phylogenetic trees from Phylomatic. Code at github. Also, see the examples in the demos folder on the Github site above.

June 3, 2011 · 1 min · Scott Chamberlain

A simple function for plotting phylogenies in ggplot2

UPDATE: Greg jordan has a much more elegant way of plotting trees with ggplot2. See his links in the comments below. I wrote a simple function for plotting a phylogeny in ggplot2. However, it only handles a 3 species tree right now, as I haven’t figured out how to generalize the approach to N species. It’s at https://gist.github.com/977207 Any ideas on how to improve this?

May 17, 2011 · 1 min · Scott Chamberlain

plyr's idata.frame VS. data.frame

I had seen the function idata.frame in plyr before, but not really tested it. From the plyr documentation: “An immutable data frame works like an ordinary data frame, except that when you subset it, it returns a reference to the original data frame, not a a copy. This makes subsetting substantially faster and has a big impact when you are working with large datasets with many groups.” For example, although baseball is a data.frame, its immutable counterpart is a reference to it: ...

May 13, 2011 · 4 min · Scott Chamberlain

Comparison of functions for comparative phylogenetics

With all the packages (and beta stage groups of functions) for comparative phylogenetics in R (tested here: picante, geiger, ape, motmot, Liam Revell’s functions), I was simply interested in which functions to use in cases where multiple functions exist to do the same thing. I only show default settings, so perhaps these functions would differ under different parameter settings. [I am using a Mac 2.4 GHz i5, 4GB RAM] Get motmot here: https://r-forge.r-project.org/R/?group_id=782 ...

May 11, 2011 · 2 min · Scott Chamberlain

Treebase trees from R

UPDATE: See Carl Boettiger’s functions/package at Github for searching Treebase here. Treebase is a great resource for phylogenetic trees, and has a nice interface for searching for certain types of trees. However, if you want to simply download a lot of trees for analyses (like that in Davies et al.), then you want to be able to access trees in bulk (I believe Treebase folks are working on an API though). I wrote some simple code for extracting trees from Treebase.org.It reads an xml file of (in this case consensus) URL’s for each tree, parses the xml, makes a vector of URL’s, reads the nexus files with error checking, remove trees that gave errors, then a simple plot looking at metrics of the trees. ...

May 3, 2011 · 1 min · Scott Chamberlain