Monday at ESA11

Monday was a good day at ESA in Austin. There were a few topics I promised to report on in my blogging/tweeting. …focused on open source data. Carly Strasser’s presentation on guidelines for data management was awesome (including other talks in the symposium on Creating Effective Data Management Plans for Ecological Research). Although this was a good session, I can’t help but wish that they had hammered home the need for open science more. Oh well. Also, they talked a lot about how, and not a lot of why we should properly curate data. Still, a good session. One issue Carly and I talked about was tracking code in versioning systems such as Github. There doesn’t seem to be a culture of versioning code for analyses/simulations in ecology. But when we get there…it will be easier to share/track/collaborate on code. ...

August 8, 2011 · 2 min · Scott Chamberlain

(#ESA11) rOpenSci: a collaborative effort to develop R-based tools for facilitating Open Science

Our development team would like to announce the launch of rOpenSci. As the title states, this project aims to create R packages to make open science more available to researchers. http://ropensci.org/ What this means is that we seek to connect researchers using R with as much open data as possible, mainly through APIs. There are a number of R packages that already do this (e.g., infochimps, twitteR), but we are making more packages, e.g., for Mendeley, PLoS Journals, and taxonomic sources (ITIS, EOL, TNRS, Phylomatic, UBio). ...

August 8, 2011 · 1 min · Scott Chamberlain

Models in Evolutionary Ecology seminar, organized by Timothee Poisot

Here is one of the talks by Thomas Brouquet, and see the rest here. Thomas Broquet by mez_en_video

July 18, 2011 · 1 min · Scott Chamberlain

Comparison of functions for comparative phylogenetics

With all the packages (and beta stage groups of functions) for comparative phylogenetics in R (tested here: picante, geiger, ape, motmot, Liam Revell’s functions), I was simply interested in which functions to use in cases where multiple functions exist to do the same thing. I only show default settings, so perhaps these functions would differ under different parameter settings. [I am using a Mac 2.4 GHz i5, 4GB RAM] Get motmot here: https://r-forge.r-project.org/R/?group_id=782 ...

May 11, 2011 · 2 min · Scott Chamberlain

Phylometa from R: Randomization via Tip Shuffle

—UPDATE: I am now using code formatting from gist.github, so I replaced the old prettyR code (sorry guys). The github way is much easier and prettier. I hope readers like the change. I wrote earlier about some code I wrote for running Phylometa (software to do phylogenetic meta-analysis) from R. I have been concerned about what exactly is the right penalty for including phylogeny in a meta-analysis. E.g.: AIC is calculated from Q in Phylometa, and Q increases with tree size. ...

April 16, 2011 · 2 min · Scott Chamberlain

Adjust branch lengths with node ages: comparison of two methods

Here is an approach for comparing two methods of adjusting branch lengths on trees: bladj in the program Phylocom and a fxn written by Gene Hunt at the Smithsonian. Get the code and example files (tree and node ages) at https://gist.github.com/938313 Get phylocom at http://www.phylodiversity.net/phylocom/ Gene Hunt’s method has many options you can mess with, including setting tip ages (not available in bladj), setting node ages, and minimum branch length imposed. You will notice that Gene’s method may be not the appropriate if you only have extant taxa. ...

April 10, 2011 · 2 min · Scott Chamberlain

Check out Phyloseminar.org

They have online seminars that you can join in on live, and watch later as recorded videos. Check it out at phyloseminar.org

March 4, 2011 · 1 min · Scott Chamberlain

Phenotypic selection analysis in R

I have up to recently always done my phenotypic selection analyses in SAS. I finally got some code I think works to do everything SAS would do. Feedback much appreciated! ########################Selection analyses############################# install.packages(c("car","reshape","ggplot2")) require(car) require(reshape) require(ggplot2) # Create data set dat <- data.frame(plant = seq(1,100,1), trait1 = rep(c(0.1,0.15,0.2,0.21,0.25,0.3,0.5,0.6,0.8,0.9,1,3,4,10,11,12,13,14,15,16), each = 5), trait2 = runif(100), fitness = rep(c(1,5,10,20,50), each = 20)) # Make relative fitness column dat_ <- cbind(dat, dat$fitness/mean(dat$fitness)) names(dat_)[5] <- "relfitness" # Standardize traits dat_ <- cbind(dat_[,-c(2:3)], rescaler(dat_[,c(2:3)],"sd")) ####Selection differentials and correlations among traits, cor.prob uses function in functions.R file ############################################################################ ####### Function for calculating correlation matrix, corrs below diagonal, ####### and P-values above diagonal ############################################################################ cor.prob <- function(X, dfr = nrow(X) - 2) { R <- cor(X) above <- row(R) < col(R) r2 <- R[above]^2 Fstat <- r2 * dfr / (1 - r2) R[above] <- 1 - pf(Fstat, 1, dfr) R } # Get selection differentials and correlations among traits in one data frame dat_seldiffs <- cov(dat_[,c(3:5)]) # calculates sel'n differentials using cov dat_selcorrs <- cor.prob(dat_[,c(3:5)]) # use P-values above diagonal for significance of sel'n differentials in dat_seldiffs dat_seldiffs_selcorrs <- data.frame(dat_seldiffs, dat_selcorrs) # combine the two ########################################################################## ####Selection gradients dat_selngrad <- lm(relfitness ~ trait1 * trait2, data = dat_) summary(dat_selngrad) # where "Estimate" is our sel'n gradient ####Check assumptions shapiro.test(dat_selngrad$residuals) # normality, bummer, non-normal hist(dat_selngrad$residuals) # plot residuals vif(dat_selngrad) # check variance inflation factors (need package car), everything looks fine plot(dat_selngrad) # cycle through diagnostic plots ############################################################################ # Plot data ggplot(dat_, aes(trait1, relfitness)) + geom_point() + geom_smooth(method = "lm") + labs(x="Trait 1",y="Relative fitness") ggsave("myplot.jpeg") Plot of relative fitness vs. trait 1 standardized ...

February 24, 2011 · 2 min · Scott Chamberlain

Phylogenetic analysis with the phangorn package: an example

The phangorn package is a relatively new package in R for the analysis and comparison of phylogenies. See here for the Bioinformatics paper and here for the package. Here is an example of using phangorn from getting sequences to making phylogenies and visualizing them:Getting sequences from GenbankMultiple alignmentMaximum likelihood tree reconstructionVisualizing treesVisualizing trees and traitsMake fake traits:Visualize them on trees: ...

February 21, 2011 · 1 min · Scott Chamberlain

Troubling news for the teaching of evolution

[UPDATE: i remade the maps in green, hope that helps…] A recent survey reported in Science (“Defeating Creationism in the Courtroom, but not in the Classroom”) found that biology teachers in high school do not often accept the basis of their discipline, as do teachers in other disciplines, and thus may not teach evolution appropriately. Read more here: New York Times. I took a little time to play with the data provided online along with the Science article. The data is available on the Science website along with the article, and the dataset I read into R is unchanged from the original. The states abbreviations file is here (as a .xls). Here goes: ...

February 9, 2011 · 3 min · Scott Chamberlain