Function for phylogeny resolution

UPDATE: Yeah, so the treeresstats function had a problem in one of the calculations. I fixed that and added some more calulcations to the function. I couldn’t find any functions to calculate number of polytomies, and related metrics. Here’s a simple function that gives four metrics on a phylo tree object: # calculate tree resolution stats treeresstats <- function(x) { require(phangorn) # load the phangorn package todo <- ( 1+Ntip(x)) : (Ntip(x) + Nnode(x) ) trsize_tips <- Ntip(x) trsize_nodes <- Nnode(x) polytomyvec <- sapply(todo, function(y) length(Children(x, y))) numpolys <- length(polytomyvec[polytomyvec > 2]) numpolysbytrsize_tips <- numpolys/trsize_tips numpolysbytrsize_nodes <- numpolys/trsize_nodes proptipsdescpoly <- sum(polytomyvec[polytomyvec > 2])/trsize_tips propnodesdich <- length(polytomyvec[polytomyvec == 2])/trsize_nodes list(trsize_tips = trsize_tips, trsize_nodes = trsize_nodes, numpolys = numpolys, numpolysbytrsize_tips = numpolysbytrsize_tips, numpolysbytrsize_nodes = numpolysbytrsize_nodes, proptipsdescpoly = proptipsdescpoly, propnodesdich = propnodesdich) } # Single tree example tree <- read.tree(text="((((((artemisia_species:44,lactuca_species:44,senecio_species:44)6:46,campanula_species:90)5:17.75,((asclepias_species:71,galium_species:71)8:18.375,plantago_species:89.375)7:18.375)4:17.75,((cerastium_species:41.833332,silene_species:41.833332)10:41.833332,chenopodium_species:83.666664)9:41.833336)3:17.75,((geum_species:47,potentilla_species:47)12:48.125,lepidium_species:95.125)11:48.125)2:17.75,(bromus_species:12,elymus_species:12)13:149)1;") dat <- treeresstats(tree) dat # Many trees example maketrees <- function(numtrees) { require(ape); require(plyr) trees <- rmtree(numtrees, 20) llply(trees, di2multi, tol = 0.5) } trees <- maketrees(30) dat <- ldply(trees, function(x) data.frame(treeresstats(x))) dat Here’s output from the gist above: ...

January 13, 2012 · 2 min · Scott Chamberlain

My talk on doing phylogenetics in R

I gave a talk today on doing very basic phylogenetics in R, including getting sequence data, aligning sequence data, plotting trees, doing trait evolution stuff, etc. Please comment if you have code for doing bayesian phylogenetic inference in R. I know phyloch has function mrbayes, but can’t get it to work… Phylogenetics in R View more presentations from schamber

November 18, 2011 · 1 min · Scott Chamberlain

Jonathan Eisen on the Fourth Domain and Open Science

Stalking the Fourth Domain with Jonathan Eisen, Ph D from mendelspod on Vimeo.

September 6, 2011 · 1 min · Scott Chamberlain

Tenure track position in systematics at the University of Vermont

There is an awesome position opening up for an assistant professor in systematics at the University of Vermont. Below is the announcement, and see the original post at the Distributed Ecology blog. Why is this related to R? One can do a lot of systematics work in R, including retrieving scientific collections data through an upcoming package handshaking with VertNet (part of the rOpenSci project), managing large data sets, retrieval of GenBank data through the ape package (see fxn read.genbank), phylogenetic reconstruction and analysis, and more. So I am sure a systematist with R ninja skills will surely have a head up on the rest of the field. ...

August 22, 2011 · 3 min · Scott Chamberlain

iEvoBio 2011 Synopsis

We just wrapped up the 2011 iEvoBio meeting. It was awesome! If you didn’t go this year or last year, definitely think about going next year. Here is a list of the cool projects that were discussed at the meeting (apologies if I left some out): Vistrails: workflow tool, awesome project by Claudio Silva Commplish: purpose is to use via API’s, not with the web UI Phylopic: a database of life-form silouhettes, including an API for remote access, sweet! Gloome MappingLife: awesome geographic/etc data visualization interace on the web SuiteSMA: visualizating multiple alignments treeBASE: R interface to treebase, by Carl Boettiger VertNet: database for vertebrate natural history collections RevBayes: revamp of MrBayes, with GUI, etc. Phenoscape Knowledge Base Peter Midford lightning talk: talked about matching taxonomic and genetic data BiSciCol: biological science collections tracker Ontogrator TNRS: taxonomic name resolution service Barcode of Life data systems, and remote access Moorea Biocode Project Microbial LTER’s data BirdVis: interactive bird data visualization (Claudio Silva in collaboration with Cornell Lab of Ornithology) Crowdlabs: I think the site is down right now, another project by Claudio Silva Phycas: Bayesian phylogenetics, can you just call this from R? RIP MrBayes!!!! replaced by RevBayes (see 9 above) Slides of presentations will be at Slideshare (not all presentations up yet) A birds of a feather group I was involved in proposed an idea (TOL-o-matic) like Phylomatic, but of broader scope, for easy access and submission of trees, and perhaps even social (think just pushing a ‘SHARE’ button within PAUP, RevBayes, or other phylogenetics software)! Synopses of Birds of a Feather discussion groups: http://piratepad.net/iEvoBio11-BoF-reportouts

June 22, 2011 · 2 min · Scott Chamberlain

searching ITIS and fetching Phylomatic trees

I am writing a set of functions to search ITIS for taxonomic information (more databases to come) and functions to fetch plant phylogenetic trees from Phylomatic. Code at github. Also, see the examples in the demos folder on the Github site above.

June 3, 2011 · 1 min · Scott Chamberlain

phylogenetic signal simulations

I did a little simulation to examine how K and lambda vary in response to tree size (and how they compare to each other on the same simulated trees). I use Liam Revell’s functions fastBM to generate traits, and phylosig to measure phylogenetic signal. Two observations: First, it seems that lambda is more sensitive than K to tree size, but then lambda levels out at about 40 species, whereas K continues to vary around a mean of 1. Second, K is more variable than lambda at all levels of tree size (compare standard error bars). Does this make sense to those smart folks out there? ...

May 18, 2011 · 2 min · Scott Chamberlain

A simple function for plotting phylogenies in ggplot2

UPDATE: Greg jordan has a much more elegant way of plotting trees with ggplot2. See his links in the comments below. I wrote a simple function for plotting a phylogeny in ggplot2. However, it only handles a 3 species tree right now, as I haven’t figured out how to generalize the approach to N species. It’s at https://gist.github.com/977207 Any ideas on how to improve this?

May 17, 2011 · 1 min · Scott Chamberlain

Comparison of functions for comparative phylogenetics

With all the packages (and beta stage groups of functions) for comparative phylogenetics in R (tested here: picante, geiger, ape, motmot, Liam Revell’s functions), I was simply interested in which functions to use in cases where multiple functions exist to do the same thing. I only show default settings, so perhaps these functions would differ under different parameter settings. [I am using a Mac 2.4 GHz i5, 4GB RAM] Get motmot here: https://r-forge.r-project.org/R/?group_id=782 ...

May 11, 2011 · 2 min · Scott Chamberlain

Treebase trees from R

UPDATE: See Carl Boettiger’s functions/package at Github for searching Treebase here. Treebase is a great resource for phylogenetic trees, and has a nice interface for searching for certain types of trees. However, if you want to simply download a lot of trees for analyses (like that in Davies et al.), then you want to be able to access trees in bulk (I believe Treebase folks are working on an API though). I wrote some simple code for extracting trees from Treebase.org.It reads an xml file of (in this case consensus) URL’s for each tree, parses the xml, makes a vector of URL’s, reads the nexus files with error checking, remove trees that gave errors, then a simple plot looking at metrics of the trees. ...

May 3, 2011 · 1 min · Scott Chamberlain