Function for phylogeny resolution

UPDATE: Yeah, so the treeresstats function had a problem in one of the calculations. I fixed that and added some more calulcations to the function. I couldn’t find any functions to calculate number of polytomies, and related metrics. Here’s a simple function that gives four metrics on a phylo tree object: # calculate tree resolution stats treeresstats <- function(x) { require(phangorn) # load the phangorn package todo <- ( 1+Ntip(x)) : (Ntip(x) + Nnode(x) ) trsize_tips <- Ntip(x) trsize_nodes <- Nnode(x) polytomyvec <- sapply(todo, function(y) length(Children(x, y))) numpolys <- length(polytomyvec[polytomyvec > 2]) numpolysbytrsize_tips <- numpolys/trsize_tips numpolysbytrsize_nodes <- numpolys/trsize_nodes proptipsdescpoly <- sum(polytomyvec[polytomyvec > 2])/trsize_tips propnodesdich <- length(polytomyvec[polytomyvec == 2])/trsize_nodes list(trsize_tips = trsize_tips, trsize_nodes = trsize_nodes, numpolys = numpolys, numpolysbytrsize_tips = numpolysbytrsize_tips, numpolysbytrsize_nodes = numpolysbytrsize_nodes, proptipsdescpoly = proptipsdescpoly, propnodesdich = propnodesdich) } # Single tree example tree <- read.tree(text="((((((artemisia_species:44,lactuca_species:44,senecio_species:44)6:46,campanula_species:90)5:17.75,((asclepias_species:71,galium_species:71)8:18.375,plantago_species:89.375)7:18.375)4:17.75,((cerastium_species:41.833332,silene_species:41.833332)10:41.833332,chenopodium_species:83.666664)9:41.833336)3:17.75,((geum_species:47,potentilla_species:47)12:48.125,lepidium_species:95.125)11:48.125)2:17.75,(bromus_species:12,elymus_species:12)13:149)1;") dat <- treeresstats(tree) dat # Many trees example maketrees <- function(numtrees) { require(ape); require(plyr) trees <- rmtree(numtrees, 20) llply(trees, di2multi, tol = 0.5) } trees <- maketrees(30) dat <- ldply(trees, function(x) data.frame(treeresstats(x))) dat Here’s output from the gist above: ...

January 13, 2012 · 2 min · Scott Chamberlain

Presenting results of logistic regression

So my advisor pointed out this ’new’ (well, 2004), way of plotting results of logistic regression results. The idea was presented in a 2004 Bulletin of the Ecological Society of America issue (here). I tried to come up with a solution using, what else, ggplot2. I don’t have it quite all the way down - I am missing the second y-axis values for the histograms, but someone smarter than me can figure that part out (note that Hadley doesn’t want to support second y-axes in ggplot2, but they can probably be hacked on). ...

January 10, 2012 · 3 min · Scott Chamberlain

Weecology can has new mammal dataset

So the Weecology folks have published a large dataset on mammal communities in a data paper in Ecology. I know nothing about mammal communities, but that doesn’t mean one can’t play with the data… Their dataset consists of five csv files: communities, references, sites, species, and trapping data Where are these sites, and by the way, do they vary much in altitude? Let’s zoom in on just the states ...

December 29, 2011 · 1 min · Scott Chamberlain

Recology is 1 yr old

This blog has lasted a whole year already. Thanks for reading and commenting. There are a couple of announcements: Less blogging: I hope to put in many more years blogging here, but in full disclosure I am blogging for Journal of Ecology now, so I am going to be (and already have been) blogging less here. More blogging: If anyone wants to write guest posts at Recology on the topics of using R for ecology and evolution, or open science, please contact me. Different blogging: I was going to roll out the new dynamic views for this blog, but Google doesn’t allow javascript, which is how I include code using GitHub gists. Oh well… Anywho, here is the breakdown of visits to this blog, visualized using #ggplot2, of course. There were a total of about 23,000 pageviews in the first year of this blog. ...

December 23, 2011 · 1 min · Scott Chamberlain

I Work For The Internet !

UPDATE: code and figure updated at 647 AM CST on 19 Dec ‘11. Also, see Jarrett Byrnes (improved) fork of my gist here. The site I WORK FOR THE INTERNET is collecting pictures and first names (last name initials only) to show collective support against SOPA (the Stop Online Piracy Act). Please stop by their site and add your name/picture. I used the #rstats package twitteR, created by Jeff Gentry, to search for tweets from people signing this site with their picture, then plotted using ggplot2, and also used Hadley’s lubridate to round timestamps on tweets to be able to bin tweets in to time slots for plotting. ...

December 13, 2011 · 1 min · Scott Chamberlain

LondonR meetings presentations

Three presentations uploaded on LondonR meetings website. I especially enjoyed the JD Long presentation on the seque package for simulations using Amazon’s EC2.

December 11, 2011 · 1 min · Scott Chamberlain

rOpenSci won 3rd place in the PLoS-Mendeley Binary Battle!

I am part of the rOpenSci development team (along with Carl Boettiger, Karthik Ram, and Nick Fabina). Our website: http://ropensci.org/. Code at Github: https://github.com/ropensci We entered two of our R packages for integrating with PLoS Journals (rplos) and Mendeley (RMendeley) in the Mendeley-PLoS Binary Battle. Get them at GitHub (rplos; RMendeley). These two packages allow users (from R! of course) to search and retrieve data from PLoS journals (including their altmetrics data), and from Mendeley. You could surely mash up data from both PLoS and Mendeley. That’s what’s cool about rOpenSci - we provide the tools, and leave it up to users vast creativity to do awesome things. 3rd place gives us a $1,000 prize, plus a Parrot AR Drone helicopter. ...

November 30, 2011 · 1 min · Scott Chamberlain

My talk on doing phylogenetics in R

I gave a talk today on doing very basic phylogenetics in R, including getting sequence data, aligning sequence data, plotting trees, doing trait evolution stuff, etc. Please comment if you have code for doing bayesian phylogenetic inference in R. I know phyloch has function mrbayes, but can’t get it to work… Phylogenetics in R View more presentations from schamber

November 18, 2011 · 1 min · Scott Chamberlain

My little presentation on getting web data through R

With examples from rOpenSci R packages. p.s. I am no expert at this... Web data from R View more presentations from schamber

October 28, 2011 · 1 min · Scott Chamberlain

Two new rOpenSci R packages are on CRAN

Carl Boettiger, a graduate student at UC Davis, just got two packages on CRAN. One is treebase, which which handshakes with the Treebase API. The other is rfishbase, which connects with the Fishbase, although I believe just scrapes XML content as there is no API. See development on GitHub for treebase here, and for rfishbase here. Carl has some tutorials on treebase and rfishbase at his website here, and we have an official rOpenSci tutorial for treebase here. Basically, these two R packages let you search and pull down data from Treebase and Fishbase - pretty awesome. This improves workflow, and puts your data search and acquisition component into your code, instead of being a bunch of mouse clicks in a browser. ...

October 27, 2011 · 1 min · Scott Chamberlain