Take the INNGE survey on math and ecology

Many ecologists are R users, but we vary in our understanding of the math and statistical theory behind models we use. There is no clear consensus on what should be the basic mathematical training of ecologists. To learn what the community thinks, we invite you to fill out a short and anonymous questionnaire on this topic here. The questionnaire was designed by Frédéric Barraquand, a graduate student at Université Pierre et Marie Curie, in collaboration with the International Network of Next-Generation Ecologists (INNGE). ...

February 17, 2012 · 1 min · Scott Chamberlain

Scraping Flora of North America

So Flora of North America is an awesome collection of taxonomic information for plants across the continent. However, the information within is not easily machine readable. So, a little web scraping is called for. rfna is an R package to collect information from the Flora of North America. So far, you can: Get taxonomic names from web pages that index the names. Then get daughter URLs for those taxa, which then have their own 2nd order daughter URLs you can scrape, or scrape the 1st order daughter page. Query Asteraceae taxa for whether they have paleate or epaleate receptacles. This function is something I needed, but more functions will be made like this to get specific traits. Further functions will do search, etc. ...

January 27, 2012 · 2 min · Scott Chamberlain

RNetLogo - A package for running NetLogo from R

Described in a new Methods in Ecology and Evolution paper here, a new R package RNetLogo allows you to use NetLogo from R. NetLogo is software is a “multi-agent programmable modeling environment”. NetLogo can be used in individual- and agent-based modeling, and is used in the book Agent-based and Individual-based Modeling: A Practical Introduction by Railsback & Grimm. I have not tried the package yet, but looks interesting. I am always a fan of running stand-alone programs from R if possible. ...

January 23, 2012 · 1 min · Scott Chamberlain

Function for phylogeny resolution

UPDATE: Yeah, so the treeresstats function had a problem in one of the calculations. I fixed that and added some more calulcations to the function. I couldn’t find any functions to calculate number of polytomies, and related metrics. Here’s a simple function that gives four metrics on a phylo tree object: # calculate tree resolution stats treeresstats <- function(x) { require(phangorn) # load the phangorn package todo <- ( 1+Ntip(x)) : (Ntip(x) + Nnode(x) ) trsize_tips <- Ntip(x) trsize_nodes <- Nnode(x) polytomyvec <- sapply(todo, function(y) length(Children(x, y))) numpolys <- length(polytomyvec[polytomyvec > 2]) numpolysbytrsize_tips <- numpolys/trsize_tips numpolysbytrsize_nodes <- numpolys/trsize_nodes proptipsdescpoly <- sum(polytomyvec[polytomyvec > 2])/trsize_tips propnodesdich <- length(polytomyvec[polytomyvec == 2])/trsize_nodes list(trsize_tips = trsize_tips, trsize_nodes = trsize_nodes, numpolys = numpolys, numpolysbytrsize_tips = numpolysbytrsize_tips, numpolysbytrsize_nodes = numpolysbytrsize_nodes, proptipsdescpoly = proptipsdescpoly, propnodesdich = propnodesdich) } # Single tree example tree <- read.tree(text="((((((artemisia_species:44,lactuca_species:44,senecio_species:44)6:46,campanula_species:90)5:17.75,((asclepias_species:71,galium_species:71)8:18.375,plantago_species:89.375)7:18.375)4:17.75,((cerastium_species:41.833332,silene_species:41.833332)10:41.833332,chenopodium_species:83.666664)9:41.833336)3:17.75,((geum_species:47,potentilla_species:47)12:48.125,lepidium_species:95.125)11:48.125)2:17.75,(bromus_species:12,elymus_species:12)13:149)1;") dat <- treeresstats(tree) dat # Many trees example maketrees <- function(numtrees) { require(ape); require(plyr) trees <- rmtree(numtrees, 20) llply(trees, di2multi, tol = 0.5) } trees <- maketrees(30) dat <- ldply(trees, function(x) data.frame(treeresstats(x))) dat Here’s output from the gist above: ...

January 13, 2012 · 2 min · Scott Chamberlain

Presenting results of logistic regression

So my advisor pointed out this ’new’ (well, 2004), way of plotting results of logistic regression results. The idea was presented in a 2004 Bulletin of the Ecological Society of America issue (here). I tried to come up with a solution using, what else, ggplot2. I don’t have it quite all the way down - I am missing the second y-axis values for the histograms, but someone smarter than me can figure that part out (note that Hadley doesn’t want to support second y-axes in ggplot2, but they can probably be hacked on). ...

January 10, 2012 · 3 min · Scott Chamberlain

Weecology can has new mammal dataset

So the Weecology folks have published a large dataset on mammal communities in a data paper in Ecology. I know nothing about mammal communities, but that doesn’t mean one can’t play with the data… Their dataset consists of five csv files: communities, references, sites, species, and trapping data Where are these sites, and by the way, do they vary much in altitude? Let’s zoom in on just the states ...

December 29, 2011 · 1 min · Scott Chamberlain

Recology is 1 yr old

This blog has lasted a whole year already. Thanks for reading and commenting. There are a couple of announcements: Less blogging: I hope to put in many more years blogging here, but in full disclosure I am blogging for Journal of Ecology now, so I am going to be (and already have been) blogging less here. More blogging: If anyone wants to write guest posts at Recology on the topics of using R for ecology and evolution, or open science, please contact me. Different blogging: I was going to roll out the new dynamic views for this blog, but Google doesn’t allow javascript, which is how I include code using GitHub gists. Oh well… Anywho, here is the breakdown of visits to this blog, visualized using #ggplot2, of course. There were a total of about 23,000 pageviews in the first year of this blog. ...

December 23, 2011 · 1 min · Scott Chamberlain

I Work For The Internet !

UPDATE: code and figure updated at 647 AM CST on 19 Dec ‘11. Also, see Jarrett Byrnes (improved) fork of my gist here. The site I WORK FOR THE INTERNET is collecting pictures and first names (last name initials only) to show collective support against SOPA (the Stop Online Piracy Act). Please stop by their site and add your name/picture. I used the #rstats package twitteR, created by Jeff Gentry, to search for tweets from people signing this site with their picture, then plotted using ggplot2, and also used Hadley’s lubridate to round timestamps on tweets to be able to bin tweets in to time slots for plotting. ...

December 13, 2011 · 1 min · Scott Chamberlain

LondonR meetings presentations

Three presentations uploaded on LondonR meetings website. I especially enjoyed the JD Long presentation on the seque package for simulations using Amazon’s EC2.

December 11, 2011 · 1 min · Scott Chamberlain

rOpenSci won 3rd place in the PLoS-Mendeley Binary Battle!

I am part of the rOpenSci development team (along with Carl Boettiger, Karthik Ram, and Nick Fabina). Our website: http://ropensci.org/. Code at Github: https://github.com/ropensci We entered two of our R packages for integrating with PLoS Journals (rplos) and Mendeley (RMendeley) in the Mendeley-PLoS Binary Battle. Get them at GitHub (rplos; RMendeley). These two packages allow users (from R! of course) to search and retrieve data from PLoS journals (including their altmetrics data), and from Mendeley. You could surely mash up data from both PLoS and Mendeley. That’s what’s cool about rOpenSci - we provide the tools, and leave it up to users vast creativity to do awesome things. 3rd place gives us a $1,000 prize, plus a Parrot AR Drone helicopter. ...

November 30, 2011 · 1 min · Scott Chamberlain