New food web dataset

So, there is a new food web dataset out that was put in Ecological Archives here, and I thought I would play with it. The food web is from Otago Harbour, an intertidal mudflat ecosystem in New Zealand. The web contains 180 nodes, with 1,924 links. Fun stuff… igraph, default layout plot igraph, circle layout plot, nice My funky little gggraph function plotget the gggraph function, and make it better, here at Github

October 14, 2011 · 1 min · Scott Chamberlain

ggplot2 talk by Hadley Whickam at Google

June 17, 2011 · 0 min · Scott Chamberlain

phylogenetic signal simulations

I did a little simulation to examine how K and lambda vary in response to tree size (and how they compare to each other on the same simulated trees). I use Liam Revell’s functions fastBM to generate traits, and phylosig to measure phylogenetic signal. Two observations: First, it seems that lambda is more sensitive than K to tree size, but then lambda levels out at about 40 species, whereas K continues to vary around a mean of 1. Second, K is more variable than lambda at all levels of tree size (compare standard error bars). Does this make sense to those smart folks out there? ...

May 18, 2011 · 2 min · Scott Chamberlain

A simple function for plotting phylogenies in ggplot2

UPDATE: Greg jordan has a much more elegant way of plotting trees with ggplot2. See his links in the comments below. I wrote a simple function for plotting a phylogeny in ggplot2. However, it only handles a 3 species tree right now, as I haven’t figured out how to generalize the approach to N species. It’s at https://gist.github.com/977207 Any ideas on how to improve this?

May 17, 2011 · 1 min · Scott Chamberlain

Phylometa from R - UDPATE

A while back I posted some messy code to run Phylometa from R, especially useful for processing the output data from Phylometa which is not easily done. The code is still quite messy, but it should work now. I have run the code with tens of different data sets and phylogenies so it should work. I fixed errors when parentheses came up against numbers in the output, and other things. You can use the code for up to 4 levels of your grouping variable. In addition, there are some lines of code to plot the effect sizes with confidence intervals, comparing random and fixed effects models and phylogenetic and traditional models. ...

April 1, 2011 · 2 min · Scott Chamberlain

basic ggplot2 network graphs ver2

I posted last week a simple function to plot networks using ggplot2 package. Here is version 2. I still need to work on figuring out efficient vertex placement. Changes in version 2: You have one of three options: use an igraph object, a matrix, or a dataframe (matrices will be converted to data frames within the function) If you have data on food webs similar to that provided in the Takapoto dataset provided in the NetIndices package, you can set trophic = “TRUE”, and gggraph will use the function TrophInd to assign trophic levels (the y axis value) to each vertex/node. You have to provide additional information along with this option such as what the imports and exports are, see NetIndices documentation. I added some simple error checking. if using method=“df” and trophic=“FALSE”, x axis placement of vertices is now done using the function degreex (see inside the fxn), which sorts vertices according to their degree (so the least connected species are on the left of the graph; note that species with the same degree are not stacked on the y-axis because e.g., two vertices of degree=5 would get x=3 then x=4). # ggraph Version 2 require(bipartite) require(igraph) require(ggplot2) # gggraph, version 3 g = an igraph graph object, a matrix, or data frame # vplace = type of vertex placement assignment, one of rnorm, runif, etc. # method = one of 'df' for data frame, 'mat' for matrix or 'igraph' for an # igraph graph object trophic = TRUE or FALSE for using Netindices # function TrophInd to determine trophic level (y value in graph) # trophinames = columns in matrix or dataframe to use for calculating # trophic level import = named or refereced by col# columns of matrix or # dataframe to use for import argument of TrophInd export = named or # refereced by col# columns of matrix or dataframe to use for export # argument of TrophInd dead = named or refereced by col# columns of matrix # or dataframe to use for dead argument of TrophInd gggraph <- function(g, vplace = rnorm, method, trophic = "FALSE", trophinames, import, export) { degreex <- function(x) { degreecol <- apply(x, 2, function(y) length(y[y > 0])) degreerow <- apply(x, 1, function(y) length(y[y > 0])) degrees <- sort(c(degreecol, degreerow)) df <- data.frame(degrees, x = seq(1, length(degrees), 1)) df$value <- rownames(df) df } # require igraph if (!require(igraph)) stop("must first install 'igraph' package.") # require ggplot2 if (!require(ggplot2)) stop("must first install 'ggplot2' package.") if (method == "df") { if (class(g) == "matrix") { g <- as.data.frame(g) } if (class(g) != "data.frame") stop("object must be of class 'data.frame.'") if (trophic == "FALSE") { # data preparation from adjacency matrix temp <- data.frame(expand.grid(dimnames(g))[1:2], as.vector(as.matrix(g))) temp <- temp[(temp[, 3] > 0) & !is.na(temp[, 3]), ] temp <- temp[sort.list(temp[, 1]), ] g_df <- data.frame(rows = temp[, 1], cols = temp[, 2], freqint = temp[, 3]) g_df$id <- 1:length(g_df[, 1]) g_df <- data.frame(id = g_df[, 4], rows = g_df[, 1], cols = g_df[, 2], freqint = g_df[, 3]) g_df_ <- melt(g_df, id = c(1, 4)) xy_s <- data.frame(degreex(g), y = rnorm(length(unique(g_df_$value)))) g_df_2 <- merge(g_df_, xy_s, by = "value") } else if (trophic == "TRUE") { # require NetIndices if (!require(NetIndices)) stop("must first install 'NetIndices' package.") # data preparation from adjacency matrix temp <- data.frame(expand.grid(dimnames(g[-trophinames, -trophinames]))[1:2], as.vector(as.matrix(g[-trophinames, -trophinames]))) temp <- temp[(temp[, 3] > 0) & !is.na(temp[, 3]), ] temp <- temp[sort.list(temp[, 1]), ] g_df <- data.frame(rows = temp[, 1], cols = temp[, 2], freqint = temp[, 3]) g_df$id <- 1:length(g_df[, 1]) g_df <- data.frame(id = g_df[, 4], rows = g_df[, 1], cols = g_df[, 2], freqint = g_df[, 3]) g_df_ <- melt(g_df, id = c(1, 4)) xy_s <- data.frame(value = unique(g_df_$value), x = rnorm(length(unique(g_df_$value))), y = TrophInd(g, Import = import, Export = export)[, 1]) g_df_2 <- merge(g_df_, xy_s, by = "value") } # plotting p <- ggplot(g_df_2, aes(x, y)) + geom_point(size = 5) + geom_line(aes(size = freqint, group = id)) + geom_text(size = 3, hjust = 1.5, aes(label = value)) + theme_bw() + opts(panel.grid.major = theme_blank(), panel.grid.minor = theme_blank(), axis.text.x = theme_blank(), axis.text.y = theme_blank(), axis.title.x = theme_blank(), axis.title.y = theme_blank(), axis.ticks = theme_blank(), panel.border = theme_blank(), legend.position = "none") p # return graph } else if (method == "igraph") { if (class(g) != "igraph") stop("object must be of class 'igraph.'") # data preparation from igraph object g_ <- get.edgelist(g) g_df <- as.data.frame(g_) g_df$id <- 1:length(g_df[, 1]) g_df <- melt(g_df, id = 3) xy_s <- data.frame(value = unique(g_df$value), x = vplace(length(unique(g_df$value))), y = vplace(length(unique(g_df$value)))) g_df2 <- merge(g_df, xy_s, by = "value") # plotting p <- ggplot(g_df2, aes(x, y)) + geom_point(size = 2) + geom_line(size = 0.3, aes(group = id, linetype = id)) + geom_text(size = 3, hjust = 1.5, aes(label = value)) + theme_bw() + opts(panel.grid.major = theme_blank(), panel.grid.minor = theme_blank(), axis.text.x = theme_blank(), axis.text.y = theme_blank(), axis.title.x = theme_blank(), axis.title.y = theme_blank(), axis.ticks = theme_blank(), panel.border = theme_blank(), legend.position = "none") p # return graph } else stop(paste("do not recognize method = \"", method, "\";\nmethods are \"df\" and \"igraph\"", sep = "")) } # Eg library(NetIndices) data(Takapoto) gggraph(Takapoto, vplace = rnorm, method = "df", trophic = "TRUE", trophinames = c(8:10), import = "CO2", export = c("CO2", "Sedimentation", "Grazing")) ...

March 23, 2011 · 5 min · Scott Chamberlain

basic ggplot2 network graphs

I have been looking around on the web and have not found anything yet related to using ggplot2 for making graphs/networks. I put together a few functions to make very simple graphs. The bipartite function especially is not ideal, as of course we only want to allow connections between unlike nodes, not all nodes. These functions do not, obviously, take full advantage of the power of ggplot2, but it’s a start. ...

March 17, 2011 · 2 min · Scott Chamberlain

Five ways to visualize your pairwise comparisons

UPDATE: At the bottom are two additional methods, and some additions (underlined) are added to the original 5 methods. Thanks for all the feedback… -Also, another post here about ordered-categorical data-Also #2, a method combining splom and hexbin packages here, for larger datasets ...

March 5, 2011 · 3 min · Scott Chamberlain

Farmer's markets data

I combined USDA data on farmer’s markets in the US with population census data to get an idea of the disparity in farmers markets by state, and then also expressed per capita. Download USDA data here. The formatted file I used below is here (in excel format, although I read into R as csv file). The census data is read from url as below. California has a ton of absolute number of farmer’s markets, but Vermont takes the cake by far with number of markets per capita. Iowa comes in a distant second behind Vermont in markets per capita. ...

February 16, 2011 · 2 min · Scott Chamberlain

Troubling news for the teaching of evolution

[UPDATE: i remade the maps in green, hope that helps…] A recent survey reported in Science (“Defeating Creationism in the Courtroom, but not in the Classroom”) found that biology teachers in high school do not often accept the basis of their discipline, as do teachers in other disciplines, and thus may not teach evolution appropriately. Read more here: New York Times. I took a little time to play with the data provided online along with the Science article. The data is available on the Science website along with the article, and the dataset I read into R is unchanged from the original. The states abbreviations file is here (as a .xls). Here goes: ...

February 9, 2011 · 3 min · Scott Chamberlain