Logistic plot reboot

Someone asked about plotting something like this today I wrote a few functions previously to do something like this. However, since then ggplot2 has changed, and one of the functions no longer works. Hence, I fixed opts() to theme(), theme_blank() to element_blank(), and panel.background = element_blank() to plot.background = element_blank() to get the histograms to show up with the line plot and not cover it. The new functions: loghistplot <- function(data) { names(data) <- c('x','y') # rename columns # get min and max axis values min_x <- min(data$x) max_x <- max(data$x) min_y <- min(data$y) max_y <- max(data$y) # get bin numbers bin_no <- max(hist(data$x, plot = FALSE)$counts) + 5 # create plots a <- ggplot(data, aes(x = x, y = y)) + theme_bw(base_size=16) + geom_smooth(method = "glm", family = "binomial", se = TRUE, colour='black', size=1.5, alpha = 0.3) + scale_x_continuous(limits=c(min_x,max_x)) + theme(panel.grid.major = element_blank(), panel.grid.minor=element_blank(), panel.background = element_blank(), plot.background = element_blank()) + labs(y = "Probability\n", x = "\nYour X Variable") theme_loghist <- list( theme(panel.grid.major = element_blank(), panel.grid.minor=element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank(), axis.ticks = element_blank(), panel.border = element_blank(), panel.background = element_blank(), plot.background = element_blank()) ) b <- ggplot(data[data$y == unique(data$y)[1], ], aes(x = x)) + theme_bw(base_size=16) + geom_histogram(fill = "grey") + scale_y_continuous(limits=c(0,bin_no)) + scale_x_continuous(limits=c(min_x,max_x)) + theme_loghist + labs(y='\n', x='\n') c <- ggplot(data[data$y == unique(data$y)[2], ], aes(x = x)) + theme_bw(base_size=16) + geom_histogram(fill = "grey") + scale_y_continuous(trans='reverse', limits=c(bin_no,0)) + scale_x_continuous(limits=c(min_x,max_x)) + theme_loghist + labs(y='\n', x='\n') grid.newpage() pushViewport(viewport(layout = grid.layout(1,1))) vpa_ <- viewport(width = 1, height = 1, x = 0.5, y = 0.5) vpb_ <- viewport(width = 1, height = 1, x = 0.5, y = 0.5) vpc_ <- viewport(width = 1, height = 1, x = 0.5, y = 0.5) print(b, vp = vpb_) print(c, vp = vpc_) print(a, vp = vpa_) } logpointplot <- function(data) { names(data) <- c('x','y') # rename columns # get min and max axis values min_x <- min(data$x) max_x <- max(data$x) min_y <- min(data$y) max_y <- max(data$y) # create plots ggplot(data, aes(x = x, y = y)) + theme_bw(base_size=16) + geom_point(size = 3, alpha = 0.5, position = position_jitter(w=0, h=0.02)) + geom_smooth(method = "glm", family = "binomial", se = TRUE, colour='black', size=1.5, alpha = 0.3) + scale_x_continuous(limits=c(min_x,max_x)) + theme(panel.grid.major = element_blank(), panel.grid.minor=element_blank(), panel.background = element_blank()) + labs(y = "Probability\n", x = "\nYour X Variable") } Install ggplot2 and gridExtra if you don’t have them: ...

May 22, 2014 · 2 min · Scott Chamberlain

R ecology workshop

After my presentation yesterday to a group of grad students on R resources, I did a presentation today on intro to R data manipulation, visualizations, and analyses/visualizations of biparite networks and community level analyses (diversity, rarefaction, ordination, etc.). As I said yesterday I’ve been playing with two ways to make reproducible presentations in R: RStudio’s presentations built in to RStudio IDE, and Slidify. Yesterday I went with RStudio’s product - today I used Slidify. See the Markdown file for the presentation here. ...

July 31, 2013 · 1 min · Scott Chamberlain

ggplot2 maps with insets

UPDATE: changed data source so that the entire example can be run by anyone on their own machine. Also, per Joachim’s suggestion, I put a box around the blown up area of the map. In addition, rgeos and maptools removed, not needed. Here’s a quick demo of creating a map with an inset within it using ggplot. The inset is achieved using the gridExtra package. Install libraries install.packages(c("ggplot2", "maps", "grid", "gridExtra")) library("ggplot2") library("maps") library("grid") library("gridExtra") Create a data frame dat <- data.frame(ecosystem = rep(c("oak", "steppe", "prairie"), each = 8), lat = rnorm(24, mean = 51, sd = 1), lon = rnorm(24, mean = -113, sd = 5)) head(dat) #> ecosystem lat lon #> 1 oak 49.58285 -107.6930 #> 2 oak 52.58942 -116.6920 #> 3 oak 50.49277 -114.5542 #> 4 oak 50.05943 -116.5660 #> 5 oak 51.76492 -112.1457 #> 6 oak 52.82153 -112.8858 Get maps using the maps library Get a map of Canada ...

August 22, 2012 · 3 min · Scott Chamberlain

Visualize your Github stats (forks and watchers) in a browser with R!

So OpenCPU is pretty awesome. You can run R in a browser using URL calls with an alphanumeric code (e.g., x3e50ee0780) defining a stored function, and any arguments you pass to it. Go here to store a function. And you can output lots of different types of things: png, pdf, json, etc - see here. Here’s a function I created (originally from https://gist.github.com/2602432): # Store a function with man lines # Go Here: http://beta.opencpu.org/apps/opencpu.demo/storefunction/ # number: x3e50ee0780 # link: http://beta.opencpu.org/R/call/store:tmp/x3e50ee0780/png?id='ropensci'&type='org' the <- function (id = 'hadley', type = 'user') { require(RCurl); require(RJSONIO); require(ggplot2); require(reshape2); require(plyr) if(type == 'user'){ url = "https://api.github.com/users/" } else if(type == 'org'){ url = "https://api.github.com/orgs/" } else stop("parameter 'type' has to be either 'user' or 'org' ") url2 <- paste(url, id, "/repos?per_page=100", sep = "") xx <- getURL(url2) tt <- fromJSON(xx) if(!length(tt) == 1) { tt <- tt } else { stop("user or organization not found - search GitHub? - https://github.com/") } out <- ldply(tt, function(x) t(c(x$name, x$forks, x$watchers))) names(out) <- c("Repo", "Forks", "Watchers") out$Forks <- as.integer(out$Forks) out$Watchers <- as.integer(out$Watcher) out2 <- melt(out, id = 1) out2$value <- as.numeric(out2$value) out2$Repo <- as.factor(out2$Repo) repoorder <- unique(out2[order(out2$value, decreasing=FALSE),][,1]) out2$Repo <- factor(out2$Repo, levels = repoorder) ggplot(out2, aes(Repo, value)) + geom_bar() + coord_flip() + facet_wrap(~variable) + theme_bw(base_size = 18) } the() # default for hadley the(id='defunkt', type='user') # works - a user with even more repos than Hadley the(id='ropensci', type='org') # works - organization example the(id='jeroenooms', type='user') # works - organization example the(id='SChamberlain', type='org') # error message - mismatch of username with org type the(id='adsff', type='user') # error message - name does not exist It makes a ggplot2 graphic of your watchers and forks on each repo (up to 100 repos), sorted by descending number of forks/watchers. ...

May 5, 2012 · 2 min · Scott Chamberlain

Presenting results of logistic regression

So my advisor pointed out this ’new’ (well, 2004), way of plotting results of logistic regression results. The idea was presented in a 2004 Bulletin of the Ecological Society of America issue (here). I tried to come up with a solution using, what else, ggplot2. I don’t have it quite all the way down - I am missing the second y-axis values for the histograms, but someone smarter than me can figure that part out (note that Hadley doesn’t want to support second y-axes in ggplot2, but they can probably be hacked on). ...

January 10, 2012 · 3 min · Scott Chamberlain

Weecology can has new mammal dataset

So the Weecology folks have published a large dataset on mammal communities in a data paper in Ecology. I know nothing about mammal communities, but that doesn’t mean one can’t play with the data… Their dataset consists of five csv files: communities, references, sites, species, and trapping data Where are these sites, and by the way, do they vary much in altitude? Let’s zoom in on just the states ...

December 29, 2011 · 1 min · Scott Chamberlain

Recology is 1 yr old

This blog has lasted a whole year already. Thanks for reading and commenting. There are a couple of announcements: Less blogging: I hope to put in many more years blogging here, but in full disclosure I am blogging for Journal of Ecology now, so I am going to be (and already have been) blogging less here. More blogging: If anyone wants to write guest posts at Recology on the topics of using R for ecology and evolution, or open science, please contact me. Different blogging: I was going to roll out the new dynamic views for this blog, but Google doesn’t allow javascript, which is how I include code using GitHub gists. Oh well… Anywho, here is the breakdown of visits to this blog, visualized using #ggplot2, of course. There were a total of about 23,000 pageviews in the first year of this blog. ...

December 23, 2011 · 1 min · Scott Chamberlain

I Work For The Internet !

UPDATE: code and figure updated at 647 AM CST on 19 Dec ‘11. Also, see Jarrett Byrnes (improved) fork of my gist here. The site I WORK FOR THE INTERNET is collecting pictures and first names (last name initials only) to show collective support against SOPA (the Stop Online Piracy Act). Please stop by their site and add your name/picture. I used the #rstats package twitteR, created by Jeff Gentry, to search for tweets from people signing this site with their picture, then plotted using ggplot2, and also used Hadley’s lubridate to round timestamps on tweets to be able to bin tweets in to time slots for plotting. ...

December 13, 2011 · 1 min · Scott Chamberlain

My talk on doing phylogenetics in R

I gave a talk today on doing very basic phylogenetics in R, including getting sequence data, aligning sequence data, plotting trees, doing trait evolution stuff, etc. Please comment if you have code for doing bayesian phylogenetic inference in R. I know phyloch has function mrbayes, but can’t get it to work… Phylogenetics in R View more presentations from schamber

November 18, 2011 · 1 min · Scott Chamberlain

My little presentation on getting web data through R

With examples from rOpenSci R packages. p.s. I am no expert at this... Web data from R View more presentations from schamber

October 28, 2011 · 1 min · Scott Chamberlain