1000 commits to taxize

Just today we’ve hit 1000 commits on taxize! taxize is an R client to search across lots of taxonomic databases on the web. In honor of the 1000 commit milestone, here’s some stats on the project. Before that, lots of people have contributed to taxize, it’s a big group effort: Eduard Szöcs Zachary Foster Carl Boettiger Karthik Ram Jari Oksanen Francis Michonneau Oliver Keyes David LeBauer Ben Marwick Anirvan Chatterjee In addition, we’ve had lots of feedback from users, including feature requests and bug reports, making taxize a lot better. ...

November 28, 2014 · 3 min · Scott Chamberlain

Intro to alpha ckanr - R client for CKAN RESTful API

Recently I had need to create a client for scraping museum metadata to help out some folks that use that kind of data. It’s called musemeta. One of the data sources in that package uses the open source data portal software CKAN, and so we can interact with the CKAN API to get data. Since many groups can use CKAN API/etc infrastucture because it’s open source, I thought why not have a general purpose R client for this, since there are other clients for Python, PHP, Ruby, etc. ...

November 26, 2014 · 8 min · Scott Chamberlain

Fun with the GitHub API

Recently I’ve had fun playing with the GitHub API, and here are some notes to self about this fun having. Setup Get/load packages install.packages(c('devtools','jsonlite','httr','yaml')) library("devtools") library("httr") library("yaml") Define a vector of package names pkgs <- c("alm", "bmc", "bold", "clifro", "ecoengine", "elastic", "fulltext", "geonames", "gistr", "RNeXML", "rnoaa", "rnpn", "traits", "rplos", "rsnps", "rWBclimate", "solr", "spocc", "taxize", "togeojson", "treeBASE") pkgs <- sort(pkgs) Define functions github_auth <- function(appname = getOption("gh_appname"), key = getOption("gh_id"), secret = getOption("gh_secret")) { if (is.null(getOption("gh_token"))) { myapp <- oauth_app(appname, key, secret) token <- oauth2.0_token(oauth_endpoints("github"), myapp) options(gh_token = token) } else { token <- getOption("gh_token") } return(token) } make_url <- function(x, y, z) { sprintf("https://api.github.com/repos/%s/%s/%s", x, y, z) } process_result <- function(x) { stop_for_status(x) if (!x$headers$`content-type` == "application/json; charset=utf-8") stop("content type mismatch") tmp <- content(x, as = "text") jsonlite::fromJSON(tmp, flatten = TRUE) } parse_file <- function(x) { tmp <- gsub("\n\\s+", "\n", paste(vapply(strsplit(x, "\n")[[1]], RCurl::base64Decode, character(1), USE.NAMES = FALSE), collapse = " ")) lines <- readLines(textConnection(tmp)) vapply(lines, gsub, character(1), pattern = "\\s", replacement = "", USE.NAMES = FALSE) } request <- function(owner = "ropensci", repo, file="DESCRIPTION", ...) { req <- GET(make_url(owner, repo, paste0("contents/", file)), config = c(token = github_auth(), ...)) if(req$status_code != 200) { NA } else { cts <- process_result(req)$content parse_file(cts) } } has_term <- function(what, ...) any(grepl(what, request(...))) has_file <- function(what, ...) if(all(is.na(request(file = what, ...)))) FALSE else TRUE Do stuff Does a package depend on a particular package? e.g., look for httr in the DESCRIPTION file (which is the default file name in request() above) ...

November 26, 2014 · 3 min · Scott Chamberlain

sofa - reboot

I’ve reworked sofa recently after someone reported a bug in the package. Since the last post on this package on 2013-06-21, there’s a bunch of changes: Removed the sofa_ prefix from all functions as it wasn’t really necessary. Replaced rjson/RJSONIO with jsonlite for JSON I/O. New functions: revisions() - to get the revision numbers for a document. uuids() - get any number of UUIDs - e.g., if you want to set document IDs with UUIDs Most functions that deal with documents are prefixed with doc_ Functions that deal with databases are prefixed with db_ Simplified all code, reducing duplication All functions take cushion as the first parameter, for consistency sake. Changed cushion() function so that you can only register one cushion with each function call, and the function takes parameters for each element now, name (name of the cushion, whatever you want), user (user name, if applicable), pwd (password, if applicable), type (one of localhost, cloudant, or iriscouch), and port (if applicable). Changed package license from CC0 to MIT There’s still more to do, but I’m pretty happy with the recent changes, and I hope at least some find the package useful. Also, would love people to try it out as all bugs are shallow and all that… ...

November 18, 2014 · 5 min · Scott Chamberlain

Conditionality meta-analysis data

The paper One paper from my graduate work asked most generally ~ “How much does the variation in magnitudes and signs of species interaction outcomes vary?”. More specifically, we wanted to know if variation differed among species interaction classes (mutualism, competition, predation), and among various “gradients” (space, time, etc.). To answer this question, we used a meta-analysis approach (rather than e.g., a field experiment). We published the paper recently. p.s. I really really wish we would have put it in an open access journal… ...

October 6, 2014 · 4 min · Scott Chamberlain

rsunlight - R client for Sunlight Labs APIs

My last blog post on this package was so long ago the package wrapped both New York Times APIs and Sunlight Labs APIs and the package was called govdat. I split that package up into rsunlight for Sunlight Labs APIs and rtimes for some New York Times APIs. rtimes is in development at Github. We’ve updated the package to include four sets of functions, one set for each of four Sunlight Labs APIs (with a separate prefix for each API): ...

August 11, 2014 · 6 min · Scott Chamberlain

analogsea - v0.1 notes

My last blog post introduced the R package I’m working on analogsea, an R client for the Digital Ocean API. Things have changed a bit, including fillig out more functions for all API endpoints, and incorparting feedback from Hadley and Karthik. The package is as v0.1 now, so I thought I’d say a few things about how it works. Note that Digital Ocean’s v2 API is in beta stage now, so the current version of analogsea at v0.1 works with their v1 API. The v2 branch of analogsea is being developed for their v2 API. ...

June 18, 2014 · 5 min · Scott Chamberlain

analogsea - an R client for the Digital Ocean API

I think this package name is my best yet. Maybe it doesn’t make sense though? At least it did at the time… Anyway, the main motivation for this package was to be able to automate spinning up Linux boxes to do cloud R/RStudio work. Of course if you are a command line native this is all easy for you, but if you are afraid of the command line and/or just don’t want to deal with it, this tool will hopefully help. ...

May 28, 2014 · 5 min · Scott Chamberlain

Logistic plot reboot

Someone asked about plotting something like this today I wrote a few functions previously to do something like this. However, since then ggplot2 has changed, and one of the functions no longer works. Hence, I fixed opts() to theme(), theme_blank() to element_blank(), and panel.background = element_blank() to plot.background = element_blank() to get the histograms to show up with the line plot and not cover it. The new functions: loghistplot <- function(data) { names(data) <- c('x','y') # rename columns # get min and max axis values min_x <- min(data$x) max_x <- max(data$x) min_y <- min(data$y) max_y <- max(data$y) # get bin numbers bin_no <- max(hist(data$x, plot = FALSE)$counts) + 5 # create plots a <- ggplot(data, aes(x = x, y = y)) + theme_bw(base_size=16) + geom_smooth(method = "glm", family = "binomial", se = TRUE, colour='black', size=1.5, alpha = 0.3) + scale_x_continuous(limits=c(min_x,max_x)) + theme(panel.grid.major = element_blank(), panel.grid.minor=element_blank(), panel.background = element_blank(), plot.background = element_blank()) + labs(y = "Probability\n", x = "\nYour X Variable") theme_loghist <- list( theme(panel.grid.major = element_blank(), panel.grid.minor=element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank(), axis.ticks = element_blank(), panel.border = element_blank(), panel.background = element_blank(), plot.background = element_blank()) ) b <- ggplot(data[data$y == unique(data$y)[1], ], aes(x = x)) + theme_bw(base_size=16) + geom_histogram(fill = "grey") + scale_y_continuous(limits=c(0,bin_no)) + scale_x_continuous(limits=c(min_x,max_x)) + theme_loghist + labs(y='\n', x='\n') c <- ggplot(data[data$y == unique(data$y)[2], ], aes(x = x)) + theme_bw(base_size=16) + geom_histogram(fill = "grey") + scale_y_continuous(trans='reverse', limits=c(bin_no,0)) + scale_x_continuous(limits=c(min_x,max_x)) + theme_loghist + labs(y='\n', x='\n') grid.newpage() pushViewport(viewport(layout = grid.layout(1,1))) vpa_ <- viewport(width = 1, height = 1, x = 0.5, y = 0.5) vpb_ <- viewport(width = 1, height = 1, x = 0.5, y = 0.5) vpc_ <- viewport(width = 1, height = 1, x = 0.5, y = 0.5) print(b, vp = vpb_) print(c, vp = vpc_) print(a, vp = vpa_) } logpointplot <- function(data) { names(data) <- c('x','y') # rename columns # get min and max axis values min_x <- min(data$x) max_x <- max(data$x) min_y <- min(data$y) max_y <- max(data$y) # create plots ggplot(data, aes(x = x, y = y)) + theme_bw(base_size=16) + geom_point(size = 3, alpha = 0.5, position = position_jitter(w=0, h=0.02)) + geom_smooth(method = "glm", family = "binomial", se = TRUE, colour='black', size=1.5, alpha = 0.3) + scale_x_continuous(limits=c(min_x,max_x)) + theme(panel.grid.major = element_blank(), panel.grid.minor=element_blank(), panel.background = element_blank()) + labs(y = "Probability\n", x = "\nYour X Variable") } Install ggplot2 and gridExtra if you don’t have them: ...

May 22, 2014 · 2 min · Scott Chamberlain

cowsay - ascii messages and warnings for R

The history Cowsay is a terminal program that generates ascii pictures of a cow saying what you tell the cow to say in a bubble. See the Wikipedia page for more information: https://en.wikipedia.org/wiki/Cowsay - Install cowsay to use in your terminal (on OSX): brew update brew install cowsay Type cowsay hello world!, and you get: ______________ < hello world! > -------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || || Optionally, you can install fortune to get pseudorandom messages from a database of quotations. On OSX do brew install fortune, then you can pipe a fortune quote to cowsay: ...

February 20, 2014 · 4 min · Scott Chamberlain