gistr - R client for GitHub gists

GitHub has this site https://gist.github.com/ in which we can share code, text, images, maps, plots, etc super easily, without having to open up a repo, etc. GitHub gists are a great way to throw up an example use case to show someone, or show code that’s throwing errors to a support person, etc. In addition, there’s API access, which means we can interact with Gists not just from their web interface, but from the command line, or any programming language. There are clients for Node.js, Ruby, Python, and on and on. But AFAIK there wasn’t one for R. Along with Ramnath and others, we’ve been working on an R client for gists. v0.1 is now on CRAN. Below is an overview. ...

January 5, 2015 · 7 min · Scott Chamberlain

Intro to alpha ckanr - R client for CKAN RESTful API

Recently I had need to create a client for scraping museum metadata to help out some folks that use that kind of data. It’s called musemeta. One of the data sources in that package uses the open source data portal software CKAN, and so we can interact with the CKAN API to get data. Since many groups can use CKAN API/etc infrastucture because it’s open source, I thought why not have a general purpose R client for this, since there are other clients for Python, PHP, Ruby, etc. ...

November 26, 2014 · 8 min · Scott Chamberlain

Fun with the GitHub API

Recently I’ve had fun playing with the GitHub API, and here are some notes to self about this fun having. Setup Get/load packages install.packages(c('devtools','jsonlite','httr','yaml')) library("devtools") library("httr") library("yaml") Define a vector of package names pkgs <- c("alm", "bmc", "bold", "clifro", "ecoengine", "elastic", "fulltext", "geonames", "gistr", "RNeXML", "rnoaa", "rnpn", "traits", "rplos", "rsnps", "rWBclimate", "solr", "spocc", "taxize", "togeojson", "treeBASE") pkgs <- sort(pkgs) Define functions github_auth <- function(appname = getOption("gh_appname"), key = getOption("gh_id"), secret = getOption("gh_secret")) { if (is.null(getOption("gh_token"))) { myapp <- oauth_app(appname, key, secret) token <- oauth2.0_token(oauth_endpoints("github"), myapp) options(gh_token = token) } else { token <- getOption("gh_token") } return(token) } make_url <- function(x, y, z) { sprintf("https://api.github.com/repos/%s/%s/%s", x, y, z) } process_result <- function(x) { stop_for_status(x) if (!x$headers$`content-type` == "application/json; charset=utf-8") stop("content type mismatch") tmp <- content(x, as = "text") jsonlite::fromJSON(tmp, flatten = TRUE) } parse_file <- function(x) { tmp <- gsub("\n\\s+", "\n", paste(vapply(strsplit(x, "\n")[[1]], RCurl::base64Decode, character(1), USE.NAMES = FALSE), collapse = " ")) lines <- readLines(textConnection(tmp)) vapply(lines, gsub, character(1), pattern = "\\s", replacement = "", USE.NAMES = FALSE) } request <- function(owner = "ropensci", repo, file="DESCRIPTION", ...) { req <- GET(make_url(owner, repo, paste0("contents/", file)), config = c(token = github_auth(), ...)) if(req$status_code != 200) { NA } else { cts <- process_result(req)$content parse_file(cts) } } has_term <- function(what, ...) any(grepl(what, request(...))) has_file <- function(what, ...) if(all(is.na(request(file = what, ...)))) FALSE else TRUE Do stuff Does a package depend on a particular package? e.g., look for httr in the DESCRIPTION file (which is the default file name in request() above) ...

November 26, 2014 · 3 min · Scott Chamberlain

Jekyll - an intro

I started using Jekyll when I didn’t really know HTML, CSS, or Ruby - so I’ve had to learn a lot - but using Jekyll has been a great learning experience for all those languages. I’ve tried to boil down steps to building a Jekyll site or blog to the minimal steps: Install Jekyll Mac/Linux/Unix: Install dependencies: Ruby RubyGems Install Jekyll using RubyGems gem install jekyll (you may need to do sudo...) If you’re having trouble installing, see the troubleshooting page. Windows: Jekyll doesn’t officially support installation on Windows - follow these steps for a Windows install. ...

November 20, 2013 · 3 min · Scott Chamberlain

On writing, sharing, collaborating, and hosting code for science

I recently engaged with a number of tweeps in response to my tweet: Rule number 1 wrt science code: DO NOT post your code on your personal website That tweet wasn’t super clear, and it’s difficult to convey my thoughts in a tweet. What I should have said was do post your code - ideally on Github/Bitbucket/etc. Here goes with a much longer version to explain what I meant. The tweet was just about where to host code, whereas the following is about more than that, but related. ...

July 20, 2013 · 5 min · Scott Chamberlain

Visualizing rOpenSci collaboration

We (rOpenSci) have been writing code for R packages for a couple years, so it is time to take a look back at the data. What data you ask? The commits data from GitHub ~ data that records who did what and when. Using the Github commits API we can gather data on who commited code to a Github repository, and when they did it. Then we can visualize this hitorical record. ...

March 8, 2013 · 3 min · Scott Chamberlain

Visualize your Github stats (forks and watchers) in a browser with R!

So OpenCPU is pretty awesome. You can run R in a browser using URL calls with an alphanumeric code (e.g., x3e50ee0780) defining a stored function, and any arguments you pass to it. Go here to store a function. And you can output lots of different types of things: png, pdf, json, etc - see here. Here’s a function I created (originally from https://gist.github.com/2602432): # Store a function with man lines # Go Here: http://beta.opencpu.org/apps/opencpu.demo/storefunction/ # number: x3e50ee0780 # link: http://beta.opencpu.org/R/call/store:tmp/x3e50ee0780/png?id='ropensci'&type='org' the <- function (id = 'hadley', type = 'user') { require(RCurl); require(RJSONIO); require(ggplot2); require(reshape2); require(plyr) if(type == 'user'){ url = "https://api.github.com/users/" } else if(type == 'org'){ url = "https://api.github.com/orgs/" } else stop("parameter 'type' has to be either 'user' or 'org' ") url2 <- paste(url, id, "/repos?per_page=100", sep = "") xx <- getURL(url2) tt <- fromJSON(xx) if(!length(tt) == 1) { tt <- tt } else { stop("user or organization not found - search GitHub? - https://github.com/") } out <- ldply(tt, function(x) t(c(x$name, x$forks, x$watchers))) names(out) <- c("Repo", "Forks", "Watchers") out$Forks <- as.integer(out$Forks) out$Watchers <- as.integer(out$Watcher) out2 <- melt(out, id = 1) out2$value <- as.numeric(out2$value) out2$Repo <- as.factor(out2$Repo) repoorder <- unique(out2[order(out2$value, decreasing=FALSE),][,1]) out2$Repo <- factor(out2$Repo, levels = repoorder) ggplot(out2, aes(Repo, value)) + geom_bar() + coord_flip() + facet_wrap(~variable) + theme_bw(base_size = 18) } the() # default for hadley the(id='defunkt', type='user') # works - a user with even more repos than Hadley the(id='ropensci', type='org') # works - organization example the(id='jeroenooms', type='user') # works - organization example the(id='SChamberlain', type='org') # error message - mismatch of username with org type the(id='adsff', type='user') # error message - name does not exist It makes a ggplot2 graphic of your watchers and forks on each repo (up to 100 repos), sorted by descending number of forks/watchers. ...

May 5, 2012 · 2 min · Scott Chamberlain