rsunlight - R client for Sunlight Labs APIs

My last blog post on this package was so long ago the package wrapped both New York Times APIs and Sunlight Labs APIs and the package was called govdat. I split that package up into rsunlight for Sunlight Labs APIs and rtimes for some New York Times APIs. rtimes is in development at Github. We’ve updated the package to include four sets of functions, one set for each of four Sunlight Labs APIs (with a separate prefix for each API): ...

August 11, 2014 · 6 min · Scott Chamberlain

analogsea - v0.1 notes

My last blog post introduced the R package I’m working on analogsea, an R client for the Digital Ocean API. Things have changed a bit, including fillig out more functions for all API endpoints, and incorparting feedback from Hadley and Karthik. The package is as v0.1 now, so I thought I’d say a few things about how it works. Note that Digital Ocean’s v2 API is in beta stage now, so the current version of analogsea at v0.1 works with their v1 API. The v2 branch of analogsea is being developed for their v2 API. ...

June 18, 2014 · 5 min · Scott Chamberlain

analogsea - an R client for the Digital Ocean API

I think this package name is my best yet. Maybe it doesn’t make sense though? At least it did at the time… Anyway, the main motivation for this package was to be able to automate spinning up Linux boxes to do cloud R/RStudio work. Of course if you are a command line native this is all easy for you, but if you are afraid of the command line and/or just don’t want to deal with it, this tool will hopefully help. ...

May 28, 2014 · 5 min · Scott Chamberlain

Logistic plot reboot

Someone asked about plotting something like this today I wrote a few functions previously to do something like this. However, since then ggplot2 has changed, and one of the functions no longer works. Hence, I fixed opts() to theme(), theme_blank() to element_blank(), and panel.background = element_blank() to plot.background = element_blank() to get the histograms to show up with the line plot and not cover it. The new functions: loghistplot <- function(data) { names(data) <- c('x','y') # rename columns # get min and max axis values min_x <- min(data$x) max_x <- max(data$x) min_y <- min(data$y) max_y <- max(data$y) # get bin numbers bin_no <- max(hist(data$x, plot = FALSE)$counts) + 5 # create plots a <- ggplot(data, aes(x = x, y = y)) + theme_bw(base_size=16) + geom_smooth(method = "glm", family = "binomial", se = TRUE, colour='black', size=1.5, alpha = 0.3) + scale_x_continuous(limits=c(min_x,max_x)) + theme(panel.grid.major = element_blank(), panel.grid.minor=element_blank(), panel.background = element_blank(), plot.background = element_blank()) + labs(y = "Probability\n", x = "\nYour X Variable") theme_loghist <- list( theme(panel.grid.major = element_blank(), panel.grid.minor=element_blank(), axis.text.y = element_blank(), axis.text.x = element_blank(), axis.ticks = element_blank(), panel.border = element_blank(), panel.background = element_blank(), plot.background = element_blank()) ) b <- ggplot(data[data$y == unique(data$y)[1], ], aes(x = x)) + theme_bw(base_size=16) + geom_histogram(fill = "grey") + scale_y_continuous(limits=c(0,bin_no)) + scale_x_continuous(limits=c(min_x,max_x)) + theme_loghist + labs(y='\n', x='\n') c <- ggplot(data[data$y == unique(data$y)[2], ], aes(x = x)) + theme_bw(base_size=16) + geom_histogram(fill = "grey") + scale_y_continuous(trans='reverse', limits=c(bin_no,0)) + scale_x_continuous(limits=c(min_x,max_x)) + theme_loghist + labs(y='\n', x='\n') grid.newpage() pushViewport(viewport(layout = grid.layout(1,1))) vpa_ <- viewport(width = 1, height = 1, x = 0.5, y = 0.5) vpb_ <- viewport(width = 1, height = 1, x = 0.5, y = 0.5) vpc_ <- viewport(width = 1, height = 1, x = 0.5, y = 0.5) print(b, vp = vpb_) print(c, vp = vpc_) print(a, vp = vpa_) } logpointplot <- function(data) { names(data) <- c('x','y') # rename columns # get min and max axis values min_x <- min(data$x) max_x <- max(data$x) min_y <- min(data$y) max_y <- max(data$y) # create plots ggplot(data, aes(x = x, y = y)) + theme_bw(base_size=16) + geom_point(size = 3, alpha = 0.5, position = position_jitter(w=0, h=0.02)) + geom_smooth(method = "glm", family = "binomial", se = TRUE, colour='black', size=1.5, alpha = 0.3) + scale_x_continuous(limits=c(min_x,max_x)) + theme(panel.grid.major = element_blank(), panel.grid.minor=element_blank(), panel.background = element_blank()) + labs(y = "Probability\n", x = "\nYour X Variable") } Install ggplot2 and gridExtra if you don’t have them: ...

May 22, 2014 · 2 min · Scott Chamberlain

cowsay - ascii messages and warnings for R

The history Cowsay is a terminal program that generates ascii pictures of a cow saying what you tell the cow to say in a bubble. See the Wikipedia page for more information: https://en.wikipedia.org/wiki/Cowsay - Install cowsay to use in your terminal (on OSX): brew update brew install cowsay Type cowsay hello world!, and you get: ______________ < hello world! > -------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || || Optionally, you can install fortune to get pseudorandom messages from a database of quotations. On OSX do brew install fortune, then you can pipe a fortune quote to cowsay: ...

February 20, 2014 · 4 min · Scott Chamberlain

cites - citation stuff from the command line

I’ve been learning Ruby, and decided to scratch an itch: getting citations for papers to put in a bibtex file or my Zotero library. This usually requires two parts: 1) searching for an article with keywords, and then 2) getting the citation once the paper is found. Since I am lazy, I would prefer to do this from the command line instead of opening up a browser. Thus => cites. (Note, I’m sure someone has created something better - the point is I’m learnin’ me some Ruby) cites does two things: ...

January 18, 2014 · 5 min · Scott Chamberlain

rgauges - fun with hourly web site analytics

Gaug.es is a really nice looking analytics platform as an alternative to Google Analytics. It is a paid service, but not that expensive really. We’ve made an R package to interact with the Gaug.es API called rgauges. Find it on Github and on CRAN. Although working with the Gaug.es API is nice and easy, they don’t keep hourly visit stats and provide those via the API, so that you have to continually collect them yourself if you want them. That’s what I have done for my own website. ...

January 17, 2014 · 5 min · Scott Chamberlain

Jekyll - an intro

I started using Jekyll when I didn’t really know HTML, CSS, or Ruby - so I’ve had to learn a lot - but using Jekyll has been a great learning experience for all those languages. I’ve tried to boil down steps to building a Jekyll site or blog to the minimal steps: Install Jekyll Mac/Linux/Unix: Install dependencies: Ruby RubyGems Install Jekyll using RubyGems gem install jekyll (you may need to do sudo...) If you’re having trouble installing, see the troubleshooting page. Windows: Jekyll doesn’t officially support installation on Windows - follow these steps for a Windows install. ...

November 20, 2013 · 3 min · Scott Chamberlain

Code display in scholarly journals

Code in journals, that is, code you would type to do some programmatic operation in say R or Python, is kind of a mess to say the least. Okay, so you can SEE code in papers, but code is not formatted in a way that facilites reuse. If an author in a paper writes out some code for software they create, or an analysis they do in the paper, wouldn’t it be nice for a reader to be able to copy and paste that code directly into whatever environment that code should execute in, and actually work. Of course there is dependencies, etc. for that software to worry about, but here I am just concerned with the code formatting in articles. Code is displayed as an image in some cases (gasp!). Additionally, there’s this thing called the internet, and we can use color, so let’s highlight code already. At least in one of our recent rOpenSci papers in F1000 Research, they do use syntax highlighting - w00t! ...

October 25, 2013 · 2 min · Scott Chamberlain

Guide to using rOpenSci packages during the US Gov't shutdown

Note: This is cross-posted from the rOpenSci blog, which will update with this post when our technical snafu is fixed. With the US government shut down, many of the federal government provided data APIs are down. We write R packages to interact with many of these APIs. We have been tweeting about what APIs that are down related to R pacakges we make, but we thought we would write up a proper blog post on the issue. ...

October 8, 2013 · 3 min · Scott Chamberlain