Exploring git commits with git2r

In rOpenSci - as in presumably most open source projects - we want the entire project to be sustainable, but also each individual software project to be sustainable. A big part of each software project (aka R package in this case) being sustainable is the people making it, particularly whether: how many contributors a project has, and how contributions are spread across contibutors There are discussions going on about how to increase contributors to any given project. But the first thing to do is to do an assesment of where you’re at. One way to do that is visualization. ...

February 5, 2018 · 4 min · Scott Chamberlain

My Sublime Text workflow/setup

Sublime Text is pretty great. Let’s start at the beginning. Why would my primary editing tool not be vim? My background is as a biologist, spending way to many years in grad school. My first programming language was R back in 2006; my first text editor about the same year was Notepad++; my first interaction with the cli was probably a year later or so (but that was on Windows). After using Notepad++ for a few years, I stumbled upon Sublime Text via advice from a friend. I used it for a few years without paying (which you can still do), and after that realized it was worth paying for. They now have an easy to use Discourse forum too. ...

January 31, 2018 · 3 min · Scott Chamberlain

Playing with Ruby Patterns in R

I was returning to a long-term project I’ve been working on - a package for caching HTTP requests in R called vcr, a port of the Ruby gem vcr - when you do that thing you do when you are porting a library from one language to another. I stumbled upon some methods/functions I wasn’t familiar with. For example, take_while I had never seeen before. It iterates over an array, returning the elements of the array that evalulate to true (for those new to Ruby, they use true instead of TRUE as we do in R) when passed through the function given. R has lists and vectors - R’s lists are the most similar to Ruby arrays because both can have mixed objects in them (e.g., a string and an integer) while still retaining those objects as is. ...

January 25, 2018 · 4 min · Scott Chamberlain

Web APIs with Sinatra, Mongo, Docker, and Caddy

The problem The R community has a package distribution thing called CRAN just like Ruby has Rubygems, and Python has Pypi, etc. On all packages on CRAN, the CRAN maintainers run checks on each package on multiple versions of R and on many operating systems. They report those results on a page associated with the package, like this one. You might be thinking: okay, but we have Travis-CI and friends, so who cares about that? Well, it’s these checks that CRAN runs that will determine if your package on CRAN leads to emails to you asking for changes, and possibly the package being taken down if e.g., they email and you don’t respond for a period of time. ...

November 14, 2017 · 8 min · Scott Chamberlain

habanero update: Crossref data from Python

I wrote about Crossref clients back nearly two years ago on this blog: Crossref programmatic clients. Since it’s been a while, it seems worth talking again about the the many ways to work programmatically with Crossref data - and focus in on the Python client habanero since it has some recent updates. The 3 clients work with the main Crossref API, which lets you do things like search for works by title, author, etc. (e.g., books, articles), search for publishing members, for funders, for journals, for DOI prefixes, and for licenses. It’s a powerful API with basically no rate limits, so you can work through lots of data quickly. ...

October 23, 2017 · 3 min · Scott Chamberlain

cranchecks: an API for CRAN check results

If you maintain an R package, or even use R packages, you may have looked at CRAN check results. These are essentially the results of running R CMD CHECK on a package. They do these for each package for each of a few different operating systems (debian, fedora, solaris, windows, osx) and different R versions (devel, release and patched). src: https://github.com/ropensci/cchecksapi base api url: https://cranchecks.info CRAN maintainers look at these, and eventually will email maintainers if checks are bad enough. ...

September 27, 2017 · 3 min · Scott Chamberlain

gbifrb: Ruby client for the GBIF API

gbifrb is a new Ruby client for the GBIF API. docs: https://www.rubydoc.info/gems/gbifrb/ rubygems: https://rubygems.org/gems/gbifrb code: https://github.com/sckott/gbifrb I maintain (w/ help) two other GBIF API clients: Python: pygbif R: rgbif API Here’s the gbifrb methods in relation to GBIF API routes registry /node - Gbif::Registry.nodes /network - Gbif::Registry.networks /installations - Gbif::Registry.installations /organizations - Gbif::Registry.organizations /dataset_metrics - Gbif::Registry.dataset_metrics /datasets - Gbif::Registry.datasets /dataset_suggest - Gbif::Registry.dataset_suggest /dataset_search - Gbif::Registry.dataset_search species /species/match - Gbif::Species.name_backbone /species/suggest - Gbif::Species.name_suggest /species/search - Gbif::Species.name_lookup /species - Gbif::Species.name_usage occurrences ...

September 7, 2017 · 1 min · Scott Chamberlain

hoardr: simple file caching

hoardr is a client for caching files and managing those files. You can definitely achieve the same tasks without a separate pacakge, and there’s a number of packages for caching various objects in R already. However, I didn’t think there was a tool for that did everything I needed. The use cases I typically need hoardr for are when dealing with large files, either text (e.g., csv) or binary (e.g., shp) files that would be nice to not make the user of packages I maintain download again if they already have the file. This makes the server’s life easier that’s serving the files and makes work faster for the user of my package. ...

August 15, 2017 · 4 min · Scott Chamberlain

Tooling for R package development

There are a lot of ways to make R packages. Many blog posts have covered making R packages, but for the most part they’ve covered only how they make packages, going from the required files for a package, what to put in DESCRIPTION, etc. But what about the tooling? I’m not going to talk about the code, etc. - but rather the different ways to approach it. The blog posts/etc. on making R packages: ...

June 18, 2017 · 7 min

Reading in May

Reading right now or just finished The Nine, Jeffrey Toobin https://www.jeffreytoobin.com/books/the-nine-tr Just finished reading this. synopsis: fucking hell, Scalia and Thomas are awful the Warren court was awesome RBG 4 life Evolutionary Biology of Parasites, Peter W. Price https://press.princeton.edu/titles/645.html In progress. I got this book from my undergrad advisor around 2001 or so - figured I’d give it a read. synopsis: parasites are awesome Bike Snob: Systematically & Mercilessly Realigning the World of Cycling, Christopher Koelle https://www.goodreads.com/book/show/7549138-bike-snob In progress. Got from my dad, thx dad synopsis: funny The Genius of Birds, Jennifer Ackerman https://www.jenniferackermanauthor.com/genius-ofbirds In progress. synopsis: birds are smart

May 16, 2017 · 1 min