CRAN Checks API and Badges

TL;DR In 6 months (end of November 2022) the CRAN Checks API https://cranchecks.info/ will be gone You can still get badges at https://badges.cranchecks.info You can use the new badges like: [![cran checks](https://badges.cranchecks.info/worst/dplyr.svg)](https://cran.r-project.org/web/checks/check_results_dplyr.html) Find more details at https://github.com/sckott/cchecksbadges Sunsetting the CRAN Checks API If you contribute an R package to CRAN, you may use badges from the CRAN checks API at https://cranchecks.info/. The CRAN Checks API has been operating since about September 2017 (I think)....

June 2, 2022 · 2 min · Scott Chamberlain

List comprehension vs. filter vs. key lookup

I was working on a work task last week, and needed to filter out one instance of a class from a list of class instances. No matter how you do this speed doesn’t matter too much if you’re doing this operation once or a few times. However, I this operation needs to be done about 100K times each time the script runs - so speed definitely does matter in this case....

April 18, 2022 · 3 min · Scott Chamberlain

Notes on Python

It’s been interesting switching jobs with respect to programming languages. I used to write 95% R - now I write 95% Python. I have been using Python for many years, but not seriously or getting paid either. I’ve learned alot in the first 6 months. Some Python things learned: Functions and methods I used to think functions and methods were the same thing. But during the last 6 months I learned that functions and methods are not the same....

February 7, 2022 · 2 min · Scott Chamberlain

Mocking HTTP redirects

You’ve experienced an HTTP redirect (or URL redirect, or URL forwarding) even if you haven’t noticed. We all use browsers (I assume, since you are reading this), either on a phone or laptop/desktop computer. Browsers don’t show all the HTTP requests going on in the background, some of which are redirects. Redirection is used for various reasons, including to prevent broken links when web pages are moved, for privacy protection, to allow multiple domains to refer to a single web page, and more....

November 27, 2021 · 4 min · Scott Chamberlain

API client design: how to deal with lots of parameters?

In February this year I wroute about how many parameters functions should have, looking at some other languages, with a detailed look at R. On a related topic … As I work on many R packages that are API clients for various web services, I began wondering: What is the best way to deal with API routes that have a lot of parameters? The general programming wisdom I’ve seen is that a function should have no more than 3-4 parameters (e....

December 21, 2020 · 8 min · Scott Chamberlain

stories behind archived packages

Update on 2021-02-09: I’ve archived 8 more packages. Post below updated Code is often arranged in packages for any given language. Packages are often cataloged in a package registry of some kind: NPM for node, crates.io for Rust, etc. For R, that registry is either CRAN or Bioconductor (for the most part). CRAN has the concept of an archived package. That is, the namespace for a package (foo) is still in the registry (and can not be used again), but the package is archived - no longer gets updated and checks I think are no longer performed....

September 10, 2020 · 8 min · Scott Chamberlain

taxizedb: an update

taxizedb arose from pain in using taxize when dealing with large amounts of data in a single request or doing a lot of requests of any data size. taxize works with remote data sources on the web, so there’s a number of issues that can slow the response down: internet speed, server response speed (was a response already cached or not; or do they even use caching), etc. The idea with taxizedb was to allow users to do the same things as taxize allows, but much faster by accessing the entire database for a data source on their own computer....

August 17, 2020 · 4 min · Scott Chamberlain

how many parameters?

Functions can have no parameters, or have a lot of parameters, or somewhere in between. How many parameters is too many? Does it even matter how many parameters there are in a function? There’s AFAIK no “correct” answer to this question. And surely the “best practice” varies among programming languages. What do folks say about this and what should we be doing in R? From other languages Many of the blog posts and SO posts on this topic cite the book Clean Code by “Uncle Bob”....

February 10, 2020 · 5 min · Scott Chamberlain

finding truffles

The bad thing about making software is that you can sometimes make it easier for someone to shoot themselves in the foot. The good thing about software is that you can make more software to help them not shoot a foot off. The R package vcr, an R port of the Ruby library of the same name, records and plays back HTTP requests. Some HTTP requests can have secrets (e.g., passwords, API keys, etc....

January 30, 2020 · 3 min · Scott Chamberlain

text mining, apis, and parsing api logs

Acquiring full text articles fulltext is an R package I maintain to obtain full text versions of research articles for text mining. It’s a hard problem, with a spaghetti web of code. One of the hard problems is figuring out what the URL is for the full text version of an article. Publishers do not have consistent URL patterns through time, and so you can not set rules once and never revisit them....

March 21, 2019 · 7 min · Scott Chamberlain