I wrote about Crossref clients back nearly two years ago on this blog: Crossref programmatic clients.

Since it’s been a while, it seems worth talking again about the the many ways to work programmatically with Crossref data - and focus in on the Python client habanero since it has some recent updates.

The 3 clients work with the main Crossref API, which lets you do things like search for works by title, author, etc. (e.g., books, articles), search for publishing members, for funders, for journals, for DOI prefixes, and for licenses. It’s a powerful API with basically no rate limits, so you can work through lots of data quickly.

Some deets:

At rOpenSci we’ve maintained the R client for quite a few years now, but the Python and Ruby clients were a result of consulting work I did for Crossref.

The R, Ruby, and Python clients are all quite feature complete, although software is never perfect :), and the thing about talking to an API to some other software is they can change stuff on their end - then we have to change suff on our end, on and on …

Back when the earlier blog post was written about these Crossref clients, we were at the first versions of both serrano and habanero. As you can see in the changelogs of the three clients (serrano, habanero, rcrossref) alot has changed in the last two years as we’ve made improvements and kept up with Crossref API changes.

Ruby and R

Nothing new to report for the Ruby (serrano) and R (rcrossref) clients, though both will soon be getting the previous features just mentioned (mailto and select).

Python: habanero

I’ve just released a new version of habanero - v0.6. Noteable changes include adding ability to add a mailto to each request to get into the so called “polite pool”; select parameter added to select certain fields to get back; and the docs got a major overhaul (check em out at https://habanero.readthedocs.io/en/latest/; hope you like it; get in touch if you think docs can be improved).

To install:

pip3 install habanero
# or
pip install habanero

To get into the polite pool, add your mailto email address when you instantiate a Crossref object

from habanero import Crossref
cr = Crossref(mailto = "foo@bar.com")

Then when you call any methods on cr your email address is sent in the request headers and you’ll get into the polite pool.


To use the select parameter, pass a comma separated string or a list of strings (both work):

cr.works(select = "DOI,title")

habanero use cases

I’ve seen some cool use cases using habanero lately.