I wrote about Crossref clients back nearly two years ago on this blog: Crossref programmatic clients.
Since it’s been a while, it seems worth talking again about the the many ways to work programmatically with Crossref data - and focus in on the Python client
habanero since it has some recent updates.
The 3 clients work with the main Crossref API, which lets you do things like search for works by title, author, etc. (e.g., books, articles), search for publishing members, for funders, for journals, for DOI prefixes, and for licenses. It’s a powerful API with basically no rate limits, so you can work through lots of data quickly.
- Crossref API documentation: http://api.crossref.org
- Python client
- Ruby client
- R client
At rOpenSci we’ve maintained the R client for quite a few years now, but the Python and Ruby clients were a result of consulting work I did for Crossref.
The R, Ruby, and Python clients are all quite feature complete, although software is never perfect :), and the thing about talking to an API to some other software is they can change stuff on their end - then we have to change suff on our end, on and on …
Back when the earlier blog post was written about these Crossref clients, we were at the first versions of both serrano and habanero. As you can see in the changelogs of the three clients (serrano, habanero, rcrossref) alot has changed in the last two years as we’ve made improvements and kept up with Crossref API changes.
Ruby and R
I’ve just released a new version of
habanero - v0.6. Noteable changes include adding ability to add a
mailto to each request to get into the so called “polite pool”;
select parameter added to select certain fields to get back; and the docs got a major overhaul (check em out at http://habanero.readthedocs.io/en/latest/; hope you like it; get in touch if you think docs can be improved).
pip3 install habanero
pip install habanero
To get into the polite pool, add your
mailto email address when you instantiate a Crossref object
from habanero import Crossref
cr = Crossref(mailto = "email@example.com")
Then when you call any methods on
cr your email address is sent in the request headers and you’ll get into the polite pool.
To use the select parameter, pass a comma separated string or a list of strings (both work):
cr.works(select = "DOI,title")
habanero use cases
I’ve seen some cool use cases using
- A bibliographic application at https://taccimo.info/tbl_sector_list.php from Sean Gordon.
- An application called PyKED from Kyle Niemeyer - “a Python-based software package for validating and interacting with ChemKED (Chemical Kinetics Experimental Data format) files that describe fundamental experimental measurements of combustion phenomena”.
- A Django app called TailorDev Biblio from Julien Maupetit that manages references.