Vertnet - getting vertebrate museum record data and a quick map

We (rOpenSci) started a repo to wrap the API for VertNet, an open access online database of vertebrate specimen records across many collection holders. Find the open source code here - please contribute if you are so inclined. We had a great Google Summer of Code student, Vijay Barve contributing to the repo this summer, so it is getting close to being CRAN-able. Most of the functions in the repo get you the raw data, but there were no functions to visualize the data. Since much of the data records of latitude and longitude data, maps are a natural visualization to use. ...

September 19, 2012 · 2 min · Scott Chamberlain

Getting data from figures in published papers

The problem: There are a lot of figures in published papers in the scholarly literature, like the below, from (Attwood et. al. 2012)): At some point, a scientist wants to ask a question for which they can synthesize the knowledge on that question by collecting data from the published literature. This often requires something like the following workflow: Search for relevant papers (e.g., via Google Scholar). Collect the papers. Decide which are appropriate for inclusion. Collect data from the figures using software on a native application. Examples include GraphClick and ImageJ. Proof data. Analyze data & publish paper. This workflow needs revamping, particularly in step number 3 - collecting the data. This data remains private, moving from one closed source (original publication) to another (personal computer). We can surely do better. ...

September 18, 2012 · 5 min · Scott Chamberlain

Scholarly metadata from R

Metadata! Metadata is very cool. It’s super hot right now - everybody is talking about it. Okay, maybe not everyone, but it’s an important part of archiving scholarly work. We are working on a repo on GitHub rmetadata to be a one stop shop for querying metadata from around the web. Various repos on GitHub we have started - rpmc, rdatacite, rdryad, rpensoft, rhindawi - will at least in part be folded into rmetadata. As a start we are writing functions to hit any metadata services that use the OAI-PMH: “Open Archives Initiative Protocol for Metadata Harvesting” framework. OAI-PMH has six methods (or verbs as they are called) for data harvesting that are the same across different metadata providers: ...

September 17, 2012 · 6 min · Scott Chamberlain