Museum metadata - the Asian Art Museum of San Francisco

I was in San Francisco last week for an altmetrics conference at PLOS. While there, I visited the Asian Art Museum, just the Roads of Arabia exhibition. It was a great exhibit. While I was looking at the pieces, I read many labels, and thought, “hey, what if someone wants this metadata”? Since we have an R package in development for scraping museum metadata (called musemeta), I just started some scraping code for this museum. Unfortunately, I don’t think the pieces from the Roads of Arabia exhibit are on their site, so no metadata to get. But they do have their main collection searchable online at https://www.asianart.org/collections/collection. Examples follow. ...

December 10, 2014 · 5 min · Scott Chamberlain

Scraping Flora of North America

So Flora of North America is an awesome collection of taxonomic information for plants across the continent. However, the information within is not easily machine readable. So, a little web scraping is called for. rfna is an R package to collect information from the Flora of North America. So far, you can: Get taxonomic names from web pages that index the names. Then get daughter URLs for those taxa, which then have their own 2nd order daughter URLs you can scrape, or scrape the 1st order daughter page. Query Asteraceae taxa for whether they have paleate or epaleate receptacles. This function is something I needed, but more functions will be made like this to get specific traits. Further functions will do search, etc. ...

January 27, 2012 · 2 min · Scott Chamberlain