iDigBio - a new data source in spocc

iDigBio, or Integrated Digitized Biocollections, collects and provides access to species occurrence data, and associated metadata (e.g., images of specimens, when provided). They collect data from a lot of different providers. They have a nice web interface for searching, check out idigbio.org/portal/search. spocc is a package we’ve been working on at rOpenSci for a while now - it is a one stop shop for retrieving species ocurrence data. As new sources of species occurrence data come to our attention, and are available via a RESTful API, we incorporate them into spocc. ...

June 8, 2015 · 3 min · Scott Chamberlain

openadds - open addresses client

openadds talks to Openaddresses.io. a run down of its things: Install devtools::install_github("sckott/openadds") library("openadds") List datasets Scrapes links to datasets from the openaddresses site dat <- oa_list() dat[2:6] #> [1] "https://data.openaddresses.io.s3.amazonaws.com/20150511/au-tas-launceston.csv" #> [2] "https://s3.amazonaws.com/data.openaddresses.io/20141127/au-victoria.zip" #> [3] "https://data.openaddresses.io.s3.amazonaws.com/20150511/be-flanders.zip" #> [4] "https://data.openaddresses.io.s3.amazonaws.com/20150417/ca-ab-calgary.zip" #> [5] "https://data.openaddresses.io.s3.amazonaws.com/20150511/ca-ab-grande_prairie.zip" Search for datasets Uses oa_list() internally, then searches through columns requested. oa_search(country = "us", state = "ca") #> Source: local data frame [68 x 5] #> #> country state city ext #> 1 us ca san_mateo_county .zip #> 2 us ca alameda_county .zip #> 3 us ca alameda_county .zip #> 4 us ca amador .zip #> 5 us ca amador .zip #> 6 us ca bakersfield .zip #> 7 us ca bakersfield .zip #> 8 us ca berkeley .zip #> 9 us ca berkeley .zip #> 10 us ca butte_county .zip #> .. ... ... ... ... #> Variables not shown: url (chr) Get data Passing in a URL ...

May 18, 2015 · 5 min · Scott Chamberlain