Scraping Flora of North America27 Jan 2012 R scraping
So Flora of North America is an awesome collection of taxonomic information for plants across the continent. However, the information within is not easily machine readable.
So, a little web scraping is called for.
rfna is an R package to collect information from the Flora of North America.
So far, you can:
- Get taxonomic names from web pages that index the names.
- Then get daughter URLs for those taxa, which then have their own 2nd order daughter URLs you can scrape, or scrape the 1st order daughter page.
- Query Asteraceae taxa for whether they have paleate or epaleate receptacles. This function is something I needed, but more functions will be made like this to get specific traits.
Further functions will do search, etc.
You can install by:
Here is an example where a set of URLs is acquired using function
getdaughterURLs, then the function
receptacle is used to ask whether of each the taxa at those URLs have paleate or epaleate receptacles.