R I/O for geojson and topojson

At rOpenSci we’ve been working on an R package (geojsonio) to make converting R data in various formats to geoJSON and topoJSON, and vice versa. We hope to do this one job very well, and handle all reasonable use cases. Functions in this package are organized first around what you’re working with or want to get, geojson or topojson, then convert to or read from various formats: geojson_list()/topojson_list() - convert to geojson/topojson as R list format geojson_json()/topojson_json() - convert to geojson/topojson as json geojson_read()``topojson_read() - read a geojson/topojson file from file path or URL geojson_write()/topojson_write() - write a geojson/topojson file locally Each of the above functions have methods for various objects/classes, including numeric, data.frame, list, SpatialPolygons, SpatialPolygonsDataFrame, SpatialLines, SpatialLinesDataFrame, SpatialPoints, SpatialPointsDataFrame. ...

January 6, 2015 · 5 min · Scott Chamberlain

gistr - R client for GitHub gists

GitHub has this site https://gist.github.com/ in which we can share code, text, images, maps, plots, etc super easily, without having to open up a repo, etc. GitHub gists are a great way to throw up an example use case to show someone, or show code that’s throwing errors to a support person, etc. In addition, there’s API access, which means we can interact with Gists not just from their web interface, but from the command line, or any programming language. There are clients for Node.js, Ruby, Python, and on and on. But AFAIK there wasn’t one for R. Along with Ramnath and others, we’ve been working on an R client for gists. v0.1 is now on CRAN. Below is an overview. ...

January 5, 2015 · 7 min · Scott Chamberlain

pytaxize - low level ITIS functions

I’ve been working on a Python port of the R package taxize that I maintain. It’s still early days with this Python library, I’d love to know what people think. For example, I’m giving back Pandas DataFrame’s from most functions. Does this make sense? Installation sudo pip install git+git://github.com/sckott/pytaxize.git#egg=pytaxize Or git clone the repo down, and python setup.py build && python setup.py install Load library import pytaxize ITIS ping pytaxize.itis_ping() 'This is the ITIS Web Service, providing access to the data behind www.itis.gov. The database contains 665,266 scientific names (501,207 of them valid/accepted) and 122,735 common names.' Get hierarchy down from tsn pytaxize.gethierarchydownfromtsn(tsn = 161030) tsn rankName taxonName parentName parentTsn 0 161048 Class Sarcopterygii Osteichthyes 161030 1 161061 Class Actinopterygii Osteichthyes 161030 Get hierarchy up from tsn pytaxize.gethierarchyupfromtsn(tsn = 37906) author parentName parentTsn rankName taxonName tsn 0 Gaertn. ex Schreb. Asteraceae 35420 Genus Liatris 37906 Get rank names pytaxize.getranknames() kingdomname rankid rankname 0 Bacteria 10 Kingdom 1 Bacteria 20 Subkingdom 2 Bacteria 30 Phylum 3 Bacteria 40 Subphylum 4 Bacteria 50 Superclass 5 Bacteria 60 Class 6 Bacteria 70 Subclass 7 Bacteria 80 Infraclass 8 Bacteria 90 Superorder 9 Bacteria 100 Order 10 Bacteria 110 Suborder 11 Bacteria 120 Infraorder 12 Bacteria 130 Superfamily 13 Bacteria 140 Family 14 Bacteria 150 Subfamily 15 Bacteria 160 Tribe 16 Bacteria 170 Subtribe 17 Bacteria 180 Genus 18 Bacteria 190 Subgenus 19 Bacteria 220 Species 20 Bacteria 230 Subspecies 21 Protozoa 10 Kingdom 22 Protozoa 20 Subkingdom 23 Protozoa 25 Infrakingdom 24 Protozoa 30 Phylum 25 Protozoa 40 Subphylum 26 Protozoa 45 Infraphylum 27 Protozoa 47 Parvphylum 28 Protozoa 50 Superclass 29 Protozoa 60 Class .. ... ... ... 150 Chromista 190 Subgenus 151 Chromista 200 Section 152 Chromista 210 Subsection 153 Chromista 220 Species 154 Chromista 230 Subspecies 155 Chromista 240 Variety 156 Chromista 250 Subvariety 157 Chromista 260 Form 158 Chromista 270 Subform 159 Archaea 10 Kingdom 160 Archaea 20 Subkingdom 161 Archaea 30 Phylum 162 Archaea 40 Subphylum 163 Archaea 50 Superclass 164 Archaea 60 Class 165 Archaea 70 Subclass 166 Archaea 80 Infraclass 167 Archaea 90 Superorder 168 Archaea 100 Order 169 Archaea 110 Suborder 170 Archaea 120 Infraorder 171 Archaea 130 Superfamily 172 Archaea 140 Family 173 Archaea 150 Subfamily 174 Archaea 160 Tribe 175 Archaea 170 Subtribe 176 Archaea 180 Genus 177 Archaea 190 Subgenus 178 Archaea 220 Species 179 Archaea 230 Subspecies Search by scientific name pytaxize.searchbyscientificname(x="Tardigrada") combinedname tsn 0 Rotaria tardigrada 58274 1 Notommata tardigrada 58898 2 Pilargis tardigrada 65562 3 Tardigrada 155166 4 Heterotardigrada 155167 5 Arthrotardigrada 155168 6 Mesotardigrada 155358 7 Eutardigrada 155362 8 Scytodes tardigrada 866744 Get accepted names from tsn pytaxize.getacceptednamesfromtsn('208527') If accepted, returns the same id ...

December 26, 2014 · 3 min · Scott Chamberlain

Museum metadata - the Asian Art Museum of San Francisco

I was in San Francisco last week for an altmetrics conference at PLOS. While there, I visited the Asian Art Museum, just the Roads of Arabia exhibition. It was a great exhibit. While I was looking at the pieces, I read many labels, and thought, “hey, what if someone wants this metadata”? Since we have an R package in development for scraping museum metadata (called musemeta), I just started some scraping code for this museum. Unfortunately, I don’t think the pieces from the Roads of Arabia exhibit are on their site, so no metadata to get. But they do have their main collection searchable online at https://www.asianart.org/collections/collection. Examples follow. ...

December 10, 2014 · 5 min · Scott Chamberlain

icanhaz altmetrics

The Lagotto application is a Rails app that collects and serves up via RESTful API article level metrics data for research objects. So far, this application has only been applied to scholarly articles, but will see action on datasets soon. Martin Fenner has lead the development of Lagotto. He recently set up a discussion site if you want to chat about it. The application has a nice GUI interface, and a quite nice RESTful API. ...

December 8, 2014 · 3 min · Scott Chamberlain

Altmetrics from anywhere

The Lagotto application is a Rails app that collects and serves up via RESTful API article level metrics data for research objects. So far, this application has only been applied to scholarly articles, but will see action on datasets soon. Martin Fenner has lead the development of Lagotto. He recently set up a discussion site if you want to chat about it. The application has a nice GUI interface, and a quite nice RESTful API. ...

December 8, 2014 · 3 min · Scott Chamberlain

Dealing with multi handle errors

At rOpenSci we occasssionally hear from our users that they run into an error like: Error in function (type, msg, asError = TRUE) : easy handled already used in multi handle This error occurs in the httr package that we use to do http requests to sources of data on the web. It happens when e.g., you make a lot of requests to a resource, then it gets interrupted somehow - then you make another call, and you get the error above. Let’s try it with the an version of httr (v0.5): ...

December 8, 2014 · 2 min · Scott Chamberlain

Publications by author country

I just missed another chat on the rOpenSci website: I want to know the number of publications by people from a certain country, but I dont know how to achieve this… Fun! Let’s do that. It’s a bit complicated because there is no field like geography of the authors. But there are affiliation fields, from which we can collect data we need. Installation You’ll need the GitHub version for the coutry names data, or just use the CRAN version, and get country names elsewhere. ...

December 3, 2014 · 4 min · Scott Chamberlain

http codes

Recently noticed a little Python library called httpcode that does a simple thing: gives information on http codes in the CLI. I thought this could maybe potentially be useful for R. So I made an R version. Installation devtools::install_github("sckott/httpcode") library("httpcode") Search by http code http_code(100) #> <Status code: 100> #> Message: Continue #> Explanation: Request received, please continue http_code(400) #> <Status code: 400> #> Message: Bad Request #> Explanation: Bad request syntax or unsupported method http_code(503) #> <Status code: 503> #> Message: Service Unavailable #> Explanation: The server cannot process the request due to a high load http_code(999) #> Error: No description found for code: 999 Fuzzy code search http_code('1xx') #> [[1]] #> <Status code: 100> #> Message: Continue #> Explanation: Request received, please continue #> #> [[2]] #> <Status code: 101> #> Message: Switching Protocols #> Explanation: Switching to new protocol; obey Upgrade header #> #> [[3]] #> <Status code: 102> #> Message: Processing #> Explanation: WebDAV; RFC 2518 http_code('3xx') #> [[1]] #> <Status code: 300> #> Message: Multiple Choices #> Explanation: Object has several resources -- see URI list #> #> [[2]] #> <Status code: 301> #> Message: Moved Permanently #> Explanation: Object moved permanently -- see URI list #> #> [[3]] #> <Status code: 302> #> Message: Found #> Explanation: Object moved temporarily -- see URI list #> #> [[4]] #> <Status code: 303> #> Message: See Other #> Explanation: Object moved -- see Method and URL list #> #> [[5]] #> <Status code: 304> #> Message: Not Modified #> Explanation: Document has not changed since given time #> #> [[6]] #> <Status code: 305> #> Message: Use Proxy #> Explanation: You must use proxy specified in Location to access this resource. #> #> [[7]] #> <Status code: 306> #> Message: Switch Proxy #> Explanation: Subsequent requests should use the specified proxy #> #> [[8]] #> <Status code: 307> #> Message: Temporary Redirect #> Explanation: Object moved temporarily -- see URI list #> #> [[9]] #> <Status code: 308> #> Message: Permanent Redirect #> Explanation: Object moved permanently http_code('30[12]') #> [[1]] #> <Status code: 301> #> Message: Moved Permanently #> Explanation: Object moved permanently -- see URI list #> #> [[2]] #> <Status code: 302> #> Message: Found #> Explanation: Object moved temporarily -- see URI list http_code('30[34]') #> [[1]] #> <Status code: 303> #> Message: See Other #> Explanation: Object moved -- see Method and URL list #> #> [[2]] #> <Status code: 304> #> Message: Not Modified #> Explanation: Document has not changed since given time Search by text message http_search("request") #> [[1]] #> <Status code: 100> #> Message: Continue #> Explanation: Request received, please continue #> #> [[2]] #> <Status code: 200> #> Message: OK #> Explanation: Request fulfilled, document follows #> #> [[3]] #> <Status code: 202> #> Message: Accepted #> Explanation: Request accepted, processing continues off-line #> #> [[4]] #> <Status code: 203> #> Message: Non-Authoritative Information #> Explanation: Request fulfilled from cache #> #> [[5]] #> <Status code: 204> #> Message: No Content #> Explanation: Request fulfilled, nothing follows #> #> [[6]] #> <Status code: 306> #> Message: Switch Proxy #> Explanation: Subsequent requests should use the specified proxy #> #> [[7]] #> <Status code: 400> #> Message: Bad Request #> Explanation: Bad request syntax or unsupported method #> #> [[8]] #> <Status code: 403> #> Message: Forbidden #> Explanation: Request forbidden -- authorization will not help #> #> [[9]] #> <Status code: 408> #> Message: Request Timeout #> Explanation: Request timed out; try again later. #> #> [[10]] #> <Status code: 409> #> Message: Conflict #> Explanation: Request conflict. #> #> [[11]] #> <Status code: 413> #> Message: Request Entity Too Large #> Explanation: Entity is too large. #> #> [[12]] #> <Status code: 414> #> Message: Request-URI Too Long #> Explanation: URI is too long. #> #> [[13]] #> <Status code: 416> #> Message: Requested Range Not Satisfiable #> Explanation: Cannot satisfy request range. #> #> [[14]] #> <Status code: 503> #> Message: Service Unavailable #> Explanation: The server cannot process the request due to a high load #> #> [[15]] #> <Status code: 505> #> Message: HTTP Version Not Supported #> Explanation: Cannot fulfill request. http_search("forbidden") #> [[1]] #> <Status code: 403> #> Message: Forbidden #> Explanation: Request forbidden -- authorization will not help http_search("too") #> [[1]] #> <Status code: 413> #> Message: Request Entity Too Large #> Explanation: Entity is too large. #> #> [[2]] #> <Status code: 414> #> Message: Request-URI Too Long #> Explanation: URI is too long. http_search("birds") #> Error: No status code found for search: : birds

December 2, 2014 · 4 min · Scott Chamberlain

taxize workflows

A missed chat on the rOpenSci website the other day asked: Hi there, i am trying to use the taxize package and have a .csv file of species names to run through taxize updating them. What would be the code i would need to run to achieve this? One way to answer this is to talk about the basic approach to importing data, doing stuff to the data, then recombining data. There are many ways to do this, but I’ll go over a few of them. ...

December 2, 2014 · 5 min · Scott Chamberlain