I’ve been working on a little thing called httping - a small R package that started as a pkg to Ping urls and time requests. It’s a port of the Ruby gem httping. The httr package is in Depends in this package, so its functions can be called directly, without having to load httr explicitly yourself.

In addition to timing requests, I’ve been tinkering with how to make http requests, with curl options accepting and returning the same object so they can be chained together, and then that object passed to a http verb like GET. Maybe this is a bad idea, but maybe not.

Installation

Install:

One non-CRAN dep (httpcode) needs to be installed first.

install.packages("devtools")
devtools::install_github("sckott/httpcode")
devtools::install_github("sckott/httping")

Then load package

library("httping")

Time requests

The idea with time() is to provide easy to use and understand information on how long http requests take to run. You should be able to pass in any httr verbs (GET(), POST(), etc.) to time(). time() repeats whatever http request you pass to it by default 10 times, but you can set the number of times to repeat in the count parameter. In addition, the flood parameter controls whether there is a delay between requests or not, and delay controls length of the delay.

A GET request

GET("http://google.com") %>% time(count=3)
#> 29.392 kb - http://www.google.com/ code:200 time(ms):92.444
#> 29.392 kb - http://www.google.com/ code:200 time(ms):82.127
#> 29.392 kb - http://www.google.com/ code:200 time(ms):85.587
#> <http time>
#>   Avg. min (ms):  82.127
#>   Avg. max (ms):  92.444
#>   Avg. mean (ms): 86.71933

A POST request

POST("http://httpbin.org/post", body = "A simple text string") %>% time(count=3)
#> 10.144 kb - http://httpbin.org/post code:200 time(ms):267.574
#> 10.144 kb - http://httpbin.org/post code:200 time(ms):113.309
#> 10.144 kb - http://httpbin.org/post code:200 time(ms):99.938
#> <http time>
#>   Avg. min (ms):  99.938
#>   Avg. max (ms):  267.574
#>   Avg. mean (ms): 160.2737

The return object is a list with slots for all the httr response objects, the times for each request, and the average times. The number of requests, and the delay between requests are included as attributes.

res <- GET("http://google.com") %>% time(count=3)
#> 29.392 kb - http://www.google.com/ code:200 time(ms):82.086
#> 29.392 kb - http://www.google.com/ code:200 time(ms):78.15
#> 29.392 kb - http://www.google.com/ code:200 time(ms):79.763
attributes(res)
#> $names
#> [1] "times"    "averages" "request" 
#> 
#> $count
#> [1] 3
#> 
#> $delay
#> [1] 0.5
#> 
#> $class
#> [1] "http_time"

Or print a summary of a response, gives more detail

res %>% summary()
#> <http time, averages (min max mean)>
#>   Total (s):           78.15 82.086 79.99967
#>   Tedirect (s):        26.695 34.319 29.80633
#>   Namelookup time (s): 0.025 0.03 0.028
#>   Connect (s):         0.028 0.034 0.032
#>   Pretransfer (s):     0.069 0.081 0.07633333
#>   Starttransfer (s):   45.44 49.326 47.95867

Messages are printed using cat, so you can suppress those using verbose=FALSE, like

GET("http://google.com") %>% time(count=3, verbose = FALSE)
#> <http time>
#>   Avg. min (ms):  86.12
#>   Avg. max (ms):  94.035
#>   Avg. mean (ms): 89.12467

Ping an endpoint

The idea with ping() is to simply return the http status code along with a message describing what that code means. That’s it.

This function is a bit different, accepts a url as first parameter, but can accept any args passed on to httr verb functions, like GET, POST, etc.

"http://google.com" %>% ping()
#> <http ping> 200
#>   Message: OK
#>   Description: Request fulfilled, document follows

Or pass in additional arguments to modify request

"http://google.com" %>% ping(config=verbose())
#> -> GET / HTTP/1.1
#> -> User-Agent: curl/7.37.1 Rcurl/1.95.4.5 httr/0.6.1
#> -> Host: google.com
#> -> Accept-Encoding: gzip
...cutoff

Even simpler verbs

httr is already easy, but Get():

  • Allows use of an intuitive chaining workflow
  • Parses data for you using httr built in format guesser, which should work in most cases

A simple GET request

"http://httpbin.org/get" %>%
  Get()
#> $args
#> named list()
#> 
#> $headers
#> $headers$Accept
#> [1] "application/json, text/xml, application/xml, */*"
#> 
#> $headers$`Accept-Encoding`
#> [1] "gzip"
#> 
#> $headers$Host
#> [1] "httpbin.org"
#> 
#> $headers$`User-Agent`
#> [1] "curl/7.37.1 Rcurl/1.95.4.5 httr/0.6.1"
#> 
#> 
#> $origin
#> [1] "24.21.209.71"
#> 
#> $url
#> [1] "http://httpbin.org/get"

You can buid up options by calling functions

"http://httpbin.org/get" %>%
  Progress() %>%
  Verbose()
#> <http request> 
#>   url: http://httpbin.org/get
#>   config: 
#> Config: 
#> List of 4
#>  $ noprogress      :FALSE
#>  $ progressfunction:function (...)  
#>  $ debugfunction   :function (...)  
#>  $ verbose         :TRUE

Then eventually execute the GET request

"http://httpbin.org/get" %>%
  Verbose() %>%
  Progress() %>%
  Get()
#> -> GET /get HTTP/1.1
#> -> User-Agent: curl/7.37.1 Rcurl/1.95.4.5 httr/0.6.1
#> -> Host: httpbin.org
#> -> Accept-Encoding: gzip
#> -> Accept: application/json, text/xml, application/xml, */*
#> -> 
#> <- HTTP/1.1 200 OK
#> <- Server: nginx
#> <- Date: Fri, 30 Jan 2015 17:38:58 GMT
#> <- Content-Type: application/json
#> <- Content-Length: 288
#> <- Connection: keep-alive
#> <- Access-Control-Allow-Origin: *
#> <- Access-Control-Allow-Credentials: true
#> <- 
#>   |=======================================| 100%
#> 
#> $args
#> named list()
#> 
#> $headers
#> $headers$Accept
#> [1] "application/json, text/xml, application/xml, */*"
#> 
#> $headers$`Accept-Encoding`
#> [1] "gzip"
#> 
#> $headers$Host
#> [1] "httpbin.org"
#> 
#> $headers$`User-Agent`
#> [1] "curl/7.37.1 Rcurl/1.95.4.5 httr/0.6.1"
#> 
#> 
#> $origin
#> [1] "24.21.209.71"
#> 
#> $url
#> [1] "http://httpbin.org/get"
#>