So I was trying to figure out a fast way to make matrices with randomly allocated 0 or 1 in each cell of the matrix. I reached out on Twitter, and got many responses (thanks tweeps!).
Here is the solution I came up with.
See if you can tell why it would be slow.
1
2
mm <- matrix ( 0 , 10 , 5 )
apply ( mm , c ( 1 , 2 ), function ( x ) sample ( c ( 0 , 1 ), 1 ))
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 1 0 1
[2,] 0 0 1 1 1
[3,] 0 0 0 0 1
[4,] 0 1 1 0 1
[5,] 0 1 1 1 1
[6,] 1 0 1 1 1
[7,] 0 1 0 1 0
[8,] 0 0 1 0 1
[9,] 1 0 1 1 1
[10,] 1 0 0 1 1
Ted Hart (@distribecology) replied first with:
1
matrix ( rbinom ( 10 * 5 , 1 , 0.5 ), ncol = 5 , nrow = 10 )
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 0 1 1
[2,] 1 0 0 1 0
[3,] 0 1 0 0 0
[4,] 0 0 1 0 0
[5,] 1 0 1 0 0
[6,] 0 0 0 0 1
[7,] 1 0 0 0 0
[8,] 0 1 0 1 0
[9,] 1 1 1 1 0
[10,] 0 1 1 0 0
Next, David Smith (@revodavid) and Rafael Maia (@hylospar) came up with about the same solution.
1
2
3
m <- 10
n <- 5
matrix ( sample ( 0 : 1 , m * n , replace = TRUE ), m , n )
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 1
[2,] 0 0 0 0 0
[3,] 0 1 1 0 1
[4,] 1 0 0 1 0
[5,] 0 0 0 0 1
[6,] 1 0 1 1 1
[7,] 1 1 1 1 0
[8,] 0 0 0 1 1
[9,] 1 0 0 0 1
[10,] 0 1 0 1 1
Then there was the solution by Luis Apiolaza (@zentree).
1
2
3
m <- 10
n <- 5
round ( matrix ( runif ( m * n ), m , n ))
[,1] [,2] [,3] [,4] [,5]
[1,] 0 1 1 0 0
[2,] 1 0 1 1 0
[3,] 1 0 1 0 0
[4,] 1 0 0 0 1
[5,] 1 0 1 1 0
[6,] 1 0 0 0 0
[7,] 1 0 0 0 0
[8,] 1 1 1 0 0
[9,] 0 0 0 0 1
[10,] 1 0 0 1 1
Last, a solution was proposed using RcppArmadillo
, but I couldn’t get it to work on my machine, but here is the function anyway if someone can.
1
2
3
library ( inline )
library ( RcppArmadillo )
f <- cxxfunction ( body = "return wrap(arma::randu(5,10));" , plugin = "RcppArmadillo" )
And here is the comparison of system.time for each solution.
1
2
3
4
5
mm <- matrix ( 0 , 10 , 5 )
m <- 10
n <- 5
system.time ( replicate ( 1000 , apply ( mm , c ( 1 , 2 ), function ( x ) sample ( c ( 0 , 1 ), 1 )))) # @recology_
user system elapsed
0.470 0.002 0.471
1
system.time ( replicate ( 1000 , matrix ( rbinom ( 10 * 5 , 1 , 0.5 ), ncol = 5 , nrow = 10 ))) # @distribecology
user system elapsed
0.014 0.000 0.015
1
system.time ( replicate ( 1000 , matrix ( sample ( 0 : 1 , m * n , replace = TRUE ), m , n ))) # @revodavid & @hylospar
user system elapsed
0.015 0.000 0.014
1
system.time ( replicate ( 1000 , round ( matrix ( runif ( m * n ), m , n )), )) # @zentree
user system elapsed
0.014 0.000 0.014
If you want to take the time to learn C++ or already know it, the RcppArmadillo option would likely be the fastest, but I think (IMO) for many scientists, especially ecologists, we probably don’t already know C++, so will stick to the next fastest options.
Get the .Rmd file used to create this post at my github account .
Written in Markdown , with help from knitr , and nice knitr highlighting/etc. in in RStudio .