Open Science

Science is becoming more open in many areas: publishing, data sharing, lab notebooks, and software. There are many benefits to open science. For example, sharing research data alongside your publications leads to increased citation rate (Piwowar et. al. 2007). In addition, data is becoming easier to share and reuse thanks to efforts like FigShare and Dryad.

If you don’t understand the problem we are currently facing due to lack of open science, watch this video:

I just want Data

Another way to look at this challenge is to think about how you can get data more easily. Right now you probably go to a website that has an interface to a database. You do a search, and then download a .csv file perhaps. Then you open it in Excel, and do some pivot tables to get the data in the right format. Only then will you bring the data in to R.

The advantage of using our packages is that they allow you to do that data collection part in a few lines of code. Therefore, you can easily do all those steps in the above paragraph using a few lines of code in one R file. Why does this matter? You can more easily reproduce your own work months later after that summer vacation. In addition, others can reproduce your research more easily.

The challenge

We (ropensci) have just kicked off the rOpenSci Open Science Challenge. If you aren’t familiar with rOpenSci, it is a software collective connecting scientists to open science data on the web. Since R is the most popular programming language for life scientists, it made sense to do this in R (instead of Python e.g.).

What is the challenge about?

At rOpenSci, we create R software to make getting open source text from publications and open source data easy. An important result of this is that we are facilitating open science. Why? Because R is an open source programming language, and all of our software is open source. . This challenge asks you to propose a project using one or more of our packages - or perhaps you want to propose a new dataset to connect to R. The rOpenSci core developer team will help you with any problems using our packages, and attempt to modify packages according to feedback from participants. Do you use one or more of our R packages? If you do, great. If not, check out our packages here.

How to apply

Just send an email to

The deadline

January 31, 2013

Get the .Rmd file used to create this post at my github account - or .md file.

Written in Markdown, with help from knitr, and knitcitations from Carl Boettiger.


Piwowar HA, Day RS, Fridsma DB and Ioannidis J (2007). “Sharing Detailed Research Data is Associated With Increased Citation Rate.” Plos One, 2.