The paper
One paper from my graduate work asked most generally ~ “How much does the variation in magnitudes and signs of species interaction outcomes vary?”. More specifically, we wanted to know if variation differed among species interaction classes (mutualism, competition, predation), and among various “gradients” (space, time, etc.). To answer this question, we used a meta-analysis approach (rather than e.g., a field experiment). We published the paper recently.
p.s. I really really wish we would have put it in an open access journal…
The data
Anyway, I’m here to talk about the data. We didn’t get the data up with the paper, but it is up on Figshare now. The files there are the following:
coniditionality.R
- script used to process the data fromvariables_prelim.csv
variables_prelim.csv
- description of variables in the preliminary data set, matchesconditionality_data_prelim.csv
variables_used.csv
- description of variables in the used data set, matchesconditionality_data_used.csv
conditionality_data_prelim.csv
- preliminary data, the raw dataconditionality_data_used.csv
- the data used for our paperREADME.md
- the readmepaper_selection.csv
- the list of papers we went through, with remarks about paper selection
Please do play with the data, publish some papers, etc, etc. It took 6 of us about 4 years to collect this data; we skimmed through ~11,000 papers on the first pass (aka. skimming through abstracts in Google Scholar and Web of Science), then decided on nearly 500 papers to get data from, and narrowed down to 247 papers for the publication mentioned above. Now, there was no funding for this, so it was sort of done in between other projects, but still, it was simply A LOT of tables to digitize, and graphs to extract data points from. Anyway, hopefully you will find this data useful :p
EML
I think this dataset would be a great introduction to the potential power of EML (Ecological Metadata Langauge). At rOpenSci, one of our team Carl Boettiger, along with Claas-Thido Pfaff, Duncan Temple Lang, Karthik Ram, and Matt Jones, have created an R client for EML, to parse EML files and to create and publish them.
What is EML?/Why EML?
A demonstration is in order…
Example using EML with this dataset
Install EML
library("devtools")
install.packages("RHTMLForms", repos = "http://www.omegahat.org/R/", type="source")
install_github("ropensci/EML", build=FALSE, dependencies=c("DEPENDS", "IMPORTS"))
Load EML
library('EML')
Prepare metadata
# dataset
prelim_dat <- read.csv("conditionality_data_prelim.csv")
# variable descriptions for each column
prelim_vars <- read.csv("variables_prelim.csv", stringsAsFactors = FALSE)
Get column definitions in a vector
col_defs <- prelim_vars$description
Create unit definitions for each column
unit_defs <- list(
c(unit = "number",
bounds = c(0, Inf)),
c(unit = "number",
bounds = c(0, Inf)),
"independent replicates",
c(unit = "number",
bounds = c(0, Inf)),
... <CUTOFF>
)
Write an EML file
eml_write(prelim_dat,
unit.defs = unit_defs,
col.defs = col_defs,
creator = "Scott Chamberlain",
contact = "myrmecocystus@gmail.com",
file = "conditionality_data_prelim_eml.xml")
## [1] "conditionality_data_prelim_eml.xml"
Validate the EML file
eml_validate("conditionality_data_prelim_eml.xml")
## EML specific tests XML specific tests
## TRUE TRUE
Read data and metadata
gg <- eml_read("conditionality_data_prelim_eml.xml")
eml_get(gg, "contact")
## [1] "myrmecocystus@gmail.com"
eml_get(gg, "citation_info")
## Chamberlain S (2014-10-06). _metadata_.
dat <- eml_get(gg, "data.frame")
head(dat[,c(1:10)])
## order i indrep avg author_last finit_1 finit_2 finit_abv co_author
## 1 1 1 a 1 Devall margaret s ms Thein
## 2 2 1 a 2 Devall margaret s ms Thein
## 3 3 1 a 3 Devall margaret s ms Thein
## 4 4 1 a 4 Devall margaret s ms Thein
## 5 5 1 a 5 Devall margaret s ms Thein
## 6 6 1 a 6 Devall margaret s ms Thein
## sinit_1
## 1 leonard
## 2 leonard
## 3 leonard
## 4 leonard
## 5 leonard
## 6 leonard
Publish
We can also use the EML
package to publish the data, here to Figshare.
First, install rfigshare
install.packages("rfigshare")
library('rfigshare')
Then publish using eml_publish()
figid <- eml_publish(
file = "conditionality_data_prelim_eml.xml",
description = "EML file for Chamberlain, S.A., J.A. Rudgers, and J.L. Bronstein. 2014. How context-dependent are species interactions. Ecology Letters",
categories = "Ecology",
tags = "EML",
destination = "figshare",
visibility = "public",
title = "condionality data, EML")
fs_make_public(figid)