Recently, she has started searching for potential thesis topics and mentioned meta-analyses were particularly interesting to her. To help her and hopefully other non-statisticans who are interested in performing a meta-analysis, I have put together a short tutorial to run a simple meta-analysis in R.
Performing a search for the words 'meta-analysis' or 'meta analysis' on CRAN and Bioconductor currently yields 34 and 6 available R packages, respectively. Some are meant for a specific type of data (e.g. genomic data such as microarrays), but in general the majority of these R packages are meant for combining summary statistics of discrete or continuous data extracted from a set of carefully selected published studies.
Before starting a meta-analysis, there are many important questions to be answered such as
- How to pick which studies to include in the meta-analysis? What are possible biases in selecting the studies?
- What effect are you interested in measuring? What data needs to be extracted from the papers?
- What type of meta-analysis should be performed?
- What software tools/packages are available to perform a meta-analysis?
- How do I interpret the results from the output of the software tool used? How do I know if the meta-analysis yielded anything statistically valid and significant?
Generally speaking there are four types of meta-analyses:
- univariate meta-analysis
- n studies comparing two treatments (e.g. case/control) with a dichotomous outcome
- multivariate meta-analysis
- n studies comparing two treatments with multiple outcomes
- meta-regression
- n studies comparing two treatments with a dichotomous outcome but can investigate the impact of additional "moderator" or explanatory variables (e.g. year of study) on the outcome
- network meta-analysis (also known as multiple treatment meta-analysis)
- n studies comparing multiple treatments with a dichotomous outcome
For example, here is a brief summary of the meta R package :
- Description: Simple package to estimate fixed-effects and random-effects models for binary and continuous data in a univariate meta-analysis. Meta-regression is also available.
- Documentation: http://cran.r-project.org/web/packages/meta/index.html
- Useful Notes: Use metabin() for binary data and metacont() for continuous data. Using continuous data, can estimate mean difference and using binary data, can estimate risk ratio, odds ratio, risk difference and arcsine difference using "sm = " argument. Try print(), summary(), forest(), funnel() and labbe(), metabias() for analyzing the results from the meta-analysis. Use metareg() for meta-regression.
Simulated data example:
Consider a univariate meta-analysis with n = 10 studies comparing two treatments (drug A and drug B) and a dichotomous outcome (e.g. death, no death). Estimate an overall odds ratio of the death in the drug A group relative to the drug B group.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Load library | |
library(meta) | |
# number of studies | |
n.studies = 10 | |
# number of treatments: case, control | |
n.trt = 2 | |
# number of outcomes | |
n.event = 2 | |
# simulate the number of cases and controls in each study | |
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.5) | |
case.group = rbinom(n = n.studies, size = 200, prob = 0.5) | |
# the number of events and no events (i.e. outcomes) in the control group | |
event.ctl.group = rbinom(n = n.studies, size = ctl.group, | |
prob = rep(.1, length(ctl.group))) | |
noevent.ctl.group = ctl.group - event.ctl.group | |
# the number of events and no events in the case group | |
event.case.group = rbinom(n = n.studies, size = case.group, | |
prob = rep(.5, length(trt.group))) | |
noevent.case.group = case.group - event.case.group | |
# Run the univariate meta-analysis using metabin() | |
mymeta <- metabin(event.e = event.trt.group, n.e = trt.group, | |
event.c = event.ctl.group, n.c = ctl.group, | |
method = "MH", sm = "OR") | |
forest(mymeta) |
Based on the simulated data, the odds ratio of death using drug B relative using drug A is 8.15 which is statistically signifiant because the 95% confidence interval of (6.44, 10.30) using a fixed-effects model does not contain the value 1.
For a further discussion on the differences between a fixed-effects and random-effects model, Wikipedia has a fairly easy to understand description of the differences. The main thing to understand is if your studies are considered to be "heterogeneous", then you will need to use a random effects model. Otherwise you should use a fixed effects model. The way to test which model to use is with the Cochran Q test or the I^2 test. In the forest plot, the I^2 test was performed in which the null hypothesis is there is no study heterogeneity and the fixed effects model should be used. Because the p-value (p = 0.6586) was greater than an a \alpha confidence level of 0.05, we fail to reject the null hypothesis and use the fixed effects model for the meta-analysis.
For an overview of the R packages CRAN has to offer, a Task View dedicated specifically to meta-analyses is available. Another good resource is from the 2013 UseR conference.
For a further discussion on the differences between a fixed-effects and random-effects model, Wikipedia has a fairly easy to understand description of the differences. The main thing to understand is if your studies are considered to be "heterogeneous", then you will need to use a random effects model. Otherwise you should use a fixed effects model. The way to test which model to use is with the Cochran Q test or the I^2 test. In the forest plot, the I^2 test was performed in which the null hypothesis is there is no study heterogeneity and the fixed effects model should be used. Because the p-value (p = 0.6586) was greater than an a \alpha confidence level of 0.05, we fail to reject the null hypothesis and use the fixed effects model for the meta-analysis.
For an overview of the R packages CRAN has to offer, a Task View dedicated specifically to meta-analyses is available. Another good resource is from the 2013 UseR conference.
Note: there are also many other software tools available outside of R which may be of interest: MetaEasy in Excel and similar functions in Stata, SPSS and SAS.
Part of her curriculum has included several statistics courses (which she aced!) so you can imagine the statistician in me is beaming with pride! fengshui
ReplyDelete