Using Monte Carlo Simulations in R to Test Methodological Advances in Social Policy Research
This is a guest post written by Kristin Porter, Senior Research Associate at MDRC. MDRC is a nonprofit, nonpartisan education and social policy research organization dedicated to learning what works to improve programs and policies that affect the poor.
Methodology Innovation at MDRC
For more than 40 years, MDRC has been a leader in advancing the most rigorous research methods in program evaluation and in sharing what we have learned with the field. MDRC is at the forefront of both the theoretical refinement and practical use of cutting-edge designs to solve real-world research problems.
The following examples describe just a few recent and ongoing areas of methodological research, often done in collaboration with researchers at other institutions:
- Beyond measuring average program impacts, it is important to understand how impacts vary. Therefore, MDRC researchers are investigating conceptual and statistical issues involved in using multisite randomized trials to learn about and from variation in program effects across individuals and across program sites. Learning about variation in program effects involves detecting and quantifying it.
- In recent years, the regression discontinuity design (RDD) has gained widespread recognition as a quasi-experimental method that, when used correctly, can produce internally valid estimates of causal effects of a treatment, a program, or an intervention In a traditional RDD, subjects (e.g., students) are rated according to a single, numeric index (e.g., a test score), and treatment assignment is determined by whether one’s rating falls above or below an exogenously defined cut-point value of the rating. The parameter of interest in a traditional RDD is the average treatment effect at the cut-point, or “treatment frontier,” which represents the average effect for the subpopulation of individuals whose ratings equal the cut-point value. MDRC researchers have recently developed recommendations for estimating causal parameters in multiple-rating RDD’s and are currently investigating the external validity, or generalizability, of RDD parameters.
- Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple testing procedures (MTPs) are statistical procedures that counteract this problem by adjusting p-values for effect estimates upward. While MTPs are increasingly used in impact evaluations in education and other areas, an important consequence of their use is a change in statistical power that can be substantial. Unfortunately, researchers frequently ignore the power implications of MTPs when designing studies. Consequently, in some cases, sample sizes may be too small, and studies may be underpowered to detect effects as small as a desired size. In other cases, sample sizes may be larger than needed, or studies may be powered to detect smaller effects than anticipated. MDRC researchers developed methods for estimating multiple definitions of power when using MTPs and produced empirical findings on how power is affected by the use of MTPs.
MDRC researchers use a variety of approaches to thoroughly test and validate new research methods before using them in our studies and sharing them with field. These include theoretical derivations, empirical comparisons using real data, and empirical checks using simulated data. Moreover, methods researchers at MDRC publish widely in peer-reviewed journals.
How MDRC Tests New Methods with Monte Carlo Simulations
It is the empirical checks using simulated data that led us to Domino. Commonplace in statistical research, MDRC often employs Monte Carlo simulation techniques, which can be very computationally intensive. With Domino’s computing platform for R, we have been able to dramatically speed up our simulation analyses.
Monte Carlo simulation relies on a computer to generate a large number of data samples from a population, which is characterized by a data generating distribution. Because the data generating distribution is specified by the analyst, values of population parameters are known, and estimators of those parameters can be evaluated in terms of their statistical properties (e.g., bias and variance).
Monte Carlo simulation can also be used for estimating statistical power (or sample size requirements or minimum detectable effects) for studies for which closed-form equations for statistical power cannot be derived (or cannot be easily derived). For example, closed-form equations for statistical power often do not exist when analyses use MTPs described above. MDRC researchers have developed a method for estimating statistical power when making adjustments for multiple tests that does not require Monte Carlo simulation. (The method does rely on a more limited simulation, however – of test statistics rather than data). The methodology is much easier and much faster to implement than Monte Carlo simulation, but we relied on numerous full Monte Carlo simulations, which we ran on Domino’s platform in R, to validate our methodology.
Learn More about MDRC
MDRC is best known for mounting large-scale demonstrations and studies of real-world policies and programs targeted to low-income people. We helped pioneer the use of random assignment — the same highly reliable method used to test new medicines — in real world settings for social policy studies. From welfare policy to high school reform, MDRC’s work has helped to shape legislation, program design, and operational practices across the country. Working in fields where emotion and ideology often dominate public debates, MDRC is a source of objective, rigorous evidence about solutions that can be replicated and expanded to scale.
To learn more about MDRC, about the wide range of studies we conduct, about the impact of our work, and about our methodology research, visit us at www.mdrc.org, or contact firstname.lastname@example.org. You can also follow MDRC on Twitter @MDRC_News on Facebook at MDRCNews.
New to Domino? Consider a Guided Tour.Watch a Demo of Domino
Recent PostsTransformers - Self-Attention to the rescue How data science can fail faster to leap ahead N-shot and Zero-shot learning with Python A Hands-on Tutorial for Transfer Learning in Python Getting started with k-means clustering in Python Feature extraction and image classification using Deep Neural Networks and OpenCV Getting Started with OpenCV Speeding up Machine Learning with parallel C/C++ code execution via Spark Semi-uniform strategies for solving K-armed bandits Polars - A lightning fast DataFrames library
Other posts you might be interested in
Subscribe to the Data Science Blog
Receive data science tips and tutorials from leading Data Scientists right to your inbox.