In This Issue

Three-level synthesis of single-subject experimental data: Further extensions, empirical validation and applications

A dissertation summary from the 2016 Anne Anastasi Dissertation Award winner.

By Mariola Moeyaert

Single-subject experimental design (SSED) studies have provided scientifically sound evaluations of treatment effects (National Research Council, 2002, Shadish & Rindskopf, 2007) in a variety of different research fields such as in biomedical research, school effectiveness, behavior modification, school psychology, and special education for more than 50 years (Gast, 2010; Kennedy, 2005, Kratochwill, 1978; Tawney & Gast, 1984; Busse, Kratochwill, & Elliott, 1995; Chorpita, Albano, Heimberg, & Barlow, 1996; Barlow & Hersen, 1984; Kratochwill & Levin, 1992). Because of the popularity of SSEDs within and across a variety of different research fields, a large number of SSEDs is available for quantitative synthesis (Shadish & Rindskopf, 2007), which was the main focus of my doctoral dissertation. 

In my dissertation, I focused on one specific flexible methodological framework that can be used to summarize SSED data across subjects and across studies —  namely three-level modeling. My dissertation was comprised of two large parts: 

  1. Empirical validation of the methodology of multilevel modeling.
  2. Enhancing the understanding and promoting the use of multilevel models to summarize effect sizes. 

Single-subject experimental designs  

SSEDs are used as a means to investigate the effectiveness of one or multiple treatments (Barlow, Nock & Hersen, 2009; Morgan & Morgan, 2001). In an SSED study, one or a few subject(s) (or another entity) is the focus of interest and is measured repeatedly during successive conditions, usually a baseline condition (in which no treatment is present) and a treatment condition (Barlow et al., 2009; Kazdin, 2011; Onghena, 2005). By comparing scores from both kinds of conditions, a single-case researcher can assess the functional relationship between the condition and the outcome scores on the dependent variable (e.g., the score on a statistical test). There are a variety of different SSED types, of which the multiple-baseline across participants, the phase change reversal designs and the alternating treatment designs are the most popular ones (Shadish & Sullivan, 2011) because they have the potential to give three demonstrations of the effectiveness of a treatment across three different points in time (Kratochwill et al., 2010).

Quantitative synthesis single-case experimental data

Although SSEDs are growing in popularity and are valued, the external validity is often questioned because of the small number of subjects under investigation in one SSED study. In order to establish an evidence base for treatment effects, several SSED studies can be combined and a three-level data structure becomes visible. Namely, measurement occasions are nested within subjects and subjects in turn are nested within studies (Van den Noortgate and Onghena, 2003a, 2003b, 2008, See Figure 2).

Figure 2. Graphical Display of the Three-Level Structure

Figure 2. Graphical Display of the Three-Level Structure

At the first level, variability in outcome scores between measurement occasions (i = 0, 1,…I) within cases (j = 0, 1,…J) within studies (k = 0, 1,…K) can be modelled as a function of a time variable, Tijk(going from zero to the end of the experiment), a dummy coded variable,  Dijk (equaling 0 if  Yijk belongs to the baseline and 1 if  Yijk belongs to the treatment), and an interaction between Tijk and Dijk:

Equation 1 

As a consequence,  β2jk and β3jk refer to the immediate treatment effect and treatment effect on the time trend respectively for case j nested within study k. Eijk is the within-case variability. Note that the design matrix can be set in a different way resulting in different quantifications of treatment effects (Moeyaert, Ugille, Ferron, Beretvas & Van den Noortgate, 2014). At the second level, variability in case-specific treatment effects (β0jk, β1jk, β2jk, β3jk) between cases within studies can be described using equation 2:

Equation 2 

θ20k and θ30k refer to the study-specific immediate treatment effect and treatment effect on the time trend respectively. The diagonals of the variance-covariance refer to the between-case variance (for instance θ2/u2 refers to the between-case variance in immediate treatment effect) whereas the off diagonals reflect the covariance (for instance,  θu2u3 is the covariance between the immediate treatment effect and the treatment effect on the time trend). At the third level, variability in study-specific treatment effects can be described:

Equation 3

This multilevel approach is promising and enables estimating treatment effects across cases and across studies in addition to study-specific and subject-specific treatment effects. Furthermore, variation in these treatment effects between studies and between subjects can be estimated, multiple predictors can be added, autocorrelation and heterogeneous variance can be modeled, etc.  

In my dissertation, I started with the empirical validation of the basic multilevel model as presented through equations 1 to 3 (Moeyaert, Ugille, Ferron, Beretvas & Van den Noortgate, 2013a). A commonly encountered issue when synthesizing SSED studies is standardization which was the first extension I focused on (Moeyaert, Ugille, Ferron, Beretvas & Van den Noortgate, 2013b). SSEDs are vulnerable to several threats to internal validity; therefore, I suggested a method to take external event effects into account (Moeyaert, Ugille, Ferron, Beretvas & Van den Noortgate, 2013c). 

In the last study of the methodological part of my dissertation, I focused on the estimate of the variance covariance matrix and evaluated the consequences of misspecifying the covariance matrix at the second and third level of the multilevel model (Moeyaert, Ugille, Ferron, Beretvas & Van den Noortgate, 2016). This allows examination of the robustness of the three-level model.

In the applied part of my dissertation, I elaborated on the design matrix specification (Moeyaert, Ugille, Ferron, Beretvas & Van den Noortgate, 2014) and explained in detail the process from single-level analysis to multilevel analysis (Moeyaert, Ferron, Beretvas & Van den Noortgate, 2014). 

In the final study, I illustrated how to combine several types of SSEDs such as simple AB designs, multiple-baseline designs, ABAB reversal designs and alternating treatment designs using one multilevel modeling framework on a real dataset (Moeyaert, Ugille, Ferron, Onghena, Heyvaert, Beretvas & Van den Noortgate, 2014). 

Some major results, recommendations and future study  

The results of my dissertation are promising and encouraging for researchers interested in estimating fixed effect (i.e., treatment effects) across subjects and across studies. Valid and reliable average treatment effect estimates are obtained if the underlying assumptions are met and if the model is correctly specified. The results of these syntheses establish a means of evaluating treatment effect estimates and contribute to evidence based practice. 

Valuable information is obtained in order to improve research and everyday practice and important policy decisions can be made based on the results of literature synthesis. However, we advise research meta-analysts to increase the number of primary studies included in the multilevel analysis whenever possible as greater precision and accuracy in effect size estimates can be obtained. While single-case meta-analysts are constrained by the availability of primary studies, they could adjust their methods for searching (e.g., expanding their search terms) whenever possible, but are limited by what the field has generated. The study shows that the average treatment effects are generally well estimated if the between study-variability is small and if a minimum of 30 studies are involved. The number of measurements and cases is of less importance. Therefore, besides the importance of systematically varying characteristics of studies in order to investigate moderator effects, it might be advantageous to replicate previous studies, resulting in homogeneous study results. 

Additionally, single-subject researchers should pay attention to baseline variability or stability in an effort to decrease variability at level one. This might partly solve standardization problems. Also, as the baseline trajectory is used as a means to estimate the treatment effect, a more justified treatment effect estimate is obtained. Another cause of variability is measurement error. Finding a way to eliminate measurement error might decrease overall variability. Therefore, we encourage single-case researchers to measure a dependent variable consistently at the same time of day, at the same setting and for the same amount of time across subjects and even across studies investigating the same underlying treatments. We also advise single-case researchers to pay attention to treatment fidelity (Kazdin, 2011), because this can result in a decrease in between-subject variability, and as a result in less variability in the average treatment effect estimate. For example, if a treatment was administered exactly like it was intended to be administered, the associated treatment effect would be different than a treatment effect associated with a treatment administered differently than intended.

A requirement for obtaining more accurate estimated treatment effects over subjects and over studies if standardized SSED data are used, is to include at least 20 measurement occasions per subject. As standardization is desirable if SSED studies are combined, we encourage single-subjects researchers to observe and measure their subjects at least 20 times.

A final recommendation to single-case researchers is to consider previous single-subject studies. Specifically, if single-case researchers from similar areas of interest (e.g., reading, math) measure their dependent variables the same across studies, then single-case meta-analysts would have a larger number of primary studies to include in their research synthesis and could feel more confident in their interpretation of average treatment effect estimates. We encourage methodologists studying the use of multilevel modeling to summarize single-case data to conduct further research on modeling count outcomes, non-linear trajectories, etc. 

Furthermore, violations of assumptions (e.g., non-normality of the level-1, level-2, or level-3 errors, heteroscedasticity of errors at all levels) and various level-1 error models (e.g., high order autoregressive or moving average models) need to be investigated in the future. Investigation of these more complex models would allow for a better understanding of the applicability of the models under a variety of conditions. 

Future research on other approaches to estimate variance components would also be of interest. The results of this dissertation have indicated that the variance components at all levels are biased. Therefore, it would be interesting to investigate alternative methods for estimating variance such as the Bayesian approach. 


Barlow, D.H., Nock, M.K., & Hersen, M. (2009). Single case experimental designs: Strategies for studying behavior change (3rd ed.). Boston: Allyn & Bacon. 

Busse, R.T., Kratochwill, T.R., & Elliott, S.N. (1995). Meta-analysis for single-case consultation outcomes: Applications to research and practice. Journal of School Psychology, 33, 269-285. 

Chorpita, B.F., Albano, A., Heimberg, R.G., & Barlow, D.H. (1996). A symstematic replication of the prescriptive treatment of school refusal behavior in a single subject. Journal of Behavior Therapy and Experimental Psychiatry, 27, 281-290. 

Gast, D.L. (2010). Applied research in education and behavioral sciences. In D.L. Gast (Ed.), Single-subject research methodology in behavioral sciences (pp. 1–19). New York: Routledge. 

Kazdin, A.E. (2011). Single-case research designs: Methods for clinical and applied settings (2nd ed.). New York: Oxford University Press. 

Kennedy, C.H. (2005). Single-case designs for educational research. New York: Allyn and Bacon.

Kratochwill, T.R., & Brody, G.H. (1978). Single subject designs. A perspective on the controversy over employing statistical inference and implications for research and training in behavior modification. Behavior Modification, 2, 291-307. 

Kratochwill, T.R., & Levin, J.R. (1992). Single-case research design and analysis: New directions for psychology and education. Hillsdale, New Jersey, England: Lawrence Erlbaum Associates, Inc. 

Kratochwill, T.R., Hitchcock, J., Horner, R.H., Levin, J.R., Odom, S.L., Rindskopf, D.M., & Shadish, W.R. (2010). Single-case designs technical documentation. Retrieved from What Works Clearinghouse website:

Moeyaert, M., Ugille, M., Ferron, J., Beretvas, S., Van den Noortgate, W. (2013a). Three-level analysis of single-case experimental data: Empirical Validation. Journal of Experimental Education, 82, 1-21. doi: 10.1080/00220973.2012.745470 (1.638).

Moeyaert, M., Ugille, M., Ferron, J., Beretvas, S., Van den Noortgate, W. (2013b). The three-level synthesis of standardized single-subject experimental data: A Monte Carlo simulation study. Multivariate Behavioral Research, 48, 719-748. doi: 10.1080/00273171.2013.816621 (4.775).

Moeyaert, M., Ugille, M., Ferron, J., Beretvas, S., Van Den Noortgate, W. (2013c). Modeling external events in the three-level analysis of multiple-baseline across-participants designs: A simulation study. Behavior Research Methods, 45, 547-559. doi: 10.3758/s13428-012-0274-1 (3.048).

Moeyaert, M., Ugille, M., Ferron, J., Onghena, P., Heyvaert, M, & Van den Noortgate, W. (2014). Estimating intervention effects across different types of single-subject experimental designs: Empirical illustration. School Psychology Quarterly, 25, 191-211. doi: 10.1037/spq0000068 (3.286).

Moeyaert, M, Ugille, M., Ferron, J., Beretvas, S., Van den Noortgate, W. (2014).The influence of the design matrix on treatment effect estimates in the quantitative analyses of single-case experimental design research. Behavior Modification, 38, 665-704. doi: 10.1177/0145445514535243 (1.219).

Moeyaert, M, Ferron, J., Beretvas, S., Van den Noortgate, W. (2014). From a single-level analysis to a multilevel analysis of single-subject experimental data.Journal of School Psychology, 52, 191-211. doi: 10.1016/j.jsp.2013.11.003 (4.260).

Moeyaert, M., Ugille, M., Ferron, J., Beretvas, S., Van Den Noortgate, W. (2016). The misspecification of the covariance matrix in the three-level modeling of single-cases. Journal of Experimental Education, 84, 3, 473-509. doi: 10.1080/00220973.2015.1065216 (1.638).

Morgan, D.L., & Morgan, R.K. (2001). Single-participant research design: Bringing science to managed care. American Psychologist, 56, 119-127. 

National Research Council (2002). Committee on Scientific Principles for Education Research. Center for Education. Division of Behavioral and Social Sciences and Education. In R.J. Shavelson & L. Towne (Eds.), Scientific research in education. Washington, DC: National Academy Press.

Onghena, P. (2005). Single-case designs. In B. Everitt & D. Howell (Eds.), Encyclopedia of statistics in behavioral science (Vol. 4, pp. 1850-1854). Chichester: Wiley.

Shadish, W.R., & Rindskopf, D.M. (2007). Methods for evidence-based practice: Quantitative synthesis of single-subject designs. New Directions for Evaluation, 113, 95-109.

Tawney, J. W., & Gast, D. L. (1984). Single subject research in special education. Columbus, Ohio: Merrill.

Save Save Save Save SaveSave