Magnitude of Response in Treatment and Control Groups within Psychedelic Trials for Psychiatric Disorders: A Meta-Analysis
Authors
- Richard Zeifman
- Shokouh Meshkat
- Eric Vermetten
Published
Research Summary of 'Magnitude of Response in Treatment and Control Groups within Psychedelic Trials for Psychiatric Disorders: A Meta-Analysis'
Introduction
Mental disorders cause a large global health and economic burden, and many people do not achieve adequate symptom relief with current first-line treatments such as psychotherapy and pharmacotherapy. Previous meta-analyses of conventional treatments often report small-to-moderate effects and substantial publication bias. Early clinical research suggests psychedelic-assisted psychotherapy may reduce symptoms of depression, PTSD and anxiety with generally mild short‑term adverse events, but trials to date are small, relatively homogeneous and face particular methodological challenges. S. and colleagues set out to quantify how much symptom change occurs within control arms of randomized psychedelic trials and to compare that with change in active treatment arms. Specifically, the authors aimed to (1) estimate between-group treatment effects using change scores, (2) describe within-group pre–post symptom changes separately for treatment and control arms, and (3) explore whether within-control change and between-group effects differ by control type (inactive placebo versus active/low‑dose placebo). The motivation was to understand the contribution of non‑drug contextual factors (expectancy, psychotherapy, assessment effects) to observed outcomes in psychedelic trials and to inform trial design and interpretation.
Methods
The study is a systematic review and meta-analysis prepared in line with PRISMA and preregistered in PROSPERO (CRD420251111853). The authors conducted a comprehensive search on 1 July 2025 across OVID (MEDLINE, Embase, APA PsycInfo) and PubMed for randomized controlled trials of classic psychedelics (psilocybin, LSD, DMT, MDMA, ayahuasca) for psychiatric disorders. Ketamine and other NMDA‑targeting agents were excluded. The extracted text contains a discrepancy: the search statement reports no language restrictions, but the inclusion/exclusion criteria later list non‑English publications as excluded; this inconsistency is present in the source text and is reported as such. Eligible studies were peer‑reviewed RCTs involving participants with a diagnosed mental disorder comparing a psychedelic or psychedelic‑assisted therapy with a placebo control (inactive or active/low‑dose). Studies were excluded if there was concurrent pharmacotherapy, non‑human work, or (per the extraction) non‑English publication. A two‑stage screening process was used, and screening and data extraction were performed by two independent reviewers; conflicts were resolved by discussion and a third reviewer when necessary. Extracted items included trial and participant characteristics, psychedelic type/dose/route/duration, control type, concurrent psychotherapy, and validated outcome scores at reported time points. Risk of bias was assessed with the Cochrane Risk of Bias tool. All quantitative analyses were conducted in R (version 4.4.0) using the metafor package. Primary between‑group efficacy analyses used standardized mean differences (SMDs) of change scores (treatment vs control). When change statistics were not reported, change means were computed as endpoint minus baseline and SDs of change were derived assuming a pre–post correlation r = 0.5; sensitivity analyses varied r (0.3, 0.7, 0.9). To characterise within‑arm changes the authors computed standardized mean change with change‑score standardisation (SMCC), which accounts for the paired pre–post structure. Random‑effects models were used throughout given expected heterogeneity. Control conditions were classified as inactive placebo versus active/low‑dose placebo and subgroup analyses by placebo type were performed for depressive and PTSD outcomes where data allowed. Heterogeneity was quantified with I² and τ², and small‑study effects/publication bias were assessed using funnel plots and Egger’s test when at least 10 arms were available.
Results
The search yielded 3,295 records (320 duplicates removed); after screening 32 full texts were assessed and 14 RCTs (total n = 643) met inclusion criteria. Risk of bias across the 14 trials was predominantly low: ten studies were rated low risk overall, three were judged to have some concerns (primarily in outcome measurement), and one study was assessed as high risk overall owing to deviations from intended interventions and additional concerns about outcome measurement and selective reporting. Between‑group meta‑analyses of change scores favoured the psychedelic treatment arms across outcomes. For depressive symptoms (k = 13 comparisons) the pooled SMD was −0.82 (95% CI −1.17 to −0.47), with I² = 60.1% indicating moderate–substantial heterogeneity. For PTSD symptoms (k = 10) the SMD was −0.89 (95% CI −1.14 to −0.65), with I² = 0%. For anxiety symptoms (k = 5) the SMD was −0.66 (95% CI −0.94 to −0.38), with I² = 0%. Sensitivity analyses excluding studies with high risk or some concerns produced similar between‑group estimates for depression (k = 10; SMD = −0.88; 95% CI −1.31 to −0.44; I² = 68.3%) and PTSD (k = 9; SMD = −0.89; 95% CI −1.14 to −0.64; I² = 0%). Funnel plots and Egger’s tests provided limited evidence of small‑study effects for the between‑group analyses (Egger p = 0.24 for depression; p = 0.15 for PTSD). Within‑group analyses demonstrated clinically meaningful pre–post symptom reductions in treatment arms and also measurable reductions in control arms, though pooled control‑arm SMCC estimates are not clearly reported in the extracted text. Reported pooled within‑treatment arm effects were large: depression (k = 13; SMCC = −1.30; 95% CI −1.69 to −0.92; I² = 84.1%), PTSD (k = 10; SMCC = −1.47; 95% CI −1.86 to −1.08; I² = 61.7%), and anxiety (k = 5; SMCC = −1.16; 95% CI −1.40 to −0.92; I² = 0%). Funnel inspection suggested asymmetry for treatment‑group within‑group analyses and Egger’s test indicated evidence of small‑study effects (p = 0.005). Sensitivity analyses varying the assumed within‑study correlation (r = 0.3–0.9) altered the magnitude of pooled effects as expected (larger absolute values at higher r) but did not change the direction of effects. One control arm was omitted from the SMCC meta‑analysis because its sample size (n = 2) produced a non‑estimable effect size. Exploratory subgroup analyses by control type found no clear differences in between‑group effects by placebo type for depressive or PTSD symptoms, but descriptive within‑control analyses suggested larger PTSD symptom reductions in trials using inactive placebo compared with active/low‑dose controls; the authors caution these subgroup findings are exploratory given small numbers of studies per subgroup.
Discussion
S. and colleagues interpret their findings as showing that psychedelic‑assisted interventions produce greater symptom reductions than control conditions for depression, PTSD and anxiety in randomized trials, with larger treatment effects for depression and PTSD and more moderate effects for anxiety. At the same time, control arms in these trials exhibited clinically meaningful within‑group improvements, indicating that non‑pharmacological trial factors—such as expectancy, therapeutic contact and structured psychotherapy delivered to both arms—likely contribute to observed change under trial conditions. The authors note that control‑arm improvements in their synthesis appear less extreme than some reports of large placebo responses in broader mental health literature, but direct comparisons are limited by differences in populations, concomitant psychotherapy, follow‑up intervals and effect‑size computations. They highlight methodological challenges specific to psychedelic trials: intense experiential effects can lead to functional unblinding and strong expectancies; control conditions often include substantial preparatory and integration psychotherapy, creating a rich therapeutic context even in placebo arms; and low‑dose or active placebos may not be pharmacologically inert and may still affect expectancy or produce subtle subjective effects. The within‑control finding of larger PTSD improvements under inactive placebo (versus active/low‑dose controls) is discussed as a potentially important signal that comparator choice can influence outcomes, though the authors emphasise limited power and inconsistent measurement of blinding and expectancy across studies. Key limitations acknowledged by the authors include inconsistent reporting of expectancy and blinding integrity (preventing quantitative evaluation of these factors), the inclusion of psychotherapy in placebo arms (which makes it impossible to disentangle drug‑specific from therapy effects within the available trials), modest sample sizes and limited power to detect subgroup differences, and sensitivity of within‑group estimates to assumptions about pre–post correlations. The authors recommend that future trials report expectancy and masking integrity more systematically, consider designs that can separate psychotherapy from drug effects (for example therapy‑alone arms or 2×2 designs), increase sample sizes, and run mechanistic and head‑to‑head comparator studies to better isolate drug‑specific effects. They conclude that careful design, justification and transparent reporting of control conditions—as well as routine assessment of blinding and expectancy—are necessary so that any incremental benefit attributable to classic psychedelics can be interpreted in the context of the broader therapeutic milieu.
View full paper sections
METHODS
This meta-analysis was prepared in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelinesand was registered in PROSPERO (CRD420251111853).
SEARCH STRATEGY
A comprehensive search of the literature was conducted on 1 July 2025 through OVID (MEDLINE, Embase, APA PsychInfo) and PubMed to identify studies evaluating the use of psychedelics for the treatment of psychiatric disorders. The search strategy included the following psychedelics: psilocybin, lysergic acid diethylamide (LSD), dimethyltryptamine (DMT), 3,4-Methylenedioxymethamphetamine (MDMA), and ayahuasca. Psychedelics that targeted NMDA receptors, such as ketamine, were excluded due to their distinct, nonserotonergic mechanisms of action. Search terms included anxiety and related disorders, depression, bipolar disorder, schizophrenia, PTSD, and eating disorders. Results were filtered to RCTs. No language or date restrictions were applied.
INCLUSION/EXCLUSION CRITERIA AND SCREENING
Studies underwent a two-stage screening process: first-level screening, which included an assessment based on title and abstract, followed by second-level screening, which involved a full-text assessment. Eligible studies were peer-reviewed RCTs that (1) involved participants with a diagnosed mental disorder, and (2) investigated the efficacy of psychedelics or psychedelic-assisted therapy versus a placebo control condition (active or inactive). Exclusion criteria were: (1) concurrent use of any other forms of pharmacotherapy or drug treatment (e.g., antidepressants); (2) non-human studies; and (3) non-English publications. Screening wasPublished online by Cambridge University Press performed by two independent reviewers (RS, SM). Conflicts between the two reviewers were resolved through discussion and, when necessary, consultation with VB.
DATA EXTRACTION
The following data on study design and patient demographics were extracted by two independent reviewers (RSSM): country of origin, mental disorder of patient population, diagnostic assessment method (e.g., Diagnostic and Statistical Manual of Mental Disorders [DSM] IV or V), mean age, sex (male or female), and sample size. Regarding treatment, the type of psychedelic, dose and duration (in weeks), route of administration, type of placebo, and concurrent psychotherapy treatment (if applicable) were extracted. Outcome data comprised scores from validated, disorder-specific instruments at all reported assessment time points.
QUALITY ASSESSMENT
Two independent reviewers (RS, SM) assessed the quality of all included studies using the Cochrane Risk of Bias tool(Supplementary Table).
META-ANALYSIS
All analyses were conducted using R software (version 4.4.0). For each outcome (depressive symptom severity, PTSD symptoms, and anxiety symptom severity), our primary efficacy analysis was a between-group meta-analysis of standardized differences in change scores (treatment vs control). When mean change was not reported, it was calculated as endpoint mean minus baseline mean. The standard deviation (SD) of the change score was obtained using To contextualize symptom change within psychedelic trials, we conducted secondary within-group meta-analyses separately for treatment and control arms using standardized mean change with change-score standardization (SMCC), which quantifies pre-post changes while accounting for the paired structure of the data. These analyses were intended to describe the magnitude of symptom change occurring within each arm, capturing placebo-associated and other non-pharmacological effects in control groups, and were not designed to estimate treatment efficacy or to isolate a causal placebo effect. Effect sizes were computed using the escalc() and rma() functions from the metafor package (version 4.8-0). Studies that reported changescore dispersion (SD/SE/CI) were analyzed directly from the change statistics; studies with baseline/endpoint data only used the derived SD change with the same assumed r = 0.5. One studyin the control group was excluded from the SMCC meta-analysis due to a very small sample size (n = 2), which led to a non-estimable effect size. Given anticipated heterogeneity across trials, all meta-analyses used random-effects models. We also conducted sensitivity analyses varying the assumed pre-post correlation used to derive SD of change (r = 0.3, 0.7 and 0.9) to assess robustness. In an additional sensitivity analysis restricted to the primary correlation assumption (r = 0.5), we repeated the meta-analyses for depressive and PTSD symptoms after excluding studies rated as having high risk of bias or some concerns of bias; this analysis was not conducted for anxiety due to an insufficient number of studies. Control conditions were classified as inactive placebo versusPublished online by Cambridge University Press active/low-dose placebo, and subgroup analyses by placebo type were performed as exploratory analyses for depressive and PTSD symptoms. Because few studies contributed to each subgroup, subgroup findings should be interpreted cautiously and viewed as exploratory rather than definitive. Statistical heterogeneity was summarized using I² and τ². Effect sizes were interpreted following standard thresholds, where Cohen's d or SMCC values of 0.2, 0.5, and 0.8 represent small, medium, and large effects, respectively. Heterogeneity across studies was assessed using the I² statistic, with values classified as follows: 0-40% (not important), 30-60% (moderate), 50-90% (substantial), and 75-100% (considerable). Potential small-study effects/publication bias were assessed using funnel plots and Egger's regression test when ≥10 study arms were available for a meta-analysis. For between-group analyses, k reflects the number of comparisons; for within-group analyses, k reflects the number of unique arms contributing data.
SEARCH RESULTS
The initial search identified 3295 articles. Duplicates were removed prior to screening (n = 320). A total of 2975 studies underwent first-level screening, and 32 full-text articles were retrieved for second-level assessment. Finally, 14 RCTs (n = 643) met inclusion criteria and were included in the meta-analysis (Figure). Characteristics of included studies are indicated in Table.
QUALITY ASSESSMENT
Across the 14 included trials, risk of bias was predominantly low. Ten studies were low risk overall, with low risk across all five domains. Three studies were rated as having some concerns overall, Holze et al. (2023)anddue to some concerns in outcome measurement, while all other domains in these studies were low risk. One study,, was assessed as high overall risk of bias, driven by high risk due to deviations from intended interventions, with additional concerns for outcome measurement and selective reporting, despite low risk for the randomization process and missing outcome data. subgroup differences p = 0.81). Funnel plots and Egger's tests provided limited evidence of small-study effects for depressive symptoms (p = 0.24; Figure) and PTSD symptoms (p = 0.15; Figure). Sensitivity analyses are presented in Table; varying the assumed withinstudy correlation (r) from 0.3 to 0.9 changed the magnitude of pooled effects (generally becoming larger in absolute value at higher r), while the overall direction remained consistent with greater symptom reduction in the treatment condition. The second sensitivity analysisPublished online by Cambridge University Press excluding studies with high or some concerns of bias showed similar results for depressive symptoms (k = 10; SMD = -0.88; 95% CI = -1.31, -0.44; I² = 68.3%; Figure) and PTSD symptoms (k = 9; SMD = -0.89; 95% CI = -1.14, -0.64; I² = 0%; Figure). In treatment groups, within-group analyses showed larger symptom reductions across outcomes, including depressive symptoms (k = 13; SMCC = -1.30; 95% CI = -1.69, -0.92; I² = 84.1%), PTSD symptoms (k = 10; SMCC = -1.47; 95% CI = -1.86, -1.08; I² = 61.7%), and anxiety symptoms (k = 5; SMCC = -1.16; 95% CI = -1.40, -0.92; I² = 0%). Funnel plot inspection (Figure) suggested some asymmetry, and Egger's test indicated evidence of small-study effects (p = 0.005). Sensitivity analyses varying the assumed within-study correlation (r = 0.3-0.9; Tables 3aS-3bS) showed that effect magnitudes changed as expected, while the direction of within-group change remained consistent across outcomes. In a separate sensitivity analysis excluding studies with high risk of bias or some concerns of bias (FiguresPublished online by Cambridge University Press 6S-7S; conducted under the primary r = 0.5 assumption), pooled effects for depressive symptoms and PTSD symptoms remained consistent.
DISCUSSION
Our results demonstrated that between-group meta-analyses of change scores consistently favored treatment for depressive symptoms, PTSD symptoms, and anxiety symptoms, with larger benefits for depression and PTSD and more moderate benefits for anxiety, and no meaningful differences by placebo type. Within-group analyses showed symptom reductions in control conditions but larger improvements in treatment groups; placebo type did not materially affect within-group depression changes, whereas control-group PTSD improvements appeared greater with inactive placebo. Small-study effects were limited for between-group findings but suggested in treatment-group within-group analyses, and sensitivity analyses varying the within-study correlation changed effect sizes but not the direction of effects. The magnitude of symptom reduction observed in control groups is clinically relevant but warrants cautious interpretation. The within-control changes in the present synthesis appear more modest than the "large" placebo responses reported in some broader overviews of mental health trials, although direct comparison is limited by differences in sampled populations, concomitant interventions (including psychotherapy), follow-up intervals, and effect-size calculations. In psychedelic-assisted psychotherapy trials, both treatment and control arms typically include substantial clinical contact and structured support, which may contribute to symptomatic improvement independent of psychedelic exposure; accordingly, observed control-arm changes are best interpreted as reflecting overall response under the trial context rather than evidence for any single operative mechanism. A further methodological consideration is that psychedelic trials may be especially susceptible to lessebo effects,Published online by Cambridge University Press attenuated improvement when participants infer assignment to placebo or a sub-therapeutic condition, particularly in settings characterized by strong prior expectations and imperfect masking. Additional non-specific influences may include reactivity to repeated, structured outcome assessments (e.g., trauma-focused interviews in PTSD trials), regression to the mean, and natural symptom fluctuation. Because expectancy and blinding integrity were inconsistently measured and reported across included studies, these potential contributors could not be evaluated quantitatively and should be treated as hypotheses rather than causal explanations. While between-group subgroup analyses did not indicate clear differences by control type, the within-control analyses suggested that PTSD symptom reductions may be larger in trials using inactive controls than in those using active controls, whereas depressive symptoms did not show a clear pattern by control condition. These observations raise important methodological considerations. Low-dose psychedelics and other active control strategies are often used to preserve blinding, yet they may still compromise trial integrity if participants infer allocation based on the presence or absence (or intensity) of expected acute effects, potentially shaping expectancy, engagement, and therapeutic response. This issue is directly relevant to the design of the MAPS phase 3 MDMA trials, which used an inactive placebo comparator; the investigators noted that low-dose MDMA had improved blinding in earlier studies but was not selected for phase 3 in part to better isolate drug efficacy and permit cleaner safety comparisons. More broadly, low-dose psychedelics may not be pharmacologically inert: even sub-perceptual or low doses can produce subtle subjective, cognitive, or physiological effects that could influence participant experience and expectations, complicating interpretation of treatment effects. Finally, the absence of consistent control-type differences in many analysesPublished online by Cambridge University Press highlights ongoing challenges in this literature, including limited power to detect subgroup effects and the frequent lack of systematic measurement of expectancy and blinding integrity factors that are critical for improving the rigor and interpretability of psychedelic trials. Our study has several limitations. First, due to the absence of detailed information on expectancy and blinding across studies, we were not able to evaluate the effects of these factors in our analysis. Differences in participant expectations and inadequate blinding may have biased the results, potentially leading to an overestimation of the interventions 'true effects. Second, since the included psychedelic trials involved psychotherapy sessions even in the placebo groups, the magnitude of the group response cannot be accurately estimated, as the effects of therapy are inherently included. These limitations suggest that future research should aim for more consistent reporting, larger sample sizes, better control for expectancy and blinding, and the inclusion of a therapy-alone condition or ideally a 2×2 design to enhance the validity and applicability of the findings. In conclusion, this meta-analysis showed that control conditions in psychedelic-assisted psychotherapy trials were associated with meaningful within-group symptom reductions, with moderate-to-large improvements in depressive and PTSD symptoms and moderate improvements in anxiety. Treatment groups consistently demonstrated greater symptom improvement than controls across outcomes, but the magnitude of change observed in control groups is consistent with an important role for non-pharmacological and contextual factors, although these mechanisms could not be evaluated directly in the included trials. In addition, while between-group effects did not appear to differ by control type, the within-control findings suggested larger PTSD improvements under inactive placebo than active control conditions, underscoring how comparator selection may shape outcomes and interpretation. Collectively,Published online by Cambridge University Press these findings emphasize the need for careful design, justification, and transparent reporting of control conditions, alongside routine assessment of blinding integrity and expectancy, so that the incremental benefit attributable to psychedelics can be interpreted within the therapeutic milieu. Future research should prioritize dismantling and head-to-head comparator designs, as well as mechanistic studies, to better disentangle drug-specific effects from psychological and contextual contributors to response. This peer-reviewed article has been accepted for publication but not yet copyedited or typeset, and so may be subject to change during the production process. The article is considered published and may be cited using its DOI. This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Full Text PDF
Study Details
- Study Typeindividual
- Journal
- Topics
- Authors
- APA Citation
Meshkat, S., Lin, Q., Sousa-Ho, R., Demchenko, I., Zeifman, R. J., Fang, H., ... & Bhat, V. Magnitude of Response in Treatment and Control Groups within Psychedelic Trials for Psychiatric Disorders: A Meta-Analysis. European psychiatry: the journal of the Association of European Psychiatrists, 1-35.
References (17)
Papers cited by this study that are also in Blossom
Goodwin, G. M., Aaronson, S. T., Alvarez, O. et al. · New England Journal of Medicine (2022)
Carhart-Harris, R. L., Bolstridge, &. M., Day, C. M. J. et al. · Psychopharmacology (2017)
Galvão-Coelho, N. L., Marx, W., Sinclair, J. et al. · Psychopharmacology (2021)
Goldberg, S. B., Shechet, B., Nicholas, C. R. et al. · Psychological Medicine (2020)
Feusner, J. D., Wheaton, M. G., Gomez, G. J. et al. · Journal of Psychiatric Research (2023)
Szigeti, B., Heifets, B. D. · Biological Psychiatry (2024)
Aday, J. S., Heifets, B. D., Pratscher, S. D. et al. · Psychopharmacology (2021)
Bouso, J. C., Doblin, R., Farré, M. et al. · Journal of Psychoactive Drugs (2008)
Mitchell, J., Ot’alora G, M., van der Kolk, B. et al. · Nature Medicine (2023)
Mitchell, J., Bogenschutz, M. P., Lilienstein, A. et al. · Nature Medicine (2021)
Show all 17 referencesShow fewer
Mithoefer, M. C., Wagner, M. T., Mithoefer, A. T. et al. · Journal of Psychopharmacology (2010)
Oehen, P., Traber, R., Widmer, V. et al. · Journal of Psychopharmacology (2012)
Palhano-Fontes, F., Barreto, D., Onias, H. et al. · Psychological Medicine (2018)
Raison, C. L., Sanacora, G., Woolley, J. D. et al. · JAMA (2023)
Schindowski, E. M., Jungwirth, J., Schuldt, A. et al. · EClinicalMedicine (2023)
Wolfson, P. E., Andries, J., Feduccia, A. A. et al. · Scientific Reports (2020)
Yanakieva, S., Polychroni, N., Family, N. et al. · Psychopharmacology (2018)