This meta-analysis (s=24) found that psychedelic-assisted therapy (s=8) was no more effective than open-label traditional antidepressants for treating major depression, and that, unlike traditional antidepressants (where blinding meaningfully influenced outcomes), PAT trials showed no difference between blinded and open-label conditions, confirming that PAT trials are effectively always unblinded.
Importance
Psychedelic-assisted therapy (PAT) trials have high levels of functional unblinding, which biases results when comparing PAT with blinded interventions. Because PAT is effectively always open label, treatment results should be compared with those of open-label traditional antidepressants (TADs), so potential benefits associated with patients knowing their treatment is equal between the interventions.
Objective
To investigate the comparative effectiveness of PAT vs open-label traditional antidepressants (TADs; such as selective serotonin and norepinephrine reuptake inhibitors) for the treatment of major depression.
Data Sources
PubMed was systematically searched in March 2024 for trials of PAT and open-label TADs for the treatment of major depression without comorbidity in adults without psychosis in the outpatient setting. Extraction was supplemented with data from a review and meta-analysis of antidepressant drugs to assess the open-label vs blinded TAD difference.
Data Extraction and Synthesis
Depression scores were extracted by 2 independent reviewers; estimates were pooled with both bayesian and frequentist mixed-effects models. Reporting follows the PRISMA guideline.
Main Outcomes and Measure
Following predefined hypotheses, the mean within-arm effect from baseline to primary end point (ie, patient improvement between PAT and open-label TAD trials on the 17-item Hamilton Depression Rating Scale) was compared. To assess the potential role of blinding, the within-arm effect of blinded vs open-label trials in both PAT and TADs was also compared.
Results
Of the initially retrieved PubMed 619 records, 24 met inclusion criteria. Contrary to the first of 3 hypotheses, PAT (8 trials; 249 patients) was no more effective than open-label TAD treatment (16 open-label TAD trials; 7921 patients), with an estimated difference of 0.3 favoring open-label TADs (95% CI, −1.39 to 1.98; P = .73). Open-label TADs were associated with better outcomes than blinded treatment (144 blinded TAD trials; 31 792 patients), with an estimated difference of 1.3 (95% CI, 0.07-2.51; P = .04;), but the same difference was not observed for PAT (0.67; 95% CI, −3.08 to 1.73; P = .58).
Conclusions and Relevance
In trials of depression, PAT was not more effective than open-label TADs. Blinding made a difference for TADs, but not for PAT, confirming that PAT trials are effectively always open label. These results argue against highly optimistic narratives surrounding PAT and highlight the importance of blinding integrity.
Papers cited by this study that are also in Blossom
Carhart-Harris, R. L., Giribaldi, B., Watts, R. et al. · New England Journal of Medicine (2021)
Raison, C. L., Sanacora, G., Woolley, J. D. et al. · JAMA (2023)
Ross, S., Bossis, A. P., Guss, J. et al. · Journal of Psychopharmacology (2016)
Schindowski, E. M., Jungwirth, J., Schuldt, A. et al. · EClinicalMedicine (2023)
Current treatments for depression include selective serotonin and norepinephrine reuptake inhibitors and other medications, collectively referred to here as traditional antidepressants (TADs). Earlier meta-analyses estimated a modest between-arm advantage of TADs over placebo of about 2.4 points on the 17-item Hamilton Depression Rating Scale (HAM-D), raising questions about clinical meaningfulness. Psychedelic-assisted therapy (PAT)—a combination of psychotherapy and a psychedelic dosing session—has attracted interest because PAT trials have reported larger between-arm differences (about 7.3 HAM-D units). However, PAT trials are characterised by pronounced functional unblinding: patients commonly deduce treatment allocation from intense subjective drug effects, with correct-guess rates reported around 90% to 95%, a much higher rate than in blinded TAD trials (about 63%). Functional unblinding can inflate apparent treatment effects, complicating direct comparisons between PAT and blinded pharmacological treatments. Against this background, the researchers preregistered a systematic review and meta-analysis to compare PAT with open-label TADs, thereby attempting to equalise the degree of unblinding between interventions. They set a minimal clinically important difference (MCID) of 3 HAM-D units and framed three hypotheses: (H1) PAT would exceed open-label TADs by at least the MCID at the primary end point; (H2) open-label TADs would outperform blinded TADs by at least the MCID; and (H3) open-label and blinded PAT would not differ by the MCID. The approach emphasises within-arm change from baseline to primary end point to capture the total (treatment-specific plus nonspecific) effects experienced by participants under similar unblinding conditions.
The researchers searched PubMed in March 2024 for trials of major depressive disorder in adults (mean age >18 and <65 years) that evaluated either open-label TADs (as listed by Cipriani et al) or PAT with specified psychedelics (LSD, psilocybin, mescaline/San Pedro/peyote, 5-MeO-DMT, or ayahuasca). Exclusion criteria included inpatient studies, psychotic depression, substantial comorbidity (an exception was made for comorbid anxiety), augmentation/combination trials, and trials with run-in periods (with the caveat that run-in phases could be treated as open-label TADs if they met inclusion criteria). Two reviewers independently screened studies and extracted depression scores; the authors also scanned references and contacted study authors when data were missing. All depression outcomes were converted to 17-item HAM-D equivalents; when only end point or change SDs were available, a conservative correlation coefficient of 0.5 was used for conversions. The primary outcome was the within-arm change from baseline to the trial's primary end point in HAM-D units. The analysis used preregistered Bayesian and frequentist multilevel random-effects meta-analytic models that accounted for multiple outcome measures within studies, included a fixed effect (and random slope) for baseline depression severity, and treated the focal variable as binary depending on the hypothesis (treatment type for H1; blinding status for H2 and H3). Heterogeneity was quantified via the intraclass correlation coefficient (ICC) estimated from the Bayesian models with 10 000 bootstrapped iterations. Bayesian results are summarised as posterior medians with 95% credible intervals (CrI) and probabilities for prespecified thresholds (for example, probability that the PAT–TAD difference exceeded the MCID), while frequentist results are reported as means with 95% confidence intervals and P values. Secondary analyses used frequentist models for computational efficiency. To estimate the within-arm effect of blinded TADs for H2, the researchers used data extracted from a large prior meta-analysis (Cipriani et al) encompassing 144 blinded TAD trials involving 31 792 patients. Study selection yielded 24 trials with extractable variables: 16 open-label TAD trials (7 921 patients) and 8 PAT trials (249 patients), of which 6 PAT trials (213 patients) were formally blinded and 2 (36 patients) were open label. The mean time from baseline to primary end point differed between groups (mean 8.1 weeks for TAD trials vs 3.4 weeks for PAT trials), and baseline HAM-D means were 22.7 for TADs and 21.3 for PAT. The authors also compiled an extended dataset that included four additional PAT studies that nearly met inclusion criteria; this extended set was used for robustness checks.
Characteristics of the included trials showed a much larger aggregate sample for open-label TADs (7 921 patients) than for PAT trials (249 patients), with PAT trials generally having earlier primary end points. Bayesian and frequentist analyses produced concordant findings. For H1 (open-label TADs vs PAT), the Bayesian model estimated the within-arm HAM-D change for open-label TADs at -12.5 units (95% CrI, -12.9 to -12.2; SMD -2.7) and for PAT at -11.8 units (95% CrI, -13.3 to -10.3; SMD -2.6). The posterior mean difference (PAT minus TAD) was β = 0.25 HAM-D units (95% CrI, -1.90 to 2.45), favouring TADs. The posterior probability that PAT reduced depression by 3 or more HAM-D units over open-label TADs (i.e. supporting H1) was 0.2%. The posterior probability that the difference lay within ±3 HAM-D units (the region of practical equivalence) was 99.1%. These results were similar when using the extended dataset. The frequentist model likewise found no significant difference between PAT and open-label TADs (the reported frequentist estimate was non-significant). For H2 (blinded vs open-label TADs), the frequentist model estimated a mean difference of β = 1.29 HAM-D units (95% CI, 0.07-2.51; P = .04), favouring open-label administration. The Bayesian model estimated an open-label vs blinded difference of approximately 0.85 HAM-D units. Although statistically significant in the frequentist model, these differences correspond to roughly half the chosen MCID (3 HAM-D units) and have 95% CIs contained within ±3 HAM-D units, indicating a practically negligible effect size despite statistical significance. For H3 (blinded vs open-label PAT), the Bayesian estimates were a within-arm change of -10.8 HAM-D units (95% CrI, -12.9 to -8.8; SMD -2.2) for blinded PAT and -13.3 HAM-D units (95% CrI, -15.4 to -11.2; SMD -3.2) for open-label PAT, with a posterior mean difference β = 2.14 HAM-D units (95% CrI, -1.86 to 5.79) favouring open label. The posterior probability that the blinded vs open-label PAT difference lay within ±3 HAM-D units was 68.0%. The frequentist model found no significant difference (β = 0.67; 95% CI, -3.08 to 1.73; P = .58). Bayesian heterogeneity estimates reported ICCs of 0.361 (95% CrI, 0.345-0.376) for H1 and 0.422 (95% CrI, 0.411-0.432) for H3. Overall, the data provide strong evidence against H1 (that PAT exceeds open-label TADs by ≥3 HAM-D units) and support H3 (no meaningful blinding effect in PAT), while showing a small but clinically negligible blinding effect in TAD trials (H2).
The authors interpret their preregistered meta-analysis as showing that PAT is not more effective than open-label TADs for major depression when both interventions are compared under equal unblinding conditions. Both Bayesian and frequentist results indicate a negligible mean difference of about 0.3 HAM-D units between PAT and open-label TADs, far below the prespecified MCID of 3 HAM-D units. The finding was robust to sensitivity analyses, including exclusion of trials restricted to treatment-resistant depression (TRD). The researchers also examined the role of blinding. For TADs, open-label administration was associated with slightly better outcomes than blinded administration (about 0.85 to 1.29 HAM-D units depending on the model), which is consistent with the notion that blinding reduces expectancy-related gains in TAD trials; however, this difference is about half the MCID and therefore not clinically meaningful. For PAT, formal blinding did not materially change within-arm outcomes, supporting the premise that PAT trials are effectively always open label due to high rates of functional unblinding. To explain why PAT shows larger between-arm differences against placebo yet is not superior to open-label TADs, the authors propose two factors. First, open-label TADs appear modestly more effective than blinded TADs (about 1.29 HAM-D units), representing part of the discrepancy. Second, placebo or control arms in PAT trials tend to show markedly less improvement (an estimated suppression of about 4.0 HAM-D units compared with placebo arms in TAD trials), inflating the observed between-arm effect for PAT. The authors term this a potential "know‑cebo" effect—control participants' disappointment when recognising they did not receive the psychedelic experience—which may be amplified by PAT trial procedures (extensive preparation followed by a non‑active dosing day). The sum of these effects (~5.2 HAM-D units) approximates the observed difference in reported between-arm effects (about 5 HAM-D units), providing a plausible explanation for the discrepancy. The authors acknowledge several limitations. Some PAT trials recruited TRD patients exclusively, while no TAD trials did so; baseline HAM-D scores differed modestly between groups, although baseline severity was controlled for in all models. Primary end points were earlier in PAT trials (mean 3.4 weeks) than in TAD trials (mean 8.1 weeks), which could favour PAT. PAT trials may have selected participants with higher education levels and underrepresented minority groups, potentially biasing results. Although PAT trials have very high correct-guess rates, they are not absolutely unblinded; conversely, open-label TADs are known with certainty, and expectancy effects may differ between interventions depending on public perception. Converting different depression measures to HAM-D equivalents could introduce artefacts, though qualitative conclusions were unchanged when analysing HAM-D data alone. The comparison of blinded vs open-label TADs relied on data from an external meta-analysis with slightly different inclusion criteria, which could introduce confounding. The analysis focused on symptom reduction and did not examine side effects or functional outcomes, and within-person (within-arm) effect estimates incorporate both specific treatment effects and nonspecific factors (natural history, placebo), which this methodology cannot disentangle. Finally, the literature search was restricted to PubMed and therefore might have missed some trials. The authors conclude that these findings argue against overly optimistic narratives about PAT's superiority for depression and underscore the importance of blinding integrity when evaluating novel therapeutics.
The authors conclude that, in trials of depression, psychedelic-assisted therapy was not more effective than open-label traditional antidepressants when compared under equal unblinding conditions. Blinding status affected outcomes for TADs but not for PAT, supporting the view that PAT trials are effectively always open label. These results caution against excessive optimism about PAT's superiority and highlight the critical role of blinding integrity in assessing novel treatments.
Szigeti, B., Nutt, D. J., Carhart-Harris, R. L. et al. · Scientific Reports (2023)
Holze, F., Gasser, P., Müller, F. et al. · Biological Psychiatry (2023)
Goodwin, G. M., Aaronson, S. T., Alvarez, O. et al. · New England Journal of Medicine (2022)
Palhano-Fontes, F., Barreto, D., Onias, H. et al. · Psychological Medicine (2018)
Sloshower, J. A., Skosnik, P. D., Safi-Aghdam, H. et al. · Journal of Psychopharmacology (2023)
Griffiths, R. R., Johnson, M. W. · Journal of Psychopharmacology (2016)
Lewis, B. R., Garland, E. L., Byrne, K. et al. · Journal of Pain and Symptom Management (2023)
Rosenblat, J. D., Meshkat, S., Doyle, Z. et al. · Med (2024)
Hieronymus, F., López, E., Sjögren, H. W. et al. · JAMA Network Open (2025)
Szigeti, B., Heifets, B. D. · Biological Psychiatry (2024)
Erritzoe, D., Barba, T., Greenway, K. T. et al. · EClinicalMedicine (2024)