This naturalistic follow-up of a Phase IIb randomised active placebo-controlled trial (n=144) found that one or two 25 mg doses of psilocybin with psychotherapy were linked to sustained reductions in depression symptoms in treatment-resistant major depression (TRD) at six and twelve months. Benefits were similar across groups, while restarting antidepressant medication during follow-up was associated with worse scores.
Introduction
Psilocybin shows promise for treatment-resistant depression (TRD), but long-term data are limited. This study examined the antidepressant effect of one or two psilocybin doses with adjunct psychotherapy in TRD until twelve months.
Methods
This is a naturalistic follow-up of a phase 2b, randomized, active placebo-controlled trial, where participants were randomized to receive two drug administrations six weeks apart, embedded within seven psychotherapeutic sessions: (1) active placebo (100 mg nicotinamide) then 25 mg psilocybin, (2) 5 mg psilocybin then 25 mg psilocybin, (3a) 25 mg psilocybin then 5 mg psilocybin, or (3b) 25 mg psilocybin twice. The controlled phase ended at twelve weeks, after which participants could pursue other treatments, with follow-ups at six- and twelve-months. The primary follow-up endpoint was change from baseline on the Hamilton Rating Scale for Depression (HAMD17).
Results
126/144 randomized participants (51 females, 40%) completed at least one follow-up visit. A generalized additive mixed regression model for change in HAMD17 scores showed a significant time effect across groups for both follow-up time points, with estimated average changes from baseline of -7.93 (95% CI: -9.17, -6.70, adj. p<0.0001) at six months and -7.74 (95% CI: -9.04, -6.43, adj. p<0.0001) at twelve months, without significant group differences. Results were consistent when controlling for antidepressant pharmacotherapy and psychedelic use. Re-initiation of antidepressant pharmacotherapy during follow-up was strongly associated with higher HAMD17 scores (β=3.79, 95% CI: 1.98, 5.60).
Conclusion
This is the largest and most complete follow-up of any clinical psychedelic trial. The findings demonstrate a stable and clinically meaningful long-lasting antidepressant effect of one or two 25 mg doses psilocybin with adjunct psychotherapy up to twelve months in TRD.
Treatment-resistant depression (TRD) is common, difficult to manage, and associated with high relapse and recurrence rates despite existing antidepressant treatments. Previous studies have suggested that psilocybin, given with psychotherapy, can produce rapid antidepressant effects, but the durability of these benefits over longer periods has remained uncertain. Earlier follow-up studies were encouraging but often small, open-label, or limited by attrition, so the evidence base for sustained outcomes in TRD has remained thin. Mertens and colleagues therefore set out to examine whether one or two doses of psilocybin with adjunct psychotherapy could produce lasting antidepressant effects over twelve months in TRD. The paper reports a naturalistic follow-up of a phase IIb randomised active-placebo controlled trial, focusing on depressive symptom change, response and remission, and whether any long-term benefit differed by dosing regimen.
The study was a six- to twelve-month naturalistic follow-up of a double-blind, randomised, active-placebo controlled, four-arm phase IIb trial conducted at two German university hospitals. Adults aged 25-65 years with moderate to severe TRD were eligible if they had not improved adequately after at least two antidepressant trials in the current episode. Key exclusions included psychotic or manic symptoms, certain personality disorders, post-traumatic stress disorder, current substance use disorder, recent significant suicidality, and recent or extensive prior classical psychedelic use. Participants were recruited through self-referral, clinical referral, and public outreach. Randomisation used permuted blocks, stratified by centre, with four dosing sequences: 100 mg nicotinamide followed by 25 mg psilocybin, 5 mg followed by 25 mg psilocybin, 25 mg followed by 5 mg psilocybin, or 25 mg psilocybin in both sessions. The controlled phase lasted 12 weeks and involved two dosing sessions six weeks apart, embedded within seven psychotherapy sessions plus weekly safety calls. The psychotherapy included preparatory work, support during dosing, and integration sessions afterwards. After the controlled phase, participants could resume other treatments; the only study-related post-trial support was an optional monthly integration group. The primary outcome was the clinician-rated HAMD17 depression scale at six and twelve months, with the self-rated BDI-II as the key secondary outcome. The researchers also examined response, remission, sustained response, sustained remission, global functioning, suicidality, and serious adverse events. Analyses included generalized additive mixed regression models with time treated as a continuous variable, logistic regression for response outcomes, and additional models adjusting for antidepressant pharmacotherapy and psychedelic use during follow-up. Missing values were not imputed; all randomised participants with at least one follow-up visit were included.
This is a six-to twelve-month naturalistic follow-up of a double-blind, randomized active-placebo controlled, four-arm, phase 2b trial investigating psilocybin with adjunct psychotherapy in TRD. Sponsored by the Central Institute of Mental Health (CIMH), Mannheim, Germany, the study was conducted at two German university hospitals from June 2021 to December 2024 (follow-up completion). The trial was approved by the Ethics Committees of the Medical Faculty Mannheim, University of Heidelberg, and the state of Berlin, and the Federal Institute for Drugs and Medical Devices (BfArM). The risk-benefit ratio of the trial was continuously reviewed by an independent Data Safety Monitoring Board (DSMB). The trial is registered at ClinicalTrials.gov, NCT04670081, registered December 9, 2020 (). The advisory board of affected persons of the CIMH was consulted before application and initiation of the trial. A patient and public involvement (PPI) board consisting of study participants was established after study completion, meeting regularly for discussion of interpretation and dissemination of the results. A list of investigators is provided in the Supplementary Material. The study design and its rationale have been published previously, as well as the main trial results. The protocol is provided in the Supplementary Material. This study followed the Consolidated Standards of Reporting Trials (CONSORT) reporting guidelines (see the Supplementary Material for the CONSORT 2025 checklist).
Adults aged 25 and 65 years with moderate to severe TRD were eligible, defined by a score ≥17 on the German version of the Hamilton Rating Scale for Depression (HAMD17; range: 0 -52 with higher scores indicating higher severity). TRD was operationalized as insufficient improvement after at least two adequate antidepressant trials of different pharmacological classes in the current episode. Patients were required to be medically stable and to pause any psychotherapies and/or monoaminergic psychiatric medications prior to study treatment. Key exclusion criteria included current or history of psychotic or manic symptoms, a family history of psychosis and/or bipolar disorder, a cluster A personality disorder, borderline personality disorder, post-traumatic stress disorder, a current substance use disorder, a current or recent history of clinically significant suicidality, and prior use of classical psychedelics in the past year and/or more than five lifetime uses. Full eligibility criteria are detailed in prior publications, and the study protocol. Recruitment occurred via self-and clinical referrals and public outreach (website, media, flyers, word of mouth). Written informed consent was obtained from all participants after receiving a full explanation of the study design and treatment arms.
Create a free account to open full-text PDFs.
Papers cited by this study that are also in Blossom
Goodwin, G. M., Aaronson, S. T., Alvarez, O. et al. · New England Journal of Medicine (2022)
Carhart-Harris, R. L., Giribaldi, B., Watts, R. et al. · New England Journal of Medicine (2021)
Raison, C. L., Sanacora, G., Woolley, J. D. et al. · JAMA (2023)
Schindowski, E. M., Jungwirth, J., Schuldt, A. et al. · EClinicalMedicine (2023)
Davis, A. K., Barrett, F. S., May, D. G. et al. · JAMA Psychiatry (2021)
Carhart-Harris, R. L., Bolstridge, M., Rucker, J. et al. · Lancet Psychiatry (2016)
Rosenblat, J. D., Meshkat, S., Doyle, Z. et al. · Med (2024)
Mu, F., Zaczek, H., Becker, A. M. et al. · Med (2025)
Carhart-Harris, R. L., Bolstridge, &. M., Day, C. M. J. et al. · Psychopharmacology (2017)
Davis, A. K., Streeter Barrett, F., Cosimano, M. P. et al. · Journal of Psychopharmacology (2022)
Erritzoe, D., Barba, T., Greenway, K. T. et al. · EClinicalMedicine (2024)
Goodwin, G. M., Nowakowska, A., Atli, M. et al. · Journal of Clinical Psychiatry (2025)
Mertens, L. J., Koslowski, M., Betzler, F. et al. · Neuroscience Applied (2022)
Mertens, L. J., Koslowski, M., Betzler, F. et al. · JAMA Psychiatry (2026)
Of 144 randomised participants, 142 received at least one intervention dose and post-treatment assessment, and 126 completed at least one six- or twelve-month follow-up and were included in the follow-up analyses. Baseline characteristics were broadly balanced across groups. Mean age was 42.4 years, 41% were female, and the sample was reported as 98% white. By twelve months, 40 participants (32%) had started antidepressant pharmacotherapy, 32 (25%) had used a classical psychedelic, and 86 (68%) had attended the integration group at least once. On the primary outcome, HAMD17 scores improved significantly over time across groups. The estimated mean change from baseline was -7.93 points at six months and -7.74 points at twelve months, both with adjusted p<0.0001. There were no significant differences between treatment groups. When follow-up antidepressant and psychedelic use were added to the model, the pattern remained essentially unchanged. Antidepressant pharmacotherapy during follow-up was associated with worse HAMD17 outcomes (β=3.79, 95% CI 1.98 to 5.60), whereas psychedelic use was not clearly associated with outcome. Secondary BDI-II results were consistent with the clinician-rated findings. Mean change from baseline was -9.51 points at six months and -11.11 points at twelve months, both significant, with no significant group differences. In models adjusting for follow-up drug use, the improvements remained significant and the between-group pattern was again null. Logistic regression showed no significant between-group differences in response or sustained response at either follow-up. Remission data were described but not emphasised as showing clear group differences. Post-hoc analyses suggested larger and more consistent HAMD17 reductions in males. Suicidal behaviours during follow-up were uncommon: two events at six months and four at twelve months, mostly preparatory actions, with one suicide attempt. Suicidal ideation on the CSSRS did not change clearly from baseline, while HAMD item 3 suggested a strong anti-suicidal effect at all post-baseline assessments.
Mertens and colleagues interpret the findings as showing stable and clinically meaningful antidepressant benefit lasting up to twelve months after one or two 25 mg psilocybin doses with adjunct psychotherapy in TRD. They describe the average HAMD17 improvement of roughly 8 points and the response rates at follow-up as evidence of durable benefit, while noting that the observational design means the results should be interpreted cautiously. The authors state that there was no clear evidence that a second 25 mg dose added benefit over regimens that included only one 25 mg dose, although they note that the 25 mg-25 mg group showed numerically higher response rates and that retreatment remains a plausible hypothesis for future study. They say the BDI-II findings aligned with the clinician-rated results, supporting a consistent antidepressant effect across measurement types. They also argue that spontaneous recovery is unlikely to fully explain the findings, given the chronicity of TRD, although they acknowledge that it cannot be ruled out in a naturalistic follow-up without a control condition. In comparing their results with earlier studies, the authors suggest that long-term outcomes here were more modest than those reported in some major depression studies, which they attribute partly to the TRD population and to design differences such as enriched follow-up in other work. They suggest that the durability of benefit may relate to extended psychotherapeutic support, including preparatory, dosing, integration, and post-trial group sessions. They also note that antidepressant pharmacotherapy after the trial was associated with poorer outcomes, which they interpret as likely reflecting greater residual illness among those needing medication again, and that post-trial psychedelic use was not associated with better depression outcomes. The paper highlights several limitations: the psychotherapeutic component was not experimentally separated from psilocybin, so the relative contribution of drug versus psychotherapy is unknown; follow-up was naturalistic and lacked a placebo arm; some participants were unblinded during follow-up; other therapies were not systematically captured; the sample was relatively homogeneous and potentially self-selected; and the study was not powered to test all follow-up comparisons. The authors also note that new adverse events were not systematically collected during follow-up, limiting conclusions about long-term safety. They argue that future research should use larger head-to-head trials, examine retreatment, compare dosing and psychotherapy approaches, and include cost-effectiveness work if psilocybin-assisted therapy is to be evaluated for implementation in healthcare systems.
The authors conclude that this naturalistic follow-up provides the first demonstration of sustained antidepressant efficacy of 25 mg psilocybin with adjunct psychotherapy over twelve months in a large TRD cohort. They say the findings support the clinical potential of one or two doses while also showing the need for better methods to evaluate long-term outcomes of psychedelic therapies. They further suggest that, if confirmed, psilocybin with psychotherapy could represent a shift towards an integrated pharmacological-psychotherapeutic model of care rather than chronic symptom management alone.
Randomization was performed via an online tool using permuted blocks by an independent data manager, stratified by center, with allocation ratios of 2:2:1:1 across four treatment groups: (i) active placebo (100 mg nicotinamide) followed by 25 mg psilocybin; (ii) 5 mg followed by 25 mg; (iiia) 25 mg followed by 5 mg psilocybin; and (iiib) 25 mg psilocybin in both sessions. The second dosing occurred after assessment of the primary endpoint at week 6; the allocation ratios ensured equal group sizes for the group comparisons (nicotinamide vs. 5 mg vs. 25 mg) at the primary endpoint. Accordingly, all participants in this observational follow-up had received at least one 25 mg psilocybin dose with adjunct psychotherapy over a period of 13 weeks. Given the high risk of functional unblinding associated with psychedelic treatments, the design served both methodological (e.g. reduction of dropouts, disappointment and subsequent depressiogenic effects in comparator arms until the primary endpoint which could inflate between-condition differences), and ethical purposes (e.g. offering active treatment to a vulnerable patient population of TRD). For a detailed discussion, see the prior publications. All study personnel (including therapists and raters) and participants were blinded to treatment allocation until completion of the main trial. Unblinding occurred in April 2024 after data lock, when the last patient had completed the Week-12 endpoint. Hence, a proportion of participants were still in follow-up at the time of unblinding.
Following screening, participants were assigned to a therapist pair, who conducted all therapeutic sessions. Both therapists were either licensed psychological psychotherapists/psychiatrists or in advanced phase of their training. One week after the baseline visit, participants received either oral 25 mg psilocybin, 5 mg psilocybin or 100 mg nicotinamide (treatment phase 1). After assessment of the primary endpoint at six weeks, a second dosing session took place, administering 25 mg or 5 mg Downloaded fromby guest on 28 May 2026 psilocybin (treatment phase 2) according to the randomization scheme. The two six-to-eight-hour dosing sessions were scheduled six weeks apart and embedded in a manualized psychotherapy framework. The adjunct psychotherapy comprised seven preparatory and integration sessions (two hours each; 14 hours total) and eight weekly safety monitoring calls. In accordance with the trial's therapy manual, preparatory therapeutic sessions focused primarily on building therapeutic rapport, exploration of the patient's depressive history, life context, resources, and treatment expectations and goals, including development of a therapeutic intention for the dosing session. A second objective was the preparation of participants for the upcoming dosing session and potential psychedelic experiences, e.g. through psychoeducation, as well as encouragement and practicing of an open, mindful attitude and functional coping strategies. During the dosing sessions, participants were lying down, listening to specific music; they were encouraged to wear eyeshades, look inward and engage with the unfolding inner experiences. Therapists checked in regularly when blood draws or vital signs were taken or when the participant appeared distressed or in an unclear state. Challenging experiences were supported in a process-directive manner: Interventions promoting engagement with unpleasant experiences were favored, while disengagement from unpleasant experiences was encouraged only in cases of severe or prolonged distress. With prior consent obtained during preparation, therapeutic touch (hand to shoulders only) was applied to relieve distress and provide emotional support. The subsequent debriefings at the end of dosing days and following psychotherapeutic integration sessions (one day, one week after each dosing session) were targeted at supporting participants in processing their experiences and in deriving relevant insights and fostering adaptive therapeutic change in a patient-directed manner. Full procedural details, including the psychotherapeutic framework, are described in prior publicationsand the therapy manual. The fact that some previous publications have referred to the psychotherapy component of psilocybin treatment as "psychological support", whereas we and others deliberately use the term "psychotherapy", should not be interpreted as indicating any qualitative or quantitative differences in the psychotherapeutic interventions provided in the present study compared with earlier studies. Rather, this terminological choice reflects a conceptual emphasis on the central role of structured psychotherapeutic processes in the delivery and interpretation of psilocybin treatment effects. Across the trial, participants attended twelve in-person visits, including two for screening, eight during the intervention phase, and two follow-up visits at six-and twelve-months after the first dose. The controlled trial was completed at week 12 or earlier in case of discontinuation, after which participants entered follow-up and could resume other treatments. The only additional study-related intervention was a voluntary monthly integration group (90 minutes per session, led by two trial therapists) providing space for continued integration and interpersonal support (see Supplementary Material for details). Follow-up visits were primarily conducted in-person; remote assessments were arranged when necessary. They included an open follow-up evaluation (catamnesis), self-report and clinician-rated outcome scales.
The HAMD17 was the primary outcome measure, assessed at multiple time points during the main trial, of which baseline and end-of-study assessments are relevant here, and at both follow-up visits. The German GRID-HAMD17 was administered by trained, blinded raters not involved in therapy of a respective patient (blinding maintained until last-patient-out of the controlled trial and data lock in April 2024). Key secondary outcome was the self-reported Beck-Depression Inventory (BDI-II (range: 0 -66, higher scores indicating greater severity), also assessed repeatedly, including both follow-ups. Both outcomes were analyzed as change from baseline, binary response (≥50% reduction on HAMD17 or BDI-II), remission (HAMD17 <8, BDI-II <10), as well as sustained response and remission since end-ofstudy. Clinically meaningful changes are generally defined as reductions of approximately 3-5 points on the HAMD17 and of 3-6 points on the BDI-II. Primary follow-up endpoints were defined as HAMD17 change from baseline at six and twelve months (time effect) and comparisons between the 25 mg-25 mg psilocybin group and other treatment groups at both time points. Key secondary follow-up endpoints mirrored this structure using BDI-II scores. Further secondary endpoints included group differences in response, sustained response, remission and sustained remission at both follow-ups on the HAMD17 and BDI-II. Exploratory efficacy endpoints included the Clinical Global Impression (CGI) and Global Assessment of Functioning (GAF) scales. At each visit, participants were assessed for psychiatric medication use, classical psychedelic use, and participation in the post-trial integration group. Systematic assessment of new adverse events (AEs) ended after week 12 of the main trial. Long-term safety monitoring included reports of serious adverse events (SAEs) and suicidality assessments using the Columbia-Suicide Severity Rating Scale (CSSRS) at both follow-ups.
Details of the power calculation are provided in prior publications, and the study protocol (Supplementary Material). The Statistical Analysis Plan is also available in the Supplementary Material. All randomized participants who completed at least one follow-up visit were included in the analyses, regardless of premature termination of the main trial or time deviations of follow-up visits. Missing values were not imputed, as the overall drop-out rate during follow-up was 12.5% (by visit) and 13.9% for HAMD17, only slightly above the 10% rate assumed. Further information on missing data is provided in the Supplementary Material (Table). Descriptive statistics summarized demographic and baseline clinical characteristics. Post-trial antidepressant and psychedelic use and integration group participation were analyzed descriptively and compared between groups using chi-squared tests and Analysis of Variance (ANOVA). HAMD17 and BDI-II change from baseline were analyzed using a generalized additive mixed regression model (GAMM), including treatment, time, treatment x time, center and center x time as fixed effects and participant-specific intercepts as random effects. Time was treated continuously (days since first dose) with effects modeled by smooth, non-linear functions, see Supplementary Material for details. Additional analogous GAMMs for both outcomes controlled for antidepressant and psychedelic use during follow-up as potentially confounding variables. Sustained response and response rates on HAMD17 and BDI-II for both follow-up visits were analyzed by logistic regression with treatment, baseline score, and antidepressant or psychedelic use since the previous visit as predictors. Remission and sustained remission rates were analyzed descriptively. Correction for multiple testing was applied to the primary (HAMD17) and key secondary (BDI-II) efficacy endpoints assessing change from baseline at six-and twelve-months, and group comparisons between the 25 mg-25 mg psilocybin group and all other groups (eight tests in total). Bonferroni-Holm correction was applied to these eight p-values. GAF, CGI, CSSRS suicidal ideations and HAMD item 3 (suicidality) were analyzed using GAMMs analogous to the main efficacy analyses. Results are reported in the Supplementary Material (Table, Table). Post-hoc HAMD17 subgroup analyses by sex and integration group participation status were conducted via descriptives and pairwise t-tests for mean comparisons (Supplementary Material, Table-S5). Suicidal behaviors and non-suicidal self-injurious behaviors (Table) and SAE follow-up reports were analyzed descriptively and are provided in the Supplementary Material. Statistical analyses were performed in R Language for Statistical Computing version 4.4.
Participants were recruited between June 2021 and December 2023; primary trial completion (week 12) occurred in February 2024, with the final twelve-months data collected in December 2024. Figureshows study enrollment participant numbers at each stage. Of 144 randomized participants, 142 received at least one IMP administration and one post-treatment assessment. A total of 126 participants completed at least one follow-up assessment at six and/or twelve-months and were included in the follow-up analysis. Of the 16 who did not, ten discontinued the main trial early; individual reasons are reported elsewhere. Six participants completed the trial through week 12 but were lost-to-follow-up thereafter. Baseline demographic and clinical characteristics of the 126 follow-up participants are provided in Table. Groups were balanced, with only negligible differences to the full sample (N=144). The mean (SD) age was 42.4 (10.6) years, 41% were female (sex assigned at birth) and 98% were white (ascertained by investigators at screening). The mean (SD) duration since the first MDD episode was 13.9 (9.8) years, with a mean (SD) of 6.4 (9.3) lifetime depressive episodes at screening. Baseline depression was moderate on the HAMD17 and severe on the BDI-II. Baseline and end-of-study depression scores of follow-up dropouts are provided in Table. Antidepressant pharmacotherapy, classical psychedelic use and participation in the post-trial integration group are depicted in Table. Group-specific data are provided in Tableand indicated no differences. By twelve months, 40 participants (32%) had initiated antidepressant pharmacotherapy, 32 (25%) had used a classical psychedelic, and 86 (68%) had attended the integration group at least once. Mean HAMD17 scores (SD) were 14.8 (7.69) at end-of-study, 14.1 (8.04) at six months and 13.9 (7.84) at twelve months, indicating an average severity within the mild depression range. Individual HAMD17 trajectories by treatment group are depicted in Figurein the Supplementary Material. The GAMM for HAMD17 scores revealed a significant time effect at both follow-ups, with a mean change from baseline of -7.93 (95% Confidence Interval [CI], -9.17 to -6.70, adj. p<0.0001) at sixmonths and -7.74 (95% CI, -9.04 to -6.43, adj. p<0.0001) at twelve-months, averaged across treatment groups and centers (Table, Figure). No group differences were observed across all timepoints (p=0.098). The complementary GAMM controlling for psychedelic and antidepressant use yielded similar results (Table, Figure). Antidepressant pharmacotherapy was associated with a poorer HAMD17 outcome (estimated mean difference: 3.79, 95% CI: 1.98 to 5.60); whereas psychedelic use showed a negligible association with outcome (-0.27, 95% CI: -2.23 to 1.69; Table). Logistic regression models revealed no significant group differences at any follow-up in HAMD17 response or sustained response rates when controlling for center, antidepressant and psychedelic use (Table). Post-hoc subgroup analyses indicated larger and more consistent HAMD17 reductions in males (Table). BDI-II results paralleled the HAMD17: Across groups, the mean BDI-II score (SD) was 22.6 (13.4) at the end-of-study, 23.9 (14.0) at six-months and 22.5 (14.4) at twelve-months, indicating an average severity within the moderate depression range. The GAMM on BDI-II scores revealed a significant time effect, averaged across group and center for both follow-up time points, with an average change from baseline of -9.51 (95% CI, -11.62 to -7.41, adj. p<0.0001) at six-months and -11.11 (95% CI, -13.39 to -8.84, adj. p<0.0001) at twelve-months (Table, Figure). No significant differences between treatment groups were found (p=0.074; Table). Results were consistent when additionally controlling for antidepressant and psychedelic use, with an estimated average change from baseline of -10.47 (95% CI: -12.68 to -8.26, adj. p<0.0001) at six-months, and -13.18 (95% CI: -15.81 to -10.54, adj. p<0.0001) at twelve-months, and no difference between treatment groups (p=0.163). This BDI-II model also provided strong evidence for a negative association of antidepressant pharmacotherapy with outcome (estimated difference: 5.88, 95% CI: 2.80 to 8.96; Table). Results of logistic regression analyses on response and sustained response revealed no differences between treatment groups at any time point (Table). Suicidal behaviors during follow-up were infrequent, with two incidences at six months and four at twelve months; all were preparatory except for one suicide attempt (Table). CSSRS suicidal ideations scores showed no changes from baseline or between-group differences, while HAMD item 3 (suicidality) indicated a strong anti-suicidal effect at all post-baseline assessments (Table).
This naturalistic follow-up of a phase 2b trial showed that patients with TRD, who received one or two doses of 25 mg psilocybin with adjunct psychotherapy, experienced stable and clinically meaningful improvements in depressive symptoms lasting up to twelve-months; given the observational design of the study, these results need to be interpreted cautiously. All participants had received 25 mg psilocybin at least once in the main trial. The estimated mean HAMD17 improvement was approximately 8 points in the follow-up period, with response rates of 34% at six-months and 37% at twelve-months. Consistent with results until week 12, no significant group differences were found during follow-up, providing no clear evidence for an additional benefit of a second 25-mg-dose compared with regimens involving a single 25 mg psilocybin dose or a 5-mg-dose followed or preceded by a 25-mg-dose. However, the study was powered only for the primary endpoint. Considering the observed (albeit not significant) higher response rate in the 25 mg-25 mg group (50% at both follow-ups versus 30-35% in other groups), and reported benefits of retreatments in another TRD trial, a benefit of retreatment remains plausible and warrants further investigation. Results of the BDI-II were in line with the HAMD17, indicating a robust and consistent antidepressant effect across self-report and clinician-rated measures. Although spontaneous recovery or the natural disease course cannot be ruled out as contributors to improvement in this naturalistic study without a control condition, this appears unlikely to be the sole contributor to the observed stable antidepressant effects in a chronic TRD cohort (Table). Consistent with findings from the main trial, females appeared to benefit less (Table), underscoring the need to investigate underlying mechanisms and to optimize psilocybin treatment particularly for women. The study results remained consistent after controlling for psychedelic use and antidepressant pharmacotherapy. Across models and outcomes, antidepressant pharmacotherapy was associated with poorer depression outcomes, likely reflecting that participants requiring antidepressant medication after the trial had benefitted less. Only 32% resumed antidepressant pharmacotherapy post-trial, compared with 45% who had discontinued at trial entry, potentially reflecting treatment response and reluctance to restart a previously ineffective therapy in TRD patients. Subsequent psychedelic use was not associated with depression outcomes. This could indicate no additional benefit of repeated psychedelic administrations, or, more plausibly, that the clinical setting and the adjunct psychotherapy is crucial for their therapeutic effects. Psychedelic use was not analyzed granularly, as type, dose, frequency and motives were not assessed. Overall, 32 participants (25%) reported psychedelic use post-trial until twelve-months. Although unrelated to depression outcomes, such use likely reflects the treatment's acceptance and perceived benefits but also underscores potential risks and the need for appropriate safeguards in clinical psychedelic trials. Overall, the long-term antidepressant effect observed in the present study was moderate and comparable to effects found with other antidepressant treatments in TRD, including (es-)ketamine or lithium and antipsychotic augmentation. Notably, direct comparisons to long-term outcomes of other antidepressant agents like (es-)ketamine is limited by methodological differences in study designs and treatment regimens, as (1) most studies use enriched designs, in which only patients who have demonstrated an acute response or remission are included, and (2) maintenance treatment is offered during follow-up. This approach tends to inflate apparent long-term response or remission rates and differs fundamentally from the present design, in which the full study cohortincluding initial non-responders -were included and no maintenance psilocybin therapy occurred. Response rates and depression score reductions were, however, lower than those reported at sixand twelve-months follow-ups by Gukasyan et al., presumably reflecting the generally larger treatment response to psilocybin therapy in MDD than in TRD. While a previous TRD study found diminishing antidepressant effects of psilocybin with time, depression scores in the present study remained stable and even slightly decreased further during follow-up (Table, Table). This sustained efficacy may relate to extended psychotherapeutic support -13 weeks of therapy with a total of 30 h of therapy with two therapists, two dosing sessions, and a post-trial integration groupcompared to shorter or single-dose trials. In line with prior studies, our main trial report provided preliminary evidence linking the acute psychedelic experience and antidepressant outcomes. More comprehensive mechanistic analyses on the role of specific aspects of the acute subjective experience, together with psychotherapeutic processes (e.g., experiential avoidance) and pharmacokinetic processes, in the antidepressant effects of the present trial are currently underway and lie beyond the scope of this manuscript. A subset of patients were sustained responders at six-(22.2%) and twelve-months (15.9%), indicating a rapid, robust and durable antidepressant effect of psilocybin in some individuals. Future studies should explore predictors of sustained response and examine whether retreatment benefits those who relapse. For others, therapeutic benefits may emerge gradually, as reflected by continued decreases in depression scores and higher response and remission rates at six-and twelve-months versus end-of-study. From a psychotherapeutic perspective this appears plausible, as PAT is viewed as an uncovering therapeutic approach, in which symptom improvement may only follow successful integration. Strikingly, the long-term benefits emerged independently of antidepressant pharmacotherapy in this trial. This pattern may reflect an ongoing change process after a single psilocybin treatment with psychotherapeutic embedding, resembling psychotherapeutic mechanisms and supporting its potential as a salutogenic or disease-modifying therapy distinct in treatment modality and likely its therapeutic mechanisms from conventional maintenance-based pharmacotherapies requiring continuous use or maintenance treatments. At the same time, it is important to note that the existing evidence base does not allow conclusions about the comparative effectiveness of PAT versus other antidepressant treatments, as head-to-head trials are largely lacking. One exception is a small (likely underpowered) MDD psilocybin trial versus escitalopram, which did not detect a difference. Accordingly, larger randomized-controlled long-term studies should assess the efficacy of PAT versus standard of care in head-to-head trials (in TRD: esketamine or lithium or quetiapine augmentation), explore and compare different dosing regimens and ranges and (psycho-)therapy protocols, and evaluate retreatment as well as ongoing psychotherapeutic support. Given the resource-and laborintensive nature of PAT, particularly when conducted with individual patients and therapist dyads, future research needs to carefully evaluate its advantages and disadvantages relative to established therapies, including rigorous cost-effective analyses. Only such head-to-head trials will suffice for a favorable Health Technology Assessment (HTA), required for implementation in most European healthcare systems. While a fixed 25 mg psilocybin dose has been established as a reliable antidepressant dose and shown superiority over 10 mg, clinical practice also employs individualized dosing regimens, including both lower and higher doses (up to 40 mg psilocybin), as seen in the Swiss limited access program. Future studies should explore the potential benefits of individualized dose selection and identify patient-specific factors influencing pharmacokinetics, acute effects and clinical outcomes in PAT. In line with other studies [e.g.,, participants discontinued monoaminergic medication prior to study treatment to avoid attenuation of psilocybin effects and to ensure safety. More recent studies in healthy volunteers and small patient cohorts suggest that discontinuation of antidepressants might not be necessary for safety and could even be beneficial by reducing negative acute effects of psilocybin, such as anxiety and adverse events. Given that challenging acute experiences (or more precisely the resolution of such experiences) have also been conceptualized as potentially beneficial for the therapeutic process (analogous to anxiety in exposure-based psychotherapies), this may be particularly relevant for patients who do not tolerate discontinuation of their antidepressants. Future studies should directly compare psilocybin monotherapy with psilocybin augmentation approaches to evaluate their effects on the therapeutic process and depression outcomes in clinical populations. An important limitation of the present study is that the psychotherapeutic component was not experimentally manipulated. Consequently, it remains unclear to what extent the observed longterm antidepressant effects can be attributed to psilocybin per se, the accompanying psychotherapy, or to their synergistic interplay/interaction effects. This design limitation has been repeatedly highlighted by psychotherapy researchers, who emphasize that PAT should be conceptualized and evaluated as a complex psychotherapeutic intervention rather than a purely pharmacological one. While the present study provides evidence for the durability of a combined treatment approach, dismantling designs or factorial trials varying key psychotherapeutic components could advance a more precise understanding of how psychedelic therapies exert their Downloaded fromby guest on 28 May 2026 effects. While conclusions regarding long-term safety are constrained by the absence of systematic collection of new AEs during follow-up, this study substantially extends the limited evidence on psilocybin's long-term efficacy. It represents the largest and most complete follow-up cohort of patients treated with a psychedelic -with nearly 90% retention -contrasting the largest prior study, in which only 66 of 252 randomized patients were followed-up, a subset likely not representative of the full sample. Additional limitations include the observational naturalistic design without a placebo arm (all participants received 25 mg psilocybin at least once), and unblinding while some participants were still in follow-up (20% of follow-up visits occurred after unblinding), which may have inflated expectancy and placebo effects. While randomized withdrawal trials are the regulatory standard for establishing relapse prevention with pharmacological treatments, their application to psilocybin is inherently problematic. Psychedelic treatments are administered as single-dose or time-limited interventions with prolonged therapeutic effects -similar to psychotherapy -, making treatment discontinuation an unreliable proxy for withdrawal of therapeutic action. As a result, relapsepreventive effects cannot be expected to cease immediately after stopping treatment, even though the precise duration of such effects cannot be definitively established. Importantly, this limitation reflects current evidence rather than a principled impossibility, with a naturalistic follow-up currently providing the most valid approach to assessing long-term outcomes of psychedelic therapies. Although post-trial antidepressant and psychedelic use was systematically recorded and controlled for, use of psychotherapies or other therapies was not. Further limitations are a consistent center effect, a socioeconomically and ethnically homogenous sample, potential self-selection bias and follow-up dropouts. Despite possible positivity bias (Table), the overall dropout rate was low, supporting the robustness and validity of the results.
Albeit constrained by its naturalistic design, this study is the first to demonstrate sustained antidepressant efficacy of 25 mg psilocybin with adjunct psychotherapy over twelve months in a large TRD cohort. Evidence of long-term benefits after one or two doses supports the clinical potential of this intervention, while also highlighting the need for methodological approaches to long-term outcomes that go beyond conventional randomized follow-up designs. Contingent on confirmation of the long-term benefits in subsequent trials, psilocybin with adjunct psychotherapy may represent a paradigm shift in psychiatric care, moving away from chronic pharmacological symptom management toward an integrated pharmacological-psychotherapeutic approach that facilitates lasting psychological change and recovery. Downloaded fromby guest on 28 May 2026 contributions to data collection as study investigators and clinicians (therapists and/or raters). All authors provided critical feedback to the manuscript (review and editing). LJM and GG confirm that they had full access to all the data of the manuscript, all authors accept responsibility to submit for publication.
The data that support the findings of this study are not publicly available due to privacy and data protection reasons. However, deidentified individual participant data will be made available one year after publication upon reasonable request to the corresponding author. Data access will be granted for academic purposes to researchers whose proposal has been approved by the responsible review board after signing of a data access agreement. Additional documents to be made available with publication are the study protocol, the statistical analysis plan (published in the supplementary material) and informed consent forms (available upon request to the coordinating authors).
Footnotes: a The number of initial contacts is the total number of people expressing interest at both trial centers. Because recruitment was conducted autonomously at both trial centers without exchange of personal information between centers, people who contacted both trial centers, are potentially counted twice. b Pre-screening consisted of a multistep process, starting withan initial e-mail contact, through which first relevant data on the inclusion and exclusion criteria were collected and (2) a pre-screening call via video or phone. c Patients were randomized to four trial arms determining their dosing scheme for the first and second IMP dose from the beginning. d Of all 144 randomized participants, 142 completed at least one post-treatment endpoint assessment and were therefore included in the main efficacy analysis (primary endpoint analysis at week 6) according to a while-ontreatment estimand, see Mertens et al.. e All patients (independent of their completion of the main trial) were intended to be included in the follow-up phase of the trial. Of all 142 participants with at least one post-treatment endpoint in the main trial phase, 126 participants completed at least one follow-up assessment and are therefore included in the follow-up analysis set. Of those 142 participants, 10 had discontinued the trial early during the main trial phase; 6 participants had completed the main trial phase compliantly until week 12, but were lost-to-follow up afterwards. f 126 participants have at least one follow-up data entry and are therefore included in the follow-up analysis set. Additional missing values on visit or scale-level for the primary efficacy endpoint (HAMD17) analyses are reported for the two follow-up visits, respectively. Follow-up observations were included in the analyses irrespective of time deviations, as the actual "time since dose 1" was included as time variable in the statistical models (see Methods). Total scores on the Hamilton Depression Rating Scale (HAMD17) range from 0 to 52 with higher scores indicating more severe depressive symptoms. The two dashed vertical lines represent the six-months (183 days after dose 1) and twelve-months (365 days after dose 1) follow-up endpoints. The dots represent individual observations, while the lines depict smoothly fitted functions for each treatment group over time derived from a generalized additive mixed regression model (GAMM). The shaded areas represent pointwise 95% confidence intervals for each treatment group. The fitted lines are based on a GAMM with treatment, time, treatment x time interactions, center and center x time interactions as covariates (fixed effects) and participant-specific intercepts (random effects). The non-linear functions are based on penalized B-splines with eight cubic basis functions. The optimal smoothing parameter was determined by the generalized cross-validation criterion. Impression; GAF = Global Assessment of Functioning; CSSRS = Columbia-Suicide Severity Rating Scale a Withdrawal from antidepressant or all psychoactive medication at trial entry was counted as "yes" if patients withdrew from their medication after providing informed consent or within four weeks prior but within the pre-screening/screening process. b In line with the inclusion/exclusion criteria no patient had HAMD17 scores in the mild range at trial entry (HAMD17 total score range at screening: for the entire sample 17 -32). c Three patients had missing values on BDI-II item level at baseline. Those missing values were imputed with the respective BDI-II scores from visit 2 (1 day before dose 1). d The CSSRS assessment at trial entry is provided here as it provides a better impression of the disease severity through assessment of the current and recent degree of suicidality (with reference to the past 6 to 12 months before trial entry) as well as the lifetime history of suicidality. CSSRS values of the baseline assessment (assessing the period since screening) are provided in the text and reported with respect to the post-baseline changes. e Included suicidal behaviors are preparatory actions, aborted and interrupted suicide attempts and actual suicide attempts.
Total HAMD17 score at end-of-study (SD) 14.5 (5.99) Footnotes: HAMD17 = Hamilton Depression Rating Scale; Response = minimum 50% reduction on the HAMD17 total score; Remission = HAMD17 total score < 8; Sustained response/remission is defined as the respective criterion being maintained since end-of-study; SD = Standard Deviation. The N refers to the number of participants within each group with at least one follow-up visit. Details on additional missing values for each follow-up visit is provided in the Supplementary Appendix, Table. a Model 1 results are based on a generalized additive mixed regression model (GAMM) with treatment, time, treatment x time interactions, center and center x time interactions as covariates (fixed effects) and participant-specific intercepts (random effects). b Model 2 results are based on an analogue GAMM with HAMD17 as outcome with treatment, time, treatment x time interactions, center and center x time interactions (fixed effects), participant-specific intercepts, and psychedelic use and antidepressant pharmacotherapy since prior visit as additional fixed effects. c Results are based on separate logistic regression models for each follow-up visit and outcome (response, sustained response). All logistic regression models included treatment, center, antidepressant pharmacotherapy and psychedelic use since prior visit as covariates.
Gründer, G., Brand, M., Mertens, L. J. et al. · Lancet Psychiatry (2024)
Nayak, S., Johnson, M. W. · Pharmacopsychiatry (2020)
Aday, J. S., Horton, D. M., Fernandes-Osterhold, G. et al. · Psychopharmacology (2024)
Smith-Apeldoorn, S. Y., Veraart, J. K. E., Spijker, J. et al. · Lancet Psychiatry (2022)
Cavarra, M., Falzone, A., Ramaekers, J. G. et al. · Frontiers in Psychology (2022)
Mccrone, P., Fisher, H., Knight, C. et al. · Psychological Medicine (2023)
Gründer, G., Mertens, L. J., Spangemacher, M. et al. · European Neuropsychopharmacology (2026)
Mueller, F., Hawrot, T., Schmid, Y. · Neuroscience Applied (2025)
Becker, A. M., Holze, F., Grandinett, T. et al. · Clinical Pharmacology and Therapeutics (2021)
Goodwin, G. M., Croal, M., Feifel, D. et al. · Neuropharmacology (2023)
Carbonaro, T. M., Bradstreet, M. P., Barrett, F. S. et al. · Journal of Psychopharmacology (2016)
Wolff, M., Evens, R., Mertens, L. J. et al. · Frontiers in Psychiatry (2020)
Stocker, K., Hartmann, M., Barrett, F. S. et al. · Religion, Brain & Behavior (2026)