This double-blind randomised controlled trial (n=120) in healthy volunteers assessed how well blinding held when people received psilocybin, MDMA, or methylphenidate as an active placebo, and found blinding was generally insufficient. Functional unblinding was highest for psilocybin, moderate for MDMA, and lowest for methylphenidate.
Maintaining effective blinding is a major methodological challenge in psychedelic research. This study provides a comprehensive evaluation of blinding integrity in 120 healthy volunteers who received either psilocybin, MDMA, or methylphenidate (active placebo) in a double-blind, randomized controlled trial. Using a multi-level assessment incorporating forced-choice substance guesses, certainty ratings, decision factors, and subjective substance effects, the analyses characterize blinding integrity and its relation to the substance experience. Results indicate that overall blinding was insufficient, with psilocybin showing the highest rates of functional unblinding, MDMA moderate levels, and methylphenidate the lowest. As an active placebo, methylphenidate provided more effective blinding for MDMA than for psilocybin. Incorporating certainty levels of substance guesses revealed a more differentiated pattern, with lower functional unblinding rates. Decision factors and subjective substance experiences were associated with phenomenological substance effects. Prior substance experiences did not influence accuracy of forced-choice substance guesses. These findings provide empirical guidance for the design and reporting of blinding procedures in psychedelic trials and underscore the value of systematic, multi-level assessment of blinding integrity.
Belinger and colleagues frame the study around a central methodological problem in psychedelic research: the strong and distinctive acute effects of substances such as psilocybin and MDMA can make it easy for participants and investigators to infer treatment allocation, undermining blinding. The authors note that weak blinding can interact with expectancy effects and potentially inflate apparent treatment effects, and that previous psychedelic studies have often assessed blinding inadequately or found it to be unsuccessful. They also highlight uncertainty about whether active placebos improve blinding and how blinding differs across substances with different phenomenological profiles. The paper therefore aims to assess blinding integrity in a randomised, double-blind, placebo-controlled trial comparing psilocybin, MDMA, and methylphenidate in healthy volunteers. The authors set out to use a multi-level approach, combining forced-choice guesses, certainty ratings, decision factors, and subjective effect measures, to characterise not only whether blinding failed but also how and why participants and investigators arrived at their guesses.
Papers cited by this study that are also in Blossom
The study was conducted at the University Hospital of Psychiatry Zurich as a randomised, double-blind, placebo-controlled, parallel-group trial. Data were collected between October 2023 and September 2025. Ethical approval and regulatory authorisation were obtained, and all participants gave written informed consent. Eligible participants were German-speaking, medically and psychologically healthy adults aged 18-40 years with no more than 10 lifetime experiences of psychedelics, MDMA, or methylphenidate. Recruitment took place through online advertisements, mailing lists, flyers, and word of mouth. Of 173 enrolled participants, 120 were included in the final analysis, with 40 participants per group. Participants were randomised in a 1:1:1 ratio to receive psilocybin, MDMA, or methylphenidate, with the pharmacy handling allocation. Psilocybin and MDMA were the experimental substances, while methylphenidate served as an active placebo because it can produce overlapping stimulant and cardiovascular effects without being expected to match the primary psychedelic effects. The doses were 15 mg psilocybin, 100 mg MDMA, and 60 mg methylphenidate. Participants received standardised briefing about the expected effects of all three substances before administration, were supervised for at least eight hours afterwards, and underwent monitoring of acute effects and blood pressure. Blinding integrity was assessed after acute effects subsided. Blinding was evaluated using three main elements: certainty ratings on visual analogue scales for each substance, forced-choice substance guesses, and multiple-choice questions about the factors that informed those guesses. Investigators completed the same certainty and forced-choice assessments. Subjective experience was measured retrospectively using the 5-Dimensional Altered States of Consciousness Questionnaire (5D-ASC), which yields dimension, subscale, and overall intensity scores. The analysis used chi-squared tests or Fisher’s exact tests to examine whether prior substance use was associated with correct guessing. Certainty ratings were compared with one-way ANOVA and Tukey post-hoc tests. Forced-choice data were summarised in 3 × 3 confusion matrices and analysed with Bang’s Blinding Index, including a weighted version that incorporated certainty of correct guesses. Binomial tests and chi-square tests examined whether guesses differed from chance. The relationship between subjective experience and blinding was assessed with ANOVA models, including analyses of overall intensity by received substance and guessed substance, with post-hoc comparisons adjusted using Bonferroni correction.
Participants had a mean age of 27.6 years, 50.8% were female, and the sample was predominantly White/Caucasian and highly educated. Prior substance use was reported by 30.8% for psychedelics, 24.2% for MDMA, and 15.8% for methylphenidate. Prior substance use was generally not associated with correct forced-choice guesses, and the one exception reported for prior psychedelic use and methylphenidate allocation did not remain significant after correction for multiple comparisons. Certainty ratings were highest for the substance actually received. For participants, mean certainty for the correct substance was 82.8% in the psilocybin group, 60.5% in the MDMA group, and 50.6% in the methylphenidate group. Investigators showed a similar pattern, with mean certainty of 84.6% for psilocybin, 58.3% for MDMA, and 58.2% for methylphenidate. Certainty differed significantly by substance group for both participants and investigators. Forced-choice accuracy was 69.2% for participants and 70.8% for investigators overall. Accuracy was highest for psilocybin, at 92.5% for participants and 95.0% for investigators, and lower for MDMA and methylphenidate. The authors report that blinding was not fully successful, with Bang’s Blinding Index indicating strongest functional unblinding for psilocybin, intermediate unblinding for MDMA, and the lowest unblinding for methylphenidate. For participants, the BI values were 0.89 for psilocybin, 0.48 for MDMA, and 0.25 for methylphenidate; weighted values were slightly lower at 0.77, 0.40, and 0.20. For investigators, the corresponding BI values were 0.92, 0.36, and 0.40, with weighted values of 0.81, 0.30, and 0.34. Binomial and chi-square tests indicated that guesses were not independent of actual allocation. Decision factors varied by substance and by guessed allocation. For psilocybin, sensory perceptions were the most common basis for guessing. For MDMA, feelings and mood were most prominent. For methylphenidate, alertness featured more often, but responses were more heterogeneous. The authors interpret this as showing that subjective phenomenology influenced substance identification. Subjective effect intensity differed strongly by substance received. Psilocybin produced higher overall intensity than MDMA, which in turn was higher than methylphenidate. When guessed substance was considered, overall intensity was also higher when psilocybin or MDMA was suspected than when methylphenidate was suspected. There was no significant interaction between received and guessed substance. The authors note that participants allocated to methylphenidate who guessed psilocybin reported especially strong experiences, similar to those in the psilocybin group, which they present as a pattern contributing to misclassification.
Belinger and colleagues interpret the findings as showing that blinding in this trial was limited, but not uniformly so across substances. Psilocybin was associated with the greatest functional unblinding and highest certainty, MDMA showed intermediate values, and methylphenidate performed best as an active placebo overall. The authors argue that forced-choice accuracy alone can overstate the degree of unblinding, because certainty ratings reveal substantial heterogeneity: many correct guesses were made with only moderate or low confidence. They therefore conclude that blinding integrity should be assessed with multiple measures rather than a single guess-based outcome. The authors link blinding failure primarily to the phenomenological strength and distinctiveness of subjective drug effects. Psilocybin produced the most intense subjective experience, and participants’ reported decision factors suggest that sensory alterations, mood changes, and alertness guided their guesses in ways that matched known effects of the substances. They emphasise that MDMA and methylphenidate are more similar in some externally observable effects than in internal subjective experience, which may help explain differences in misclassification patterns between participants and investigators. They also suggest that some participants showed particularly strong placebo-like responses, especially when methylphenidate was mistaken for psilocybin. In relation to earlier research, the authors state that their accuracy rates were lower than in some previous psychedelic studies, which they interpret as indicating somewhat better blinding integrity than has often been reported. They also note that prior substance experience did not appear to meaningfully affect blinding, suggesting that both substance-naïve and previously exposed participants may be included in future work without major blinding penalties. The authors acknowledge several limitations and uncertainties. Blinding was measured only on the day of substance administration, so guesses may have changed later. They did not assess expectancy directly, so they cannot determine how expectancy interacted with compromised blinding or outcome interpretation. The sample was relatively homogeneous, consisting of healthy volunteers, which limits generalisability to clinical populations. They also caution that their conclusions about the suitability of methylphenidate as an active placebo are context-specific and should not be taken to apply equally across substances or trial designs. For future research and trial design, the authors recommend routine inclusion of forced-choice guesses, certainty ratings, decision factors, and expectancy measures to provide a fuller picture of blinding integrity. They suggest that methylphenidate may be a more suitable active control for MDMA than for psilocybin, while noting that no placebo strategy fully solves the blinding problem. They also discuss the possibility of multi-arm or multi-dose designs, more objective outcome measures, and alternative analytic approaches to reduce the impact of functional unblinding.
The authors conclude that multi-level assessment is essential for evaluating blinding in psychedelic trials. They state that methylphenidate appears to function reasonably well as an active placebo for MDMA, but not for psilocybin, and that no currently available control strategy fully preserves blinding when strong subjective drug effects are present. They recommend more comprehensive blinding and expectancy assessment in future psychedelic research.
The study was conducted at the University Hospital of Psychiatry Zurich, Switzerland, with a randomized, double-blind, placebocontrolled, parallel-group design. Data were collected between October 2023 and September 2025. The study followed the Revised Declaration ofand the International Council for Harmonization Good Clinical Practice guidelines. Ethical approval was granted by the Cantonal Ethics Committee in Zurich, Switzerland, and authorization by the Federal Office for Public Health (BASEC registration: 2022-02009; clinicaltrials.gov registration: NCT06081179). All participants provided written informed consent prior to participation.
German-speaking, medically and psychologically healthy participants aged 18-40 with no >10 lifetime experiences of psychedelics, MDMA, or methylphenidate were included (see Supplement S1 for full eligibility criteria). Participants were recruited via online advertisements, mailing lists, flyers, and word of mouth. Of 173 enrolled participants, the final sample included 120 participants, with 40 per substance group used for analyses (see Supplement S2 for consort flow diagram). Demographic information and lifetime substance use are reported in Table.
To investigate blinding integrity, we focus exclusively on the data from the substance administration visit. However, the entire study process will be explained to provide the full study context. Following a preliminary phone screening, eligible individuals attended the on-site screening visit (t0 -10d (± 7d)). After providing written informed consent, participants underwent eligibility assessment, including a physical examination and psychological evaluation, and baseline measures of prosocial behavior. Participants were then randomly assigned to receive either psilocybin, MDMA, or methylphenidate. At the first study visit, all participants received a standardized briefing on the expected effects of all three substances by trained study personnel, regardless of substance assignment, prior knowledge, or experience. Additional individual information was provided as needed. Substance administration took place during the second study visit (t0) in a living room-like setting within the research facilities of the University Hospital of Psychiatry Zurich. Participants were continuously supervised by two study team members for at least eight hours postadministration or until psychoactive effects had subsided. Participants were encouraged to focus inward and could structure their experience freely, except for using electronic devices. Acute substance effects were assessed every 1.5 h and blood pressure was monitored hourly. Retrospective effects and blinding integrity were assessed after psychoactive effects had subsided. Four weeks post administration, participants returned for follow-up measures of prosocial behavior and sustained substance effects (t0 + 4 w (± 3d)), with longer-term effects captured via the online follow-up (t0 + 16 w (± 7d); results reported separately).
Psilocybin and MDMA were selected due to their investigation for psychiatric indications. Methylphenidate was chosen as an active placebo because it produces acute psychoactive and especially cardiovascular effects that partially overlap with those of psilocybin and MDMAand has a comparable duration of subjective effects. Psilocybin, a classic serotonergic psychedelic, primarily acts via the serotonin (5-HT) system as an agonist at 5-HT2A receptors. It induces altered states of consciousness characterized by changes in emotional and cognitive processing, self-awareness, and perception. MDMA, an entactogen, promotes the release of serotonin, norepinephrine, and dopamine. It elicits experiences of emotional openness, heightened mood, euphoria, empathy, sociability, and extroversion, typically without strong perceptual alterations. Methylphenidate, a stimulant used in the treatment of Attention-Deficit/Hyperactivity Disorder (ADHD), is a norepinephrine-dopamine re-uptake inhibitor. In healthy individuals, methylphenidate produces psychostimulant effects, including enhanced activity, concentration, cognitive performance, and cardiovascular activation. Participants received either 15 mg psilocybin, 100 mg MDMA, or 60 mg methylphenidate, doses previously shown to produce wellcontrolled subjective effects with good tolerability and safety. Using these doses, intensity of effects is suggested to be comparable across substances, but their qualitative effects differ. For instance, a study comparing 125 mg MDMA with 60 mg methylphenidate found similar cardiovascular effects, while MDMA induced stronger empathogenic and psychedelic effects (5D-ASC) and methylphenidate primarily enhanced concentration and activity. The duration of subjective effects is expected to be similar (around 5.5 h after 15 mg psilocybin, 4-6 h after 100 mg MDMA, and 6-8 h after 60 mg methylphenidate).
Randomization was conducted by the pharmacy providing the study substances (Pharmacy Dr. Hysek AG, Biel, Switzerland) with a 1:1:1 allocation to psilocybin, MDMA, or methylphenidate, without further stratification. Investigators and participants were blinded to the substance allocation. Blinding integrity was assessed on the day of substance administration, after subjective effects had subsided. First, participants rated their certainty of having received each substance on a visual analog scale (0-100% for each substance). Second, they made a forced-choice substance guess for one of the three substances. Lastly, they completed multiple-choice questions about decision factors influencing their forced-choice. Investigators answered the same certainty rating and forced-choice questions in close temporal proximity to the substance session (see Supplement S3 for full questionnaires).
The 5-Dimensional Altered States of Consciousness Questionnaire (5D-ASC) was used to retrospectively assess subjective substance experiences. It contains 94 visual analog items (0-100) and yields five dimensions, eleven subscales, and an overall intensity score across all items. The 5D-ASC was assessed on the day of substance administration, after psychoactive effects had subsided.
All statistical analyses and visualizations were conducted in R (version 4.5.2). Statistical significance was set at α = 0.05, two-tailed.). For each substance group, Chi-squared tests or Fisher's exact tests (for small cell counts) were examined to evaluate the independence between binary prior substance use (yes/no) and participants' correct forced-choice substance guesses. Analyses considered prior use of each substance group separately (psychedelics, MDMA, methylphenidate) and combined.
Participants and investigators indicated their certainty that the administered substance was psilocybin, MDMA, or methylphenidate on a visual analog scale ranging from 0 to 100%. Differences in mean certainty between the three substance groups were tested using a one-way ANOVA with Tukey's Honestly Significant Difference (HSD) post-hoc comparisons.
After rating their certainty, participants and investigators were asked to select one substance (psilocybin, MDMA, or methylphenidate) as their best guess in a forced-choice format. For subsequent analyses, forcedchoice data were categorized by substance allocation (received substance) and forced-choice substance choice (guessed substance), generating a 3 × 3 confusion matrix.
Blinding integrity for participants and investigators was quantified with the forced-choice substance guesses using Bang's Blinding Index (BI) and weighted Bang's BI, with 0 indicating perfect blinding and ±1 indicating complete lack of blinding. The weighted Bang's BI incorporates mean certainty of correct substance guesses, thereby capturing both accuracy and confidence of guesses. For each substance group, binomial tests compared proportions of correct substance guesses with the theoretical chance level (p = 1/3). A Chi-square test of independence was applied to assess blinding across all substances.
To indicate which drug effects informed their forced-choice guess, participants selected one or more factors from a predefined list (Sensory Perceptions, Feelings/Mood, Cognition, Alertness, Bodily Sensations, Other). Selected factors were reported as percentages for each cell of the confusion matrix.
Subjective effects (5D-ASC) were analyzed separately for each substance group and in combination with substance guesses, respectively. Overall intensity was compared between substance groups using a one-way ANOVA with Tukey's HSD post-hoc tests. We conducted a two-way ANOVA with received substance, guessed substance, and their interaction as factors, and overall intensity as the dependent variable. Post-hoc pairwise comparisons were conducted using estimated marginal means (EMMs) and Bonferroni adjustment for multiple comparisons.
Participants had a mean age of 27.6 years (SD = 5.4) and a balanced gender distribution (50.8% female, 48.3% male, 1 diverse). The sample was predominantly White / Caucasian (95.0%) and highly educated (92.5%). 30.8% of participants reported prior use of psychedelics, 24.2% had previously used MDMA, and 15.8% had used methylphenidate. Most participants had used those substances less than four times and everyone less than ten times (see Table). Prior substance use was not associated with correct substance guesses in our study (psychedelics, MDMA, methylphenidate separately, and combined; p > 0.05). The exception was participants with prior psychedelic use showing a reduced likelihood of correctly identifying their allocation to methylphenidate (p = 0.022; OR = 0.18; 95% CI [0.03, 0.82]). However, this effect did not survive correction for multiple comparisons.
Descriptively, participants' and investigators' certainty ratings were highest for the substance that participants actually received (see Fig.). Among participants who received psilocybin, mean certainty reached 82.8% for psilocybin, but were lower for MDMA (27.6%) and methylphenidate (8.5%). A similar pattern appeared in the MDMA group: certainty was 60.5% for MDMA, compared with 43.5% for psilocybin and 12.2% for methylphenidate. For those who received methylphenidate, confidence was highest for methylphenidate (50.6%) and lower for MDMA (39.2%) and psilocybin (20.7%). A similar pattern was observed in investigators' ratings. When participants received psilocybin, investigators rated high confidence for psilocybin (84.6%), with much lower certainty for MDMA (14.0%) and methylphenidate (8.0%). For the MDMA group, investigators' certainty was highest for MDMA (58.3%), compared with 30.1% for psilocybin and 20.9% for methylphenidate. For the methylphenidate group, investigators were most confident when selecting methylphenidate (58.2%) but showed lower certainty for MDMA (34.2%) and psilocybin (11.9%). We found a significant main effect of the substance group on certainty ratings for both participants (F(2, 117) = 10.49, p < 0.001) and investigators (F(2, 117) = 9.93, p < 0.001), indicating that certainty differed significantly between allocated substances. Specifically, participants and investigators were significantly more confident in choosing psilocybin compared to MDMA (participants: mean difference = -22.30, 95% CI [-39.39, -5.21], p < 0.01; investigators: mean difference = -26.35, 95% CI, p < 0.001), and methylphenidate (participants: mean difference = -32.18, 95% CI.09], p < 0.001; investigators: mean difference = -26.45, 95% CI.21], p < 0.001), with no significant difference between MDMA and methylphenidate (participants and investigators: p > 0.05).
For the forced-choice substance guesses, participants showed an overall accuracy of 69.2%, while investigators correctly guessed substance allocation in 70.8%. Accuracy was highest for the psilocybin group (participants: 92.5%; investigators: 95.0%), followed by MDMA (participants: 65.0%; investigators: 57.5%) and methylphenidate (participants: 50.0%; investigators: 60.0%), as displayed in Fig.. Misclassifications are shown adjacent to the diagonal, with rare or absent cases reflected by sample sizes. Although blinding was limited, participants' high uncertainty levels indicate that blinding was stronger than forced-choice substance guesses alone suggest.
Blinding was not fully successful, with some substances showing better blinding integrity than others. For participants, BI indicated strongest functional unblinding for psilocybin (BI = 0.89), moderate functional unblinding for MDMA (BI = 0.48), and lowest functional unblinding for methylphenidate (BI = 0.25). When weighted by mean certainty of correct substance guesses, indices decreased slightly to 0.77, 0.40, and 0.20, respectively (see Fig.. Individual and mean certainty levels for each substance. The figure presents individual certainty levels in % (y-axis) as jittered points, grouped by substance allocation (columns) and the corresponding substance which was rated for certainty on the x-axis, separately for participants (A) and investigators (B). Certainty ratings were done for each substance (0-100%), independent of forced-choice substance guess. Colored dots indicate the substances which have been rated for certainty: orange for psilocybin, green for MDMA, and blue for methylphenidate, with black diamonds representing group means. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.). Table). For investigators, a similar pattern emerged, with BI = 0.92 for psilocybin, BI = 0.36 for MDMA, and BI = 0.40 for methylphenidate, and corresponding weighted values of BI = 0.81, 0.30, and 0.34. Binomial tests conducted within each substance group (see Table) and a Chi-square test across groups were significant for both participants and investigators (participants: X 2 (4, N = 120) = 84.12, p < 0.001; investigators: X 2 (4, N = 120 = 86.99, p < 0.001), indicating that substance guesses were not independent of substance allocation.
Participants who received psilocybin most frequently based their guesses on sensory perceptions, irrespective of which substance they believed they had received (32% for guessed psilocybin, 50% for guessed MDMA, and 50% for guessed methylphenidate), as illustrated in Fig.. Among those who received MDMA, feelings and mood were the predominant decision factors (30% for guessed psilocybin, and 35% for guessed MDMA). For methylphenidate, responses were more heterogeneous, with state of alertness appearing more frequently than in the other groups (24% for guessed psilocybin, 12% for guessed MDMA, and 19% for guessed methylphenidate). Participants guessing psilocybin based their decision on sensory perceptions when they received psilocybin (32%) or methylphenidate (29%), but on feelings and mood when they received MDMA (30%). Those who guessed MDMA most often reported feelings and mood when the received substance was MDMA (35%) or methylphenidate (31%), but sensory perceptions when it was psilocybin (50%). No consistent dominant factor emerged among participants guessing methylphenidate. Factor selection did not specify whether participants referred to its presence or absence. These results indicate that the subjective substance effects influenced the factors guiding participants' forced-choice guesses. Psilocybin and MDMA induced rather robust subjective effects facilitating identification, whereas the more variable and susceptible effects of methylphenidate increased misclassifications.
We found a significant effect of substance allocation on overall intensity (F(2, 117) = 52.15, p < 0.001), with significantly higher intensities for psilocybin compared to MDMA (mean difference = -14.09, 95% CI, p < 0.001) and methylphenidate (mean difference = -25.29, 95% CI, p < 0.001), as well as for MDMA compared to methylphenidate (mean difference = -11.19, 95% CI, p < 0.001; see Fig.). The analyses incorporating forced-choice substance guesses revealed significant main effects of the received substance (F(2, 112) = 59.79), p < 0.001) and the guessed substance (F(2, 112) = 7.42, p < 0.001) on overall intensity, with no significant interaction (p > 0.05). A simplified model including only the main effects confirmed significant effects of the received substance (F(2, 115) = 57.64), p < 0.001) and the guessed substance (F(2, 115) = 7.15, p < 0.01). Pairwise comparisons for received substance revealed higher overall intensity after psilocybin compared to MDMA (t(115) = 3.98, p < 0.001) and methylphenidate (t (115) = 5.31, p < 0.001), with no difference between MDMA and methylphenidate (p > 0.05). For guessed substance, overall intensity was significantly higher when psilocybin (t(115) = 3.78, p < 0.001) or MDMA (t(115) = 2.52, p < 0.05) was suspected compared to methylphenidate, with no significant difference between psilocybin and MDMA guesses (p > 0.05). Fig.visualizes the associations between substance guesses and subjective experience. In general, the reported subjective effects reflected known, characteristic phenomenological substance effects, which usually influenced participants' forced-choice substance guesses (see Fig.). Participants who received methylphenidate but suspected psilocybin or MDMA reported subjective effects more similar to those typically associated with the guessed substance, likely contributing to their misclassification. Specifically, the six participants allocated to methylphenidate but guessing psilocybin experienced high overall intensity (see Fig.), comparable to the participants allocated to the psilocybin group. Therefore, participants were more likely to misclassify a substance when their subjective experience did not align with the expected drug effects.
Although overall blinding was limited, accuracy ratings were lower than reported in previous studies, suggesting relatively better blinding integrity. Certainty ratings, forced-choice data, as well as blinding indices revealed a similar pattern: psilocybin was associated with the highest functional unblinding and greatest certainty, MDMA showed intermediate values, and methylphenidate the lowest functional unblinding. Accordingly, functional unblinding was largely driven by psilocybin, which potentially inflated overall accuracy rates and complicated pooled analyses. This pattern likely reflects the unique phenomenology of psilocybin compared with MDMA and methylphenidate. In line, psilocybin induced the strongest overall subjective experience. Interestingly, blinding integrity was comparable between substancenaïve and participants with previous experiences, indicating that both groups can be included in future trials without compromising blinding.
Blinding indices and statistical analyses of blinding integrity based on forced-choice substance guesses, reported separately for participants and investigators. Analyses include the total sample size for each substance group (n total); the number of correct and incorrect forced-choice substance guesses (n correct and n incorrect, respectively); the percentage of correct forced-choice substance guesses (correct rate); Bang's Blinding Index values; the mean certainty of correct forcedchoice substance guesses (mean certainty); Bang BI values weighted by mean certainty (weighted Bang BI); and full results from binominal tests (binominal test). Furthermore, our findings highlight that blinding integrity requires more than a forced-choice substance guess. The certainty with which participants and investigators made substance guesses provides a more nuanced indication of how well blinding was maintained. Although 92.5% of participants correctly identified psilocybin, their certainty ratings showed substantial heterogeneity. Mean certainty ratings for psilocybin were 82.8%. At the same time, participants reported mean certainty levels of 27.6% and 8.5% for MDMA and methylphenidate, respectively, despite only three misclassifications in the forced-choice condition. Individual variability was even more pronounced, with a considerable proportion of participants indicating >50% certainty for receiving MDMA or methylphenidate when they were actually allocated to psilocybin. The observed heterogeneity in certainty ratings emerged across all substances and investigators. Comparison of the standard and weighted Bang BI further demonstrates the importance of incorporating certainty ratings: the unweighted index overestimates functional unblinding relative to the certainty-weighted measure. A full characterization of blinding integrity requires the incorporation of certainty ratings, as interpretation based solely on forced-choice substance guesses may overestimate the lack of blinding. Both participants and investigators misclassified MDMA and methylphenidate more often than psilocybininvestigators slightly more so. Investigators base their guesses largely on what participants report, alongside observable behavior and clinical impressions, whereas participants rely on their own direct internal perceptual, cognitive, and affective experiences when identifying the substance. While MDMA and methylphenidate share externally observable features, such as stimulation and increased cardiovascular activation, internally, MDMA produces heightened emotionality, feelings of connectedness, and intensified sensory perception, resembling psilocybin more closely than methylphenidate. This contrast between external similarity and internal divergence may explain differences in how participants and investigators misclassified these substances.
Participants' substance guesses, and thus blinding integrity, were influenced by phenomenological substance effects, conceptualized by the subjective effects (5D-ASC) and decision factors. Psilocybin induced more intense subjective effects than both MDMA and methylphenidate, while MDMA elicited stronger effects than methylphenidate. Participants experiencing more intense subjective effects were more prone to guessing psilocybin or MDMA than methylphenidate. The influence of the subjective experience on forced-choice substance guesses was most pronounced when participants received methylphenidate but suspected psilocybin: they reported effects stronger than in the MDMA group and comparable to psilocybin-substantially stronger than normally expected under methylphenidate. This pattern suggests that some participants exhibited pronounced placebolike effects, which contributed to misclassifications in their forcedchoice substance guesses. Decision factors were mainly based on subjective effects; experiencing sensory alterations predominantly lead to guessing psilocybin, affective changes were most influential for MDMA, and heightened alertness for methylphenidate. Within the psilocybin and MDMA group, these patterns remained largely consistent even if participants suspected a different substance. In contrast, decision factors in the methylphenidate group varied with the guessed substance: sensory alterations were the main drivers when the participants guessed psilocybin, while changes in feelings and mood dominated when they guessed MDMA. Psilocybin (15 mg) and MDMA (100 mg) seem to induce robust, substance-specific effects that comprise blinding integrity, whereas methylphenidate (60 mg) functions as an active placebo, as subjective effects were more variable and susceptible. Overall, these findings indicate that the intensity and distinctiveness of subjective drug experiences are central drivers of functional unblinding. Accordingly, substances with pronounced and distinctive effects are more likely to lead to functional unblinding. At the same time, subjective substance effects have been associated with stronger therapeutic outcomes in some substance-assisted interventions; Yaden and Griffiths, Fig.. Each cell shows the proportion of selected decision factors (relative to all selected factors, not participants) for each combination of received (y-axis) and guessed (x-axis) substance, displayed as pie charts with percentages. Cells on the diagonal represent correct substance guesses and are highlighted with a grey background. Misclassifications are positioned adjacent to the diagonal. The sample size (n) for each cell is indicated. 2020), whereas other studies report no such associations. Although the extent to which these effects are necessary for clinical efficacy remains debated, this raises an important consideration for future trials: how to balance the maintenance of effective blinding with the preservation of acute subjective effects, should these prove to be relevant for therapeutic efficacy.
Our findings highlight the importance of multi-level assessments of blinding integrity and the influence of subjective substance effects on substance decision-making, which ultimately determines blinding success. Methylphenidate showed differential blinding performance across substances, with better blinding observed for MDMA than for psilocybin, as indicated by blinding indices and misclassification rates. This suggests that methylphenidate may be more suitable as an active control in MDMA studies, potentially due to its nonspecific psychoactive effects. Active placebos are generally considered more appropriate, and our findings support the use of methylphenidate as an active placebo in the context of blinding in MDMA trials. However, the inclusion of more than two experimental conditions in our design may have further enhanced blinding, and direct conclusions based solely on the comparison between methylphenidate and MDMA cannot be drawn. Multi-arm and multi-dose trial designs may offer advantages in future studies, although these potential benefits require further systematic evaluation. Overall, these control strategies do not achieve fully effective blinding. Nevertheless, they appear reasonably effective in the absence of a perfect solution. In contrast, methylphenidate did not yield favorable blinding outcomes for psilocybin. Similarly, commonly used active placebos such as niacin or diphenhydramine have not demonstrated blinding success Fig.. Association between substance guesses and subjective substance experience. 1) shows 5D-ASC scores for the substance groups. 2) contains 5D-ASC scores grouped by received and guessed substances for A) the five dimensions, B) the eleven subscales, and C) overall intensity. Facets in 2A-C correspond to guessed substances, and colors within the plots indicate substance allocation: orange for psilocybin, green for MDMA, and blue for methylphenidate. Panel C includes sample size (n) indications, which also apply to the other panels from 2).. Low-dose psychedelics represent an alternative; however, blinding remains insufficient, and dose selection requires a careful balance between noticeable subjective effects and minimal therapeutic benefits. Identifying suitable placebo conditions for classic psychedelics such as psilocybin remains an important priority for future research. Furthermore, additional studies are necessary to determine if higher or lower doses than the ones administered in the present study affect blinding integrity differently. When placebo-controlled comparisons are not feasible and blinding consistently fails, adapted trial designs may be required to investigate and reliably evaluate psychedelics and MDMA. Possible approaches include the use of clinician-rated assessments of the primary outcome by blinded raters who were not involved in the acute substance experience, as well as increased reliance on objectively derived measures rather than subjective outcomes. Some argue that in regard to clinical trials, given the frequency and extent of functional unblinding, psychedelic-assisted therapies should be treated as effectively open-label and compared with open-label standard treatments such as antidepressants to better account for expectancy effects related to treatment awareness. In addition, emerging analytical approaches aim to mitigate the impact of functional unblinding, including the Correct Guess Rate Curve and the use of baseline expectancy scores as statistical covariates. While these methods do not fully resolve issues associated with functional unblinding, they may help to partially control for its effects. In the present study, blinding was measured on the day of substance administration; however, participants' substance guesses may have changed over time, warranting follow-up assessments at the primary outcome. Explicit expectancy assessments were not included in the present study. Consequently, no conclusions can be drawn regarding the role of expectancy in the context of compromised blinding and its potential implications for outcome interpretation. Future studies should implement expectancy measures to further clarify how expectancy may influence treatment outcomes under conditions of compromised blinding. These assessments should include both participants and investigators and employ validated measures, such as the Stanford Expectations of Treatment Scale, which captures both positive and negative expectancies. Greater heterogeneity in the study sample would enhance the generalizability of the results. Importantly, the present findings may not generalize to clinical populations. In the absence of direct comparisons between healthy and clinical samples, it remains unclear to what extent blinding success differs across these contexts. If blinding is compromised, differences in expectancy and motivation may introduce population-specific biases in outcomes. Moreover, alterations of the serotonin system associated with psychiatric disordersmay influence subjective substance effects and, consequently, blinding differently than in healthy controls. While prior antidepressant use does not appear to have a clinically meaningful impact on outcomes in psilocybin-assisted therapy, evidence regarding other psychopharmacological treatments is limited. Furthermore, dosing regimens established in healthy individuals may not be directly applicable to patient populations, as patients may exhibit differential responses; for example, individuals with alcohol dependence have been shown to exhibit attenuated subjective effects relative to healthy controls. The extent to which such differences between healthy and clinical populations affect blinding integrity warrants further investigation. In line with our results, future research should assess blinding using forced-choice substance guesses, certainty ratings, and decision factors as well as expectancy measures to allow a comprehensive evaluation and move toward best-practice standards in psychedelic research.
Create a free account to open full-text PDFs.
Back, A., Freeman-Young, T. K., Morgan, L. et al. · JAMA Network Open (2024)
Bogenschutz, M. P., Forcehimes, A. A., Pommy, J. A. et al. · Journal of Psychopharmacology (2015)
Bogenschutz, M. P., Ross, S., Bhatt, S. R. et al. · JAMA Psychiatry (2022)
Carhart-Harris, R. L., Giribaldi, B., Watts, R. et al. · New England Journal of Medicine (2021)
Garcia-Romeu, A., Griffiths, R. R., Johnson, M. W. · Current Drug Abuse Reviews (2015)
Griffiths, R. R., Johnson, M. W., Richards, W. A. et al. · Psychopharmacology (2011)
Holze, F., Becker, A. M., Kolaczynska, K. E. et al. · Clinical Pharmacology and Therapeutics (2022)
Holze, F., Gasser, P., Müller, F. et al. · Biological Psychiatry (2023)
Holze, F., Vizeli, P., Müller, F. et al. · Neuropsychopharmacology (2019)
Liechti, M. E., Baumann, C., Gamma, A. et al. · Neuropsychopharmacology (2000)
Luquiens, A., Belahda, D., Graux, C. et al. · Addiction (2025)
Marwood, L., Croal, M., Mistry, S. et al. · Journal of Psychiatric Research (2024)
Mitchell, J., Bogenschutz, M. P., Lilienstein, A. et al. · Nature Medicine (2021)
Mitchell, J., Ot’alora G, M., van der Kolk, B. et al. · Nature Medicine (2023)
Mu, F., Zaczek, H., Becker, A. M. et al. · Med (2025)
Nayak, S., Bradley, M. K., Kleykamp, B. A. et al. · Journal of Clinical Psychiatry (2023)
Nichols, D. E. · Pharmacology and Therapeutics (2004)
Olson, D. E. · ACS Pharmacology and Translational Science (2020)
Orsini, D. K., Wong, S., Di Luch, S. et al. · JAMA Psychiatry (2026)
Papaseit, E., Pérez-Mañá, C., Mateus, J. A. et al. · Neuropsychopharmacology (2016)
Preller, K. H., Vollenweider, F. X. · Behavioral Neurobiology of Psychedelic Drugs (2016)
Raison, C. L., Sanacora, G., Woolley, J. D. et al. · JAMA (2023)
Rieser, N. M., Bitar, R., Halm, S. et al. · EClinicalMedicine (2025)
Roseman, L., Nutt, D. J., Carhart-Harris, R. L. · Frontiers in Pharmacology (2018)
Ross, S., Bossis, A. P., Guss, J. et al. · Journal of Psychopharmacology (2016)
Schindowski, E. M., Jungwirth, J., Schuldt, A. et al. · EClinicalMedicine (2023)
Straumann, I., Holze, F., Becker, A. M. et al. · Neuroscience Applied (2024)
Studerus, E., Kometer, M., Hasler, F. et al. · Journal of Psychopharmacology (2010)
Szigeti, B., Heifets, B. D. · Biological Psychiatry (2024)
Szigeti, B., Nutt, D. J., Carhart-Harris, R. L. et al. · Scientific Reports (2023)
Szigeti, B., Weiss, B., Rosas, F. E. et al. · Psychological Medicine (2024)
Taylor, J. J., Szigeti, B., Silverberg, N. D. et al. · Lancet Psychiatry (2026)
Vizeli, P., Liechti, M. E. · Journal of Psychopharmacology (2017)
Vollenweider, F. X., Preller, K. H. · Nature Reviews Neuroscience (2020)
Weiss, B., Miller, J. D., Carter, N. T. et al. · Scientific Reports (2021)
Williams, Z. J., Barnett, H., Szigeti, B. · JAMA Psychiatry (2026)
Yaden, D. B., Griffiths, R. R. · ACS Pharmacology and Translational Science (2020)