Placebo

The difference between ‘placebo group’ and ‘placebo control’: a case study in psychedelic microdosing

This computational analysis (2023) employs computational modelling to illustrate how weak blinding and positive treatment expectancy can cause an uneven distribution of expectancy effects, termed 'activated expectancy bias' (AEB). The study introduces the Correct Guess Rate Curve (CGRC), a tool to estimate the results of a perfectly blinded trial using data from an imperfect one, and re-analyzed the ‘self-blinding psychedelic microdose trial’ dataset (n=191) to demonstrate that placebo-microdose differences may be susceptible to AEB, thereby suggesting microdosing can be viewed as an active placebo.

Authors

  • Carhart-Harris, R. L.
  • Erritzoe, D.
  • Nutt, D. J.

Published

Scientific Reports
individual Study

Abstract

In medical trials, ‘blinding’ ensures the equal distribution of expectancy effects between treatment arms in theory; however, blinding often fails in practice. We use computational modelling to show how weak blinding, combined with positive treatment expectancy, can lead to an uneven distribution of expectancy effects. We call this ‘activated expectancy bias’ (AEB) and show that AEB can inflate estimates of treatment effects and create false positive findings. To counteract AEB, we introduce the Correct Guess Rate Curve (CGRC), a statistical tool that can estimate the outcome of a perfectly blinded trial based on data from an imperfectly blinded trial. To demonstrate the impact of AEB and the utility of the CGRC on empirical data, we re-analyzed the ‘self-blinding psychedelic microdose trial’ dataset. Results suggest that observed placebo-microdose differences are susceptible to AEB and are at risk of being false positive findings, hence, we argue that microdosing can be understood as active placebo. These results highlight the important difference between ‘trials with a placebo-control group’, i.e., when a placebo control group is formally present, and ‘placebo-controlled trials’, where patients are genuinely blind. We also present a new blinding integrity assessment tool that is compatible with CGRC and recommend its adoption.

Unlocked with Blossom Pro

Research Summary of 'The difference between ‘placebo group’ and ‘placebo control’: a case study in psychedelic microdosing'

Methods

The study re-analyses data from a previously conducted "self-blinding" citizen science microdose trial using simulation and statistical methods designed to assess the impact of imperfect blinding. In the self-blinding trial participants packaged their own microdoses and placebo capsules (non-transparent gel capsules and empty capsules, respectively), labelled them with QR codes so investigators could log ingestion without informing participants, and followed a 4-week schedule taking two microdoses per week in the active condition. For the present analysis the investigators used only data from the first week to make datapoints independent; this yielded n = 233 datapoints. The trial enrolled people who planned to microdose of their own accord; no clinical supervision or financial compensation was provided. Ethical approval and informed consent were obtained as reported. Outcome measures were split into acute and post-acute assessments. Acute outcomes (assessed 2–6 hours after ingestion) included the Positive and Negative Affect Schedule (PANAS), a cognitive performance score (CPS) aggregated from six computerized tasks, and visual analogue scales (VAS) for mood, energy, creativity, focus and temper. Post-acute outcomes (assessed the day after ingestion) comprised the Warwick–Edinburgh Mental Wellbeing Scale (WEMWBS), the Quick Inventory of Depressive Symptomatology (QIDS), the State–Trait Anxiety Inventory (STAI‑T) and the Social Connectedness Scale (SCS). For each capsule taken participants also made a binary guess whether it was placebo or microdose; this correct-guess data underpins later analyses. Methodologically the paper used two complementary approaches. First, a computational "activated expectancy bias" (AEB) model was simulated; this model has three binary nodes representing treatment (TRT), perceived treatment (PT) and treatment expectancy (TE), plus a continuous outcome node. The simulations explored four configurations in which the direct treatment effect (DTE) and AEB pathways were present or absent. Treatment effects in both simulated and empirical data were estimated with an outcome ~ treatment linear model implemented via the lme package in R. Second, the authors developed the Correct Guess Rate Curve (CGRC) adjustment, a novel resampling technique intended to estimate what trial results would look like under perfect blinding, using data from an imperfectly blinded trial. Empirical scores were stratified into the four treatment/guess combinations (PL/PL, AC/PL, PL/AC, AC/AC); kernel density estimation (KDE, implemented with scikit‑learn in Python) modelled each stratum and random samples were drawn to construct pseudo-datasets with a target correct-guess rate (for example CGR = 0.5 to mimic perfect blinding). The resampling was repeated 100 times and the mean treatment estimate and p-value across resamples reported as the CGR-adjusted result. The authors note that their CGR method produces pseudo-experimental, not truly randomised, data and discuss potential dependence and duplication issues arising from resampling.

Results

Simulation results from the AEB model showed that CGR adjustment markedly reduces false positive findings when activated expectancy bias is present. When neither direct treatment effect nor AEB was active, both traditional (non‑CGR adjusted) and CGR‑adjusted analyses produced significant treatment p-values at approximately the nominal 0.05 rate (5%/6%). When only a direct treatment effect was present, traditional and CGR‑adjusted analyses identified a significant effect in 86% and 84% of simulations respectively (average p ≈ 0.032/0.036). When only AEB was active (no true treatment effect), the traditional analysis produced false positive treatment effects in 78% of simulations, whereas CGR adjustment produced false positives in only 3% of simulations. When both DTE and AEB were active, traditional analysis found a significant effect in 99% of trials versus 82% for CGR adjustment; moreover, traditional analysis overestimated the true effect (estimated 5.69 points vs true 3), while the CGR‑adjusted estimate was 3.04 points. Overall, CGR adjustment increased the false negative rate by ~2–4% relative to traditional analysis in scenarios with a true effect, but reduced the false positive rate by roughly 75% when AEB was present and yielded more accurate effect‑size estimates. Applying the CGR method to the empirical self‑blinding microdose data produced materially different conclusions from traditional analyses. Using standard (non‑CGR adjusted) models, statistically significant placebo–microdose differences favoured microdosing on several measures: acute PANAS (mean difference 3.2 ± 1.3; p = 0.01), energy VAS (11.5 ± 2.7; p < 0.001), mood VAS (6.4 ± 2.7; p = 0.02), creativity VAS (6.4 ± 2.5; p = 0.01), and post‑acute QIDS (−1.2 ± 0.06; p = 0.04). After CGR adjustment, none of these outcomes remained significant except the energy VAS, which retained significance at approximately p = 0.04 but with an effect size reduced by about 40% (reported Hedges' g = 0.34). Equivalence testing reported by the authors for the outcomes that lost significance (PANAS, QIDS, mood and creativity VASs), with equivalence bounds set to average within‑subject variability, indicated statistical equivalence between placebo and microdose after CGR adjustment (details and numeric thresholds provided in supplementary materials). The cognitive performance score (CPS) was largely unaffected by CGR adjustment: both p-value and effect estimate remained approximately constant, consistent with the measure being objectively assessed rather than self‑rated. Additional empirical observations relevant to blinding were reported. A brief five‑item treatment‑guess questionnaire (included in the supplement) showed that 55% of participants cited body or perceptual sensations (for example muscle tension, stomach discomfort) as the main cue for their treatment guess, whereas only 23% cited mental or psychological benefits. The authors compared the mean placebo–microdose differences on PANAS subdomains to reported day‑to‑day within‑subject variability from a no‑intervention study, and concluded that the observed mean differences were several times smaller than natural variability, arguing they would be difficult to notice and therefore suggesting perceptual side‑effects rather than perceived efficacy were the dominant source of unblinding.

Discussion

The authors frame their central contribution as highlighting how imperfect blinding, coupled with a positive expectancy for the active condition, can create an "activated expectancy bias" (AEB) that inflates treatment estimates and generates false positive findings. They emphasise that a formal placebo group is not equivalent to an effectively placebo‑controlled trial: a trial may include a placebo arm but still fail to control expectancy if blinding integrity is poor. Consequently, the paper argues that claims of "placebo‑controlled" efficacy should be contingent on demonstrated empirical blinding integrity. Applied to psychedelic microdosing, the re‑analysis suggests that many small positive findings observed with traditional methods are susceptible to AEB. In the self‑blinding microdose data, the majority of prior significant outcomes lost significance after CGR adjustment, except for a small residual effect on self‑reported energy. The authors propose that microdosing behaves as an "active placebo" — it produces perceivable psychopharmacological effects that can unblind participants, but the available evidence does not yet demonstrate clear clinical benefit for mental health measures once AEB is accounted for. They acknowledge an alternative possibility: microdosing might only show efficacy at doses that necessarily produce conspicuous subjective effects, in which case placebo control may not be feasible and alternative trial designs or mechanistic evidence would be required to establish efficacy. The discussion addresses key caveats and limitations noted by the authors. CGR adjustment depends on binary treatment‑guess data, which is a crude representation of treatment belief; inclusion of guess confidence would be preferable. The method assumes that unblinding arises from perceptual cues that drive expectancy ("malicious" unblinding) rather than from genuine symptomatic improvement yielding correct guesses ("benign" unblinding). If benign unblinding predominates, CGR adjustment could introduce collider bias and produce false negatives; the authors therefore stress the need to assess the source of unblinding before applying the method. Methodological limitations of the CGR approach include resampling dependence (points may be duplicated), potential imbalance of confounders in pseudo‑datasets, and increased error rates in small samples or extreme CGRs. The authors recommend that researchers simulate performance under their own data parameters before applying CGR adjustment. Finally, the authors recommend routine assessment and reporting of blinding integrity, propose a brief five‑item treatment‑guess questionnaire (in the supplement) compatible with CGR analysis, and argue that the scientific community should treat blinding integrity as a prerequisite for judging trials as truly placebo‑controlled. They position CGR adjustment as an approximate analytical tool that can improve inference where ideal blinding is difficult to achieve, but they underscore that it does not substitute for results from genuinely blinded randomized controlled trials.

View full paper sections

SECTION

Self-blinding microdose trial. The self-blinding microdose trial used an 'self-blinding' citizen science approach, where participants implemented their own placebo control based on online setup instructions without clinical supervision. Self-blinding involved enclosing the microdoses inside non-transparent gel capsules and using empty capsules as placebos. Then, these capsules were labeled with QR codes that allowed investigators to track when placebo/microdose was taken without sharing this information with participants. Participants were followed throughout a 4-week dosing period, taking 2 microdoses/week in the active group. For each capsule taken, participants made a binary guess whether their capsule was placebo or microdose, see Supplementary materials for details. Here, the trial's acute and post-acute outcomes are re-analyzed. Acute measures were completed 2-6 h after ingestion of the capsule, while post-acute measures were taken the day after a capsule was taken. Acute outcomes were: positive and negative affect schedule (PANAS), cognitive performance score (CPS) and visual analogue Figure. The activated expectancy bias (AEB) model, consisting of 3 binary nodes (TRT, PT and TE) and a continuous value node, the outcome (OUT). In the equations, B X /N X stand for a random Bernoulli/normal variable, respectively. The binary nodes (TRT, PT and TE) represent Bernoulli variables (B TRT , B PT , B TE ), where the values of 0/1 correspond to placebo/active. To generate AEB model data, first Treatment (TRT) is determined by Eq. 1 and then the Perceived treatment (PT) by Eq. 2, where p CG is the probability of correct guess, i.e. the correct guess rate, and then Treatment expectancy is fixed according to Eq. 3. Finally, the outcome score is calculated by Eq. 4 which has components of natural history ( N NH ), direct treatment effect ( N DTE ) and activated expectancy bias ( N AEB ), see Supplementary table 1 for the numeric value of all parameters. scale items for mood, energy, creativity, focus, and temper. The CPS is an aggregated quantification of cognitive performance based on 6 computerized tasks (spatial span, odd one out, mental rotations, spatial planning, feature match, paired associates). Post-acute outcomes were: Warwick-Edinburgh mental well-being scale (WEMWB), quick inventory of depressive symptomatology (QIDS), state-trait anxiety inventory (STAIT)and social connectedness scale (SCS). To simplify the current analysis, we only used data from the first week of the experiment, thus, each datapoint is independent and not confounded by order effects. This approach reduced the overall sample, but yielded almost identical qualitative conclusion as the full dataset. In the current analysis n = 233 datapoints were included. The trial only engaged people who planned to microdose through their own initiative, but who consented to incorporate placebo control to their self-experimentation. The trial team did not endorse microdosing or psychedelic use and no financial compensation was offered to participants. The study was approved by Imperial College Research Ethics Committee and the Joint Research Compliance Office at Imperial College London (reference number 18IC4518). Informed consent was obtained from all subjects, the trial was carried out in accordance with relevant guidelines and regulations. Estimate of treatment effects. Throughout this work treatment effects are estimated by an outcome ~ treatment linear model, where outcome is a numeric, treatment is a binary variable (placebo or active treatment). In this manuscript 'non-CGR adjusted analysis' means that this model is fitted to empirical data, while 'CGR adjusted analysis' means that this model is fitted to the CGR adjusted pseudo-experimental data, see Correct guess rate curve section for details. Therefore, the CGR-adjusted treatment estimate/p-value is to the estimate/p-value associated with the treatment term in the model above, applied to data adjusted by the CGRC method. All linear models were implemented using the lme package (version 3.1-155) in R (v4.0.2). Correct guess rate curve. We developed CGR adjustment, a novel statistical technique that can estimate the outcome of a perfectly blinded trial, based on data from an imperfectly blinded trial. Briefly, first the scores are separated into four strata corresponding to all four possible combinations of treatment and guess. Next, statistical models of these four strata are built using kernel density estimation (KDE). KDE estimates were implemented by the scikit-learn package (v1.0.2) in python (v3.7), all parameters were left at default value. Then, random samples are drawn from each strata, such that the combined sample has CGR = 0.5, mimicking a perfectly blinded trial, see Fig.for a detailed explanation. Treatment estimates for other CGR values can be obtained in a similar manner by changing the number of samples drawn from each KDE. For example, a trial with CGR = 0.6 can be approximated by drawing 0.6*n random samples from the correct guess KDEs and 0.4*n random samples from the incorrect guess KDEs, etc.

RESULTS

Correct guess rate (CGR) adjustment of the activated expectancy bias (AEB) model. We analyze pseudo-experimental data generated by the 2*2 = 4 configurations of the AEB model (corresponding to direct treatment effect and activated expectancy bias either being active or not, see Fig.) with both traditional, i.e. non-CGR adjusted, and CGR-adjusted analysis. To demonstrate that the qualitative conclusions presented here do not require fine tuning of parameters, we present a robustness analysis in the Supplementary Materials. First, the case was analyzed where neither direct treatment effect nor the activated expectancy bias pathways are activated (top row in Table). In this case, the outcome is a normal random variable. The treatment p-value was significant for 5%/6% of the simulated trials using the traditional/CGR adjusted models, which is expected based on the 0.05 significance level. Next, the case was analyzed where a direct treatment effect was active, but activated expectancy bias was not active (second row from top in Table). Non-CGR adjusted and CGR adjusted analysis identifies a significant treatment effect in 86/84% of the simulations with an average p-value of 0.032/0.036, respectively. We note that this 14%/16% false negative rate is due to the small effect used in simulations (~0.4 Hedges' g), larger effects decrease the false negative rate of both analyses, see robustness analysis in Supplementary materials. In both analysis the treatment estimate is within 5% of the true effect. Next, the case was analyzed where a direct treatment effect was inactive, but activated expectancy bias was active (third row from top in Table), i.e. a scenario where there is no true treatment effect and activated expectancy is a complete mediator of the treatment. For the traditional models, 78% of the simulated trials resulted in a false positive treatment effect. For the CGR-adjusted models, only 3% of the simulated trials produced a false positive treatment effect. Finally, the case was analyzed where both a direct treatment effect and activated expectancy bias were active (bottom row in Table), i.e., a case where AEB is a partial mediator of treatment. The average treatment p-value was 0.001/0.041 with 99%/82% of the trials resulting a significant treatment effect for the traditional/CGR adjusted analysis, respectively. Note that the CGR adjusted analysis can only be as good to detect a treatment effect as the unadjusted analysis when only DTE is active (as the adjustment aims to remove the effect of AEB). Thus, CGR adjusted analysis detects an effect in just 4% less of the simulations (86% vs. 82%) than this bestcase scenario, i.e. CGR adjustment only adds 4% to the false negative rate. Furthermore, the traditional analysis estimated the effect to be 5.69 points, while the CGR adjusted estimate was 3.04 points (the true treatment effect was 3), so traditional analysis significantly overestimated the effect due to the influence of AEB. In summary, the CGR adjusted analysis' false negative rate is ~2-4% higher than the traditional analysis' (rows 2&4 in Table), but the false positive rate is ~75% lower when AEB is present (row1&3 in Table). Furthermore, when a true effect is present, CGR provides a more reliable estimate of the effect size (row 4 in Table) as it subtracts the influence of AEB. Both treatment and guess are binary with potential values of placebo/active, thus, the four strata are (using the treatment/guess notation): PL/PL, AC/PL, PL/AC and AC/AC. Next, statistical models of these strata are built using kernel density estimation (KDE). Note that two strata correspond to correct guesses (PL/PL and AC/AC; red) and two to incorrect guesses (AC/ PL, PL/AC; blue). Next, n/2 random samples are drawn from the correct guess KDEs, such that the relative sample sizes of the correct guess strata are preserved, i.e. the ratio n PL/PL /n AC/AC is same as in the original data, see Supplementary materials for a numeric example. Similarly, n/2 random samples are drawn from the incorrect guess KDEs, such that the ratio n AC/PL /n PL/AC is same as in the original data. These random samples are then combined, resulting in a pseudo-experimental dataset with CGR = 0.5 (purple distribution at bottom), corresponding to effective blinding. The random sampling from KDEs is repeated 100 times, for each CGRadjusted pseudo-experimental dataset is analyzed to estimate the direct treatment effect, see Estimate of treatment effects. The 'CGR adjusted treatment effect/p-value' is the mean treatment estimate / p-value across these 100 samples. Estimates at other CGR values can be obtained similarly, e.g. a trial with CGR = 0.6 can be approximated by drawing 0.6*n random samples from the correct guess KDEs and 0.4*n random samples from the incorrect guess KDEs, etc. Correct guess rate (CGR) adjusted analysis of the self-blinding microdose trial. Next, we advance from analyzing pseudo-experimental data to scrutinizing empirical data from the self-blinding microdose trial. Using traditional, i.e. non-CGR adjusted, data analysis, statistically significant placebo-microdose differences were observed on the following scales: acute emotional state (PANAS; mean difference ± SE = 3.2 ± 1.3; p = 0.01**), energy visual analogue scale VAS (11.5 ± 2.7; p < 0.001***), mood VAS (6.4 ± 2.7; p = 0.02*), creativity VAS (6.4 ± 2.5; p = 0.01*) and post-acute depression (QIDS; -1.2 ± 0.06; p = 0.04*). After CGR adjustment, none of these outcomes remained significant with the exception of the energy VAS that remained significant (p ~ 0.04), but with a ~ 40% reduced effect size. This finding suggests that microdosing increases self-perceived energy beyond what is explainable by expectancy effects, although the magnitude of the remaining effect is small (Hedges' g = 0.34). Equivalence testing for all outcomes where significance changed after CGR adjustment (i.e. PANAS, QIDS, mood and creativity VASs) with an equivalence bound equal to the average within-subject variability were significant, arguing that outcomes were equivalent in the placebo and microdose groups after the CGR adjustment, see Supplementary materials for details. See Tablefor numeric results and Fig.for the CGR curves of selected outcomes.

TREATMENT GUESS QUESTIONNAIRE.

In the supplementary materials we included a brief, 5-items questionnaire developed to collect treatment guess and source of unblinding data. The resulting data is compatible with the current and planned future versions of the CGR curve.

DISCUSSION

Effective blinding distributes expectancy effects equally between treatment arms. However, if blinding is ineffective, i.e. patients can deduce their treatment allocation, and if patients have a positive expectancy bias for the active arm, then expectancy effects will be no longer equally distributed and trial outcomes will be biased towards the active arm. We call this bias 'activated expectancy bias' (AEB), which can be viewed as a residual expectancy bias potentially present even in 'blinded' trials. A key consequence is that the research community needs to distinguish between trials with a placebo-control group, i.e., when a placebo control group is formally present in the trial, and placebo-controlled trials, where patients are genuinely blinded and thus AEB is not present. In other words, a placebo control group is necessary, but in-itself insufficient to control for expectancy effects. For example, a recent trial on LSD therapy includes 'double-blind, placebo-controlled' in its title, but as the manuscript describes "only one patient in the LSD-first group mistook LSD as placebo" (out of 18 patients), highlighting that the trial was formally blinded, but not in practice. The implication is that 'placebo-controlled' studies are more fallible than conventionally assumed with consequences for evidence-based medicine. Current FDA drug approval only requires two trials with statistically significant drug-placebo difference, thus, the self-blinding microdose trial yielded evidence consistent with FDA approval, despite that the findings were likely false positives, driven by AEB. In our view, placebo-controlled trials should only be considered 'gold standard' if blinding integrity is demonstrated with empirical data. This requirement would create a new, more rigorous standard for what is 'placebo control' . Given the high costs and low success rate of psychiatric trials, there may be little appetite from industry and regulators to create such new standard, but it should be embraced by the scientific community. We note that it is difficult to estimate how prevalent AEB is in medical trials, because blinding integrity has only been reported in 2-7% of trials. To understand the prevalence of AEB, more trials need to capture blinding integrity data. To aid this practice, in the supplementary materials we suggest a brief 5-items questionnaire that is compatible with the method presented here and recommend its adoption. When the self-blinding microdose trial was analyzed traditionally, small, but significant microdose-placebo differences were observed on emotional state, depression, mood, energy and creativity, favoring microdosing. After CGR adjustment, only energy VAS remained significant (p ~ 0.04) with a ~ 40% reduced effect size-we note that another recent trial similarly found significant increases in self-perceived energy beyond what is explainable by the placebo and expectancy effects. One could argue that these negative results are false negatives; however, Table. Comparative results of traditional and CGR adjusted analysis of the AEB model. The model is analyzed with 2*2 = 4 parameter configurations, corresponding to the direct treatment effect (DTE) and activated expectancy bias being active or not, see Fig.. Results are equivalent for the two analysis in the top two rows, however, when only the activated expectancy bias is active (3rd row from top), traditional analysis produces false positive findings for 78% of the simulations. Furthermore, when both direct treatment effect and activated expectancy bias are active (bottom row), traditional analysis overestimates the known true treatment effect (estimate is 5.69 points, while the true effect is 3 points), see Fig.the consistency of the negative results across measures argues against this possibility. Furthermore, the trial had the necessary features for AEB, i.e. weak blinding and a positively biased, implying that the trial is susceptible to AEB. AEB is likely to be present in other psychedelic microdose trials as well, results should be interpreted with caution, especially if evidence for effective blinding is not presented. We hypothesize that the reported benefits psychedelic microdosing on mood and creativity can be understood as an 'active placebo' , i.e., an intervention without medical benefits, but with perceivable effects, emphasizing the difference between effects and benefits. A recent comprehensive review of microdosing concluded that: "These findings together provide clear evidence of psychopharmacological effects. That is, microdosing is doing something. A key question for researchers is whether the effects of microdosing have clinical or optimization benefits beyond what might be explained by placebo or expectation. ". In short, microdosing leads to perceivable effects, for example by the heightened energy levels, explaining why CGR is universally high across trials, but at this point none of these effects seem to be related improved mental health. If our hypothesis is correct, then, either improved blinding or a sample without positive expectancy would nullify the observed benefits of microdosing by nullifying AEB. An alternative possibility is that microdosing is only effective at doses where blinding integrity cannot be maintained due to conspicuous subjective effects, such as in the case of psychedelic macrodosing. In this. For the DTE off; AEB on case (bottom left) generates a false positive finding when CGR is not considered during analysis (green dashed line intersects p-value estimate below 0.05), but CGR adjustment recovers the lack of treatment effect (black dashed line intersects p-value estimate above 0.05). For the DTE on; AEB on case (bottom right), both analyses correctly identify that there is a treatment effect; however, non-CGR adjusted analysis overestimates the effect size by ~ 40%, see Tablefor numeric results. scenario the possibility of effective placebo control is abandoned and efficacy beyond expectancy needs to be established outside of blinded trials. Arguments for the merit of alternative trial designs to assess the efficacy of psychedelics have been made before, for example mechanistic studies could also help to establish the causal effect of treatment. Recently, arguments against the value of placebo control have been raised in psychedelic trials. This article remains neutral on this issue, it merely insists that if a trial is called 'placebo controlled' , then it should really control for the placebo effect and not just have a 'placebo group' . Our arguments above assume that the high CGR is explained by malicious unblinding, i.e. positive treatment expectancy drives the positive outcomes, rather than benign unblinding, i.e. patients correctly guess their treatment due to noticeable health improvements. If unblinding is benign, then CGR adjustment could lead to false negative findings due to collider bias 46 (currently Fig.represents malicious unblinding, for benign unblinding PT → TE → OUT would need to be replaced with OUT → PT). Accordingly, investigators need to carefully assess the source of unblinding prior to using our method. To facilitate this assessment, our questionnaire in the supplementary materials captures this source of unblinding information. What was the source of unblinding in the self-blinding psychedelic microdose trial? Two lines of evidence point towards that it was the perceptual/side effects rather than efficacy, corresponding to malicious unblinding. First, 55% reported that the primary cue to formulate their treatment guess was 'body/perceptual sensations', such as muscle tension (58%) and stomach discomfort (27%), in contrast only 23% reported 'mental/psychological benefits'. Secondly, among participants who were assessed under both placebo and microdose conditions, the mean placebo-microdose difference on the positive / negative affect subdomains of the PANAS was 2.1/0.8. In a recent study without any intervention, the mean temporal intra-individual difference, i.e. the within-subject day-to-day variability, of the same subdomains was ~ 10/~ 6. Thus, the natural within-subject variability is ~ 500-750% larger than the mean placebo-microdose difference, arguing that the effect is too small to be noticeable. Limitations. CGR adjustment relies on binary treatment guess data from patients, however, treatment belief is a complex construct that cannot be reduced to a single binary variable. We focused on binary guess data due to its availability and note that even this imperfect data is rare to find. Treatment guess could be better characterized if guess confidence was also rated. Such confidence data would allow to distinguish between those who truly identified their drug condition (high confidence guess) versus those who guess correctly by chance (low confidence guess). In our analysis, we treat the source of unblinding as a binary variable, either being only benign or malicious. A more realistic scenario is that for some patients, both efficacy and non-specific effects contribute to their guesses. Relatedly, our assessment on the source of unblinding is based on retrospective self-reports, that cannot provide conclusive evidence on causation. Our AEB model assumes linear addition of the direct treatment and the activated expectancy effects to estimate the total effect, however, these effects may not be additive for all circumstances. The CGR curve relies on resampling the observed data, thus, the resulting data cannot be considered experimentally randomized, and as a consequence confounding variables may not be equally distributed. Despite the KDE approximation of each strata, practically some datapoints may appear multiple times in the Table. Comparison of traditional (non-CGR adjusted) and CGR adjusted models of the self-blinding microdose trial data. Note that for all outcomes that were statistically significant in the traditional models became insignificant after CGR adjustment with the exception of the energy VAS. These results argue that positive outcomes in the traditional analysis could be false positive findings created by AEB. Energy VAS remained significant even after CGR adjustment, although the effect size is reduced by ~ 40%. This finding suggests that microdosing increases self-perceived energy beyond what is explainable by expectancy effects, although the remaining effect is small, see Fig.for CGR curves of selected outcomes. www.nature.com/scientificreports/ pseudo-experimental samples, potentially increasing the error rate due to dependent observations. The error rate of our methodology is a function of the sample characteristics, generally, the smaller the sample, the more extreme the CGR and the smaller the effects are, the less reliable the results will be. In a range of these parameters that mimics microdosing and antidepressant trials (n ~ 200, CGR ~ 0.7, treatment effect ~ 0.4 Hedges' g), our method has comparable false negative rate as traditional, non-CGR adjusted analysis. However, when AEB is present CGR adjusted analysis has a much lower false positive rate and a more reliable estimate of the true effect size compared to non-CGR adjusted analysis. The error rate of our methodology can be higher in other contexts, in particular if the sample is small. Researchers wishing to use CGR adjustment should first run simulations to determine whether CGR produces acceptable error rates for the parameters of their data and the application in mind. For the limitations listed above, our CGR adjustment is inferior to results from a truly blind RCT, its value lies that it can provide an approximate answer when achieving ideal blinding is difficult or impossible. This finding suggests that microdosing increases self-perceived energy beyond what is explainable by expectancy effects, although the remaining effect is small (Hedges' g = .34). Finally, CGR adjustment has little impact on the cognitive performance score as both the p-value and the effect estimate remain close to a constant. This finding suggests that this measure is not affected by AEB, possibly because cognitive performance was not self-rated, rather measured by objective computerized tests, see Tablefor numerical results. CGR adjustment can be viewed as an example of a resampling method to overcome the challenges of imbalanced data. Here we present only a particular solution to this problem and not a systematic exploration of how rebalancing of the data can be achieved. Finally, our data on microdosing was obtained from a self-selected healthy sample. Microdosing may be effective for certain conditions in a clinical population, in domains we did not assess, if used at higher doses or longer time periods or when it is co-administered with a behavioral therapy, such as cognitive training.

Study Details

Your Library