Overcoming blinding confounds in psychedelic randomized controlled trials using biomarker driven causal mediation analysis
This commentary (2023) suggests that causal mediation analysis using objective biomarkers could help establish causal pathways between treatment and outcome, providing greater confidence in the efficacy of psychedelic therapies before they are approved as regular medicines. This cautious approach is recommended to avoid potential drawbacks such as expanding indications based on low-quality evidence and unstable efficacy over time.
Abstract
There is great interest in the use of psychedelic-assisted therapies to treat a range of mental health conditions and initial randomized controlled trials (RCTs) trials have generated positive results. However, the effect sizes reported in psychedelic RCTs are likely inflated due to expectancy effects due to the de-blinding of both participants and study personnel to treatment allocation caused by the distinctive psychoactive effects of psychedelic drugs. In this article an introduction to causal inference for randomized controlled trials, the underlying assumptions, and potential confounders along with graphical illustrations is provided. It is proposed that causal mediation analysis using objectively measured mediating biomarkers could be used to identify causal pathways between treatment and outcome in psychedelic RCTs, even with de-blinding of participants and give greater confidence as to the mechanistic basis and efficacy of psychedelic therapies. It is argued that psychedelic therapies should not be approved as regular medicines until causal pathways are clearly established between treatment and outcome. Potential downsides of doing so include, future indication expansion based on low quality clinical trial evidence, the approval of other therapies based on similarly low-quality evidence, and the potential for efficacy to be unstable over time after approval.
Research Summary of 'Overcoming blinding confounds in psychedelic randomized controlled trials using biomarker driven causal mediation analysis'
Introduction
Psychedelic-assisted therapies have attracted substantial scientific, public and commercial interest for a range of psychiatric disorders, including major depressive disorder, post-traumatic stress disorder and other mood and anxiety conditions. Despite promising results from early Phase 2 and Phase 3 trials for agents such as psilocybin, MDMA and ketamine/esketamine, effect-size estimates from randomized controlled trials (RCTs) are likely confounded by de-blinding: the distinctive psychoactive effects of these drugs tend to reveal treatment allocation to participants and study personnel, producing expectancy and placebo-related responses that may inflate measured clinical benefits. Muthukumaraswamy proposes an alternative validation strategy: use causal mediation analysis (CMA) driven by objectively measured biological mediators to link treatment exposure to clinical outcome. The paper introduces the Neyman–Rubin potential outcomes framework and Directed Acyclic Graphs (DAGs) to illustrate how blinding failures generate back-door confounding, then outlines the assumptions and methodology of CMA. It argues that response biomarkers measured during or soon after the intervention, analysed within a causal mediation framework, could provide mechanistic evidence of treatment effects even when traditional blinding is compromised, and contends that psychedelic therapies should not be approved as routine medicines until causal pathways linking treatment to outcome are established.
Methods
This is a theoretical and methodological article rather than an empirical trial report. The researchers present a conceptual exposition grounded in the Neyman–Rubin potential outcomes framework, explicating key assumptions required to estimate average treatment effects: exchangeability (ignorability), positivity, and the two components of SUTVA (consistency and non-interference). Directed Acyclic Graphs are used to visualise how randomisation normally blocks back-door confounding and how de-blinding and expectancy introduce additional pathways (for example treatment -> blinding status -> expectancy -> outcome) that threaten identification of a true treatment effect. The paper then introduces causal mediation analysis within the counterfactual framework, defining average causal mediation effects (ACME), average direct effects (ADE) and their relationships to total effects. Sequential ignorability is spelled out as the central identification assumption for CMA, comprising (a) no unmeasured confounding of treatment–outcome and treatment–mediator relationships (plausible under randomisation) and (b) no unmeasured confounding of the mediator–outcome relationship and no mediator–outcome confounder affected by treatment (a strong assumption because the mediator cannot be randomised). Linear formulations of mediation models and interaction terms are provided as an accessible analytical approach when outcomes are continuous and approximately linear. To illustrate applicability, synthetic-data simulations are presented to demonstrate four mediation scenarios (treatment affects mediator and outcome with interaction; mediation present in both arms; treatment affects outcome but not mediator; treatment affects mediator but mediator not associated with outcome). Practical recommendations for applying CMA in psychedelic RCTs are discussed: prioritise temporal ordering of mediator and outcome measurements, select mediators grounded in biological mechanism (neuroimaging, electrophysiology, blood biomarkers), favour domain-specific clinical outcomes aligned with mechanism, control for experimental confounds (for example diurnal variation), measure relevant covariates including baselines of mediator and outcome, choose comparator conditions that ideally do not change the mediator, and follow standard statistical safeguards such as pre-registration, correction for multiple comparisons and intention-to-treat analysis. Sensitivity analyses for violations of sequential ignorability are recommended, and both frequentist and Bayesian estimation approaches are noted as compatible with CMA.
Results
As a methodological paper, empirical results are illustrative rather than data-driven. The synthetic simulations demonstrate the distinct patterns one could observe under different mediator–treatment–outcome relationships: (a) mediation plus a direct effect present only in the treatment arm (indicating mechanistic specificity); (b) mediation present in both treatment and control arms (evidence of mediation but not specificity); (c) treatment affects outcome but not the mediator (no mediation because the treatment does not change the mediator); and (d) treatment changes the mediator but the mediator is not related to outcome (no mediation, only a direct effect). These examples show that when a treatment both alters an objectively measured mediator and that mediator causally predicts subsequent clinical outcome, CMA can provide evidence linking treatment to outcome that is less likely to be attributable to expectancy or de-blinding. The paper reports no empirical biomarker data, but identifies plausible mediator modalities—functional MRI, magneto/electroencephalography, and blood-based markers of neuroplasticity—as candidates that have shown drug-related changes in previous research. It emphasises that satisfying the first three of Prentice's surrogate criteria (treatment affects outcome, treatment affects surrogate, surrogate affects outcome) is necessary for identifying mediation, but that full surrogacy (surrogate capturing all treatment effect) is unlikely in psychiatry because non-specific placebo and natural history effects commonly contribute a large proportion of measured clinical response. The authors stress that CMA identification hinges on untestable sequential ignorability assumptions and therefore recommend sensitivity analyses and careful measurement of potential mediator–outcome confounders.
Discussion
Muthukumaraswamy situates the proposed biomarker-driven CMA approach as a response to persistent critiques about blinding and expectancy confounds in psychedelic RCTs. He argues that objective, mechanistically plausible biological readouts—measured with appropriate temporal ordering and analytic controls—would constitute stronger evidence that observed clinical responses have a biological basis rather than being entirely driven by placebo or expectancy. This is presented as a way to retain the epistemic strengths of the double-blind RCT paradigm while addressing its practical weaknesses in the context of psychoactive interventions. The discussion highlights several threats to identification in psychedelic trials beyond participant de-blinding: consistency violations if therapists alter their delivery after becoming aware of allocation, contagion or non-interference problems when participants influence one another (for example via public testimonials), and sample self-selection that may accentuate expectancy bias. To mitigate these, the paper recommends careful therapy manualisation, therapist training and fidelity monitoring (for example via recordings), trial designs that measure and adjust for expectancy and therapeutic alliance, and biomarkers chosen to minimise susceptibility to non-specific effects. Regulatory implications are addressed directly: the author contends that trials failing to distinguish drug effects from placebo effects do not meet the US FDA standard for establishing effectiveness, and warns against abandoning rigorous RCT methods simply because blinding is difficult. Potential harms of premature approval based on weakly controlled evidence are discussed, including unwarranted indication expansion, approval of low-quality interventions elsewhere, and instability of effect sizes over time as public expectations change. The researchers acknowledge the limitations of CMA—principally the strong, partly untestable assumptions about mediator–outcome confounding—and recommend sensitivity analyses, pre-registration and domain-aligned outcome selection to strengthen causal claims. Overall, CMA using objectively measured mediators is presented as a pragmatic, mechanistically informed complement to standard RCT evidence for psychedelic medicines rather than as a substitute for rigorous experimental design.
Conclusion
The field of psychedelic clinical research remains nascent after decades of intermittent study. Muthukumaraswamy concludes that if psychedelic therapies are truly efficacious, rigorous and time-consuming research will confirm that fact; however, premature adoption driven by commercial or advocacy pressure risks exposing patients to ineffective or unstable interventions. The article offers biomarker-driven causal mediation analysis as one feasible methodological route to obtain decisive mechanistic evidence that was not available to earlier researchers, and urges that such approaches be pursued before widespread introduction of psychedelic therapies into clinical practice.
View full paper sections
INTRODUCTION
There is considerable scientific, public and commercial interest in the potential for psychedelic drugs 1 to be used medically in the treatment of a variety of disorders including, but not limited to, major depressive disorder, depression, post-traumatic stress disorder and anxiety disorders. Currently esketamine is approved for use in a number of jurisdictions for treatment-resistant depression and racemic ketamine is now widely given off-label for the treatment of depression. Still in clinical development, MDMA-assisted psychotherapy for the treatment of post-traumatic stress disorder (PTSD) has one published Phase 3 clinical trialand several Phase 2 trials for psilocybin in treatment-resistant depression have been conducted, with a Phase 3 programme underway. Despite these advancements, estimates of treatment effect sizes in randomized controlled trials (RCTs) for all these interventions are confounded by de-blinding of participants and study personnel to treatment-arm allocation. It is theoretically possible that the entire estimated effect size of these interventions could be accounted for by these confounds which would mean that they provide no evidence for efficacy. In this article an alternative approach for validating efficacy is proposed -using causal mediation analysis to link treatment and outcome via an objectively measured biological mediator. The article begins by first introducing the Neyman-Rubin potential outcomes framework which serves as the theoretical basis for establishing causation (of both safety and efficacy) in clinical trials. The various assumptions of the framework including exchangeability, consistency, noninterference and positivity are introduced. It is then considered how various methodological confounds in psychedelic RCTs violate these assumptions. In particular, it is illustrated using Directed-Acyclic Graphs (DAGs) how blinding violations can generate back-door confounders of the treatment-outcome pathway. Next, the theoretical basis of causal mediation analysis is outlined in terms of the potential outcomes framework for causal inference and the necessary identification assumptions used by this framework are introduced. A potential solution to the de-blinding issues in psychedelic RCTs problem is proposed by using objectively measured biomarkers to link treatment and response via causal mediation analysis. Synthetic data and outcomes are considered for illustrative purposes. The criteria for the validity of candidate measurements from psychedelic RCTs that could be entered into the causal mediation framework are then considered. Finally, it is argued that psychedelic medicines should not be introduced as medicines under standard regulatory frameworks given the low-quality evidence currently provided by extant clinical trials.
THE POTENTIAL OUTCOMES FRAMEWORK FOR CAUSAL INFERENCE
The establishment of causation (both efficacy and safety) in RCTs is succinctly conceptualised within the Neyman-Rubin potential outcomes framework-sometimes called the Rubin Casual Model. Let I be a population of units with i denoting an individual unit and Y a response variable to be explained, with Y(i) being the response of an individual unit. In a simple case A 1 There is no universally agreed definition of psychedelic drugs. For the purposes of the current work, a broad definition is used to not only classical psychedelics such as LSD and psilocybin, but also MDMA and ketamine. The arguments made have general applicability to psychopharmaceutical interventions which dramatically alter waking consciousness. is a treatment variable with two options (1 for treatment and 0 for control). The causal effect of a on i is called the individual treatment effect (ITE) and can be defined by: Importantly, of the two potential outcomes for any individual, at any point in time (Y1(i) and Y0(i)), one is counterfactual and can never be observed -this is termed the fundamental problem of causal inference and is in essence a missing data problem. Since individual treatment effects cannot be identified, in order to establish causation, the average treatment effect (ATE) over the population of units (I) is estimated. ATE can be defined as: Where 𝔼 is the expected value. Estimation of the ATE without bias relies on four key assumptions. The first of these is that the potential outcomes are independent of treatment assignment. This is the ignorability/exchangeability assumption, that participants are all sampled from the same population with no unmeasured confounders. Given any set of confounders (C) it can be written as: In RCT's this assumption is well satisfied by random allocation of units to treatment. The second assumption of the Rubin causal model is positivity and states that all treatment states are possible, that is: Again, this assumption is well satisfied in RCTs by randomization. Two further assumptions of the model together form the stable unit treatment value assumption (SUTVA). The first part of SUTVA is that for each unit, there should be no hidden versions of the treatments that could lead to different potential outcomes (consistency). That is: The second part of SUTVA is that potential outcomes for a unit should not vary with treatments assigned to other units (non-interference). In practise the second part of the assumption means that participants in clinical trials should be kept as separate from each other as possible by investigators. Tchetgen et al.notate this assumption as: ) for all 𝑖 and any 𝒂 and 𝒂′ (6) Where ' indicates negation and boldface indicates a is a vector that indicates an entire treatment allocation scheme across the units studied. In other words, for any individual the potential outcomes should be the same regardless of the treatments that other units receive in the allocation scheme. Further, to these underlying assumptions in terms of estimation, trial sample size should be sufficient such that 𝐸(𝑌 ! | 𝐴 = 1) and 𝐸(𝑌 ! | 𝐴 = 0) are good estimates of E(Y1) and E(Y0) respectively. Specific to RCTs, treatment allocation should be concealed and blinding maintained in participants and study personnel to reduce bias. This will be illustrated in more detail in the next section using Directed Acyclic Graphs (DAGs) which are graphical representations of cause and effect used in causal inference. Placebo Treatment (with blind):
DIRECTED ACYCLIC GRAPHS (DAGS) OF BLINDING AND EXPECTANCY CONFOUNDS
It is common practice in the causal inference literature to use diagrams to represent the relationship between variables in a proposed causal system in a graphical format termed a directed acyclic graph (DAG). In a DAG, nodes represent variables and edges represent a (proposed) causal effect between two nodes with the arrow representing the direction of the causal effect. Following, in FigureDAGs are used to represent various scenarios for the effect of an intervention (A) on an outcome variable (Y) that help to illuminate the role of blinding, expectancy and the placebo response in clinical trials. In Figure) participants are randomized and offered treatment (AOffer). If participants take up the offer of treatment and receive the treatment (AGet) then the subsequent outcome can be measured (Y). The pathway AOffer -> AGet -> Y represents the intention-to-treat effect which is usually the main estimate of interest in an RCT. In DAGs "backdoor" confounds (C) can exist which can influence both the treatment (e.g. self-selection) and outcome. The role of randomization in clinical trials is to break the backdoor pathway between C -> A, ensuring that the exchangeability assumption is met -represented in Figurewith a dashed line to illustrate that pathway has been eliminated. After receiving treatment, the effects of the treatment may affect the blinding status (B) of the participant. B can be considered as either a categorical or continuous variable but is considered as categorical here for simplicity. In a clinical trial where blinding is maintained across the two trial arms (Figure) it can be seen that the arrows from AGet -> B, B -> ExB and ExB ->Y are nullified. Elimination of these backdoor confounders allowing the treatment effect to be identified from data. In psychedelic RCTs, where de-blinding occurs (Figure) it is highly probable that the pathway AGet -> B will be linked. Prior to the trial, participants may have some expectation (E) as to how the treatment will affect them. To the extent that participants are de-blinded, this will interact with their expectation (ExB) which will influence their outcome (Y). Across two-arms of a trial this leads to an identification issue for the treatment effect. If the blind is not evenly held across the two trial arms, then it is no longer possible to determine whether differences in Y between the two arms of the trial have been caused by a treatment effect or by an expectancy-blinding interaction. In Figure) the effect of a pure placebo used in clinical treatment is illustrated. The outcome variable Y can be heavily influenced by the treatment purely by the ExB interaction with no direct effect on treatment. It is worth noting that pre-trial expectancy, is a difficult concept to measure and while scales do exist to measure expectancy, it is not clear that they capture what might be a complex construct (hence it is denoted with a boldface vector). Ordinarily, in RCTs with preserved blinding this is not problematic as randomization ensures the exchangeability assumption is met, and therefore these expectancies can be considered like any other covariate. Finally, for simplicity the DAGs in Figurehave not included the effects of deblinding study personnel. In terms of psychedelic medicine, the major potential confound here is violations of the consistency assumption which is considered later. Put more colloquially, given the obvious psychoactive effects of psychedelic drugs, in a standard parallel-groups RCT, those participants who are allocated to active intervention group likely know they have received the treatment and may show greater treatment response due to expectancy effects. Conversely, those participants that receive a placebo intervention (active or inactive) may know they have received the placebo and disappointment may decrease their placebo response. Thus, it is argued that psychedelic RCTs with clear blinding failures are unable to distinguish treatment effects from placebo responses -later this issue will be considered from a regulatory perspective.
CONTAGION AND CONSISTENCY PROBLEMS IN PSYCHEDELIC RANDOMIZED CONTROLLED TRIALS
Psychedelic(-assisted) therapies as interventions to be tested for safety and efficacy, are considerably more complex than the standard pharmaceutical interventions traditionally tested in psychiatry. Firstly, although some interventions such as esketamine/ketamine can, and are, provided as "drug-alone" interventions many of the therapeutic approaches being tested in psychedelic involve extensive use of psychotherapy using a variety of therapy models. Psychotherapy in psychedelic-assisted psychotherapy can occur, before, during and after multiple treatment sessions with the investigational medicinal product being tested. This makes for a complex intervention with many variables interacting across time. For example, in the MDMA for PTSD treatment protocol currently in Phase 3 trials, over a twelve-week period from when the intervention commences, participants have three doses separated by four weeks with each dose followed by three psychedelic integration sessions. Hence, the AGet intervention is actually a very complex interaction, unfolding over time, with many opportunities for factors such as expectancy and blinding, and general knowledge of the participant about the intervention to change over time. For simplicity these complexities are largely ignored here, Similar to the de-blinded participants involved in the trial, even if a study is doubleblind "by design" study personnel involved in the delivery of psychedelic therapy can easily become aware of the treatment group allocation of the participant, given the participant's immediate reaction to the drug administered. Either through covert or overt processes, therapists might then deliver differential therapy across the groups. Any hidden variation of the treatment provided to the patient would be a violation of the consistency assumption of causal inference. Variation in the treatments delivered to patients has the potential to bias results by differentially modifying the therapeutic alliance generated between patient and therapist. It is well known in psychotherapy that the development of therapeutic alliance has a strong effect on therapeutic success with a meta-analysis including over 30,000 patients indicating an effect size of d = 0.58. In theory, this consistency violation could inflate the effect sizes seen in psychedelic RCTs, if therapists end up delivering therapy with better fidelity to those in the active over the control group. To improve the likelihood that consistency assumption is met, the therapy delivery should be carefully manualized and therapists given appropriate training. Ultimately verification of the delivered therapy should be undertaken (which might involve formal verification through use of video recordings). Finally, in psychedelic trials there exist many opportunities for violations of the noninterference assumption to exist (seefor an extensive consideration of this topic). Of particular concern are contagion effects where one individual's treatment outcome can affect the outcome of another unit. For example, the positive testimonials of one patient, can affect the outcomes of other/future patients if, for instance, they are shared online, or at local psychedelic society or integration group events. These contagion effects might be particularly pronounced in psychedelic RCTs, relative to other medical RCTs due to the extensive media coverage of psychedelic drugs and the d-blinding/expectancy effects that can occur in these trials. How might psychedelic RCTs overcome the blinding (identification), consistency and non-interference issues they are faced with? Previously we suggested trial designs that might allow unbiased estimation of treatment effects when combined with explicit measurement of blinding, expectancy and therapeutic alliance. Similarly, Aday et alproposed a number of trial designs and study procedures that may help to lessen the confounding of psychedelic RCTs. However, the very realistic possibility exists that for these interventions such steps might not be adequate and the degree of de-blinding too great to overcome these biases. Instead, here a solution is proposed that uses mechanistically sensible biomarker measurements incorporated into causal mediation analysis to provide confirmatory evidence that the clinical responses observed are not being generated entirely by biases unrelated to the efficacy of the intervention under study. A biomarker that shows that a biological response has occurred in an individual who has been exposed to a medical/environmental interventionSurrogate Biomarker "A laboratory measurement or physical sign that is used in therapeutic trials as a substitute for a clinically meaningful endpoint that is a direct measure of how a patient feels, functions, or survives and is expected to predict the effect of the therapy"Validated Surrogate Biomarker "Surrogate biomarkers for which evidence has established that a drug-induced effect on the surrogate results in the desired effect on the clinical outcome of interest."Unvalidated Surrogate Marker "Surrogate biomarkers which are reasonably likely to predict the clinical benefit of interest, but for which there is insufficient evidence to establish that such an effect, does, in fact, result in the desired clinical outcome."Tableprovides a set of definitions for various types of biomarkers that can be measured in clinical trials including response biomarkers, surrogate biomarkers, and validated/ unvalidated surrogate biomarkers. In this article we focus on response biomarkers rather than the more specific surrogate biomarkers. A surrogate biomarker is designed to replace a true clinical endpoint of interest. An early approach to formalising surrogate biomarkers was developed by Prenticewho laid out four operational criteria for defining surrogate biomarkers. These are:
BIOMARKER TYPES IN RANDOMIZED CONTROLLED TRIALS
) 𝑓(𝑌|𝑀, 𝐴) = 𝑓(𝑌|𝑀)Which can be described as i) -the treatment must affect the outcome (Equation), ii) the treatment must affect the surrogate (M) (Equation), iii) the surrogate (M) must affect the outcome (Y) (Equation) and iv) when the outcome is conditioned on the surrogate it is independent of the treatment (Equation). The fourth criterion implies that all information on the effect of outcome can be captured by the surrogate. Although statistical approaches to surrogacy identification have been improved since the original Prentice criteria (seefor a review) it is argued here that identification of surrogate endpoints is not required by psychedelic investigational medicines. For some areas of medicine, it is desirable to have surrogate biomarkers to replace clinical endpoints, because the clinical endpoints are impractical to collect in the context of a clinical trial. For example, the desired clinical endpoints may happen too far into the future, be overly invasive, or overly expensive to measure. As opposed to these areas of medicine where surrogates are used to replace clinical endpoints, in psychedelic medicine where the main application is in psychiatry, primary clinical outcomes are generally clinician-administered or patient-rated scales which are cheap, non-invasive, and are easily obtained multiple times in clinical trials in a timely way. Rather, in psychedelic medicine the issue at stake is that investigators are unsure of the veracity of the clinical outcome measurements as they are highly susceptible to being contaminated by non-specific/placebo responses. Moreover, in psychedelic medicine given the confounding effects of blinding/expectancy and the contribution of placebo/natural history effects to the observed clinical response it is highly unlikely that criterion iv) or similar would ever be met by a biomarker. This is because a substantial proportion (>50%) of the measured clinical response is likely to be generated by these non-specific placebo/natural history effects. The key proposal here is that biomarkers could however be used to establish a causal pathway between treatment and outcomes through biologically relevant processes. Formally, these pathways can be linked by Causal Mediation Analysis (CMA) which extends the potential outcomes framework and is introduced next. ExB CMAextends traditional mediation analysisconceptualising two pathways between treatment (A) and outcome (Y) (see Figure) -a direct pathway (r') and an indirect pathway (pq). Unlike traditional mediation analysis with purely observational data, in CMA, A is usually an experimentally manipulated treatment variable. CMA extends the counterfactual framework to explicitly, and causally, define direct and indirect (mediated) effects. Indirect causal mediation effects can be defined in terms of counterfactuals as:
AN INTRODUCTION TO CAUSAL MEDIATION ANALYSIS
for a = 1,0. This can be interpreted as the causal effect for each treatment given the value of the mediator from the condition and its counterfactual. Average causal mediation effects (ACME) are defined as: for a = 1,0. Similarly direct effects of treatment can be defined as: for a = 1,0. This can be interpreted as the causal effect of the treatment while the mediator is held constant. The average causal direct effects (ADE) are defined as: Total causal effects -analogous to ITE can be defined as: with overall ATE defined as: Sometimes ACME(0), ACME, ADE(0), ADEare referred to as the Pure Natural Indirect Effect (PNIE), True Natural Indirect Effect (TNIE), Pure Natural Direct Effect (PNDE) and True Natural Direct Effect (TNDE) respectively. Tableprovides a summary of effect names, and respective counterfactuals and estimators. The Controlled Direct Effect (CDE) is omitted as it unlikely to be relevant in this context. Similar, to the basic Rubin causal model, estimation of the quantities can be framed as a missing data problem to identify those counterfactuals that cannot be directly observed. Identification relies on two assumptions often termed together the sequential ignorability (exchangeability) assumption. The sequential ignorability assumption can be stated as: {𝑌 (a ) , 𝑚), 𝑀(𝑡)} ⊥ 𝐴|𝐶 for all a ) , 𝑎 ∈ {0,1} {𝑌 (𝑎 ) , 𝑚)} ⊥ 𝑀|𝐴 = 𝑎, 𝐶 = c for all a ) , 𝑎 ∈ {0,1} The first part of this assumption states that given pre-treatment variables (C) that treatment assignment is independent of potential outcomes and mediators. The second part states that given pre-treatment variables and the treatment, that the mediator is independent of potential outcomes. Valeri and VanderWeeledescribe the first part of the assumption as (a) no unmeasured confounding of the treatment-outcome relationship and b) no unmeasured confounding of the treatment-mediator relationship. Under conditions of randomization of units to A these are well satisfied. The second part of the sequential ignorability assumption assumes c) no unmeasured confounding of the mediator-outcome relationship and d) there is no mediator-outcome confounder that is affected by the treatment. These are strong assumptions because in this setting the mediator cannot be randomized. While this assumption cannot be explicitly tested a number of steps including well-informed experimental design, measurement of appropriate confounders and sensitivity analysis can be used to address this concern. This will be re-visited in a subsequent section. Finally, it should be noted that the identification assumptions above are nonparametric and independent of the distribution and form of the outcome variables. Given assumptions of linearity with continuous outcome variables causal mediation can be expressed simply as the following pair of linear which can be solved with least squares: 𝑌 = 𝑔 ' + r ) 𝐴 + 𝑞𝑀 + ℎA𝑀 + 𝒌𝑪 + 𝑒 'Where boldface indicate potential vector quantities, h represents an interaction term between the treatment and the mediator, g are intercepts and e are error terms. Table: The main nomenclature used in CMA (after). h represents an interaction effect term (see text for details).
APPLICATION OF CAUSAL MEDIATION ANALYSIS TO PSYCHEDELIC RCTS
As described, causal mediation analysis allows "front-door" identification of a causal pathway between treatment and outcome via objectively measured biomarkers (the pq pathway). The key argument here is that use of response biomarkers to link treatment and outcome via CMA would ameliorate concerns regarding de-blinding in psychedelic RCTs. The underlying assumption is that biological readouts of low-level and specific mechanistic processes are less likely to be contaminated by placebo effects than the subjective scales used for primary outcome measures in psychedelic RCTs. Evidence for mediation of clinical effects by objectively measured biological readouts would constitute strong evidence of a clinical effect of psychedelics beyond non-specific placebo responses. Is the assumption that biological readouts are less likely to be contaminated by placebo response reasonable? Here specific suggestions are made regarding mediator variable selection and general trial design for psychedelic RCTs that would make this assumption reasonable (although not certain). a) Respect for temporal ordering. Mediation analysis implicitly assumes temporal ordering in that treatment must precede measurement of the mediating process with clinical outcome measured subsequently to the mediator. In terms of application to psychedelic RCTs, potential mediators are probably best measured during the intervention or immediately afterwards (hours/days), whereas clinical outcome measurement could be delayed significantly -typically weeks. The closer temporally that mediator measurement comes to outcome measurement the more risk there will be of mediator-outcome confounding. If measured concurrently to outcome assessment, then the mediating variable is better interpreted as a biological state variable than a mediating mechanism. b) Mediator variable section based on mechanism. Mechanistic understanding of the neuropharmacology of psychedelics has made considerable advancements in recent years -from receptor binding through cell signaling, to the effects on humans which can be measured through functional magnetic resonance imaging, magneto/electroencephalographyand blood-based biomarkers amongst others. Psychedelics have been shown to have powerful effects on these measurement modalities making them potentially useful as tools to measure mediating variables. Selection of specific mediator variables based on both existing data and sound theoretical bases such as neuroplasticity hypothesesor the REBUS hypothesiswould be preferrable. Notably, the linear equations of CMA (Equations 19/20) are easily implemented in most neuroimaging software, and it should be possible to generate statistical parametric maps of mediation effects.
C) SELECTION OF CLINICAL OUTCOME MEASURES.
The proposed approach would typically occur in RCTs of psychedelics -potentially being conducted by industry sponsors for regulatory approval. Naturally, there will be desire for the primary outcome measure of these trials to be standard psychiatric scales (e.g CAPS-5, MADRS, HAM-D, HAM-A etc). This is understandable pragmatically, but it is important to note that these scales typically rely on sum scores which combine highly heterogeneous symptoms (seethat may be biologically non-specific). Arguably, the clinical outcome measures for CMA should be focused on domain-specific symptom measurements that could sit alongside traditional primary outcome measures for the indication under study. This would provide better alignment between mechanisms and symptoms and indeed could start to bring better congruency with frameworks such as the NIH Research Domain Criteria matrix (RDoC)and the Hierarchical Taxonomy of Psychopathology (HiTOP). d) Experimental design to control confounds. In order to reduce potential confounds of the MY relationship sound experimental design can be used. For example, if an outcome measure is known to be sensitive to the diurnal cycle (for example, BDNF as a marker of plasticity) then this should be controlled experimentally where feasible. e) Inclusion of confounding variables. Alternatively, if potential confounds cannot be controlled experimentally then attempts should be made for explicit measurement so that they can be used as covariates in analytical models. Notably baseline measurements of both M and Y can be included in analyses. Previous work has shown this to not only have more precision than using change scores as inputs to CMA and removes the assumption of independence of measures of baseline values and their change over time. f) Choice of comparator condition. A key consideration in psychedelic RCT design is which comparator condition should be chosen. Previous workhas advocated for the use of active placebo and dose-response controls while the use of waitlist, delayed treatment and crossover designs was not encouraged. The current work does not impact on those arguments. One caveat for the current proposal is that ideally the comparator condition would not have effects on the mediator. If the comparator does affect the mediator (relative to baseline) then this might have the effect of reducing the value of p and the overall sensitivity of the CMA approach. g) Standard statistical concerns apply regarding pre-registration of analyses, dealing multiple comparisons issues and use of intention-to-treat analyses. Power calculations for Phase 2/3 trial can be conducted and CMA is amenable to both frequentist and Bayesian approaches. Recently a set of guidelines has been published for how to report CMA in controlled trials.
CAUSAL MEDIATION ANALYSIS -SYNTHETIC DATA SCENARIOS
For illustrative purposes, in Figure, four simulation scenarios using synthetic data are displayed. Simulation a) reveals a scenario with a causally mediated (and direct effect) in the treatment group but not in the control group indicating the presence of an interaction (h) which allows claims of mechanistic specificity of the intervention (relative to the comparator condition) to be made. Overall, in this scenario the treatment affects both the mediator and the outcome. In the scenario illustrated in b) mediation is present in both groups and the treatment affects both mediator and outcome. While in this scenario mechanistic specificity cannot be claimed there is still evidence for causal mediation by the mediating variable. This scenario might represent a case where both the placebo and treatment both affect the mediator albeit to different extents 3 . Both scenarios a) and b) represent cases where causal mediation effects have been found linking treatment to outcome and provide evidence for linking between treatment and outcome -even if blinding were to fail in the trial! In scenario c) while the treatment affects the outcome and the mediator variable correlates with the outcome. However, there is no mediation, as the treatment does not affect the mediator (p=0). Finally, in d) while the treatment affects the mediator and the outcome there is no relationship between mediator and outcome (q=0) and hence no mediation, but a direct effect exists. That is, while the biological mediator has changed, this does not appear to be affecting the changes in the outcome variable. In reality, empirical data will never look like those in the simulation examples, because in real data there are likely to be significant random participant effects that will affect post-outcome measures. While difficult to illustrate, these baseline measurements can be easily included as covariates in linear models (Equations 19/20). With real data, it is possible for analysts to perform sensitivity analysis to check how sensitive obtained results will be to violations of the sequential ignorability assumption. Finally, it can be noted that for causal mediation to be identified that the first three of the Prentice criteria (Equations 7-9) must be satisfied. See text for further explanation.
GENERAL DISCUSSION AND IMPLICATIONS
Since our first paper raising issues with blinding and expectancy confounds in psychedelic RCTs-these issues have been discussed in a number of academic articles, conferences and on social media. Here some common responses to the issues raised in that paper are addressed. The causal mediation approach outlined here is to some extent a novel solution to the blinding/expectancy issues raised in that initial paper. On the appropriateness of the double-blind RCT methodology for psychedelic medicine Schenbergproposes that the case of psychedelics really shows that the double-blind RCT methodology is not epistemically fit for psychiatric interventions. He asks whether the objectivity of the double-blind RCT approach is coherent when both the psychoactive effects of psychedelics and their potential effects on symptomology are inherently subjective. The current proposal provides a potential solution to this issue. As a counter-argument to the position of abandoning double-blind RCTs for psychedelics, I argue that this is potentially dangerous with no rigorously thought out replacement evidence gathering methodology in place. Indeed, there are several potentially harmful downstream consequences of not being able to accurately estimate treatment effects of psychedelic interventions and as such to discriminate treatment effects from potential placebo effects. Firstly, if the double-blind approach is abandoned for psychedelics, psychedelic drugs could potentially be applied across an ever-expanding set of clinical indications. It is difficult to know where this expansion might stop, however, it might be that it extends well beyond the scope of any true efficacy of psychedelics (if any). Rigorously conducted RCTs provide an objective framework for stopping inappropriate indication expansion and prevents patients being exposed to ineffective interventions -which is both unethical and potentially harmful. Secondly, if weak evidence from a functionally unblinded intervention is the minimum standard of evidence required to introduce a medical intervention into clinical practice then who knows what spurious interventions from alternative medicine might attempt to be approved as a standard treatment. Thirdly, as we have argued before, approving interventions where treatment effects may be contaminated by expectancy effects, could lead to the concerning situation where effect sizes may be unstable over time. Usually, estimates of efficacy of medical interventions are assumed to be stable over time and not influenced by the whim of whatever the current social zeitgeist towards the treatment is -which alters participant expectancies. This issue should be particularly concerning to both regulators and payers -as these stakeholders generally assume that treatment effects are temporally stable.
ON THE REGULATORY IMPLICATIONS OF BLINDING FAILURES IN PSYCHEDELIC MEDICINE
In their provocatively titled paper "Expectancy in placebo-controlled trials of psychedelics: if so, so what?" Butler et al.argue that "Placebo-controlled RCTs are not a perfect fit for all therapeutics, and problems in blinding should not automatically disqualify medications from licensing decisions." It is important then to consider what the regulatory standards are for licensing new treatments. In Australia the Therapeutic Goods Administration recently approved the prescribing of MDMA for PTSD and psilocybin for treatment-resistant depressionagainst the view of their own scientific assessing committee with next to no safety or efficacy data nor a submitted data package -in my view a frighteningly low bar. While for some regulators the standards are not clear, in the USA the FDA provides a clear regulatory standard on how substantial evidence of effectiveness for drug products should be established. In particular, "To establish a drug's effectiveness, it is essential to distinguish the effect of the drug "from other influences, such as spontaneous change in the course of the disease, placebo effect, or biased observation"" (pg 4, emphasis added). Effectively, unblinded psychedelic RCTs which fail to identify treatment effects from placebo effects, as demonstrated in the DAGs in Figurein this paper, therefore fail to meet the FDAs regulatory standard. The current arguments and approaches do have relevance to the active debates around de-blinding effects in other areas of psychiatry and beyond, but it is beyond the scope of the current work to address these issues. It is also interesting to note that most medical regulators worldwide have no legal remit to approve psychotherapies -and as such may have little experience in their assessment of therapeutic models in which psychotherapies interact with drug interventions. This seems particularly problematic given that for psychedelic assisted therapy the drug is only one part of the intervention. On whether psychedelic RCTs are an extreme case of expectancy effects Butler et al.present the example of surgical trials as a counterpoint to the argument that double-blind trials are necessary to demonstrate efficacy as double-blind trials are difficult to conduct in surgery (and some other fields of medicine). While that argument has some persuasiveness, it is likely that no participant in a trial of surgery is enrolling in a clinical trial by searching: the internet, clinical trial registries, social media and actively responding to advertisements. Mostly one presumes that the patients in general medicine trials like surgery are already in hospital settings due to having an injury or hospitalizing disease and are being offered entry to a trial from their usual care provider. Often these care providers are part of large clinical trial networks. Indeed, may trial sites that study psychedelics (including my own) keep databases of participant keen to enroll in psychedelic trials and moreover I would venture that there is no psychedelic researcher who does not receive regular desperate emails from patients looking for psychedelic therapy. A cruel irony of attempting to measuring treatment responses in a clinical trial type of any design, is that no trial is in fact studying a true representation of the disease population, but rather that subset of the population willing to engage in experimental therapy. As such, while all clinical trials suffer from self-selection in bias to some degree, in the case of psychedelic medicine this bias could be particularly strong. It is not unreasonable to suggest that the expectancy effects for such a subset of the population would be particularly strong leading to potentially severe over-estimation of effect-sizes via the ExB term in Figure. Such accentuated effect sizes may simply be a façade and not an accurate representation of the clinical effects when deployed in the wider population. With lower expectancy may come lower effect sizes, or indeed no true underlying clinical effect, or perhaps even worse, harmful effects as less keen patients receive the intervention. It is unclear whether large swathes of patients should be prematurely exposed to an intervention with the potential that treatment effects could have such very limited generalisability due to clinical trials being conducted on a sub-population with particularly high expectancy.
CONCLUSION
Psychedelics have been used in human cultures for thousands of years with serious medical research into their clinical use reaching an initial apex in the 1950s and 60s. Due to societal pressures and restrictions they had been seldom studied in medicine for nearly 60 years and the field as such is still nascent. Oramnotes that the decline in psychedelic research from that time was due to the difficulties researchers faced into trying to fit the LSD psychotherapy into the double-blind paradigm that the FDA drug regulations now treated as the gold-standard for clinical reaseach. Novakargues that contrary to the belief of many, the initial crackdown of psychedelic research in the 1960s was not due to social concerns and the "war on drugs" but rather due to the lack of care and rigour taken by medical scientists of the time. In the present time, if psychedelic therapies are indeed truly efficacious then they will remain so even their introduction is not immediate. While the pressure from commercial sponsors, advocacy groups and potentially even legislators will be to accelerate their introduction to clinical practice, rigorously conducted time-consuming research should be prioritised. With respect to the issue of blinding and expectancy in psychedelic medicine pioneering LSD researcher Sidney Cohen presciently wrote nearly 60 years ago: "A control group of patients matched as well as possible with the LSD patients must be given the identical treatment except that LSD is not used. A placebo or drug with some minor activity identical in appearance would have to be substituted. It is quite impossible to keep the therapist in the dark about who is getting the LSD because of its pronounced action. Will he invest as much energy and dedication to his non-LSD patients? The patients themselves will quickly know whether they have received LSD or not. Their expectations of its benefits will alter their therapeutic set. These difficulties and others are the reasons why a decisive test of the efficacy of LSD has not yet been performed. The problems are great but surmountable. Hopefully, this investigation will be done one day".(pg.199). Unfortunately, such a decisive test has not yet been conducted with psychedelic investigational medicines. The present article has aimed to provide one potential methodological approach by which such a test could be conducted using biological knowledge, measurement techniques and statistical approaches that were not available in the time of pioneering researchers such as Sidney Cohen.
Full Text PDF
Study Details
- Study Typemeta
- Populationhumans
- Characteristicscommentary
- Journal
- Author