This observational study (n=29) examined daily voice journals before and after a 5-MeO-DMT retreat and found that speech changed towards more cognitive and less social language, with altered voice quality. Baseline speech patterns also helped predict preparedness, emotional breakthrough and later well-being.
Background
5-Methoxy-N,N-dimethyltryptamine (5-MeO-DMT), a potent, short-acting psychedelic, induces profound shifts in cognition, affect, and self-awareness. Because language explicitly expresses these domains and voice implicitly conveys them, both may serve as potential ‘biomarkers’ of behavioural change. Aim: This study introduces a novel framework for analysing baseline language and vocal features, pre- to post-psychedelic changes, and assessing their potential to predict subjective experiences and psychological outcomes.
Methods
Daily voice journals from 29 participants were collected via “RetreatBot” for 2 weeks before and after 5-MeO-DMT (1 × 12 mg). Transcripts were analysed using natural language processing (bag-of-words for vocabulary; transformer model for textual affect), and acoustic features (e.g. pitch, jitter, shimmer) were extracted to assess vocal dynamics.
Results
Following 5-MeO-DMT, speech markers revealed increased cognitive language, decreased social words, and altered voice quality (increased jitter/shimmer). Baseline speech patterns predicted psychedelic preparedness, emotional breakthrough, and post-experience well-being.
Conclusion
This first longitudinal analysis of speech markers surrounding a psychedelic retreat reveals a shift from external focus to introspection. Speech markers predicted and tracked psychological transformation surrounding the 5-MeO-DMT retreat experience, establishing vocal journaling as a valuable framework for monitoring changes during the “preparation” and “integration” periods.
Papers cited by this study that are also in Blossom
Blackburne, G., Mcalpine, R. G., Fabus, M. et al. · Cell Reports (2025)
Calder, A. E., Hasler, G. · Neuropsychopharmacology (2022)
Carhart-Harris, R. L., Goodwin, G. M. · Neuropsychopharmacology (2017)
Carhart-Harris, R. L., Kaelen, M., Whalley, M. G. et al. · Psychopharmacology (2014)
Carhart-Harris, R. L., Muthukumaraswamy, S., Roseman, L. et al. · PNAS (2016)
Kuc and colleagues note that psychedelics, including the short-acting compound 5-MeO-DMT, are being studied for their potential effects on cognition, emotion, self-awareness, and mental health. The authors argue that language and voice are especially promising because they can express psychological change directly and indirectly, yet no previous longitudinal studies had tracked naturally produced speech and acoustic features before and after psychedelic use in everyday settings. They therefore frame the paper as addressing a gap in ecological, time-resolved evidence about how psychedelic experiences may be reflected in language and vocal behaviour. The study set out to develop and test a framework for analysing baseline speech, pre-to-post psychedelic change, and the extent to which speech markers might predict subjective experience and later psychological outcomes. In particular, the authors aimed to examine whether vocabulary, emotional expression, and voice quality changed around a single 5-MeO-DMT session, whether these changes unfolded over time, and whether pre-session speech features related to preparedness, acute subjective effects, emotional breakthrough, and post-experience well-being. The paper presents this as the first longitudinal investigation of both textual and acoustic speech markers around a psychedelic retreat, using daily voice journals collected through a chatbot-based ecological momentary assessment approach.
The researchers conducted a naturalistic study at the Tandava Retreat Centre with healthy adult participants recruited globally through the F.I.V.E. platform, social media, and word of mouth. The final cohort comprised 29 participants, most of whom were White/Caucasian and non-religious, and all had prior experience with 5-MeO-DMT to varying degrees. Participants attended one of six three-day retreats. Day 1 involved orientation and briefing, Day 2 involved the 5-MeO-DMT session and primary data collection, and Day 3 focused on integration through facilitator-guided discussion. On the dosing day, participants received a single 12 mg dose of synthetic 5-MeO-DMT by vaporisation under supervised, standardised conditions. The compound was inhaled in one breath while participants were in a semi-supine position on a padded recliner. The retreat setting was described as controlled but naturalistic rather than laboratory-based. Speech data were collected via RetreatBot, a custom Telegram chatbot designed to prompt daily voice journals from seven days before dosing to seven days after, with analysis extended to a 28-day window centred on the session because engagement continued beyond the planned period. The chatbot used a rule-based decision-tree structure rather than an adaptive conversational AI. Voice entries were transcribed automatically and checked manually, with sensitive information redacted. The researchers analysed transcribed text using LIWC-22 bag-of-words categories and a RoBERTa model trained on GoEmotions to classify sentence-level emotional content. Acoustic features were extracted with openSMILE/eGeMAPS, yielding 88 voice parameters including pitch, jitter, and shimmer. For analysis, the authors used linear mixed-effects models to compare pre- and post-session language features, paired t-tests for acoustic means, and day-by-day as well as weekly temporal analyses. They also examined whether pre-dosing language predicted five psychometric outcomes: psychedelic preparedness, ego dissolution, oceanic boundlessness, emotional breakthrough, and well-being. Pearson correlations were corrected for multiple comparisons using the Benjamini-Hochberg false discovery rate procedure, and significant features were entered into regression models. Ridge regression with leave-one-out cross-validation was used to test whether multivariate prediction generalised. In addition, principal component analysis was applied to pre-dosing emotion scores to derive broader emotional profiles. Sensitivity analyses tested whether findings were explained by participants’ prior 5-MeO-DMT exposure. The extracted text also notes that the study had ethical approval, participants gave informed consent, and the researchers used a range of safeguards including manual transcription checks, anonymisation, and false discovery rate correction.
The study collected 288 unique voice journals from 29 participants, comprising 4127 sentences and just over 506 minutes of audio. On average, each participant contributed 10 voice journals, although engagement varied. Most participants provided both pre- and post-ceremony recordings, but sample sizes differed somewhat across analyses because some questionnaires or recordings were missing. Following 5-MeO-DMT, several clear language shifts were observed. Cognitive language increased, driven by more use of cognitive process, insight, tentative, and certitude words. Social language decreased, particularly words related to communication and social referents. In the linguistic category, adverb use increased and conversational markers declined. Time-related language shifted towards the past and away from the future. Emotion classification with the RoBERTa model showed reduced excitement, nervousness, and fear, alongside increases in admiration, relief, joy, and realisation. In other words, the post-session speech sounded less anticipatory or anxious and more reflective or meaning-oriented. Acoustic analyses found significant post-dosing increases in average local jitter and shimmer, alongside a decrease in normalised jitter variability. When the authors ranked effects by Cohen’s d, the largest changes were the decrease in excitement, the decrease in jitter variability, and the increase in past-focused language. Across the full set of features, 13 increased and 9 decreased after dosing. The day-by-day analyses showed a crossover pattern: social language was initially above baseline and then dropped after dosing, while cognitive language rose above baseline and stayed elevated into the second post-dosing week. Vocal deviation analyses showed that pitch, jitter, and shimmer followed distinct trajectories around the retreat. Pitch deviations were more common before dosing and became less frequent afterwards. Jitter and shimmer showed more pronounced post-session changes, with high deviations peaking shortly after the session and low deviations becoming less common in the post-period. The authors report that these temporal patterns were broadly consistent in individual-level analyses, although shimmer was more variable and did not always reach corrected significance. For prediction, pre-dosing language was most clearly related to psychedelic preparedness. Positive tone, work-related language, and lifestyle references were each correlated with higher preparedness scores, but only positive tone remained an independent predictor in the joint regression model. In the principal component analysis of pre-session emotions, PC1 reflected positive emotional engagement and higher scores on this component predicted higher preparedness and better well-being. PC2 reflected negative emotional states but did not predict outcomes. PC3 reflected reflective integration and predicted stronger emotional breakthrough. The ridge regression models using all LIWC features did not generalise well to held-out participants, suggesting overfitting. The sensitivity analyses indicated that the main pre-post effects were not explained by lifetime 5-MeO-DMT use. No baseline speech features were significantly associated with prior lifetime use after correction, the direction of the main effects was preserved in both lower- and higher-experience groups, and the effects remained after including lifetime use as a covariate. Only one interaction, past-focused language, was moderated by prior experience, with a larger post-session increase among more experienced users.
Kuc and colleagues interpret the findings as evidence that naturalistic speech and voice can capture psychologically meaningful change around a psychedelic retreat. They argue that the post-session rise in cognitive language and fall in social language suggest a shift away from externally oriented discourse towards introspection, self-examination, and meaning-making during the two weeks after dosing. The authors present the observed increase in past-focused language and the reduction in future-oriented expression as consistent with a move from anticipation to retrospective processing. They also interpret the emotional-language findings as showing that speech did not simply become more positive overall, but changed in the specific kinds of emotions being expressed. Anticipatory emotions such as excitement, nervousness, and fear declined, whereas joy, admiration, and realisation increased. The authors link this to theories that psychedelics may promote insight and perspective change, and they note that these patterns may reflect an active integration process rather than a straightforward mood elevation. Similarly, the increase in jitter and shimmer is discussed as potentially reflecting a transient, psychologically malleable state rather than distress, especially since anxiety was not elevated in the sample. Relative to earlier research, the authors state that their work extends prior language-based psychedelic studies by moving beyond acute effects, retrospective trip reports, or single-time-point therapy transcripts. They argue that the study adds two major advances: longitudinal, ecologically valid voice journalling and simultaneous analysis of lexical, emotional, and acoustic markers. They also suggest that these multimodal methods may be useful for tracking preparation and integration processes in future psychedelic research. In discussing prediction, the authors say that positively valenced pre-dosing language appears to index readiness for the experience and may be associated with better outcomes. They further interpret the PCA findings as indicating that a broader pattern of positive emotional engagement predicts both preparedness and well-being, while reflective emotional complexity may facilitate emotional breakthrough. They suggest that some experiential outcomes are not well captured by simple word categories, which is why transformer-based emotion modelling added value. The main limitations they acknowledge are the small, self-selecting sample; the participants’ extensive prior psychedelic experience; the fact that the sample was made up of healthy volunteers rather than clinical patients; and the retreat context, which may have shaped responses independently of the drug. They also note the open-label design, absence of a control group, possible limitations of dictionary-based and model-based language tools, and the lack of follow-up assessments to establish durability. The authors stress that the findings are hypothesis-generating and require replication in larger, blinded, controlled studies. They also note that journalling itself could have influenced the observed changes. For future work, the authors propose larger studies with active controls or placebo arms, comparisons with non-psychedelic retreats or other intense experiences, multimodal links to neuroimaging and physiological measures, and longer follow-up to determine persistence. They suggest that automated speech monitoring could eventually help guide preparation, support integration, and personalise psychedelic-assisted care.
The authors conclude that everyday language and vocal patterns are sensitive indicators of psychological change surrounding a 5-MeO-DMT retreat, although controlled studies are still needed to separate drug effects from contextual factors. They state that the observed move towards introspective, cognitively oriented speech, together with changes in voice quality, points to substantial shifts in self-related processing after a single retreat experience. They also conclude that longitudinal voice analysis may become a useful tool for mapping preparation and integration, and for providing future real-time support in psychedelic-assisted therapy or retreat settings.
The resurgence of interest in psychedelics has led to growing efforts to understand how these substances affect phenomenological experiencesand how this knowledge can be applied to mental health treatments. The growing body of evidence suggests that, when administered in controlled settings, psychedelics can catalyse psychological transformation. At the molecular level, psychedelics are associated with increased (neuro)plasticity, as well as changes in brain network dynamics. At the experiential level, they alter fundamental aspects of consciousness (such as perception, self-awareness, and emotional processing), with the intensity and quality of the subjective experiences predicting the long-term clinical outcomes. 5-MeO-DMT is a naturally occurring tryptamine, found in the secretions of the Bufo alvarius (Incilius alvarius), the Sonoran Desert toad, and various plant species such as Anadenanthera peregrina, and also produced synthetically. Compared to psilocybin or lysergic acid diethylamide (LSD), 5-MeO-DMT stands out for
producing very intense but relatively brief alterations in consciousness, typically lasting 15-20 minutes when vaporised. The subjective experience on 5-MeO-DMT is characterised by a rapid loss of sense of self and bodily awareness, often resulting in "mystical-type" experiences, defined by a sense of oneness, intuitive or revelatory qualities, and an ineffable, paradoxical, and sacred nature, together with temporal and spatial distortions. Distinctively, the 5-MeO-DMT experience is often described as "content-free," featuring sensory deprivation frequently depicted as immersion in all-white light or total darkness. Similar to other psychedelics, 5-MeO-DMT is being evaluated for its potential beneficial long-term effects on mental health and wellbeing. These outcomes are thought to be enhanced through adequate preparation prior to the session (psychoeducation, intentionsetting, and establishing trust and comfort) and integration afterwards (e.g. guided reflection and emotional processing).
Modern views of complex psychological functions, like memory and emotional processing, suggest that they are supported by multiple networks, widely distributed throughout the whole brain. This is also true of the neurobiology of language, perhaps even more so. That is, humans use language ubiquitously, encountering about 150,000 words a day, beginning in utero. These words are intimately tied to all of our psychological processes, including sensorimotor and social-emotional processing, and cognitive processes like attention and memory. For example, language is used to direct attention to and arrange the world categorically, organising sensory experiences into colours (e.g. "blue"), objects ("banana"), and emotions ("love"). It also organises our memories into "higher-level" constructs like the "narrative self" using words like "I" and "me". This explains why language has a broad neurobiological reach: Word processing is connected to widely distributed regions of the brain involved in all of these processes. Because of its pervasive role in scaffolding a wide range of psychological processes, language may serve as a valuable proxy for understanding and predicting mental health. Written diaries, for example, often capture preoccupations and facets of an individual's inner dialogue, proving valuable in clinical psychology. More recently, ecological momentary assessments (EMAs) -methodologies involving repeated sampling of experiences in real-world environments -have gained traction for their ability to capture naturalistic data outside the typical research setting, thereby minimising recall bias. These EMAs can be administered through smartphone apps, prompting participants to respond to questionnaires or, increasingly, to provide open-ended text responses in real time. EMAs are now used both as a monitoring tool and as an intervention in mental health contexts, illustrating their versatility in capturing evolving linguistic markers of psychological well-being. A common analytical approach to studying language use is the bag-of-words method, which treats language as a collection of individual words (without considering grammar or word order). Tools like Linguistic Inquiry and Word Count (LIWC), which capture the frequency of specific words in text, have demonstrated that word category frequencies can serve as indicators of various mental health conditions. For instance, depression correlates with increased use of absolutist terms and self-referential language (Al-Mosaiwi and, psychological well-being is reflected in patterns of emotional word use and cognitive processing terms, stress responses manifest in shifts in function words and emotional expression, and personality traits are marked by distinctive word-choice and grammatical structures. While many of these findings stem from static text analyses, language-based EMAs expand these insights into real-world, real-time environments, for example, aiding in the understanding of social aspects of alcohol consumption. Alongside the word count approaches, the advent of transformer-based large language models (LLMs) has opened new frontiers in language analysis. These models excel at contextual understanding and scaling to vast datasets, making them particularly powerful for capturing subtle semantic and pragmatic nuances. This is especially relevant in psychology, where factors such as register, metaphor, or sarcasm can reflect deeper emotional and cognitive states. Recent applications of LLMs in mental health research include enhancing online psychological consultations, classifying mental health conditions like depression or anxiety, monitoring social media language for broader mental health surveillance, or detecting crisis signals in chat-based interactions. Moreover, LLMs, which operate solely on language, can convincingly simulate diverse psychological profiles, leading users to empathise with them and perceive these models as psychologically similar to themselves. These examples underscore the significant role of language in revealing psychological processes and the growing sophistication of language modelling methods in uncovering them. In addition to lexical and semantic analyses, examining vocal features provides a complementary approach to understanding psychological states, offering objective, physiological markers that may reflect both acute and sustained changes in mental states. Fundamental frequency, typically measured in Hertz (Hz), reflects emotional arousal and stress levels; jitter, measured as a percent deviation from normal periodicity, indicates irregular changes in sound quality in a short time; and shimmer reflects irregular variation in amplitude. These parameters have been linked to a range of psychological conditions: lower vocal variability, reduced pitch, and changes in loudness and tempo are associated with depression, while increased pitch variability has been observed in social anxiety disorder. Nevertheless, methodologies across studies remain heterogeneous, both in analytical approaches and data sources (e.g. online posts, diaries, interviews). Given that language use evolves over time, across locations, and varies by contextsand populations, the findings from one dataset may not generalise well to another, especially with small population sizes. To ensure interpretability and relevance, language models must be attuned to the linguistic and contextual nuances of the populations they aim to represent.
Psychedelic experiences present a particularly compelling context for such tailored approaches, given their distinct effects on language and cognition. Building on the successful use of language-based methodologies to reveal underlying psychological processes, these techniques can likewise help us understand the mechanisms underlying these states and the mental health changes that may follow. Supporting this, a large variety of studies have demonstrated that psychedelics alter language acutely, mostly by increasing ineffabilityand changing word semantics to be less predictable, and more bizarre. A number of studies have also used language to analyse reconstructed post-experience reflections (e.g. "trip reports") and therapy transcripts to predict treatment outcomes. Linguistic differences across substances have been identified through analysis of trip reports, withusing a bag-of-words approach to uncover variations in analytical language and emotional expression within the Erowid corpus.linked semantic similarities in trip reports to binding affinity profiles, bridging subjective language with pharmacology.developed a mystical-experience dictionary, demonstrating correlations between mystical word usage and self-reported experience intensity. Large-scale natural language processing of 11,000 Reddit trip reports revealed that psychedelics were more strongly associated with inferred emotions, such as "Realization," "Curiosity," "Confusion," "Surprise," and "Amusement" compared to other drug classes. Beyond self-reports, sentiment analysis has been applied to therapy transcripts to predict treatment response.used a custom "emotional analysis" algorithm with machine learning, achieving 85% accuracy in predicting psilocybin treatment response from 12 baseline interviews.applied a transformer-based sentiment model to integration therapy sessions of 90 patients, achieving 85%-88% accuracy in predicting long-term outcomes at 3 weeks post-dosing. Current study. In summary, language is intimately connected to psychological functioning, making it a powerful marker for understanding psychological and mental health changes. There is clear evidence that language is impacted by psychedelics, though there seem to be few consistent results across studies. Existing work has focused on acute effects during the psychedelic state or analysed therapy sessions or post-experience reconstructions at a single time point, which may not capture language use in genuinely naturalistic settings and thus raise concerns about demand characteristics and ecological validity. Moreover, no prior studies have explicitly tracked longitudinal changes in natural language before and after psychedelic ingestion, and none have examined acoustic measures in these contexts, let alone tracked them from pre-to post-experience. Filling these gaps is crucial for uncovering objective markers of psychedelic-induced shifts in cognitive, affective, and social cognition, potentially shedding light on their mechanism of action and guiding more nuanced therapeutic applications. Thus, we conduct the first longitudinal investigation into both textual and acoustic features of natural speech surrounding a single high dose of 5-MeO-DMT in a retreat setting. We focus on three key questions: • Pre-to post-5-MeO-DMT language and voice changes: How does vocabulary, emotions conveyed in language, and voice shift following 5-MeO-DMT exposure? What patterns emerge in emotional expression, and how do these changes relate to psychological outcomes? • Temporal dynamics: What patterns exist in language use across the study period? Are there systematic changes in vocal features before and after the experience, and can we identify markers that indicate successful preparation and integration? • Predictive markers: Can pre-experience language patterns predict experience intensity? Which linguistic features correlate with optimal outcomes, and how do vocal characteristics differ between preparation and integration periods? Three main hypotheses guide this investigation. First, we predict that the psychedelic experience will induce detectable changes in language use, with certain linguistic shifts predicting psychological outcomes and subjective experience qualities. Second, we hypothesise that pre-experience language patterns, including both vocabulary usage and emotional expression, can predict the nature of the psychedelic experience and subsequent well-being outcomes. Third, we expect to observe distinct patterns in vocal features across different periods surrounding the psychedelic experience, reflecting phases of preparation and integration. To test our hypotheses, we adopted an EMA approach using daily voice journals via a custom-built journaling bot (see Figure). We analysed data collected within a ±14-day window around the single 12 mg 5-MeO-DMT session. We extracted bag-of-words features from transcribed speech using LIWC, classified text emotions with a RoBERTa model trained on the GoEmotions dataset, and performed acoustic analysis (e.g. pitch, jitter, shimmer) using the openSMILE library.
Over a 28-day window centred on the 5-MeO-DMT session (±14 days), participants (n = 29) submitted a total of 288 unique voice journals comprising 4127 sentences. These recordings represent 506.07 minutes (8.44 hours) of audio content. On average, each participant contributed 10 ± 6 voice journals and 142 ± 115 sentences (mean ± standard deviation (SD); see Figure). The mean voice journal duration was 105.4 ± 68.8 seconds. Submission frequency peaked shortly before the retreat, with heightened activity in the days immediately surrounding dosing. Although participants were prompted to submit entries from days -7 to +7, we extended the analysis window to ±14 days based on observed engagement. Of the initial 29 participants, 25 provided both preand post-ceremony recordings, and 23 contributed at least 2 recordings in each period. Psychometric questionnaires were completed by 26 participants, with 24 providing at least 2 pre-dosing recordings plus questionnaire data. Sample sizes for each analysis are noted throughout. For more details on demographics, see Supplemental Table.
To examine individual changes (n = 29) in vocabulary use from before to after the 5-MeO-DMT experience, we analysed the LIWC categories that reflect linguistic, psychological, and social aspects of language. For full category definitions and base rates, see Supplemental Table. Overall, a number of significant linguistic shifts across multiple LIWC categories were observed in participants' voice journals after the 5-MeO-DMT experience (Figure, left panel; for full statistical results see: Supplemental Table). Cognitive word use increased significantly, (Coef.: 1.96, false discovery rate-adjusted p-value (p FDR ) < 0.001), driven by greater frequency of cognitive processes words (e.g. "because," "felt," "pretty," "if," "want"; Coef.: 1.93, p FDR < 0.001), insight words (e.g. "felt," "know," "feeling," "reflecting"; Coef.: 0.63, p FDR < 0.05), tentative words (e.g. "pretty," "if," "any," "sometimes," "hope," "might"; Coef.: 0.74, p FDR < 0.01), and certitude words (e.g. "really," "real," "actually," "incredibly," "totally"; Coef.: 0.43, p FDR < 0.05). By contrast, Social words (e.g. "they," "life," "she," "we're," "he's"; Coef.: -1.69, p FDR < 0.05) decreased, with most reductions seen in the communication (e.g. "thank," "talk," "meeting," "say," "said," "told"; Coef.: -1.22, p FDR < 0.05) and social referents (e.g. "they," "she," "friends," "everybody"; Coef.: -0.64, p FDR < 0.05) subcategories. In the Linguistic category, the use of adverbs increased significantly (e.g. "pretty," "when," "so," "now," "there," "here"; Coef.: 1.04, p FDR < 0.05), while conversational markers declined (e.g. "um," "okay," "yes," "yeah."; Coef.: -1.64, p FDR < 0.05). Within the Time category, there was a significant increase in pastoriented language (e.g. "went," "made," "yesterday," "realised"; Coef.: 1.6, p FDR < 0.001), and a significant decrease in futureoriented language (e.g. "tomorrow," "ready to," "hoping," "we'll," Coef.: -0.58, p FDR < 0.01). Text-based emotion detection using a RoBERTa-GoEmotion model (see Figure, right panel; for full statistical results see: Supplemental Table) revealed that after 5-MeO-DMT administration, participants (n = 29) showed significantly decreased excitement (Coef.: -0.022, p FDR < 0.001), nervousness (Coef.: -0.007, p FDR < 0.01), and fear (Coef.: -0.004, p FDR < 0.05), while admiration (Coef.: 0.028, p FDR < 0.01), relief (Coef.: 0.003, p FDR < 0.01), joy (Coef.: 0.022, p FDR < 0.05), and realisation (Coef.: 0.009, p FDR < 0.05) increased. Vocal feature changes. Three of 88 vocal features derived from openSMILE (n = 23) demonstrated significant post-dosing alterations in jitter and shimmer. Average local jitter (Coef.: 0.776, p FDR < 0.05) and shimmer (Coef.: 0.712, p FDR < 0.05) increased, while normalised jitter variability decreased (Coef.: -0.910, p FDR < 0.05). Full statistical results are provided in Supplemental Tableand visualised in Supplemental Figure. Temporal dynamics of cognitive and social language. Consistent with overall linguistic analysis (see Figure), the day-byday analyses (n = 25; see Figure) showed a pronounced crossover in cognitive versus social word usage after the 5-MeO-DMT session. Whereas social language, initially above baseline, fell below baseline levels post-dosing, cognitive language rose above baseline. This switch in trajectories persisted into the second post-dosing week, reflecting a sustained transition from socially oriented discourse toward a more cognitively focused language profile following the psychedelic experience (see. Word clouds revealed characteristic terms in cognitive language (e.g. "because," "felt," "if," "want," "trying," "know") and social language (e.g. "they," "life," "she," "we're," "talk," "city," "friends") that drove these opposing trajectories (see Figure, middle). This pattern was confirmed statistically, with a significant increase from weeks -1 to 1 and 2 for cognitive words and a significant decrease in social words from weeks -1 to 1 (see Figure, bottom). For detailed statistics, see Supplemental Tablesand. Temporal dynamics of vocal feature deviations. Pitch, jitter, and shimmer -selected for their associations with mental health and emotional regulation -showed distinct shifts around the 5-MeO-DMT session (n = 22; see Figure). For each feature, deviations from individual baselines (±1 SD) were classified as "high" or "low" and tracked daily. Day-by-day analyses (Figure, top) revealed that high pitch deviations peaked at Day -8 (frequency = 5.28), whereas low pitch deviations peaked at Day +2 (frequency = 2.98). Group-level comparisons across predefined weekly periods (Figure, bottom) confirmed a significant reduction in high pitch deviations from Week -1 (frequency = 2.23) to Week +2 (frequency = 0.51; pFDR = 0.0353), and a significant increase in low pitch deviations from Week -2 (frequency = 0.76) to Week +1 (frequency = 1.98; pFDR = 0.0079). Similar patterns were observed for jitter and shimmer, with elevated low deviation frequencies during the post-session period. Jitter presented the opposite pattern, with high jitter deviations peaking at Day +2 (frequency = 4.95) and low jitter deviations at Day -9 (frequency = 2.63). Over weekly intervals, high jitter deviations rose from Week -2 (frequency = 0.74) to Week -1 (frequency = 1.63; pFDR = 0.0320) and subsequently to Week +1 (frequency = 3.09; p FDR = 0.0147). Low jitter deviations decreased from Week -2 (frequency = 1.61) to Week +1 (frequency = 0.68; p FDR = 0.0486) and further to Week +2 (frequency = 0.33; pFDR = 0.0213). Finally, high shimmer deviations peaked at Day +3 (frequency = 3.92), whereas low shimmer deviations were most frequent at Day -8 (frequency = 3.73). Weekly comparisons showed shimmer high deviations rising from Week -2 (frequency = 0.69) to Week -1 (frequency = 1.50; p FDR = 0.0175) and to Week +1 (frequency = 3.12; p FDR = 0.0002), before falling back by Week +2 (frequency = 1.27; p FDR = 0.0025). Shimmer low deviations decreased from Week -1 (frequency = 2.25) to Week +1 (frequency = 0.60; p FDR = 0.0124) and further to Week +2 (frequency = 0.37; p FDR = 0.0112). We conducted an additional analysis, to examine the temporal dynamics of vocal features in a less categorical manner, at the individual level and using the same methods as those applied for the vocabulary dynamics (see: Supplemental Figure). This confirmed the main temporal trends in pitch and jitter observed at the group level. Shimmer followed a similar trajectory but did not reach corrected statistical significance. pre-post effects ranked by Cohen's d. Effect sizes ranged from small to large, with three features exceeding the largeeffect threshold (|d| ⩾ 0.8): the decrease in excitement (d = -1.01), the decrease in jitter variability (d = -0.91), and the increase in past-focused language (d = +0.87). Eleven features showed medium effects (0.5 ⩽ |d| < 0.8), including cognitive processes, certitude, shimmer, fear, and nervousness, and eight showed small effects. Overall, 13 features increased and 9 decreased post-dosing (for full details, see Supplemental Table).
Associations between pre-dosing vocabulary patterns and psychometric outcomes. We examined bivariate Pearson correlations between weighted means of pre-dosing LIWC features (118 categories) and 5 psychometric measures (n = 26 after removing cases with missing data): preparedness (Psychedelic Preparedness Scale, PPS), negative ego dissolution (Altered States of Consciousness, Dread of Ego Dissolution subscale, ASC-DED), unitive experience (Altered States of Consciousness, Oceanic Boundlessness subscale, ASC-OBN), emotional breakthroughs (Emotional Breakthrough Inventory, EBI), and mental well-being (short Warwick-Edinburgh Mental Well-Being Scale, sWEMWBS). Together, these domains were chosen a priori to reflect both immediate experiential qualities and post-experience psychological outcomes. False discovery rate (FDR) correction (Benjamini-Hochberg) was applied across all 118 features within each outcome to control for multiple comparisons. PPS showed the strongest and most consistent associations with pre-dosing language, and was the only outcome for which features survived FDR correction. Three LIWC features reached significance (see: Figure): positive tone vocabulary (e.g. "good," "well," "beautiful," "thank," "excited"; r = 0.654, p FDR = 0.028), work-related language (e.g. "work," "session," "meeting," "study," "school"; r = 0.636, p FDR = 0.028), and lifestyle references (e.g. "work," "home," "spent," "bed," "session," "weekend"; r = 0.612, p FDR = 0.035). A follow-up linear regression including all three FDR-surviving features (R 2 = 0.575, F(3,22) = 9.920, p < 0.001) revealed that only positive tone remained a significant independent predictor (β = 2.492, p FDR = 0.025), while work-related language (β = 5.343, p FDR = 0.400) and lifestyle references (β = 0.562, p FDR = 0.850) did not reach significance, suggesting their contributions are partly shared with positive tone. Beyond the FDR-significant PPS associations, 14 additional large bivariate associations (|r| > 0.5) were identified across 4 of the 5 outcomes, but did not survive FDR correction (e.g. (Top) Line plots showing the smoothed daily frequency of high (red) and low (blue) deviations in pitch, jitter, and shimmer over a 28-day period, centred around the dosing day (Day 0, vertical dashed line). Deviations were classified as significant when participants' vocal measurements exceeded one standard deviation above (high) or below (low) their individual mean values. Vertical dotted lines demarcate weekly periods: Week -2 (-14 to -7 days), Week -1 (-7 to 0 days), Week +1 (0 to 7 days), and Week +2 (7-14 days). Coloured dots indicate peak frequency points for high and low deviations, with smoothing applied using a Gaussian filter (sigma = 1). (Bottom) Bar plots depicting mean frequency of smoothed deviations during each weekly period, with error bars representing SEM. Statistical significance between periods is based on independent t-tests with FDR correction for multiple comparisons (*p < 0.05. **p < 0.01. ***p < 0.001). Only participants with at least two recordings in both pre-and post-periods were included in the analysis. SEM: standard error of the mean; FDR: false discovery rate; 5-MeO-DMT: 5-methoxy-N,N-dimethyltryptamine. sWEMWBS and Cognition (r = -0.598, p FDR = 0.074), DED and quantity words (r = +0.628, p FDR = 0.07); for more details, see Supplemental Figure). We additionally explored whether multivariate ridge regression using all 118 LIWC features could improve prediction, but leave-one-out cross-validation (LOOCV) showed that no model generalised to held-out participants (R 2 ranged from -0.27 to 0.22, with all but PPS yielding negative values; Supplemental Table), consistent with overfitting given the unfavourable predictor-to-observation ratio (p = 118, n = 26). Pre-dosing emotional states and their relationship to outcome measures. Principal component analysis (PCA) performed on the pre-dosing weighted mean emotion scores derived from the RoBERTa-GoEmotions model identified three key patterns of emotional expression combinations, together explaining 49.1% of the total variance (see: Supplemental Figure). Each of the identified principal components (PCs) was then used as a predictor in multiple regression analyses to examine their relationship with selected psychological outcomes (for details, see Supplemental Table). PC1 explained 25.5% of the variance and we refer to it as "positive emotional engagement" (see Figure. This is because it represented a contrast between positive emotional expression, characterised by gratitude (0.303), pride (0.266), joy (0.258), excitement (0.234), caring (0.223), and love (0.206), versus negative emotional states, marked by disapproval (-0.282), confusion (-0.264), embarrassment (-0.257), disgust (-0.250), disappointment (-0.245), and annoyance (-0.235).. FDR: false discovery rate; LIWC: Linguistic Inquiry and Word Count. When mapping vocabulary categories onto PC1, we found significant positive correlations with positive tone words (r = 0.596, p FDR = 0.032) and significant negative correlations with negations (e.g. "didn't," "haven't," "not really," "not"; r = -0.658, p FDR = 0.011), cognition (r = -0.743, p FDR = 0.001), cognitive processes (r = -0.748, p FDR = 0.001), and differentiation (e.g. "if," "didn't," "than," "but," "else," "without"; r = -0.647, p FDR = 0.012). As shown in the bottom panel of Figure, linear regressions confirmed that higher PC1 scores significantly predicted greater preparedness as measured by the PPS (β = 2.33, p = 0.002, p FDR = 0.017, R 2 = 0.363) and sWEMWBS (β = 0.66, p = 0.005, p FDR = 0.020, R 2 = 0.303) scores. PC2 (15.3% variance), referred to as "negative emotional states," contrasted intense emotional distress, including anger (0.366), sadness (0.355), grief (0.353), annoyance (0.288), desire (0.267), nervousness (0.263), and disappointment (0.224), with emotional neutrality (-0.317) and approval (-0.212). When mapping LIWC categories onto PC2, we found significant positive correlations with 3rd person singular (r = 0.639, p FDR = 0.019), negative tone (e.g. "tired," "nervous," "lost," "anxiety," "sick"; r = 0.659, p FDR = 0.016), emotion (e.g. "good," "tired," "excited," "happy," "nervous"; r = 0.613, p FDR = 0.027), negative emotion (e.g. "tired," "nervous," "anxiety," "sad"; r = 0.601, p FDR = 0.029), and anger (e.g. "frustrating," "angry," "revenge," "pissed off"; r = 0.725, p FDR = 0.004). PC2 did not predict any of the outcome variables. PC3 (8.3% variance), "reflective integration," captured the tension between reflective emotions, expressed through remorse (0.427), relief (0.287), approval (0.276), pride (0.230), and embarrassment (0.203), versus exploratory emotions characterised by amusement (-0.309), fear (-0.267), curiosity (-0.242), and gratitude. No significant correlations were found when mapping LIWC categories onto PC3 after FDR correction. However, higher PC3 scores predicted a stronger emotional breakthrough, as measured by the EBI (see Figurebottom panel; β = 73.47, p = 0.001, p FDR = 0.010, R 2 = 0.377).
To assess whether pre-post effects were driven by heterogeneity in participants' prior 5-MeO-DMT experience (range = 1-300 lifetime uses), we first examined whether pre-dosing levels of the 19 significant LIWC and GoEmotion features correlated with lifetime 5-MeO-DMT use (Spearman rank correlations, Supplemental Table). No features showed significant baseline associations with lifetime use after FDR correction (all p FDR > 0.05), though fear (rho = -0.50, p FDR = 0.065) and nervousness (rho = -0.49, p FDR = 0.065) showed the strongest trends. We then conducted two complementary sensitivity analyses. A median split (media n = 6 uses) divided participants into a lower-experience group (1-6 uses, n = 16) and a higher-experience group (more than 6 uses, n = 13; Supplemental Figure). Re-running the pre-post mixed-effects models within each subgroup confirmed that all 19 significant effects maintained the same direction of change across both groups (Supplemental Table; Supplemental Figure). While reduced statistical power in the smaller subgroups meant that fewer individual effects reached corrected significance, no category reversed direction, and the cognitive-social crossover pattern was preserved in both subgroups (Supplemental Figure). When log-transformed lifetime use was instead included as a covariate, all 19 pre-post effects remained significant after FDR correction (all p FDR < 0.05), with negligible changes in coefficient magnitude (all below 5%; median change 0.6%; Supplemental Table). Interaction models (Category ~ PrePos t × log(lifetime use) + (1 | ParticipantID)) revealed that only 1 of 19 interaction terms reached significance: past-focused language (pFDR = 0.004), where the pre-post increase was amplified in more experienced users. Nervousness showed a trend toward attenuation in more experienced users (pFDR = 0.065). The remaining 18 interactions were not significant, and Akaike information criterion (AIC) comparisons favoured the base model (Pre-Post only) for 16 of 19 categories. Together, these results indicate that the observed pre-post changes are robust across varying levels of lifetime 5-MeO-DMT use.
Our investigation into participants' naturalistic language and voice surrounding a single 5-MeO-DMT session provides the first documentation that measurable longitudinal changes in verbal and vocal expression accompany a psychedelic retreat, revealing psychologically meaningful dynamics in real-world contexts. By capturing voice journals for 2 weeks before and after dosing, we identified significant shifts in cognitive and social language use, observed altered linguistic markers of emotional expression, and uncovered distinct vocal markers, together pointing to a reorientation of participants' internal and external focus. These findings suggest that linguistic and vocal markers may reflect core psychological mechanisms of integration and offer potential predictors of therapeutic response. This study also introduces a set of methodological innovations: the use of voice-note diaries as an EMA tool, a custom chatbot for automated longitudinal data capture, and a multimodal analytical framework combining bag-of-words vocabulary analysis, transformer-based textual emotion detection, and acoustic feature extraction. These approaches enabled fine-grained, low-burden monitoring of psychological states in the periods leading up to and following the psychedelic experience (often termed "preparation" and "integration" in therapeutic contexts), and offer a scalable framework for future clinical and naturalistic research.
A key finding was the marked increase in cognitive and decrease in social words following 5-MeO-DMT, suggesting a reorientation from externally focused discourse toward more introspective modes of expression (Figure, left). Day-by-day trajectories (Figure) showed a crossover effect: social language, initially elevated relative to baseline, declined sharply post-dosing, while cognitive language rose and remained elevated for up to 2 weeks. Such a pattern points to a cognitively active, "slightly detached" phase that may persist during the 2 weeks after the retreat, as participants work to contextualise and process their experience. Despite the absence of formal integration therapy in this study, this interval may still support heightened self-examination and meaning-making. This process may relate to post-dosing changes in brain function, including increased neuroplasticity in regions like the prefrontal cortex, which is involved in emotional regulation, though our behavioural data alone cannot confirm such neural mechanisms. Indeed, increased use of cognitive process terms (e.g. because, felt, pretty, if, want) and cognitive differentiation vocabulary (e.g. if, didn't, than, but, else, without) may reflect intensified reflective thinking or problem-solving. This aligns with evidence that psychedelics may enhance cognitive flexibilityand suggests that language could serve as a useful marker for tracking such processes. While the overall number of affective words and the balance between positive and negative emotion terms (as measured by a bag-of-words approach; Figure, left) remained relatively stable from pre-to post-dosing, a more nuanced analysis using a transformer-based model (Figure, right) reveals a substantial shift in the specific types of emotions verbally expressed across timepoints. Emotions related to anticipation and uncertainty (e.g. excitement, nervousness, fear) gave way to expressions reflecting positive affect and meaning-making (e.g. joy, admiration, realisation), suggesting a shift from looking forward to looking inward. Notably, the rise in "realisation" aligns with theories that psychedelics may catalyse insight and perspective shifts, suggesting an active process of integrating and making sense of the experience rather than a simple elevation of mood, though the open-label design warrants replication in controlled settings. These text-inferred affective changes, which coincided with a significant increase in pastfocused language and a decrease in future-oriented expression, suggest a shift from anticipatory thought to retrospective processing. Together with reduced social and increased cognitive word use, this pattern points to a post-acute phase of introspective engagement. It may reflect temporary distancing from interpersonal concerns and anticipatory thinking, replaced by a focus on immediate, introspective, and emotionally salient content. Alongside these linguistic dynamics, we observed significant shifts in vocal parameters, particularly jitter and shimmer that offer further insight into the changes surrounding the 5-MeO-DMT retreat. Both jitter and shimmer increased on average from pre-to post-session, while jitter variability decreased, reflecting a distinctive pattern of amplitude and frequency modulation (Figure). Although higher jitter and shimmer have been linked to anxietyand depression, the absence of elevated anxiety (see Supplemental Table) in our dataset suggests that these changes may reflect a temporary "turbulent" or liminal state, rather than pathology. This interpretation is exploratory and draws tentative support from emerging, primarily preclinical, evidence that psychedelics can reopen "critical periods" of neuroplasticity, which could allow for the revision of entrenched cognitive and emotional patterns. In our day-by-day analyses over a 28-day window, pitch was frequently elevated prior to dosing -a "build-up" phase -while jitter and shimmer reached their highest deviations in the immediate post-session period, implying a subsequent "integration" phase of sustained emotional engagement (Figure). Although additional single-participant analyses confirmed the main pitch and jitter trends, shimmer showed more variability across individuals, suggesting personal differences in amplitude modulation. These vocal changes, when considered alongside shifts in cognitive and affective language use, highlight a multifaceted process of "transition" following a 5-MeO-DMT retreat. While participants' textual narratives increasingly focused on introspection and meaning-making, elevated jitter and shimmer may reflect an embodied, physiological dimension of heightened engagement with emotional and cognitive content. The reported patterns were robust to heterogeneity in prior 5-MeO-DMT experience: sensitivity analyses confirmed that all pre-post effects were directionally consistent across lower-and higher-experience subgroups, and all features remained significant when lifetime use was controlled for as a covariate. The only significant moderation was for past-focused language, where experienced users showed a larger post-dosing increase, possibly reflecting a richer repertoire of prior psychedelic experiences to draw upon for comparison and contextualisation (see Supplemental Figures S5 to S7; Supplemental Tables S10, S11, and S12). Taken together, these findings underscore that elevated jitter and shimmer might not necessarily indicate clinical distress but rather a transient window of malleability and openness following a profound psychedelic event. Importantly, this divergence between surface-level emotional stability and deeper shifts in emotional tone highlights the complementary strengths of different analytic approaches. While bag-of-words methods capture overall linguistic trends, transformer-based models uncover subtle context-dependent transformations in emotional expression, and the observed changes in vocal parameters reveal additional psychophysiological nuances. These convergent measures shed new light on the extended psychological processes that follow a single 5-MeO-DMT session and underscore the value of multimodal approaches in understanding psychedelic experiences. Language-based insights into self-reported outcomes. Our findings also suggest that pre-dosing language use may provide a meaningful indicator of PPS. Although positive tone, work-, and lifestyle-related language were each correlated with higher preparedness scores, only positive tone remained a significant independent predictor in the joint regression model, implying that the latter associations were largely accounted for by shared variance with positive emotional expression. This is consistent with the PCA-based analysis, in which participants scoring higher on PC1 (a component reflecting positive emotional expression and characterised by greater use of positive tone language and lower use of negations, cognitive vocabulary and cognitive differentiation words), also had higher preparedness scores. Post-retreat selfreported well-being (sWEMWBS) was also significantly and positively associated with PC1. Together, these findings suggest that positively valenced language reflects greater psychological readiness for the psychedelic experience and may be associated with more favourable therapeutic outcomes. An intriguing finding arose in predicting emotional breakthrough, a state of emotional catharsis often linked with beneficial therapeutic outcomes in psychedelic-assisted treatments for depression. While no specific vocabulary clusters were associated with EBI scores after FDR correction, higher PC3 scoresreflecting greater expression of reflective emotions such as remorse, relief, approval, and embarrassment -significantly predicted stronger emotional breakthroughs. Notably, these emotions are not uniformly positive; rather, they suggest a capacity to engage with complex or ambivalent inner states. This supports the view that openness to emotionally challenging or integrative content may facilitate deeper therapeutic processing. It also highlights how some experiential outcomes may elude capture by discrete vocabulary categories, underscoring the value of transformer-based approaches in detecting more subtle emotional dynamics. In summary, these findings support emerging models of PPS, highlighting how baseline emotional and linguistic patterns shape participants' readiness and their subsequent experiences. Rather than relying solely on retrospective self-report, natural language offers a dynamic, unobtrusive window into the psychological state, one that may help identify individuals who could benefit from tailored preparation. Ongoing, voice-based assessments may further illuminate how readiness and risk evolve over time, ultimately enabling more responsive support before, during, and after psychedelic sessions.
A key limitation of this study is its sample composition: a small, self-selecting group of individuals with extensive prior experience using psychedelics, who are not representative of the general population. Psychedelic users tend to exhibit higher openness and extraversion, and lower neuroticism, traits that may influence spoken language complexity and expressivity. Moreover, participants' high familiarity with 5-MeO-DMT (mean lifetime use: 39 occasions; SD = 72.91, range = 1-300) may have also influenced baseline linguistic patterns, potentially obscuring session-specific effects. Because all participants were healthy volunteers without diagnosed mental health conditions, the generalisability of these findings to clinical populations remains limited-an important consideration given the increasing use of psychedelics in therapeutic contexts, where languageand speech) are being studied as potential diagnostic markers. Contextual factors may also have influenced participants' expressions of their psychedelic experiences. The retreat setting, including group dynamics, interpersonal interactions, and heightened suggestibility, could amplify the phenomenon Durkheim described as "collective effervescence", potentially shaping responses in ways unlikely to occur in controlled clinical settings. Methodologically, this study was not preregistered, reflecting its status as the first investigation of longitudinal voice-journal analysis around a psychedelic session; while the analytical framework was hypothesis-driven and multiple safeguards were applied (FDR correction, cross-validation, a priori feature exclusion), the specific markers identified should be treated as hypothesis-generating pending confirmatory replication. Additionally, the reliance on predefined word dictionaries, while a common analytical approach, may have missed nuances in slang, nonnative speech, or culturally specific language. Trained on Reddit comments, the automated emotion detection model may not fully capture the specialised language and emotional states associated with psychedelic integration, and imbalances in the training datacould bias the detection of underrepresented emotions. And crucially, the absence of a control group prevents us from ruling out alternative explanations for the observed changes, such as the known therapeutic benefits of journaling itself, which has been shown to improve mental health and well-being independent of any psychedelic intervention. Crucially, the absence of a non-psychedelic control condition (e.g. a similar retreat without drug administration) limits our ability to attribute the observed linguistic and vocal changes specifically to the pharmacological effects of 5-MeO-DMT, rather than to contextual or reflective processes associated with an emotionally salient, time-limited event. Many reported shifts, including decreased anticipatory emotions, increased past focus, and changes in social versus cognitive language, might partly relate to a transition from anticipation to reflection irrespective of drug effects. A larger, blinded, placebo-controlled study, ideally complemented by laboratory-based administration, would enable clearer tests of dose dependence and moderators (e.g. acute experience intensity and prior psychedelic exposure) while better separating pharmacological effects from expectancy and retreat context. Finally, several data collection constraints should be noted. The voice journal format, though ecologically valid for capturing natural speech, may have led to self-censorship or altered expression due to participants' awareness of being recorded. Inconsistencies in recording conditions -such as ambient noise, microphone distance, or device quality -could affect the reliability of acoustic data. Disruptions to participants' schedules due to time zone differences, along with the absence of follow-up assessments, limit conclusions about the persistence of observed linguistic changes. Taken together, these factors suggest that the findings, while informative, should be interpreted with caution regarding their generalisability and precision.
This chatbot-based, speech-focused protocol exemplifies a scalable method for capturing nuanced psychological transformations around psychedelic sessions. By leveraging the widespread accessibility of smartphones and messaging applications, such automated approaches can reduce participant burden while enabling finer-grained tracking of integration processes in real-world settings. Future studies could expand this approach across diverse cultural contexts and larger sample sizes, or adapt it to other short-acting psychedelics (e.g. DMT) and longer-acting compounds (e.g. psilocybin, LSD). The inclusion of active control conditions, such as journaling without psychedelic use, or placebo arms, will be essential for isolating the specific effects of psychedelic compounds from other therapeutic elements. Comparison with language data from non-psychedelic retreats, meditation intensives, or other significant life events would further help establish whether the observed patterns are specific to psychedelic pharmacology. Linking language or acoustic changes to neuroimaging, physiological, and behavioural measures could further elucidate the underlying mechanisms. Longitudinal data collection at multiple post-dosing intervals (e.g. 1 and 3 months) will be critical for determining the persistence of observed linguistic and acoustic changes and their potential relation to lasting neuroplastic adaptations. Methodological refinements may include the use of advanced language models (e.g. GPT-4o, Llama 3, Claude 3.7) capable of detecting nuanced emotional tone, metaphorical richness, and non-standard linguistic constructions. However, the computational demands, "black box" nature, inherent biases, and ethical considerations associated with such models warrant careful attention. Transparent validation procedures, open-source dataset sharing, and interdisciplinary collaboration will be essential to ensuring that emerging computational tools contribute meaningfully and responsibly to both psychedelic science and clinical translation. Finally, further investigation is needed into how different integration modalities (e.g.shape postsession language trajectories, and whether personalised support can enhance therapeutic outcomes. These approaches should be tested across diverse populations, including clinical groups such as individuals with depression, who may engage with psychedelic experiences and their integration differently than healthy, experienced users. Ultimately, speech-based monitoring holds promise for guiding preparation, predicting outcomes, and supporting integration in both clinical and retreat settings.
In conclusion, our data support the notion that everyday language and vocal patterns are sensitive indicators of psychological change surrounding psychedelic retreat experiences, though future controlled studies are needed to better isolate pharmacological effects from contextual factors. The observed shift toward introspective, cognitively oriented language, coupled with distinct vocal transitions, demonstrates substantial changes in one's verbal communication and sense of self after a single 5-MeO-DMT retreat experience. Not only do these methods offer new ways to map the trajectory of "preparation" and "integration," but they also suggest a future where automated, real-time feedback might guide and support individuals undergoing psychedelic-assisted therapies or retreat experiences. By establishing the feasibility and utility of longitudinal voice analysis, our findings set the stage for broader efforts to illuminate the pathways through which psychedelics effect enduring personal growth and therapeutic benefits.
This study adhered to the principles of the Helsinki Declaration and received approval from UCL's Ethics Committee (ID: 19437/004). It was conducted in collaboration with the Tandava Retreat Centre (TRC), which supplied both facilitators and facilities. Participants reviewed the study details and provided informed consent online. Participation was voluntary, without compensation, and individuals could withdraw at any time without repercussion. All participants were provided with information about additional support resources in case of any study-related distress.
Healthy, adult participants were recruited globally through the F.I.V.E. platform (www.five-meo.education), social media, and word of mouth. The final cohort (N = 29) was predominantly White/Caucasian (86.21%) and non-religious (93.10%), with prior experience with 5-MeO-DMT (mean lifetime use: 39 occasions (SD = 72.91, range = 1-300)).
Three-day retreat and 5-MeO-DMT session. The study was conducted in a naturalistic setting, at the TRC. Participants were assigned to one of the six 3-day retreats with the following schedule: Day 1 focused on orientation and participant briefing, Day 2 on 5-MeO-DMT administration and primary data collection, and Day 3 on integration featuring facilitator-guided group discussions. On Day 2, participants were administered synthetic 5-MeO-DMT (1 × 12 mg) individually. The compound was vaporised using an argon gas piston vaporiser at 203°C-210°C over approximately 120 seconds, and participants inhaled the fully vaporised compound in a single breath following a standardised protocol. Administration took place in a controlled setting under the supervision of experienced facilitators, with participants in a semi-supine position on a padded recliner. For a comprehensive overview of study design and additional study components, such as EEG protocols, see. Automated chatbot-based voice journal collection system. Voice journals were collected via RetreatBot, a custombuilt chatbot hosted on the Telegram platform. Developed for Telegram using the OneReach.ai software platform Generative Studio X, this chatbot functioned as an EMA tool to capture voice journals of participants' thoughts and feelings. Unlike more recent, dynamic conversational AIs (e.g. ChatGPT) that adapt in real time, the chatbot employed in this study was rule based, following a decision-tree model with predefined paths and functioning more like a structured app. A primary reason for choosing a chatbot was its inherently conversational interface (see Figure), which can reduce user burden and enhance engagement. In addition, the chatbot featured speech-based data entry -an approach suggested to be more intuitive and efficient than typing in mobile health applications. By integrating these design elements, RetreatBot aimed to streamline participation and minimise the technological barriers or effort required from users. Upon registration, each participant used a unique study ID, received an overview of the study procedures, and was guided on how to record voice journals. The default prompt was: Take a moment to reflect on your day and share your thoughts, feelings, and experiences. Record a voice note for about one minute. Notifications were sent daily from 7 days before (Day -7) to 7 days after (Day +7) the dosing session. However, participants could also submit entries outside this core window to capture additional relevant experiences.
Participants completed a broader battery of psychometric questionnaires as part of the parent study. Five outcome measures were selected a priori for the present analyses to capture theoretically relevant constructs across three phases of the psychedelic experience: pre-dosing preparedness, acute subjective effects, and post-dosing well-being. The remaining scales in the battery assessed constructs reserved for other components of the broader research project (e.g. inner speech, interoception). The Depression, Anxiety and Stress Scale was administered pre-and post-dosing but was retained as a clinical descriptor rather than an outcome variable, as the non-clinical sample was unlikely to show substantial changes in clinical symptomatology. We examined the relationships between these measures and participants' voice journal data (both text-and audio-based features). The PPSincludes 20 items rated on a seven-point Likert scale (1-7) designed to assess an individual's readiness for a psychedelic experience evaluating four key domains: preparatory readiness: knowledge-expectation, psychophysical readiness, intention-preparation, and support-planning. The total PPS score was calculated as the sum of all 20 items, resulting in a total range from 20 to 140. The OBN subscale of the 5-Dimensional Altered States of Consciousness Questionnaire (5D-ASC)includes 27 items rated on a visual analogue scale from 0 ("Not at all") to 100 ("Extremely"). It measures feelings of unity, spiritual experiences, blissful states, and insightfulness, capturing positively experienced depersonalisation and derealisation associated with heightened mood or euphoric exaltation. The OB score was calculated as the average of all OB-related items, providing a score in the 0-100 range. The DED is another subscale of the 5D-ASC that assesses experiences related to ego disintegration, such as impaired control and cognition, anxiety, and disembodiment. It reflects negatively experienced derealisation and depersonalisation, including cognitive disturbances and loss of self-control. Scores for the DED subscale are calculated in the same manner as for OBN, yielding a score in the 0-100 range. The EBIincludes six items rated on a six-point Likert scale (0-5), designed to assess the degree of emotional release or catharsis following a psychedelic experience. The total EBI score was calculated as the sum of all six items, yielding a score in the 0-30 range. The sWEMWBS is a short version of the WEMWBSand includes 7 of the WEMWBS's 14 items rated on a five-point Likert scale (1-5), assessing both hedonic (e.g. happiness) and eudaimonic (e.g. purpose, connection) aspects of mental well-being. The total WEMWBS score was calculated as the sum of all 7 items, with a resulting range of 7-35.
Voice journal preprocessing. All voice recordings were automatically transcribed upon submission using the AssemblyAI API. Two researchers (A.S. and J.K.) subsequently performed manual checks to ensure transcription accuracy. Journal entries containing no spoken text were removed. To maintain participant confidentiality, sensitive information (e.g. names, locations, organisations, dates) was anonymised. An automated approach using the transformers library's TokenClassificationPipeline and a pre-trained named entity recognition model (Davlan/bert-base-multilingualcased-ner-hrl) was used. Any remaining identifiable elements were further redacted manually.
Coordinated Time timestamps were converted to the local retreat timezone (Mexico City). For consistency, the retreat start time was standardised to 8:00 AM. Each voice entry's relative date was then determined as the difference (in days) between its localadjusted timestamp and the retreat start. Entries were labelled "Pre" or "Post" depending on whether they occurred before or after the 5-MeO-DMT session (Day 0). Recordings made on the same calendar day as the dosing but referring to anticipated or immediate post-experience content were manually assigned slightly adjusted relative dates (-0.5 or +0.5) to ensure accurate pre-post categorisation. Voice entries from Day -14 through Day +14 were retained for analysis. Text-based feature extraction. After transcription, each journal entry was split into sentences using the Natural Language Toolkit sentence tokeniser. Two complementary approaches were used: 1. Categorical word frequency . Each sentence was processed by LIWC-22 software. LIWC scores represent the proportion of words within each text that fall into the specific category. For example, in the sentence "I am happy," the score for the Affect category would be 0.33, as one out of the three words ("happy") pertains to affective content. Categories with an average frequency below 1% across all data were excluded to allow for reliable analysis of the coefficients.
A Robustly Optimised BERT Pretraining Approach (RoBERTa) model, an optimised variant of the BERT model developed by Facebook AI, finetuned on the GoEmotions dataset, was used to generate probability scores for 27 emotion categories plus a neutral class (huggingface.co/ SamLowe/roberta-base-go_emotions). These emotions included positive (admiration, amusement, approval, caring, desire, excitement, gratitude, joy, love, optimism, pride, relief), negative (anger, annoyance, disappointment, disapproval, disgust, embarrassment, fear, grief, nervousness, remorse, sadness), and ambiguous (confusion, curiosity, realisation, surprise) ones, following the GoEmotions taxonomy. To represent these features for analysis, two main types of scores were computed: • Per-sentence scores: Raw LIWC category or emotion scores calculated for each sentence. • Weighted mean scores: Individual pre-and post-dosing averages weighted by sentence length (in words). Weighted mean was calculated using this formula: where WordCount i is the number of words in a given sentence, and FactorValue i represents the LIWC or emotional category score for that sentence. This formula calculates a weighted average by giving more influence to sentences with higher word counts, ensuring that longer sentences contribute proportionally to the overall score. By applying this weighting, we achieved a more representative measure of language patterns over the entire study period. Acoustic features. Acoustic features were extracted from voice recordings using the OpenSMILE toolkit (eGe-MAPSv02), producing 88 parameters (e.g. pitch, intensity, speech rate, voice quality, spectral measures). For each participant, we calculated simple mean values of key parameters in the 2-week pre-session (Days -14 to 0) versus post-session (Days 0 to +14) windows.
All statistical tests were two-tailed unless stated otherwise, with p-values adjusted via the Benjamini-Hochberg FDR method within each analysis family (e.g. LIWC features, emotion features, acoustic features). We used R 2 -adjusted and the Bayesian Information Criterion (BIC) to assess regression model performance and guide model selection.
Text-based features. Linear mixed-effects models were employed to examine changes in LIWC and emotion category usage from pre-to post-session: Here, Category represents a LIWC or RoBERTa-GoEmotions measure, Pre-Post is a fixed effect, and ParticipantID is a random intercept to account for repeated measures. Statistically significant shifts were identified based on FDR-corrected p-values. Acoustic metrics. Paired t-tests were used to compare pre-versus post-dosing means of each acoustic feature. Differences were standardised as z-scores, and 95% confidence intervals were computed for each parameter. Effect size comparison. To enable comparison of effect magnitudes across modalities and feature types, Cohen's d was computed for each significant feature as the mean pre-to-post difference divided by the pooled standard deviation of that feature's pre-and post-dosing values. Temporal analyses. We conducted a day-by-day examination of how certain linguistic and vocal features evolved from Day -14 to Day +14. For linguistic metrics (i.e. "Cognition," and "Social" words), we calculated participant-level z-scores relative to each individual's overall mean, then aggregated these daily across the entire sample. Smoothing with a Gaussian kernel of 1.5 days was applied for visualisation. To facilitate discrete comparisons, the 28-day window was divided into 4 periods: Week -2 (Days -14 to -7), Week -1 (Days -7 to 0), Week +1 (Days 0 to 7), and Week +2 (Days 7 to 14). Mean standardised usage of linguistic categories in each period was compared using pairwise t-tests. Similarly, for acoustic characteristics (pitch, jitter, and shimmer), baseline values for all participants were calculated as the mean and standard deviation (±1 SD) across all available recordings. Deviations beyond this range were classified as "High" or "Low" deviations for each acoustic feature. These deviations were calculated separately for pitch (in Hz, derived from semitone values relative to 27.5 Hz), jitter (local), and shimmer (local dB). Deviation frequencies were aggregated at the daily level, then smoothed using a Gaussian kernel (σ = 1). Group-level comparisons were performed across four time periods -Week -2 (Days -14 to -7), Week -1 (Days -7 to 0), Week +1 (Days 0 to 7), and Week +2 (Days 7 to 14) -using independent two-sample t-tests. FDR correction was applied across all pairwise tests. Deviations were visualised as daily time series and as period-averaged bar plots, annotated with statistical significance where applicable. Sensitivity analysis accounting for prior 5-MeO-DMT experience. Given the heterogeneity in participants' prior 5-MeO-DMT use (range = 1-300 occasions), we conducted a series of sensitivity analyses to assess whether the reported pre-post effects were robust to differences in lifetime experience. First, we assessed whether pre-dosing levels of the 19 significant features were associated with lifetime use using Spearman rank correlations (FDR-corrected across the 19 tests) to characterise any baseline differences between experience groups. Second, we performed a median split, dividing participants into lower-experience and higher-experience subgroups, and reran the pre-post mixed-effects models within each subgroup to verify that effect directions and patterns were preserved. Third, to avoid the information loss inherent in dichotomisation, we re-estimated the original models with log-transformed lifetime 5-MeO-DMT use as an additional covariate Model fit was compared using the AIC. FDR correction was applied across all tests within each analysis family. Predictive modelling. To explore whether pre-session linguistic patterns predicted subjective experiences, we examined bivariate Pearson correlations between weighted means of pre-dosing LIWC features (118 categories) and 5 psychometric outcomes (PPS, ASC-OBN, ASC-DED, EBI, and sWEMWBS). FDR correction (Benjamini-Hochberg) was applied across all 118 features within each outcome to control for multiple comparisons. Features surviving FDR correction (p FDR < 0.05) were entered into ordinary leastsquares linear regression models to quantify their joint and independent predictive contributions, with FDR correction applied to the regression coefficients. Associations exhibiting large effect sizes at |r| > 0.5 thresholdthat did not survive FDR correction were additionally reported as suggestive. To assess whether multivariate models could improve prediction, we performed ridge regression using all 118 LIWC features for each outcome. Regularisation parameters were selected via fivefold cross-validation (RidgeCV), and model generalisability was evaluated using LOOCV, with feature scaling and ridge fitting performed independently within each fold to prevent data leakage. For the emotion-based features, we applied PCA to the weighted mean pre-dosing RoBERTa-GoEmotions scores. Components were selected based on an elbow plot, and emotions with loadings ⩾ |0.2| were used to interpret each component. Regression analyses tested whether these component scores predicted the five self-report measures. Statistically significant models were identified through adjusted R 2 , BIC, and p-values. and Joel Brierre for their key contributions in facilitating the 5-MeO-DMT sessions. We are also grateful to Luis Fabian Rodriguez, Otto Maier, James Sanders, and George Deane for their support during data collection. Our deepest appreciation goes to the participants who volunteered for this study, generously contributing their time and personal voice journals. This research would not be possible without their openness. Additionally, we thank the contributors to the crowdfunding campaign, whose support made this research possible.
Create a free account to open full-text PDFs.
Carrillo, F., Sigman, M., Fernández Slezak, D. et al. · Journal of Affective Disorders (2018)
Davis, A. K., So, S., Lancelotta, R. et al. · The American Journal of Drug and Alcohol Abuse (2019)
Doss, M. K., Považan, M., Rosenberg, M. D. et al. · Translational Psychiatry (2021)
Dougherty, R. F., Clarke, P., Kuc, J. et al. · Psychopharmacology (2023)
Ermakova, A. O., Dunbar, F., Rucker, J. et al. · Journal of Psychopharmacology (2021)
Friedman, S. F., Ballentine, G. · Biorxiv (2022)
Kraehenmann, R. ;., Pokorny, D. ;., Vollenweider, L. ;. et al. · Psychopharmacology (2017)
Lyons, T., Spriggs, M. J., Kerkelä, L. et al. · Nature Communications (2026)
Milliere, R., Carhart-Harris, R. L., Roseman, L. et al. · Frontiers in Psychology (2018)
Murphy, R., Murphy-Beiner, A., Kettner, H. et al. · Frontiers in Pharmacology (2022)
Nardou, R., Sawyer, E., Song, Y. J. et al. · Nature (2023)
Nichols, C. D., Nichols, D. E., Johnson, M. W. · Clinical Pharmacology and Therapeutics (2016)
Nutt, D. J., Erritzoe, D., Carhart-Harris, R. L. · Cell (2020)
Pahnke, W. N. · Psychedelic Review (1969)
Peill, J. M., Trinci, K. E., Kettner, H. et al. · Journal of Psychopharmacology (2022)
Preller, K. H., Vollenweider, F. X. · Behavioral Neurobiology of Psychedelic Drugs (2016)
Roseman, L., Haijen, E. C. H. M., Idialu-Ikato, K. et al. · Journal of Psychopharmacology (2019)
Roseman, L., Nutt, D. J., Carhart-Harris, R. L. · Frontiers in Pharmacology (2018)
Sjöström, D. K., Claesdotter-Knutsson, E., Kajonius, P. J. · Scientific Reports (2024)
Studerus, E., Gamma, A., Vollenweider, F. X. · PLOS ONE (2010)
Swift, T. C., Belser, A. B., Agin-Liebes, G. et al. · Journal of Humanistic Psychology (2017)
Tagliazucchi, E. · Frontiers in Pharmacology (2022)
Tagliazucchi, E., Roseman, L., Kaelen, M. et al. · Current Biology (2016)
van Elk, M., Yaden, D. B. · Neuroscience and Biobehavioral Reviews (2022)
Weiss, B., Sleep, C., Beller, N. et al. · Journal of Psychedelic Studies (2023)
Zamberlan, F., Sanz, C., Pallavicini, C. et al. · Frontiers in Integrative Neuroscience (2018)