Search | VHL Regional Portal

1.

Iodide-Dependent Selective Dehydroaromatization Affording Maleimide-Fused 9,10-Phenanthrenes and Their Analogues.

Wang, Shuowen; Chen, Zhuohao; Chen, Shanping; Shao, Wen; Chen, Ya; Deng, Guo-Jun.

Org Lett ; 25(39): 7142-7147, 2023 Oct 06.

Article in English | MEDLINE | ID: mdl-37732874

ABSTRACT

A novel and selective synthesis of polycyclic fused maleimides from easily available raw materials under metal-free conditions is presented. This cascade protocol involves self-condensation of cyclohexanones, followed by Diels-Alder reaction with maleimides, intramolecular dehydration, and selective dehydroaromatization in a one-pot fashion, affording maleimide-fused 9,10-phenanthrenes and their analogues in satisfactory yields. Notably, iodide reagents play a critical role in switching the selectivity toward full or partial dehydrogenation compounds.

2.

A Machine-Learning Algorithm for the Automated Perceptual Evaluation of Dysphonia Severity.

van der Woerd, Benjamin; Chen, Zhuohao; Flemotomos, Nikolaos; Oljaca, Maria; Sund, Lauren Timmons; Narayanan, Shrikanth; Johns, Michael M.

J Voice ; 2023 Jul 08.

Article in English | MEDLINE | ID: mdl-37429808

ABSTRACT

OBJECTIVES: Auditory-perceptual assessments are the gold standard for assessing voice quality. This project aims to develop a machine-learning model for measuring perceptual dysphonia severity of audio samples consistent with assessments by expert raters. METHODS: The Perceptual Voice Qualities Database samples were used, including sustained vowel and Consensus Auditory-Perceptual Evaluation of Voice sentences, which were previously expertly rated on a 0-100 scale. The OpenSMILE (audEERING GmbH, Gilching, Germany) toolkit was used to extract acoustic (Mel-Frequency Cepstral Coefficient-based, n = 1428) and prosodic (n = 152) features, pitch onsets, and recording duration. We utilized a support vector machine and these features (n = 1582) for automated assessment of dysphonia severity. Recordings were separated into vowels (V) and sentences (S) and features were extracted separately from each. Final voice quality predictions were made by combining the features extracted from the individual components with the whole audio (WA) sample (three file sets: S, V, WA). RESULTS: This algorithm has a high correlation (r = 0.847) with estimates of expert raters. The root mean square error was 13.36. Increasing signal complexity resulted in better estimation of dysphonia, whereby combining the features outperformed WA, S, and V sets individually. CONCLUSION: A novel machine-learning algorithm was able to perform perceptual estimates of dysphonia severity using standardized audio samples on a 100-point scale. This was highly correlated to expert raters. This suggests that ML algorithms could offer an objective method for evaluating voice samples for dysphonia severity.

3.

An Automated Quality Evaluation Framework of Psychotherapy Conversations with Local Quality Estimates.

Chen, Zhuohao; Flemotomos, Nikolaos; Singla, Karan; Creed, Torrey A; Atkins, David C; Narayanan, Shrikanth.

Comput Speech Lang ; 752022 Sep.

Article in English | MEDLINE | ID: mdl-35479611

ABSTRACT

Text-based computational approaches for assessing the quality of psychotherapy are being developed to support quality assurance and clinical training. However, due to the long durations of typical conversation based therapy sessions, and due to limited annotated modeling resources, computational methods largely rely on frequency-based lexical features or dialogue acts to assess the overall session level characteristics. In this work, we propose a hierarchical framework to automatically evaluate the quality of transcribed Cognitive Behavioral Therapy (CBT) interactions. Given the richly dynamic nature of the spoken dialog within a talk therapy session, to evaluate the overall session level quality, we propose to consider modeling it as a function of local variations across the interaction. To implement that empirically, we divide each psychotherapy session into conversation segments and initialize the segment-level qualities with the session-level scores. First, we produce segment embeddings by fine-tuning a BERT-based model, and predict segment-level (local) quality scores. These embeddings are used as the lower-level input to a Bidirectional LSTM-based neural network to predict the session-level (global) quality estimates. In particular, we model the global quality as a linear function of the local quality scores, which allows us to update the segment-level quality estimates based on the session-level quality prediction. These newly estimated segment-level scores benefit the BERT fine-tuning process, which in turn results in better segment embeddings. We evaluate the proposed framework on automatically derived transcriptions from real-world CBT clinical recordings to predict session-level behavior codes. The results indicate that our approach leads to improved evaluation accuracy for most codes when used for both regression and classification tasks.

4.

Automated evaluation of psychotherapy skills using speech and language technologies.

Flemotomos, Nikolaos; Martinez, Victor R; Chen, Zhuohao; Singla, Karan; Ardulov, Victor; Peri, Raghuveer; Caperton, Derek D; Gibson, James; Tanana, Michael J; Georgiou, Panayiotis; Van Epps, Jake; Lord, Sarah P; Hirsch, Tad; Imel, Zac E; Atkins, David C; Narayanan, Shrikanth.

Behav Res Methods ; 54(2): 690-711, 2022 04.

Article in English | MEDLINE | ID: mdl-34346043

ABSTRACT

With the growing prevalence of psychological interventions, it is vital to have measures which rate the effectiveness of psychological care to assist in training, supervision, and quality assurance of services. Traditionally, quality assessment is addressed by human raters who evaluate recorded sessions along specific dimensions, often codified through constructs relevant to the approach and domain. This is, however, a cost-prohibitive and time-consuming method that leads to poor feasibility and limited use in real-world settings. To facilitate this process, we have developed an automated competency rating tool able to process the raw recorded audio of a session, analyzing who spoke when, what they said, and how the health professional used language to provide therapy. Focusing on a use case of a specific type of psychotherapy called "motivational interviewing", our system gives comprehensive feedback to the therapist, including information about the dynamics of the session (e.g., therapist's vs. client's talking time), low-level psychological language descriptors (e.g., type of questions asked), as well as other high-level behavioral constructs (e.g., the extent to which the therapist understands the clients' perspective). We describe our platform and its performance using a dataset of more than 5000 recordings drawn from its deployment in a real-world clinical setting used to assist training of new therapists. Widespread use of automated psychotherapy rating tools may augment experts' capabilities by providing an avenue for more effective training and skill improvement, eventually leading to more positive clinical outcomes.

Subject(s)

Professional-Patient Relations , Speech , Humans , Language , Psychotherapy/methods

5.

Feature Fusion Strategies for End-to-End Evaluation of Cognitive Behavior Therapy Sessions.

Chen, Zhuohao; Flemotomos, Nikolaos; Ardulov, Victor; Creed, Torrey A; Imel, Zac E; Atkins, David C; Narayanan, Shrikanth.

Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 1836-1839, 2021 11.

Article in English | MEDLINE | ID: mdl-34891644

ABSTRACT

Cognitive Behavioral Therapy (CBT) is a goal-oriented psychotherapy for mental health concerns implemented in a conversational setting. The quality of a CBT session is typically assessed by trained human raters who manually assign pre-defined session-level behavioral codes. In this paper, we develop an end-to-end pipeline that converts speech audio to diarized and transcribed text and extracts linguistic features to code the CBT sessions automatically. We investigate both word-level and utterance-level features and propose feature fusion strategies to combine them. The utterance level features include dialog act tags as well as behavioral codes drawn from another well-known talk psychotherapy called Motivational Interviewing (MI). We propose a novel method to augment the word-based features with the utterance level tags for subsequent CBT code estimation. Experiments show that our new fusion strategy outperforms all the studied features, both when used individually and when fused by direct concatenation. We also find that incorporating a sentence segmentation module can further improve the overall system given the preponderance of multi-utterance conversational turns in CBT sessions.

Subject(s)

Cognitive Behavioral Therapy , Motivational Interviewing , Humans , Psychotherapy

6.

Automated quality assessment of cognitive behavioral therapy sessions through highly contextualized language representations.

Flemotomos, Nikolaos; Martinez, Victor R; Chen, Zhuohao; Creed, Torrey A; Atkins, David C; Narayanan, Shrikanth.

PLoS One ; 16(10): e0258639, 2021.

Article in English | MEDLINE | ID: mdl-34679105

ABSTRACT

During a psychotherapy session, the counselor typically adopts techniques which are codified along specific dimensions (e.g., 'displays warmth and confidence', or 'attempts to set up collaboration') to facilitate the evaluation of the session. Those constructs, traditionally scored by trained human raters, reflect the complex nature of psychotherapy and highly depend on the context of the interaction. Recent advances in deep contextualized language models offer an avenue for accurate in-domain linguistic representations which can lead to robust recognition and scoring of such psychotherapy-relevant behavioral constructs, and support quality assurance and supervision. In this work, we propose a BERT-based model for automatic behavioral scoring of a specific type of psychotherapy, called Cognitive Behavioral Therapy (CBT), where prior work is limited to frequency-based language features and/or short text excerpts which do not capture the unique elements involved in a spontaneous long conversational interaction. The model focuses on the classification of therapy sessions with respect to the overall score achieved on the widely-used Cognitive Therapy Rating Scale (CTRS), but is trained in a multi-task manner in order to achieve higher interpretability. BERT-based representations are further augmented with available therapy metadata, providing relevant non-linguistic context and leading to consistent performance improvements. We train and evaluate our models on a set of 1,118 real-world therapy sessions, recorded and automatically transcribed. Our best model achieves an F1 score equal to 72.61% on the binary classification task of low vs. high total CTRS.

Subject(s)

Cognitive Behavioral Therapy/methods , Mental Disorders/therapy , Clinical Competence , Data Interpretation, Statistical , Female , Humans , Male , Models, Psychological , Natural Language Processing , Psychiatric Status Rating Scales

7.

Towards End-2-end Learning for Predicting Behavior Codes from Spoken Utterances in Psychotherapy Conversations.

Singla, Karan; Chen, Zhuohao; Atkins, David C; Narayanan, Shrikanth.

Proc Conf Assoc Comput Linguist Meet ; 2020: 3797-3803, 2020 Jul.

Article in English | MEDLINE | ID: mdl-36751434

ABSTRACT

Spoken language understanding tasks usually rely on pipelines involving complex processing blocks such as voice activity detection, speaker diarization and Automatic speech recognition (ASR). We propose a novel framework for predicting utterance level labels directly from speech features, thus removing the dependency on first generating transcripts, and transcription free behavioral coding. Our classifier uses a pretrained Speech-2-Vector encoder as bottleneck to generate word-level representations from speech features. This pre-trained encoder learns to encode speech features for a word using an objective similar to Word2Vec. Our proposed approach just uses speech features and word segmentation information for predicting spoken utterance-level target labels. We show that our model achieves competitive results to other state-of-the-art approaches which use transcribed text for the task of predicting psychotherapy-relevant behavior codes.

8.

IMPROVING THE PREDICTION OF THERAPIST BEHAVIORS IN ADDICTION COUNSELING BY EXPLOITING CLASS CONFUSIONS.

Chen, Zhuohao; Singla, Karan; Gibson, James; Can, Dogan; Imel, Zac E; Atkins, David C; Georgiou, Panayiotis; Narayanan, Shrikanth.

Proc IEEE Int Conf Acoust Speech Signal Process ; 2019: 6605-6609, 2019 May.

Article in English | MEDLINE | ID: mdl-36704712

ABSTRACT

In this work we address the problem of joint prosodic and lexical behavioral annotation for addiction counseling. We expand on past work that employed Recurrent Neural Networks (RNNs) on multimodal features by grouping and classifying subsets of classes. We propose two implementations: One is hierarchical classification, which uses the behavior confusion matrix to cluster similar classes and makes the prediction based on a tree structure. The second is a graph-based method which uses the result of the original classification just to find a certain subset of the most probable candidate classes, where the candidate sets of different predicted classes are determined by the class confusions. We make a second prediction with simpler classifier to discriminate the candidates. The evaluation shows that the strict hierarchical approach degrades performance, likely due to error propagation, while the graph-based hierarchy provides significant gains.

9.

Using Prosodic and Lexical Information for Learning Utterance-level Behaviors in Psychotherapy.

Singla, Karan; Chen, Zhuohao; Flemotomos, Nikolaos; Gibson, James; Can, Dogan; Atkins, David C; Narayanan, Shrikanth.

Interspeech ; 2018: 3413-3417, 2018 Sep.

Article in English | MEDLINE | ID: mdl-34307639

ABSTRACT

In this paper, we present an approach for predicting utterance level behaviors in psychotherapy sessions using both speech and lexical features. We train long short term memory (LSTM) networks with an attention mechanism using words, both manually and automatically transcribed, and prosodic features, at the word level, to predict the annotated behaviors. We demonstrate that prosodic features provide discriminative information relevant to the behavior task and show that they improve prediction when fused with automatically derived lexical features. Additionally, we investigate the weights of the attention mechanism to determine words and prosodic patterns which are of importance to the behavior prediction task.

10.

Social exclusion leads to attentional bias to emotional social information: Evidence from eye movement.

Chen, Zhuohao; Du, Jinchen; Xiang, Min; Zhang, Yan; Zhang, Shuyue.

PLoS One ; 12(10): e0186313, 2017.

Article in English | MEDLINE | ID: mdl-29040279

ABSTRACT

Social exclusion has many effects on individuals, including the increased need to belong and elevated sensitivity to social information. Using a self-reporting method, and an eye-tracking technique, this study explored people's need to belong and attentional bias towards the socio-emotional information (pictures of positive and negative facial expressions compared to those of emotionally-neutral expressions) after experiencing a brief episode of social exclusion. We found that: (1) socially-excluded individuals reported higher negative emotions, lower positive emotions, and stronger need to belong than those who were not socially excluded; (2) compared to a control condition, social exclusion caused a longer response time to probe dots after viewing positive or negative face images; (3) social exclusion resulted in a higher frequency ratio of first attentional fixation on both positive and negative emotional facial pictures (but not on the neutral pictures) than the control condition; (4) in the social exclusion condition, participants showed shorter first fixation latency and longer first fixation duration to positive pictures than neutral ones but this effect was not observed for negative pictures; (5) participants who experienced social exclusion also showed longer gazing duration on the positive pictures than those who did not; although group differences also existed for the negative pictures, the gaze duration bias from both groups showed no difference from chance. This study demonstrated the emotional response to social exclusion as well as characterising multiple eye-movement indicators of attentional bias after experiencing social exclusion.

Subject(s)

Attentional Bias , Emotions , Eye Movements/physiology , Social Behavior , Adolescent , Adult , Facial Expression , Female , Humans , Male , Reaction Time

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL