Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Front Psychiatry ; 14: 1160291, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37398577

RESUMO

Introduction: To assist mental health care providers with the assessment of depression, research to develop a standardized, accessible, and non-invasive technique has garnered considerable attention. Our study focuses on the application of deep learning models for automatic assessment of depression severity based on clinical interview transcriptions. Despite the recent success of deep learning, the lack of large-scale high-quality datasets is a major performance bottleneck for many mental health applications. Methods: A novel approach is proposed to address the data scarcity problem for depression assessment. It leverages both pretrained large language models and parameter-efficient tuning techniques. The approach is built upon adapting a small set of tunable parameters, known as prefix vectors, to guide a pretrained model towards predicting the Patient Health Questionnaire (PHQ)-8 score of a person. Experiments were conducted on the Distress Analysis Interview Corpus - Wizard of Oz (DAIC-WOZ) benchmark dataset with 189 subjects, partitioned into training, development, and test sets. Model learning was done on the training set. Prediction performance mean and standard deviation of each model, with five randomly-initialized runs, were reported on the development set. Finally, optimized models were evaluated on the test set. Results: The proposed model with prefix vectors outperformed all previously published methods, including models which utilized multiple types of data modalities, and achieved the best reported performance on the test set of DAIC-WOZ with a root mean square error of 4.67 and a mean absolute error of 3.80 on the PHQ-8 scale. Compared to conventionally fine-tuned baseline models, prefix-enhanced models were less prone to overfitting by using far fewer training parameters (<6% relatively). Discussion: While transfer learning through pretrained large language models can provide a good starting point for downstream learning, prefix vectors can further adapt the pretrained models effectively to the depression assessment task by only adjusting a small number of parameters. The improvement is in part due to the fine-grain flexibility of prefix vector size in adjusting the model's learning capacity. Our results provide evidence that prefix-tuning can be a useful approach in developing tools for automatic depression assessment.

2.
JASA Express Lett ; 3(3): 035207, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-37003704

RESUMO

Many existing speech intelligibility prediction (SIP) algorithms can only account for acoustic factors affecting speech intelligibility and cannot predict intelligibility across corpora with different linguistic predictability. To address this, a linguistic component was added to five existing SIP algorithms by estimating linguistic corpus predictability using a pre-trained language model. The results showed improved SIP performance in terms of correlation and prediction error over a mixture of four datasets, each with a different English open-set corpus.


Assuntos
Linguística , Inteligibilidade da Fala , Idioma , Cognição , Algoritmos
3.
Hear Res ; 426: 108620, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36175300

RESUMO

We compare two alternative speech intelligibility prediction algorithms: time-frequency glimpse proportion (GP) and spectro-temporal glimpsing index (STGI). Both algorithms hypothesize that listeners understand speech in challenging acoustic environments by "glimpsing" partially available information from degraded speech. GP defines glimpses as those time-frequency regions whose local signal-to-noise ratio is above a certain threshold and estimates intelligibility as the proportion of the time-frequency regions glimpsed. STGI, on the other hand, applies glimpsing to the spectro-temporal modulation (STM) domain and uses a similarity measure based on the normalized cross-correlation between the STM envelopes of the clean and degraded speech signals to estimate intelligibility as the proportion of the STM channels glimpsed. Our experimental results demonstrate that STGI extends the notion of glimpsing proportion to a wider range of distortions, including non-linear signal processing, and outperforms GP for the additive uncorrelated noise datasets we tested. Furthermore, the results show that spectro-temporal modulation analysis enables STGI to account for the effects of masker type on speech intelligibility, leading to superior performance over GP in modulated noise datasets.


Assuntos
Inteligibilidade da Fala , Percepção da Fala , Ruído/efeitos adversos , Razão Sinal-Ruído , Mascaramento Perceptivo , Estimulação Acústica
4.
Front Psychiatry ; 12: 738466, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34616322

RESUMO

Introduction: Electronic health records (EHR) and administrative healthcare data (AHD) are frequently used in geriatric mental health research to answer various health research questions. However, there is an increasing amount and complexity of data available that may lend itself to alternative analytic approaches using machine learning (ML) or artificial intelligence (AI) methods. We performed a systematic review of the current application of ML or AI approaches to the analysis of EHR and AHD in geriatric mental health. Methods: We searched MEDLINE, Embase, and PsycINFO to identify potential studies. We included all articles that used ML or AI methods on topics related to geriatric mental health utilizing EHR or AHD data. We assessed study quality either by Prediction model Risk OF Bias ASsessment Tool (PROBAST) or Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) checklist. Results: We initially identified 391 articles through an electronic database and reference search, and 21 articles met inclusion criteria. Among the selected studies, EHR was the most used data type, and the datasets were mainly structured. A variety of ML and AI methods were used, with prediction or classification being the main application of ML or AI with the random forest as the most common ML technique. Dementia was the most common mental health condition observed. The relative advantages of ML or AI techniques compared to biostatistical methods were generally not assessed. Only in three studies, low risk of bias (ROB) was observed according to all the PROBAST domains but in none according to QUADAS-2 domains. The quality of study reporting could be further improved. Conclusion: There are currently relatively few studies using ML and AI in geriatric mental health research using EHR and AHD methods, although this field is expanding. Aside from dementia, there are few studies of other geriatric mental health conditions. The lack of consistent information in the selected studies precludes precise comparisons between them. Improving the quality of reporting of ML and AI work in the future would help improve research in the field. Other courses of improvement include using common data models to collect/organize data, and common datasets for ML model validation.

5.
Artigo em Inglês | MEDLINE | ID: mdl-33748329

RESUMO

Spectro-temporal modulations are believed to mediate the analysis of speech sounds in the human primary auditory cortex. Inspired by humans' robustness in comprehending speech in challenging acoustic environments, we propose an intrusive speech intelligibility prediction (SIP) algorithm, wSTMI, for normal-hearing listeners based on spectro-temporal modulation analysis (STMA) of the clean and degraded speech signals. In the STMA, each of 55 modulation frequency channels contributes an intermediate intelligibility measure. A sparse linear model with parameters optimized using Lasso regression results in combining the intermediate measures of 8 of the most salient channels for SIP. In comparison with a suite of 10 SIP algorithms, wSTMI performs consistently well across 13 datasets, which together cover degradation conditions including modulated noise, noise reduction processing, reverberation, near-end listening enhancement, and speech interruption. We show that the optimized parameters of wSTMI may be interpreted in terms of modulation transfer functions of the human auditory system. Thus, the proposed approach offers evidence affirming previous studies of the perceptual characteristics underlying speech signal intelligibility.

6.
J Acoust Soc Am ; 147(5): EL396, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32486791

RESUMO

Individual acoustic parameters of reverberation have the potential to affect both the intelligibility of speech and the degree of perceived reverberation. The current experiments used monaural acoustic simulations to investigate the effect of reverberation time (RT) and direct-to-reverberant ratio (DRR) on word and sentence intelligibility at different levels of analysis (phonemes, words, and sentences). Perceived reverberation and recall of sentences were also assessed. Intelligibility and perceived reverberation decreased with increasing RT and decreasing DRR (particularly between 0 and -10 dB). Results indicate consistent effects of both RT and DRR on the intelligibility and perceived reverberation of words and sentences.


Assuntos
Inteligibilidade da Fala , Percepção da Fala , Estimulação Acústica , Acústica
7.
Artigo em Inglês | MEDLINE | ID: mdl-19964897

RESUMO

Evaluation of the quality of tracheoesophageal (TE) speech using machines instead of human experts can enhance the voice rehabilitation process for patients who have undergone total laryngectomy and voice restoration. Towards the goal of devising a reference-free TE speech quality estimation algorithm, we investigate the efficacy of speech signal features that are used in standard telephone-speech quality assessment algorithms, in conjunction with a recently introduced speech modulation spectrum measure. Tests performed on two TE speech databases demonstrate that the modulation spectral measure and a subset of features in the standard ITU-T P.563 algorithm estimate TE speech quality with better correlation (up to 0.9) than previously proposed features.


Assuntos
Diagnóstico por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Espectrografia do Som/métodos , Distúrbios da Fala/diagnóstico , Distúrbios da Fala/reabilitação , Medida da Produção da Fala/métodos , Voz Alaríngea , Idoso , Algoritmos , Inteligência Artificial , Humanos , Masculino , Pessoa de Meia-Idade , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
8.
Ophthalmic Physiol Opt ; 29(1): 49-57, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19154280

RESUMO

PURPOSE: To compare effectiveness of cleaning with and without rubbing of soft contact lenses. METHODS: Three-hundred new biweekly disposable hydrogel lenses (Ocufilcon D, FDA Group IV; 55% water content) were artificially deposited with serum albumin, hand cream (semi-transparent deposits) and mascara (black deposits). The treated lenses were randomly divided into three groups, each group cleaned by one of three methods of cleaning--Rubbing (R), No-Rub following the manufacturer's instruction on duration of rinsing (NR1) and No-Rub with a shorter duration of rinsing (NR2). Four commercially-available multipurpose solutions (MPS) and a saline were used. The cleaning effectiveness was determined by the amount of deposits remaining on the contact lenses after cleaning, assessed with the aid of a slit-lamp. The level of deposits remaining (in terms of coverage of lens surface) were determined using a five-point scale [0 (no observable deposits)--4 (>80% deposits remained)] for semi-transparent deposits (protein and hand cream) and black deposits (mascara). The investigators were masked as to the solutions used (except for one MPS which has a different rinsing time than the other MPS), and the investigator who assessed the deposits left on the lenses did not know which solution or cleaning method was used to clean each lens. RESULTS: Lenses cleaned by the R method were significantly cleaner than those cleaned by methods NR1 and NR2. No significant difference was found between lenses cleaned by NR1 and NR2 methods. The median grade of deposits for lenses cleaned by R method was 0.5 for both semi-transparent and black deposits. For lenses cleaned by NR1 and NR2 methods, the median grade of deposits left on lens surfaces was 4.0 for both types of deposits. Different solutions used did not affect the level of deposits left on lens surfaces for all three cleaning methods. CONCLUSIONS: Not rubbing the soft lens when cleaning is ineffective in removing loosely-bound deposits. A longer rinse, as recommended by the manufacturers, does not remove significantly more deposits than a shorter rinse with the MPS. This work supports the view that contact lens wearers should be encouraged to rub their lenses when cleaning.


Assuntos
Soluções para Lentes de Contato/normas , Lentes de Contato Hidrofílicas/efeitos adversos , Desinfecção/métodos , Albuminas , Soluções para Lentes de Contato/química , Cosméticos , Equipamentos Descartáveis , Fricção , Humanos
9.
Artigo em Inglês | MEDLINE | ID: mdl-19163050

RESUMO

Separation of heart and lung sounds from breath sound recordings is a challenging task due to the temporal and spectral overlap of the two signals. In this paper, the use of a spectro-temporal representation to improve signal separation is investigated. The representation is obtained by means of a frequency decomposition (termed modulation frequency) of temporal trajectories of short-term spectral components. Experiments described herein suggest that improved separability of heart (HS) and lung sounds (LS) is attained in the modulation frequency domain. Bandpass and bandstop modulation filters are designed to separate HS and LS signals from breath sound recordings, respectively. Visual and auditory inspection, quantitative analysis, as well as algorithm execution time are used to assess algorithm performance. Log-spectral distances below 1 dB corroborate our listening test which found no audible artifacts in separated heart and lung sound signals.


Assuntos
Ruídos Cardíacos , Sons Respiratórios , Acústica , Adulto , Auscultação/estatística & dados numéricos , Engenharia Biomédica , Auscultação Cardíaca/estatística & dados numéricos , Humanos , Processamento de Sinais Assistido por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...