Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
Neural Netw ; 143: 171-182, 2021 Nov.
Article in English | MEDLINE | ID: mdl-34157642

ABSTRACT

In this paper, we propose a visual embedding approach to improve embedding aware speech enhancement (EASE) by synchronizing visual lip frames at the phone and place of articulation levels. We first extract visual embedding from lip frames using a pre-trained phone or articulation place recognizer for visual-only EASE (VEASE). Next, we extract audio-visual embedding from noisy speech and lip frames in an information intersection manner, utilizing a complementarity of audio and visual features for multi-modal EASE (MEASE). Experiments on the TCD-TIMIT corpus corrupted by simulated additive noises show that our proposed subword based VEASE approach is more effective than conventional embedding at the word level. Moreover, visual embedding at the articulation place level, leveraging upon a high correlation between place of articulation and lip shapes, demonstrates an even better performance than that at the phone level. Finally the experiments establish that the proposed MEASE framework, incorporating both audio and visual embeddings, yields significantly better speech quality and intelligibility than those obtained with the best visual-only and audio-only EASE systems.


Subject(s)
Speech Perception , Speech , Lip , Noise
2.
Sci Rep ; 9(1): 3840, 2019 03 07.
Article in English | MEDLINE | ID: mdl-30846758

ABSTRACT

We propose using faster regions with convolutional neural network features (faster R-CNN) in the TensorFlow tool package to detect and number teeth in dental periapical films. To improve detection precisions, we propose three post-processing techniques to supplement the baseline faster R-CNN according to certain prior domain knowledge. First, a filtering algorithm is constructed to delete overlapping boxes detected by faster R-CNN associated with the same tooth. Next, a neural network model is implemented to detect missing teeth. Finally, a rule-base module based on a teeth numbering system is proposed to match labels of detected teeth boxes to modify detected results that violate certain intuitive rules. The intersection-over-union (IOU) value between detected and ground truth boxes are calculated to obtain precisions and recalls on a test dataset. Results demonstrate that both precisions and recalls exceed 90% and the mean value of the IOU between detected boxes and ground truths also reaches 91%. Moreover, three dentists are also invited to manually annotate the test dataset (independently), which are then compared to labels obtained by our proposed algorithms. The results indicate that machines already perform close to the level of a junior dentist.


Subject(s)
Deep Learning , Radiography, Dental/methods , Tooth/diagnostic imaging , Algorithms , Forensic Dentistry/methods , Humans , Neural Networks, Computer , Supervised Machine Learning
3.
Ear Hear ; 39(4): 795-809, 2018.
Article in English | MEDLINE | ID: mdl-29360687

ABSTRACT

OBJECTIVE: We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients. DESIGN: The deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing. RESULTS: The experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions. CONCLUSIONS: When compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion for the key speech envelope information, thus, improving speech recognition more effectively for Mandarin CI recipients. The results suggest that the proposed deep learning-based NR approach can potentially be integrated into existing CI signal processors to overcome the degradation of speech perception caused by noise.


Subject(s)
Cochlear Implantation , Cochlear Implants , Deafness/rehabilitation , Deep Learning , Noise , Speech Perception , Adult , Child , Female , Humans , Male , Middle Aged , Signal-To-Noise Ratio , Young Adult
4.
IEEE Trans Biomed Eng ; 65(2): 254-263, 2018 02.
Article in English | MEDLINE | ID: mdl-29035206

ABSTRACT

OBJECTIVE: In this paper, we explore the dependence of sliding window correlation (SWC) results on different parameters of correlating signals. The SWC is extensively used to explore the dynamics of functional connectivity (FC) networks using resting-state functional MRI (rsfMRI) scans. These scanned signals often contain multiple amplitudes, frequencies, and phases. However, the exact values of these parameters are unknown. Two recent studies explored the relationship of window length and frequencies (minimum/maximum) in the correlating signals. METHODS: We extend the findings of these studies by using two deterministic signals with multiple amplitudes, frequencies, and phases. Afterward, we modulate one of the signals to introduce dynamics (nonstationarity) in their relationship. We also explore the relationship of window length and frequency band for real rsfMRI data. RESULTS: For deterministic signals, the spurious fluctuations due to the method itself minimize, and the SWC estimates the stationary correlation when frequencies in the signals have specific relationship. For dynamic relationship also, the undesirable frequencies were removed under specific conditions for the frequencies. For real rsfMRI data, the SWC results varied with frequencies and window length. CONCLUSION: In the absence of any "ground truth" for different parameters in real rsfMRI signals, the SWC with a constant window size may not be a reliable method to study the dynamics of the FC. SIGNIFICANCE: This study reveals the parametric dependencies of the SWC and its limitation as a method to analyze dynamics of FC networks in the absence of any ground truth.


Subject(s)
Magnetic Resonance Imaging/methods , Signal Processing, Computer-Assisted , Adult , Algorithms , Brain/diagnostic imaging , Brain/physiology , Female , Humans , Male , Middle Aged , Young Adult
5.
IEEE Trans Biomed Eng ; 64(7): 1568-1578, 2017 07.
Article in English | MEDLINE | ID: mdl-28113304

ABSTRACT

OBJECTIVE: In a cochlear implant (CI) speech processor, noise reduction (NR) is a critical component for enabling CI users to attain improved speech perception under noisy conditions. Identifying an effective NR approach has long been a key topic in CI research. METHOD: Recently, a deep denoising autoencoder (DDAE) based NR approach was proposed and shown to be effective in restoring clean speech from noisy observations. It was also shown that DDAE could provide better performance than several existing NR methods in standardized objective evaluations. Following this success with normal speech, this paper further investigated the performance of DDAE-based NR to improve the intelligibility of envelope-based vocoded speech, which simulates speech signal processing in existing CI devices. RESULTS: We compared the performance of speech intelligibility between DDAE-based NR and conventional single-microphone NR approaches using the noise vocoder simulation. The results of both objective evaluations and listening test showed that, under the conditions of nonstationary noise distortion, DDAE-based NR yielded higher intelligibility scores than conventional NR approaches. CONCLUSION AND SIGNIFICANCE: This study confirmed that DDAE-based NR could potentially be integrated into a CI processor to provide more benefits to CI users under noisy conditions.


Subject(s)
Cochlear Implants , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Sound Spectrography/methods , Speech Intelligibility/physiology , Speech Production Measurement/methods , Algorithms , Humans , Reproducibility of Results , Sensitivity and Specificity , Signal-To-Noise Ratio , Sound Spectrography/instrumentation , Speech Production Measurement/instrumentation
6.
Neuroimage ; 133: 111-128, 2016 06.
Article in English | MEDLINE | ID: mdl-26952197

ABSTRACT

A promising recent development in the study of brain function is the dynamic analysis of resting-state functional MRI scans, which can enhance understanding of normal cognition and alterations that result from brain disorders. One widely used method of capturing the dynamics of functional connectivity is sliding window correlation (SWC). However, in the absence of a "gold standard" for comparison, evaluating the performance of the SWC in typical resting-state data is challenging. This study uses simulated networks (SNs) with known transitions to examine the effects of parameters such as window length, window offset, window type, noise, filtering, and sampling rate on the SWC performance. The SWC time course was calculated for all node pairs of each SN and then clustered using the k-means algorithm to determine how resulting brain states match known configurations and transitions in the SNs. The outcomes show that the detection of state transitions and durations in the SWC is most strongly influenced by the window length and offset, followed by noise and filtering parameters. The effect of the image sampling rate was relatively insignificant. Tapered windows provide less sensitivity to state transitions than rectangular windows, which could be the result of the sharp transitions in the SNs. Overall, the SWC gave poor estimates of correlation for each brain state. Clustering based on the SWC time course did not reliably reflect the underlying state transitions unless the window length was comparable to the state duration, highlighting the need for new adaptive window analysis techniques.


Subject(s)
Algorithms , Brain Mapping/methods , Brain/physiology , Data Interpretation, Statistical , Magnetic Resonance Imaging/methods , Nerve Net/physiology , Adult , Female , Humans , Male , Middle Aged , Reproducibility of Results , Sensitivity and Specificity , Statistics as Topic
7.
Article in English | MEDLINE | ID: mdl-25570125

ABSTRACT

Different regions in the resting brain exhibit non-stationary functional connectivity (FC) over time. In this paper, a simple and efficient framework of clustering the variability in FC of a rat's brain at rest is proposed. This clustering process reveals areas that are always connected with a chosen region, called seed voxel, along with the areas exhibiting variability in the FC. This addresses an issue common to most dynamic FC analysis techniques, which is the assumption that the spatial extent of a given network remains constant over time. We increase the voxel size and reduce the spatial resolution to analyze variable FC of the whole resting brain. We hypothesize that the adjacent voxels in resting state functional magnetic resonance imaging (rsfMRI), just as in task-based fMRI, exhibit similar intensities, so they can be averaged to obtain larger voxels without any significant loss of information. Sliding window correlation is used to compute variable patterns of the rat's whole brain FC with the seed voxel in the sensorimotor cortex. These patterns are grouped based on their spatial similarities using binary transformed feature vectors in k-means clustering, not only revealing the variable and nonvariable portions of FC in the resting brain but also detecting the extent of the variability of these patterns.


Subject(s)
Neural Pathways/physiology , Animals , Brain Mapping , Cluster Analysis , Magnetic Resonance Imaging , Rats , Somatosensory Cortex/physiology
8.
IEEE Trans Biomed Eng ; 60(6): 1477-87, 2013 Jun.
Article in English | MEDLINE | ID: mdl-23268374

ABSTRACT

In color flow imaging, it is a challenging work to accurately extract blood flow information from ultrasound Doppler echoes dominated by the strong clutter components. In this paper, we provide an in-depth analysis of ridge ensemble empirical mode decomposition (R-EEMD) and compare it with the conventional empirical mode decomposition (EMD) framework. R-EEMD facilitates nonuniform and trial-dependent weights obtained by an optimization procedure during ensemble combination and results in less decomposition errors when compared with the conventional ensemble empirical mode decomposition techniques. A theoretic result is then extended to demonstrate that R-EEMD has an ability to solve the mode mixing problem frequently encountered in EMD and improve the decomposition performance with adequate noise strength when separating a composite two-tone signal. Based on the proposed R-EEMD framework, a novel clutter rejection filter for ultrasound color flow imaging is designed. In a series of simulations, the R-EEMD-based filter achieves a significant improvement on blood flow velocity estimation over the state-of-the-art regression filters and decomposition-based filters, such as eigen-based and EMD filters. An experiment on human carotid artery data also verifies that the R-EEMD algorithm achieves minimum clutter energy and maximum blood-to-clutter energy ratio among all the tested techniques.


Subject(s)
Algorithms , Image Processing, Computer-Assisted/methods , Signal Processing, Computer-Assisted , Ultrasonography, Doppler, Color/methods , Carotid Arteries/diagnostic imaging , Carotid Arteries/physiology , Computer Simulation , Humans , Regional Blood Flow
9.
J Med Ultrason (2001) ; 40(2): 99-105, 2013 Apr.
Article in English | MEDLINE | ID: mdl-27277097

ABSTRACT

PURPOSE: Clutter regarded as ultrasound Doppler echoes of soft tissue interferes with the primary objective of color flow imaging (CFI): measurement and display of blood flow. Multi-ensemble samples based clutter filters degrade resolution or frame rate of CFI. The prevalent single-ensemble clutter rejection filter is based on a single rejection criterion and fails to achieve a high accuracy for estimating both the low- and high-velocity blood flow components. METHODS: The Bilinear Hankel-SVD achieved more exact signal decomposition than the conventional Hankel-SVD. Furthermore, the correlation between two arbitrary eigen-components obtained by the B-Hankel-SVD was demonstrated. In the hybrid approach, the input ultrasound Doppler signal first passes through a low-order regression filter, and then the output is properly decomposed into a collection of eigen-components under the framework of B-Hankel-SVD. The blood flow components are finally extracted based on a frequency threshold. RESULTS: In a series of simulations, the proposed B-Hankel-SVD filter reduced the estimation bias of the blood flow over the conventional Hankel-SVD filter. The hybrid algorithm was shown to be more effective than regression or Hankel-SVD filters alone in rejecting the undesirable clutter components with single-ensemble (S-E) samples. It achieved a significant improvement in blood flow frequency estimation and estimation variance over the other competing filters.

SELECTION OF CITATIONS
SEARCH DETAIL
...