RESUMO
In this paper, we investigated and evaluated various machine learning-based approaches for automatically detecting wheezing sounds. We conducted a comprehensive comparison of these proposed systems, assessing their classification performance through metrics such as Sensitivity, Specificity, and Accuracy. The main approach to developing a machine learning-based system for classifying respiratory sounds involved the combination of a technique for extracting features from an unknown input sound with a classification method to determine its belonging class. The characterization techniques used in this study are based on the cepstral analysis, which was extensively employed in the automatic speech recognition field. While MFCC (Mel-Frequency Cepstral Coefficients) feature extraction methods are commonly used in respiratory sounds classification, our study introduces a novelty by employing GFCC (Gammatone-Frequency Cepstral Coefficients) and BFCC (Bark-Frequency Cepstral Coefficients) for this purpose. For the classification task, we employed two types of neural networks: the MLP (Multilayer Perceptron), a feedforward neural network, and a variant of the LSTM (Long Short-Term Memory) recurrent neural network called BiLSTM (Bidirectional LSTM). The proposed classification systems are evaluated using a database consisting of 497 wheezing segments and 915 normal respiratory segments, which are recorded from individuals diagnosticated with asthma and individuals without any respiratory issues, respectively. The highest classification performance was achieved by the BFCC-BiLSTM model, which demonstrated an exceptional accuracy rate of 99.8%.
Assuntos
Asma , Sons Respiratórios , Humanos , Sons Respiratórios/diagnóstico , Processamento de Sinais Assistido por Computador , Redes Neurais de Computação , Aprendizado de Máquina , Asma/diagnósticoRESUMO
Monitoring blue and fin whales summering in the St. Lawrence Estuary with passive acoustics requires call recognition algorithms that can cope with the heavy shipping noise of the St. Lawrence Seaway and with multipath propagation characteristics that generate overlapping copies of the calls. In this paper, the performance of three time-frequency methods aiming at such automatic detection and classification is tested on more than 2000 calls and compared at several levels of signal-to-noise ratio using typical recordings collected in this area. For all methods, image processing techniques are used to reduce the noise in the spectrogram. The first approach consists in matching the spectrogram with binary time-frequency templates of the calls (coincidence of spectrograms). The second approach is based on the extraction of the frequency contours of the calls and their classification using dynamic time warping (DTW) and the vector quantization (VQ) algorithms. The coincidence of spectrograms was the fastest method and performed better for blue whale A and B calls. VQ detected more 20 Hz fin whale calls but with a higher false alarm rate. DTW and VQ outperformed for the more variable blue whale D calls.
Assuntos
Acústica , Algoritmos , Automação , Processamento de Sinais Assistido por Computador , Vocalização Animal , Animais , Oceano Atlântico , Balaenoptera , Bases de Dados como Assunto , Baleia Comum , Quebeque , Espectrografia do Som/métodos , Fatores de TempoRESUMO
In this paper, we present the pattern recognition methods proposed to classify respiratory sounds into normal and wheeze classes. We evaluate and compare the feature extraction techniques based on Fourier transform, linear predictive coding, wavelet transform and Mel-frequency cepstral coefficients (MFCC) in combination with the classification methods based on vector quantization, Gaussian mixture models (GMM) and artificial neural networks, using receiver operating characteristic curves. We propose the use of an optimized threshold to discriminate the wheezing class from the normal one. Also, post-processing filter is employed to considerably improve the classification accuracy. Experimental results show that our approach based on MFCC coefficients combined to GMM is well adapted to classify respiratory sounds in normal and wheeze classes. McNemar's test demonstrated significant difference between results obtained by the presented classifiers (p<0.05).