Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Imaging ; 9(10)2023 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-37888321

RESUMO

Colour correction is the process of converting RAW RGB pixel values of digital cameras to a standard colour space such as CIE XYZ. A range of regression methods including linear, polynomial and root-polynomial least-squares have been deployed. However, in recent years, various neural network (NN) models have also started to appear in the literature as an alternative to classical methods. In the first part of this paper, a leading neural network approach is compared and contrasted with regression methods. We find that, although the neural network model supports improved colour correction compared with simple least-squares regression, it performs less well than the more advanced root-polynomial regression. Moreover, the relative improvement afforded by NNs, compared to linear least-squares, is diminished when the regression methods are adapted to minimise a perceptual colour error. Problematically, unlike linear and root-polynomial regressions, the NN approach is tied to a fixed exposure (and when exposure changes, the afforded colour correction can be quite poor). We explore two solutions that make NNs more exposure-invariant. First, we use data augmentation to train the NN for a range of typical exposures and second, we propose a new NN architecture which, by construction, is exposure-invariant. Finally, we look into how the performance of these algorithms is influenced when models are trained and tested on different datasets. As expected, the performance of all methods drops when tested with completely different datasets. However, we noticed that the regression methods still outperform the NNs in terms of colour correction, even though the relative performance of the regression methods does change based on the train and test datasets.

2.
Artigo em Inglês | MEDLINE | ID: mdl-33972890

RESUMO

This work proposes a convolutional recurrent neural network (CRNN) based direction of arrival (DOA) angle estimation method, implemented on the Android smartphone for hearing aid applications. The proposed app provides a 'visual' indication of the direction of a talker on the screen of Android smartphones for improving the hearing of people with hearing disorders. We use real and imaginary parts of short-time Fourier transform (STFT) as a feature set for the proposed CRNN architecture for DOA angle estimation. Real smartphone recordings are utilized for assessing performance of the proposed method. The accuracy of the proposed method reaches 87.33% for unseen (untrained) environments. This work also presents real-time inference of the proposed method, which is done on an Android smartphone using only its two built-in microphones and no additional component or external hardware. The real-time implementation also proves the generalization and robustness of the proposed CRNN based model.

3.
IEEE Access ; 8: 197047-197058, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33981519

RESUMO

In this article, we present a real-time convolutional neural network (CNN)-based Speech source localization (SSL) algorithm that is robust to realistic background acoustic conditions (noise and reverberation). We have implemented and tested the proposed method on a prototype (Raspberry Pi) for real-time operation. We have used the combination of the imaginary-real coefficients of the short-time Fourier transform (STFT) and Spectral Flux (SF) with delay-and-sum (DAS) beamforming as the input feature. We have trained the CNN model using noisy speech recordings collected from different rooms and inference on an unseen room. We provide quantitative comparison with five other previously published SSL algorithms under several realistic noisy conditions, and show significant improvements by incorporating the Spectral Flux (SF) with beamforming as an additional feature to learn temporal variation in speech spectra. We perform real-time inferencing of our CNN model on the prototyped platform with low latency (21 milliseconds (ms) per frame with a frame length of 30 ms) and high accuracy (i.e. 89.68% under Babble noise condition at 5dB SNR). Lastly, we provide a detailed explanation of real-time implementation and on-device performance (including peak power consumption metrics) that sets this work apart from previously published works. This work has several notable implications for improving the audio-processing algorithms for portable battery-operated Smart loudspeakers and hearing improvement (HI) devices.

4.
Proc Meet Acoust ; 39(1)2019 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-32742552

RESUMO

Deep neural network (DNN) techniques are gaining popularity due to performance boost in many applications. In this work we propose a DNN-based method for finding the direction of arrival (DOA) of speech source for hearing study improvement and hearing aid applications using popular smartphone with no external components as a cost-effective stand-alone platform. We consider the DOA estimation as a classification problem and use the magnitude and phase of speech signal as a feature set for DNN training stage and obtaining appropriate model. The model is trained and derived using real speech and real noisy speech data recorded on smartphone in different noisy environments under low signal to noise ratios (SNRs). The DNN-based DOA method with the pre-trained model is implemented and run on Android smartphone in real time. The performance of proposed method is evaluated objectively and subjectively in the both training and unseen environments. The test results are presented showing the superior performance of proposed method over conventional methods.

5.
IEEE Access ; 7: 169969-169978, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32754421

RESUMO

In this paper, we present a real-time convolutional neural network (CNN) based approach for speech source localization (SSL) using Android-based smartphone and its two built-in microphones under noisy conditions. We propose a new input feature set - using real and imaginary parts of the short-time Fourier transform (STFT) for CNN-based SSL. We use simulated noisy data from popular datasets that was augmented with few hours of real recordings collected on smartphones to train our CNN model. We compare the proposed method to recent CNN-based SSL methods that are trained on our dataset and show that our CNN-based SSL method offers higher accuracy on identical test datasets. Another unique aspect of this work is that we perform real-time inferencing of our CNN model on an Android smartphone with low latency (14 milliseconds(ms) for single frame-based estimation, 180 ms for multi frame-based estimation and frame length is 20 ms for both cases) and high accuracy (i.e. 88.83% at 0dB SNR). We show that our CNN model is rather robust to smartphone hardware mismatch, hence we may not need to retrain the entire model again for use with different smartphones. The proposed application provides a 'visual' indication of the direction of a talker on the screen of Android smartphones for improving the hearing of people with hearing disorders.

6.
Annu Int Conf IEEE Eng Med Biol Soc ; 2018: 417-420, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-30440422

RESUMO

This paper presents the minimum variance distortionless response (MVDR) beamformer combined with a Speech Enhancement (SE) gain function as a real-time application running on smartphones that work as an assistive device to Hearing Aids. It has been shown that beamforming techniques improve the Signal to Noise Ratio (SNR) in noisy conditions. In the proposed algorithm, MVDR beamformer is used as an SNR booster for the SE method. The proposed SE gain is based on the Log-Spectral Amplitude estimator to improve the speech quality in the presence of different background noises. Objective evaluation and intelligibility measures support the theoretical analysis and show significant improvements of the proposed method in comparison with existing methods. Subjective test results show the effectiveness of the application in real-world noisy conditions at SNR levels of -5 dB, 0 dB, and 5 dB.


Assuntos
Algoritmos , Auxiliares de Audição , Smartphone , Software , Humanos , Ruído , Tecnologia Assistiva , Razão Sinal-Ruído , Inteligibilidade da Fala , Percepção da Fala
7.
Annu Int Conf IEEE Eng Med Biol Soc ; 2018: 433-436, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-30440427

RESUMO

In this paper, we present an improved version of a Speech source Iocalization method for Direction of Arrival (DOA) estimation using only two microphones. We also present a real-time Android application on a latest smartphone to help improve the spatial awareness of hearing impaired users. Unlike earlier methods, the proposed method is computationally more efficient and fully adaptive to dynamically changing background noise. We compare the performance of proposed method with similar earlier methods and demonstrate significantly lower DOA estimation errors as well as lower computation times. People who find it difficult to localize speech sources during group conversations or social activities can use the `easy-to-use' Android application. The proposed implementation does not need any additional hardware or external microphone attachments, and can run on any dual-microphone device, such as a smartphone or tablet.


Assuntos
Auxiliares de Audição , Aplicativos Móveis , Smartphone , Localização de Som , Meio Ambiente , Desenho de Equipamento , Audição , Perda Auditiva/reabilitação , Humanos , Ruído , Tecnologia Assistiva , Percepção da Fala
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...