ABSTRACT
Extraction of tonal signals embedded in background noise is a crucial step before classification and separation of low-frequency sounds of baleen whales. This work reports results of comparing five tonal detectors, namely the instantaneous frequency estimator, YIN estimator, harmonic product spectrum, cost-function-based detector, and ridge detector. Comparisons, based on a low-frequency adaptation of the Silbido scoring feature, employ five metrics, which quantify the effectiveness of these detectors to retrieve tonal signals that have a wide range of signal to noise ratios (SNRs) and the quality of the detection results. Ground-truth data were generated by embedding 20 synthetic Antarctic blue whale (Balaenoptera musculus intermedia) calls in randomly extracted 30-min noise segments from a 79 h-library recorded by an Ocean Bottom Seismometer in the Indian Ocean during 2012-2013. Monte-Carlo simulations were performed using 20 trials per SNR, ranging from 0 dB to 15 dB. Overall, the tonal detection results show the superiority of the cost-function-based and the ridge detectors, over the other detectors, for all SNR values. More particularly, for lower SNRs (⩽3 dB), these two methods outperformed the other three with high recall, low fragmentation, and high coverage scores. For SNRs ⩾7 dB, the five methods performed similarly.
Subject(s)
Balaenoptera/psychology , Signal Processing, Computer-Assisted , Sound Spectrography , Transducers , Vocalization, Animal , Animals , Signal-To-Noise RatioABSTRACT
This letter presents an improvement of the image source method for geoacoustic inversion. The algorithm is based on the Teager-Kaiser energy operator which amplifies the discontinuities in signals while the soft transitions are reduced. This property is exploited for accurate detection of time arrivals and thus for location of the image sources. The effectiveness of the method is shown on both synthetic and real data and the inversion results are, overall, in good agreement with ground truth and other inversion results with a significant reduction of computation time.
ABSTRACT
In this paper a speech denoising strategy based on time adaptive thresholding of intrinsic modes functions (IMFs) of the signal, extracted by empirical mode decomposition (EMD), is introduced. The denoised signal is reconstructed by the superposition of its adaptive thresholded IMFs. Adaptive thresholds are estimated using the Teager-Kaiser energy operator (TKEO) of signal IMFs. More precisely, TKEO identifies the type of frame by expanding differences between speech and non-speech frames in each IMF. Based on the EMD, the proposed speech denoising scheme is a fully data-driven approach. The method is tested on speech signals with different noise levels and the results are compared to EMD-shrinkage and wavelet transform (WT) coupled with TKEO. Speech enhancement performance is evaluated using output signal to noise ratio (SNR) and perceptual evaluation of speech quality (PESQ) measure. Based on the analyzed speech signals, the proposed enhancement scheme performs better than WT-TKEO and EMD-shrinkage approaches in terms of output SNR and PESQ. The noise is greatly reduced using time-adaptive thresholding than universal thresholding. The study is limited to signals corrupted by additive white Gaussian noise.
Subject(s)
Models, Theoretical , Noise , Signal Processing, Computer-Assisted , Speech Acoustics , Speech Production Measurement/methods , Voice Quality , Humans , Nonlinear Dynamics , Signal-To-Noise Ratio , Sound Spectrography , Speech Intelligibility , Speech Perception , Time Factors , Wavelet AnalysisABSTRACT
In this paper, two methods for signal detection and time-delay estimation based on the cross Psi(B)-energy operator are proposed. These methods are well suited for mono-component AM-FM signals. The Psi(B) energy operator measures how much one signal is present in another one. The peak of the Psi(B) operator corresponds to the maximum of interaction between the two signals. Compared to the cross-correlation function, the Psi(B) operator includes temporal information and relative changes of the signal which are reflected in its first and second derivatives. The discrete version of the continuous-time form of the Psi(B) operator, which is used in its implementation, is presented. The methods are illustrated on synthetic and real signals and the results compared to those of the matched filter and the cross correlation. The real signals correspond to impulse responses of buried objects obtained by active sonar in iso-speed single path environments.
Subject(s)
Hearing , Radio , Algorithms , Auditory Perception , Humans , Image Interpretation, Computer-Assisted , Models, Biological , Neural Networks, Computer , Signal Detection, PsychologicalABSTRACT
We extend and generalize the Teager-Kaiser [in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (1993), Vol. 3, p. 149] and the higher-order differential energy operators [IEEE Signal Process. Lett.2, 152 (1995)] to a large class of operators called higher-order energy operators. We show that for AM-FM signal demodulation, the introduced partial derivative orders have to satisfy certain conditions. These operators are parameterized for local processing of AM-FM signals. The operators are illustrated using synthetic signals and a real signal from light scanning interferometry.
ABSTRACT
We describe a new segmentation method of dynamic nuclear medicine images based on the cross-Psi(B)-energy operator. Psi(B) is a nonlinear measure which quantifies the interaction between two time-signals including their first and second derivatives. Similarity measure, noted SimilB, between the time activity curve (TAC) of each pixel and the mean value of the TACs of a reference region of the scintigraphic image series is calculated. The resulting SimilB map is a functional image representing regions with different temporal dynamics. Some new properties of Psi(B) are presented. Particularly, we show that Psi(B) as a similarity measure is robust to both scale and time shift. The proposed method is applied to nuclear cardiac sequences for visualization and analysis of the ventricular emptying pattern, which may be useful in studying motion or conduction abnormalities. Results of a normal subject and four patients with abnormal ventricular contraction patterns are presented to highlight the suitability of this operator for studying non-stationary TAC series.