Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
J Acoust Soc Am ; 122(2): 1138-49, 2007 Aug.
Article in English | MEDLINE | ID: mdl-17672660

ABSTRACT

The role of transient speech components on speech intelligibility was investigated. Speech was decomposed into two components--quasi-steady-state (QSS) and transient--using a set of time-varying filters whose center frequencies and bandwidths were controlled to identify the strongest formant components in speech. The relative energy and intelligibility of the QSS and transient components were compared to original speech. Most of the speech energy was in the QSS component, but this component had low intelligibility. The transient component had much lower energy but was almost as intelligible as the original speech, suggesting that the transient component included speech elements important to speech perception. A modified version of speech was produced by amplifying the transient component and recombining it with the original speech. The intelligibility of the modified speech in background noise was compared to that of the original speech, using a psychoacoustic procedure based on the modified rhyme protocol. Word recognition rates for the modified speech were significantly higher at low signal-to-noise ratios (SNRs), with minimal effect on intelligibility at higher SNRs. These results suggest that amplification of transient information may improve the intelligibility of speech in noise and that this improvement is more effective in severe noise conditions.


Subject(s)
Noise , Speech Intelligibility , Algorithms , Environment , Filtration , Humans , Psychoacoustics , Sound
2.
Conf Proc IEEE Eng Med Biol Soc ; 2006: 1727-30, 2006.
Article in English | MEDLINE | ID: mdl-17946476

ABSTRACT

Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al. and Tantibundhit et al. were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al. and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.


Subject(s)
Algorithms , Diagnosis, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Sound Spectrography/methods , Speech Production Measurement/methods , Humans , Reproducibility of Results , Sensitivity and Specificity
3.
Conf Proc IEEE Eng Med Biol Soc ; 2005: 6273-6, 2005.
Article in English | MEDLINE | ID: mdl-17281701

ABSTRACT

Automatic detection of region of interest (ROIs) in a complex image or video, such as an angiogram or endoscopic neurosurgery video, is a critical task in many medical image and video processing applications. In this paper, we present a new method that addresses several challenges in automatic detection of ROI of neurosurgical video for ROI coding which is used for neurophysiological intraoperative monitoring (IOM) system. This method is based on an object tracking technique with multivariate density estimation theory, combined with the shape information of the object. By defining the ROIs for neurosurgical video, this method produces a smooth and convex emphasis region within which surgical procedures are performed. A large bandwidth budget is assigned within the ROI to archive high-fidelity Internet transmission. Outside the ROI, a small bandwidth budget is allocated to efficiently utilize the bandwidth resource. We believe this method also can be used to image-guiduance surgery (IGS) systems to track the positions of surgical instruments in the physical space occupied by the patient after some improvement.

4.
Opt Lett ; 27(2): 89-91, 2002 Jan 15.
Article in English | MEDLINE | ID: mdl-18007721

ABSTRACT

Blind source separation of two electromagnetic fields is investigated. The difficulty of this task lies in the fact that only the power, which is the square of the sum of the electromagnetic fields, can be directly measured; the cross term of the electromagnetic fields is inevitable, and a strong correlation occurs in blind deconvolution. However, the relative phase is physically different from the field intensities, and, hence, extracting the phase during separation seems inconceivable. Our results demonstrate that the intensities and the relative phase of two electromagnetic waves can be determined with eigenvalue problem formalism even when the mixing processes are completely unknown.

SELECTION OF CITATIONS
SEARCH DETAIL
...