Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
1.
Front Neurosci ; 14: 590164, 2020.
Article in English | MEDLINE | ID: mdl-33324153

ABSTRACT

The combination of neuromorphic visual sensors and spiking neural network offers a high efficient bio-inspired solution to real-world applications. However, processing event- based sequences remains challenging because of the nature of their asynchronism and sparsity behavior. In this paper, a novel spiking convolutional recurrent neural network (SCRNN) architecture that takes advantage of both convolution operation and recurrent connectivity to maintain the spatial and temporal relations from event-based sequence data are presented. The use of recurrent architecture enables the network to have a sampling window with an arbitrary length, allowing the network to exploit temporal correlations between event collections. Rather than standard ANN to SNN conversion techniques, the network utilizes a supervised Spike Layer Error Reassignment (SLAYER) training mechanism that allows the network to adapt to neuromorphic (event-based) data directly. The network structure is validated on the DVS gesture dataset and achieves a 10 class gesture recognition accuracy of 96.59% and an 11 class gesture recognition accuracy of 90.28%.

2.
Front Neurorobot ; 14: 568319, 2020.
Article in English | MEDLINE | ID: mdl-33192434

ABSTRACT

Traditionally the Perception Action cycle is the first stage of building an autonomous robotic system and a practical way to implement a low latency reactive system within a low Size, Weight and Power (SWaP) package. However, within complex scenarios, this method can lack contextual understanding about the scene, such as object recognition-based tracking or system attention. Object detection, identification and tracking along with semantic segmentation and attention are all modern computer vision tasks in which Convolutional Neural Networks (CNN) have shown significant success, although such networks often have a large computational overhead and power requirements, which are not ideal in smaller robotics tasks. Furthermore, cloud computing and massively parallel processing like in Graphic Processing Units (GPUs) are outside the specification of many tasks due to their respective latency and SWaP constraints. In response to this, Spiking Convolutional Neural Networks (SCNNs) look to provide the feature extraction benefits of CNNs, while maintaining low latency and power overhead thanks to their asynchronous spiking event-based processing. A novel Neuromorphic Perception Understanding Action (PUA) system is presented, that aims to combine the feature extraction benefits of CNNs with low latency processing of SCNNs. The PUA utilizes a Neuromorphic Vision Sensor for Perception that facilitates asynchronous processing within a Spiking fully Convolutional Neural Network (SpikeCNN) to provide semantic segmentation and Understanding of the scene. The output is fed to a spiking control system providing Actions. With this approach, the aim is to bring features of deep learning into the lower levels of autonomous robotics, while maintaining a biologically plausible STDP rule throughout the learned encoding part of the network. The network will be shown to provide a more robust and predictable management of spiking activity with an improved thresholding response. The reported experiments show that this system can deliver robust results of over 96 and 81% for accuracy and Intersection over Union, ensuring such a system can be successfully used within object recognition, classification and tracking problem. This demonstrates that the attention of the system can be tracked accurately, while the asynchronous processing means the controller can give precise track updates with minimal latency.

3.
Article in English | MEDLINE | ID: mdl-30440307

ABSTRACT

Acoustic analysis using signal processing tools can be used to extract voice features to distinguish whether a voice is pathological or healthy. The proposed work uses spectrogram of voice recordings from a voice database as the input to a Convolutional Neural Network (CNN) for automatic feature extraction and classification of disordered and normal voice. The novel classifier achieved 88.5%, 66.2% and 77.0% accuracy on training, validation and testing data set respectively on 482 normal and 482 organic dysphonia speech files. It reveals that the proposed novel algorithm on the Saarbruecken Voice Database can effectively been used for screening pathological voice recordings.


Subject(s)
Dysphonia/physiopathology , Voice , Acoustics , Algorithms , Databases, Factual , Humans , Neural Networks, Computer , Signal Processing, Computer-Assisted
4.
Int J Lang Commun Disord ; 53(4): 875-887, 2018 07.
Article in English | MEDLINE | ID: mdl-29774624

ABSTRACT

BACKGROUND: Stress production is important for effective communication, but this skill is frequently impaired in people with motor speech disorders. The literature reports successful treatment of these deficits in this population, thus highlighting the therapeutic potential of this area. However, no specific guidance is currently available to clinicians about whether any of the stress markers are more effective than others, to what degree they have to be manipulated, and whether strategies need to differ according to the underlying symptoms. AIMS: In order to provide detailed information on how stress production problems can be addressed, the study investigated (1) the minimum amount of change in a single stress marker necessary to achieve significant improvement in stress target identification; and (2) whether stress can be signalled more effectively with a combination of stress markers. METHODS & PROCEDURES: Data were sourced from a sentence stress task performed by 10 speakers with ataxic dysarthria and 10 healthy matched control participants. Fifteen utterances perceived as having incorrect stress patterns (no stress, all words stressed or inappropriate word stressed) were selected and digitally manipulated in a stepwise fashion based on typical speaker performance. Manipulations were performed on F0, intensity and duration, either in isolation or in combination with each other. In addition, pitch contours were modified for some utterances. A total of 50 naïve listeners scored which word they perceived as being stressed. OUTCOMES & RESULTS: Results showed that increases in duration and intensity at levels smaller than produced by the control participants resulted in significant improvements in listener accuracy. The effectiveness of F0 increases depended on the underlying error pattern. Overall intensity showed the most stable effects. Modifications of the pitch contour also resulted in significant improvements, but not to the same degree as amplification. Integration of two or more stress markers did not result in better results than manipulation of individual stress markers, unless they were combined with pitch contour modifications. CONCLUSIONS & IMPLICATIONS: The results highlight the potential for improvement of stress production in speakers with motor speech disorders. The fact that individual parameter manipulation is as effective as combining them will facilitate the therapeutic process considerably, as will the result that amplification at lower levels than seen in typical speakers is sufficient. The difference in results across utterance sets highlights the need to investigate the underlying error pattern in order to select the most effective compensatory strategy for clients.


Subject(s)
Dysarthria , Speech Acoustics , Adult , Aged , Dysarthria/physiopathology , Dysarthria/psychology , Female , Humans , Male , Middle Aged , Speech Intelligibility , Speech Perception , Speech Production Measurement
5.
IEEE Trans Neural Syst Rehabil Eng ; 25(10): 1832-1842, 2017 10.
Article in English | MEDLINE | ID: mdl-28436879

ABSTRACT

Advanced forearm prosthetic devices employ classifiers to recognize different electromyography (EMG) signal patterns, in order to identify the user's intended motion gesture. The classification accuracy is one of the main determinants of real-time controllability of a prosthetic limb and hence the necessity to achieve as high an accuracy as possible. In this paper, we study the effects of the temporal and spatial information provided to the classifier on its off-line performance and analyze their inter-dependencies. EMG data associated with seven practical hand gestures were recorded from partial-hand and trans-radial amputee volunteers as well as able-bodied volunteers. An extensive investigation was conducted to study the effect of analysis window length, window overlap, and the number of electrode channels on the classification accuracy as well as their interactions. Our main discoveries are that the effect of analysis window length on classification accuracy is practically independent of the number of electrodes for all participant groups; window overlap has no direct influence on classifier performance, irrespective of the window length, number of channels, or limb condition; the type of limb deficiency and the existing channel count influence the reduction in classification error achieved by adding more number of channels; partial-hand amputees outperform trans-radial amputees, with classification accuracies of only 11.3% below values achieved by able-bodied volunteers.


Subject(s)
Artificial Limbs , Electromyography/statistics & numerical data , Prosthesis Design , Adolescent , Adult , Aged , Algorithms , Amputees , Electrodes , Electromyography/classification , Electromyography/methods , Extremities/physiology , Female , Forearm/physiology , Gestures , Hand , Humans , Male , Middle Aged , Movement , Reproducibility of Results , Signal Processing, Computer-Assisted
6.
J Acoust Soc Am ; 137(5): EL360-6, 2015 May.
Article in English | MEDLINE | ID: mdl-25994734

ABSTRACT

Sound sources at the same angle in front or behind a two-microphone array (e.g., bilateral hearing aids) produce the same time delay and two estimates for the direction of arrival: A front-back confusion. The auditory system can resolve this issue using head movements. To resolve front-back confusion for hearing-aid algorithms, head movement was measured using an inertial sensor. Successive time-delay estimates between the microphones are shifted clockwise and counterclockwise by the head movement between estimates and aggregated in two histograms. The histogram with the largest peak after multiple estimates predicted the correct hemifield for the source, eliminating the front-back confusions.


Subject(s)
Biomimetics , Correction of Hearing Impairment/instrumentation , Hearing Aids , Persons With Hearing Impairments/rehabilitation , Sound Localization , Acoustic Stimulation , Algorithms , Equipment Design , Fourier Analysis , Head Movements , Humans , Models, Theoretical , Motion , Persons With Hearing Impairments/psychology , Sound , Time Factors
7.
Article in English | MEDLINE | ID: mdl-26736782

ABSTRACT

A new algorithm for 3D throat region segmentation from magnetic resonance imaging (MRI) is presented. The proposed algorithm initially pre-processes the MRI data to increase the contrast between the throat region and its surrounding tissues and to reduce artifacts. Isotropic 3D volume is reconstructed using the Fourier interpolation. Furthermore, a cube encompassing the throat region is evolved using level set method to form a smooth 3D boundary of the throat region. The results of the proposed algorithm on real and synthetic MRI data are used to validate the robustness and accuracy of the algorithm.


Subject(s)
Algorithms , Fourier Analysis , Imaging, Three-Dimensional/methods , Pharynx , Artifacts , Humans , Magnetic Resonance Imaging , Reproducibility of Results
8.
Annu Int Conf IEEE Eng Med Biol Soc ; 2015: 482-5, 2015 Aug.
Article in English | MEDLINE | ID: mdl-26736304

ABSTRACT

This paper presents a technique to improve the performance of an LDA classifier by determining if the predicted classification output is a misclassification and thereby rejecting it. This is achieved by automatically computing a class specific threshold with the help of ROC curves. If the posterior probability of a prediction is below the threshold, the classification result is discarded. This method of minimizing false positives is beneficial in the control of electromyography (EMG) based upper-limb prosthetic devices. It is hypothesized that a unique EMG pattern is associated with a specific hand gesture. In reality, however, EMG signals are difficult to distinguish, particularly in the case of multiple finger motions, and hence classifiers are trained to recognize a set of individual gestures. However, it is imperative that misclassifications be avoided because they result in unwanted prosthetic arm motions which are detrimental to device controllability. This warrants the need for the proposed technique wherein a misclassified gesture prediction is rejected resulting in no motion of the prosthetic arm. The technique was tested using surface EMG data recorded from thirteen amputees performing seven hand gestures. Results show the number of misclassifications was effectively reduced, particularly in cases with low original classification accuracy.


Subject(s)
ROC Curve , Algorithms , Artificial Limbs , Electromyography , Pattern Recognition, Automated
9.
IEEE Trans Neural Syst Rehabil Eng ; 22(5): 1003-12, 2014 Sep.
Article in English | MEDLINE | ID: mdl-24802139

ABSTRACT

This paper presents a new electromyography activity detection technique in which 1-D local binary pattern histograms are used to distinguish between periods of activity and inactivity in myoelectric signals. The algorithm is tested on forearm surface myoelectric signals occurring due to hand gestures. The novel features of the presented method are that: 1) activity detection is performed across multiple channels using few parameters and without the need for majority vote mechanisms, 2) there are no per-channel thresholds to be tuned, which makes the process of activity detection easier and simpler to implement and less prone to errors, 3) it is not necessary to measure the properties of the signal during a quiescent period before using the algorithm. The algorithm is compared to other offline single- and double-threshold activity detection methods and, for the data sets tested, it is shown to have a better overall performance with greater tolerance to the noise in the real data set used.


Subject(s)
Arm/physiology , Electromyography/methods , Electromyography/statistics & numerical data , Algorithms , Data Interpretation, Statistical , Electromyography/instrumentation , False Positive Reactions , Hand/physiology , Humans , Movement/physiology
10.
IEEE Trans Neural Syst Rehabil Eng ; 22(4): 774-83, 2014 Jul.
Article in English | MEDLINE | ID: mdl-24760926

ABSTRACT

The ability to recognize various forms of contaminants in surface electromyography (EMG) signals and to ascertain the overall quality of such signals is important in many EMG-enabled rehabilitation systems. In this paper, new methods for the automatic identification of commonly occurring contaminant types in surface EMG signals are presented. Such methods are advantageous because the contaminant type is typically not known in advance. The presented approach uses support vector machines as the main classification system. Both simulated and real EMG signals are used to assess the performance of the methods. The contaminants considered include: 1) electrocardiogram interference; 2) motion artifact; 3) power line interference; 4) amplifier saturation; and 5) additive white Gaussian noise. Results show that the contaminants can readily be distinguished at lower signal to noise ratios, with a growing degree of confusion at higher signal to noise ratios, where their effects on signal quality are less significant.


Subject(s)
Action Potentials/physiology , Algorithms , Artifacts , Electromyography/methods , Muscle Contraction/physiology , Muscle, Skeletal/physiology , Pattern Recognition, Automated/methods , Data Interpretation, Statistical , Humans , Reproducibility of Results , Sensitivity and Specificity
11.
J Cardiovasc Magn Reson ; 15: 28, 2013 Mar 30.
Article in English | MEDLINE | ID: mdl-23548176

ABSTRACT

BACKGROUND: T2-weighted cardiovascular magnetic resonance (CMR) is clinically-useful for imaging the ischemic area-at-risk and amount of salvageable myocardium in patients with acute myocardial infarction (MI). However, to date, quantification of oedema is user-defined and potentially subjective. METHODS: We describe a highly automatic framework for quantifying myocardial oedema from bright blood T2-weighted CMR in patients with acute MI. Our approach retains user input (i.e. clinical judgment) to confirm the presence of oedema on an image which is then subjected to an automatic analysis. The new method was tested on 25 consecutive acute MI patients who had a CMR within 48 hours of hospital admission. Left ventricular wall boundaries were delineated automatically by variational level set methods followed by automatic detection of myocardial oedema by fitting a Rayleigh-Gaussian mixture statistical model. These data were compared with results from manual segmentation of the left ventricular wall and oedema, the current standard approach. RESULTS: The mean perpendicular distances between automatically detected left ventricular boundaries and corresponding manual delineated boundaries were in the range of 1-2 mm. Dice similarity coefficients for agreement (0=no agreement, 1=perfect agreement) between manual delineation and automatic segmentation of the left ventricular wall boundaries and oedema regions were 0.86 and 0.74, respectively. CONCLUSION: Compared to standard manual approaches, the new highly automatic method for estimating myocardial oedema is accurate and straightforward. It has potential as a generic software tool for physicians to use in clinical practice.


Subject(s)
Edema, Cardiac/diagnosis , Heart Ventricles/pathology , Image Interpretation, Computer-Assisted , Magnetic Resonance Imaging, Cine , Myocardial Infarction/diagnosis , Myocardium/pathology , Adult , Aged , Automation , Edema, Cardiac/pathology , Female , Humans , Male , Middle Aged , Models, Statistical , Myocardial Infarction/pathology , Observer Variation , Predictive Value of Tests , Reproducibility of Results , Software
12.
J Acoust Soc Am ; 131(3): EL268-74, 2012 Mar.
Article in English | MEDLINE | ID: mdl-22423819

ABSTRACT

Hearing-aid wearers have reported sound source locations as being perceptually internalized (i.e., inside their head). The contribution of hearing-aid design to internalization has, however, received little attention. This experiment compared the sensitivity of hearing-impaired (HI) and normal-hearing listeners to externalization cues when listening with their own ears and simulated behind-the-ear hearing-aids in increasingly complex listening situations and reduced pinna cues. Participants rated the degree of externalization using a multiple-stimulus listening test for mixes of internalized and externalized speech stimuli presented over headphones. The results showed that HI listeners had a contracted perception of externalization correlated with high-frequency hearing loss.


Subject(s)
Auditory Perception/physiology , Cues , Hearing Aids , Hearing Loss/physiopathology , Acoustic Stimulation , Adult , Female , Humans , Male , Middle Aged , Sound Localization/physiology , Speech Perception/physiology , Young Adult
13.
Article in English | MEDLINE | ID: mdl-22256202

ABSTRACT

Viability assessment of heart muscle after a myocardial infarction is an important step for diagnosis and therapy planning. It is important to quantify the area of edema because it can differentiate between viable and death myocardial tissues. In this paper an automatic method to quantify cardiac edema is presented. The method is based on a combination of morphological operations and statistical thresholding. Using real MRI data it is demonstrated that the proposed method can delineate edema region comparable to manual segmentation with a linear correlation coefficient r=0.76 and the mean difference is around 9.95%. The quantification result is also used to generate 3D visualisation model showing normal myocardial wall and edema region, which will enhance clinician diagnosis capability with real pattern of edema distribution and quantitative description.


Subject(s)
Edema, Cardiac/pathology , Imaging, Three-Dimensional/methods , Magnetic Resonance Imaging/methods , Myocardium/pathology , Automation , Heart Ventricles/pathology , Humans , Image Processing, Computer-Assisted
14.
Otol Neurotol ; 31(3): 486-91, 2010 Apr.
Article in English | MEDLINE | ID: mdl-20042993

ABSTRACT

OBJECTIVE: To produce a reliable objective method of assessing the House-Brackmann (H-B) and regional grades of facial palsy with the results produced and presented in a time and manner suitable for a routine clinical setting. STUDY DESIGN: Analysis of video pixel data using artificial neural networks (ANNs). SETTING: Tertiary-referral neuro-otologic center. SUBJECTS: Subjects with varying degrees of unilateral facial palsy. METHOD: Clinicians assessed videos of subjects with varying degrees of facial palsy performing prescribed movements. The results of their overall and regional assessments were used to train ANNs. These were then tested for consistency, accuracy, and ability to identify clinical changes in grading. RESULTS: A group of subjects had their objective computer assessment repeated, and consistent H-B and regional grades were obtained. A second group had both subjective clinical and objective computer assessments performed. The program gave results that were within the expected level of agreement with the subjective clinical assessment for both H-B and regional grades. A third group had repeated clinical and computer assessments from the time of onset to recovery of facial function. The changes in the computer results both for H-B and regional grades tracked the clinical change. CONCLUSION: It is possible to measure consistently and objectively the H-B and regional grades of facial palsy using trained ANNs to analysis video pixel data, and this can be done in a routine clinical environment by a technician. The results from each region of the face are presented as a Facogram along with the H-B grade.


Subject(s)
Facial Paralysis/diagnosis , Facial Paralysis/physiopathology , Image Interpretation, Computer-Assisted/methods , Severity of Illness Index , Electrodiagnosis , Humans , Movement , Video Recording
15.
IEEE Trans Biomed Eng ; 56(7): 1864-70, 2009 Jul.
Article in English | MEDLINE | ID: mdl-19336281

ABSTRACT

Facial paralysis is the loss of voluntary muscle movement of one side of the face. A quantitative, objective, and reliable assessment system would be an invaluable tool for clinicians treating patients with this condition. This paper presents a novel framework for objective measurement of facial paralysis. The motion information in the horizontal and vertical directions and the appearance features on the apex frames are extracted based on the local binary patterns (LBPs) on the temporal-spatial domain in each facial region. These features are temporally and spatially enhanced by the application of novel block processing schemes. A multiresolution extension of uniform LBP is proposed to efficiently combine the micropatterns and large-scale patterns into a feature vector. The symmetry of facial movements is measured by the resistor-average distance (RAD) between LBP features extracted from the two sides of the face. Support vector machine is applied to provide quantitative evaluation of facial paralysis based on the House-Brackmann (H-B) scale. The proposed method is validated by experiments with 197 subject videos, which demonstrates its accuracy and efficiency.


Subject(s)
Face/physiopathology , Facial Paralysis/physiopathology , Image Processing, Computer-Assisted/methods , Movement , Video Recording/methods , Algorithms , Artificial Intelligence , Data Interpretation, Statistical , Humans , Reproducibility of Results
16.
Article in English | MEDLINE | ID: mdl-19163791

ABSTRACT

This paper presents a novel framework for objective measurement of facial paralysis in biomedial videos. The motion information in the horizontal and vertical directions and the appearance features on the apex frames are extracted based on the Local Binary Patterns (LBP) on the temporal-spatial domain in each facial region. These features are temporally and spatially enhanced by the application of block schemes. A multi-resolution extension of uniform LBP is proposed to efficiently combine the micro-patterns and large-scale patterns into a feature vector, which increases the algorithmic robustness and reduces noise effects while still retaining computational simplicity. The symmetry of facial movements is measured by the Resistor-Average Distance (RAD) between LBP features extracted from the two sides of the face. Support Vector Machine (SVM) is applied to provide quantitative evaluation of facial paralysis based on the House-Brackmann (H-B) Scale. The proposed method is validated by experiments with 197 subject videos, which demonstrates its accuracy and efficiency.


Subject(s)
Artificial Intelligence , Facial Paralysis/diagnosis , Facial Paralysis/physiopathology , Image Interpretation, Computer-Assisted/methods , Movement , Photography/methods , Video Recording/methods , Algorithms , Facial Expression , Humans , Reproducibility of Results , Sensitivity and Specificity , Signal Processing, Computer-Assisted
17.
Appl Opt ; 46(21): 4579-86, 2007 Jul 20.
Article in English | MEDLINE | ID: mdl-17609703

ABSTRACT

Three principal strategies for the compression of phase-shifting digital holograms (interferogram domain-, hologram domain-, and reconstruction domain-based strategies) are reviewed and their effects in the reconstruction domain are investigated. Images of the reconstructions are provided to visually compare the performances of the methods. In addition to single reconstructions the compression effects on different depth reconstructions and reconstructions corresponding to different viewing angles are investigated so that a range of the 3D aspects of the holograms may be considered. Although comparable at low compression rates, it is found that depth and perspective information is degraded in different ways with the different techniques at high compression rates. A hologram of an object with sufficient details at different depths is used so that both parallax and depth effects can be illustrated.

18.
Appl Opt ; 46(3): 351-6, 2007 Jan 20.
Article in English | MEDLINE | ID: mdl-17228380

ABSTRACT

Phase-shifting digital hologram compression has been mainly studied in the recording domain, where data possess a rather randomlike appearance, yielding reduced compression efficiency. We carry out the compression of such data in the reconstruction domain, which benefits from the spatial correlation of the data yielding, increased efficiency. Real holographic data are used to demonstrate the performance of the new approach. It is also shown that the reconstruction is not limited to the initially obtained view, as additional views can still be obtained with appropriate postprocessing.

19.
IEEE Trans Image Process ; 15(12): 3804-11, 2006 Dec.
Article in English | MEDLINE | ID: mdl-17153953

ABSTRACT

Fresnelets are wavelet-like base functions specially tailored for digital holography applications. We introduce their use in phase-shifting interferometry (PSI) digital holography for the compression of such holographic data. Two compression methods are investigated. One uses uniform quantization of the Fresnelet coefficients followed by lossless coding, and the other uses set portioning in hierarchical trees (SPIHT) coding. Quantization and lossless coding of the original data is used to compare the performance of the proposed algorithms. The comparison reveals that the Fresnelet transform of phase-shifting holograms in combination with SPIHT or uniform quantization can be used very effectively for the compression of holographic data. The performance of the new compression schemes is demonstrated on real PSI digital holographic data.


Subject(s)
Algorithms , Data Compression/methods , Holography/methods , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Signal Processing, Computer-Assisted , Reproducibility of Results , Sensitivity and Specificity
20.
Appl Opt ; 45(11): 2437-43, 2006 Apr 10.
Article in English | MEDLINE | ID: mdl-16623240

ABSTRACT

A compression method of phase-shifting digital holographic data is presented. Three interference patterns are recorded, and holographic information is extracted from them by phase-shifting interferometry. The scheme uses standard baseline Joint Photographic Experts Group (JPEG) or standard JPEG-2000 image compression techniques on the recorded interference patterns to reduce the amount of data to be stored. High compression rates are achieved for good reconstructed object image quality. The utility of the proposed method is experimentally verified with real holographic data. Results for compression rates using JPEG-2000 and JPEG of approximately 27 and 20, respectively, for a normalized root-mean-square error of approximately 0.7 are demonstrated.

SELECTION OF CITATIONS
SEARCH DETAIL
...