Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 14(1): 6981, 2024 Mar 24.
Artigo em Inglês | MEDLINE | ID: mdl-38523168

RESUMO

Stitched images can offer a broader field of view, but their boundaries can be irregular and unpleasant. To address this issue, current methods for rectangling images start by distorting local grids multiple times to obtain rectangular images with regular boundaries. However, these methods can result in content distortion and missing boundary information. We have developed an image rectangling solution using the reparameterized transformer structure, focusing on single distortion. Additionally, we have designed an assisted learning network to aid in the process of the image rectangling network. To improve the network's parallel efficiency, we have introduced a local thin-plate spline Transform strategy to achieve efficient local deformation. Ultimately, the proposed method achieves state-of-the-art performance in stitched image rectangling with a low number of parameters while maintaining high content fidelity. The code is available at https://github.com/MelodYanglc/TransRectangling .

2.
Front Hum Neurosci ; 17: 1253211, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37727862

RESUMO

Introduction: Speech production involves neurological planning and articulatory execution. How speakers prepare for articulation is a significant aspect of speech production research. Previous studies have focused on isolated words or short phrases to explore speech planning mechanisms linked to articulatory behaviors, including investigating the eye-voice span (EVS) during text reading. However, these experimental paradigms lack real-world speech process replication. Additionally, our understanding of the neurological dimension of speech planning remains limited. Methods: This study examines speech planning mechanisms during continuous speech production by analyzing behavioral (eye movement and speech) and neurophysiological (EEG) data within a continuous speech production task. The study specifically investigates the influence of semantic consistency on speech planning and the occurrence of "look ahead" behavior. Results: The outcomes reveal the pivotal role of semantic coherence in facilitating fluent speech production. Speakers access lexical representations and phonological information before initiating speech, emphasizing the significance of semantic processing in speech planning. Behaviorally, the EVS decreases progressively during continuous reading of regular sentences, with a slight increase for non-regular sentences. Moreover, eye movement pattern analysis identifies two distinct speech production modes, highlighting the importance of semantic comprehension and prediction in higher-level lexical processing. Neurologically, the dual pathway model of speech production is supported, indicating a dorsal information flow and frontal lobe involvement. The brain network linked to semantic understanding exhibits a negative correlation with semantic coherence, with significant activation during semantic incoherence and suppression in regular sentences. Discussion: The study's findings enhance comprehension of speech planning mechanisms and offer insights into the role of semantic coherence in continuous speech production. Furthermore, the research methodology establishes a valuable framework for future investigations in this domain.

3.
Artigo em Inglês | MEDLINE | ID: mdl-37624721

RESUMO

Speech emotion recognition (SER) plays an important role in human-computer interaction, which can provide better interactivity to enhance user experiences. Existing approaches tend to directly apply deep learning networks to distinguish emotions. Among them, the convolutional neural network (CNN) is the most commonly used method to learn emotional representations from spectrograms. However, CNN does not explicitly model features' associations in the spectral-, temporal-, and channel-wise axes or their relative relevance, which will limit the representation learning. In this article, we propose a deep spectro-temporal-channel network (DSTCNet) to improve the representational ability for speech emotion. The proposed DSTCNet integrates several spectro-temporal-channel (STC) attention modules into a general CNN. Specifically, we propose the STC module that infers a 3-D attention map along the dimensions of time, frequency, and channel. The STC attention can focus more on the regions of crucial time frames, frequency ranges, and feature channels. Finally, experiments were conducted on the Berlin emotional database (EmoDB) and interactive emotional dyadic motion capture (IEMOCAP) databases. The results reveal that our DSTCNet can outperform the traditional CNN-based and several state-of-the-art methods.

4.
Cereb Cortex ; 33(13): 8620-8632, 2023 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-37118893

RESUMO

Sentence oral reading requires not only a coordinated effort in the visual, articulatory, and cognitive processes but also supposes a top-down influence from linguistic knowledge onto the visual-motor behavior. Despite a gradual recognition of a predictive coding effect in this process, there is currently a lack of a comprehensive demonstration regarding the time-varying brain dynamics that underlines the oral reading strategy. To address this, our study used a multimodal approach, combining real-time recording of electroencephalography, eye movements, and speech, with a comprehensive examination of regional, inter-regional, sub-network, and whole-brain responses. Our study identified the top-down predictive effect with a phrase-grouping phenomenon in the fixation interval and eye-voice span. This effect was associated with the delta and theta band synchronization in the prefrontal, anterior temporal, and inferior frontal lobes. We also observed early activation of the cognitive control network and its recurrent interactions with the visual-motor networks structurally at the phrase rate. Finally, our study emphasizes the importance of cross-frequency coupling as a promising neural realization of hierarchical sentence structuring and calls for further investigation.


Assuntos
Idioma , Leitura , Eletroencefalografia , Encéfalo/fisiologia , Linguística
5.
Sci Rep ; 13(1): 2478, 2023 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-36774391

RESUMO

Multi-focus image fusion is a process of fusing multiple images of different focus areas into a total focus image, which has important application value. In view of the defects of the current fusion method in the detail information retention effect of the original image, a fusion architecture based on two stages is designed. In the training phase, combined with the polarized self-attention module and the DenseNet network structure, an encoder-decoder structure network is designed for image reconstruction tasks to enhance the original information retention ability of the model. In the fusion stage, combined with the encoded feature map, a fusion strategy based on edge feature map is designed for image fusion tasks to enhance the attention ability of detail information in the fusion process. Compared with 9 classical fusion algorithms, the proposed algorithm has achieved advanced fusion performance in both subjective and objective evaluations, and the fused image has better information retention effect on the original image.

6.
J Neural Eng ; 20(1)2023 02 14.
Artigo em Inglês | MEDLINE | ID: mdl-36720164

RESUMO

Objective.Constructing an efficient human emotion recognition model based on electroencephalogram (EEG) signals is significant for realizing emotional brain-computer interaction and improving machine intelligence.Approach.In this paper, we present a spatial-temporal feature fused convolutional graph attention network (STFCGAT) model based on multi-channel EEG signals for human emotion recognition. First, we combined the single-channel differential entropy (DE) feature with the cross-channel functional connectivity (FC) feature to extract both the temporal variation and spatial topological information of EEG. After that, a novel convolutional graph attention network was used to fuse the DE and FC features and further extract higher-level graph structural information with sufficient expressive power for emotion recognition. Furthermore, we introduced a multi-headed attention mechanism in graph neural networks to improve the generalization ability of the model.Main results.We evaluated the emotion recognition performance of our proposed model on the public SEED and DEAP datasets, which achieved a classification accuracy of 99.11% ± 0.83% and 94.83% ± 3.41% in the subject-dependent and subject-independent experiments on the SEED dataset, and achieved an accuracy of 91.19% ± 1.24% and 92.03% ± 4.57% for discrimination of arousal and valence in subject-independent experiments on DEAP dataset. Notably, our model achieved state-of-the-art performance on cross-subject emotion recognition tasks for both datasets. In addition, we gained insight into the proposed frame through both the ablation experiments and the analysis of spatial patterns of FC and DE features.Significance.All these results prove the effectiveness of the STFCGAT architecture for emotion recognition and also indicate that there are significant differences in the spatial-temporal characteristics of the brain under different emotional states.


Assuntos
Emoções , Reconhecimento Psicológico , Humanos , Encéfalo , Eletroencefalografia , Inteligência Artificial
7.
Front Comput Neurosci ; 16: 919215, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35874316

RESUMO

In recent years, electroencephalograph (EEG) studies on speech comprehension have been extended from a controlled paradigm to a natural paradigm. Under the hypothesis that the brain can be approximated as a linear time-invariant system, the neural response to natural speech has been investigated extensively using temporal response functions (TRFs). However, most studies have modeled TRFs in the electrode space, which is a mixture of brain sources and thus cannot fully reveal the functional mechanism underlying speech comprehension. In this paper, we propose methods for investigating the brain networks of natural speech comprehension using TRFs on the basis of EEG source reconstruction. We first propose a functional hyper-alignment method with an additive average method to reduce EEG noise. Then, we reconstruct neural sources within the brain based on the EEG signals to estimate TRFs from speech stimuli to source areas, and then investigate the brain networks in the neural source space on the basis of the community detection method. To evaluate TRF-based brain networks, EEG data were recorded in story listening tasks with normal speech and time-reversed speech. To obtain reliable structures of brain networks, we detected TRF-based communities from multiple scales. As a result, the proposed functional hyper-alignment method could effectively reduce the noise caused by individual settings in an EEG experiment and thus improve the accuracy of source reconstruction. The detected brain networks for normal speech comprehension were clearly distinctive from those for non-semantically driven (time-reversed speech) audio processing. Our result indicates that the proposed source TRFs can reflect the cognitive processing of spoken language and that the multi-scale community detection method is powerful for investigating brain networks.

8.
Comput Intell Neurosci ; 2022: 9948218, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35685160

RESUMO

On-board system fault knowledge base (KB) is a collection of fault causes, maintenance methods, and interrelationships among on-board modules and components of high-speed railways, which plays a crucial role in knowledge-driven dynamic operation and maintenance (O&M) decisions for on-board systems. To solve the problem of multi-source heterogeneity of on-board system O&M data, an entity matching (EM) approach using the BERT model and semi-supervised incremental learning is proposed. The heterogeneous knowledge fusion task is formulated as a pairwise binary classification task of entities in the knowledge units. Firstly, the deep semantic features of fault knowledge units are obtained by BERT. We also investigate the effectiveness of knowledge unit features extracted from different hidden layers of the model on heterogeneous knowledge fusion during model fine-tuning. To further improve the utilization of unlabeled test samples, a semi-supervised incremental learning strategy based on pseudo labels is devised. By selecting entity pairs with high confidence to generate pseudo labels, the label sample set is expanded to realize incremental learning and enhance the knowledge fusion ability of the model. Furthermore, the model's robustness is strengthened by embedding-based adversarial training in the fine-tuning stage. Based on the on-board system's O&M data, this paper constructs the fault KB and compares the model with other solutions developed for related matching tasks, which verifies the effectiveness of this model in the heterogeneous knowledge fusion task of the on-board system.

9.
IEEE Trans Cybern ; 52(3): 1364-1376, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-32356771

RESUMO

Spikes are the currency in central nervous systems for information transmission and processing. They are also believed to play an essential role in low-power consumption of the biological systems, whose efficiency attracts increasing attentions to the field of neuromorphic computing. However, efficient processing and learning of discrete spikes still remain a challenging problem. In this article, we make our contributions toward this direction. A simplified spiking neuron model is first introduced with the effects of both synaptic input and firing output on the membrane potential being modeled with an impulse function. An event-driven scheme is then presented to further improve the processing efficiency. Based on the neuron model, we propose two new multispike learning rules which demonstrate better performance over other baselines on various tasks, including association, classification, and feature detection. In addition to efficiency, our learning rules demonstrate high robustness against the strong noise of different types. They can also be generalized to different spike coding schemes for the classification task, and notably, the single neuron is capable of solving multicategory classifications with our learning rules. In the feature detection task, we re-examine the ability of unsupervised spike-timing-dependent plasticity with its limitations being presented, and find a new phenomenon of losing selectivity. In contrast, our proposed learning rules can reliably solve the task over a wide range of conditions without specific constraints being applied. Moreover, our rules cannot only detect features but also discriminate them. The improved performance of our methods would contribute to neuromorphic computing as a preferable choice.


Assuntos
Redes Neurais de Computação , Neurônios , Aprendizagem , Neurônios/fisiologia
10.
IEEE Trans Neural Netw Learn Syst ; 33(4): 1714-1726, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-33471769

RESUMO

Spiking neural networks (SNNs) are considered as a potential candidate to overcome current challenges, such as the high-power consumption encountered by artificial neural networks (ANNs); however, there is still a gap between them with respect to the recognition accuracy on various tasks. A conversion strategy was, thus, introduced recently to bridge this gap by mapping a trained ANN to an SNN. However, it is still unclear that to what extent this obtained SNN can benefit both the accuracy advantage from ANN and high efficiency from the spike-based paradigm of computation. In this article, we propose two new conversion methods, namely TerMapping and AugMapping. The TerMapping is a straightforward extension of a typical threshold-balancing method with a double-threshold scheme, while the AugMapping additionally incorporates a new scheme of augmented spike that employs a spike coefficient to carry the number of typical all-or-nothing spikes occurring at a time step. We examine the performance of our methods based on the MNIST, Fashion-MNIST, and CIFAR10 data sets. The results show that the proposed double-threshold scheme can effectively improve the accuracies of the converted SNNs. More importantly, the proposed AugMapping is more advantageous for constructing accurate, fast, and efficient deep SNNs compared with other state-of-the-art approaches. Our study, therefore, provides new approaches for further integration of advanced techniques in ANNs to improve the performance of SNNs, which could be of great merit to applied developments with spike-based neuromorphic computing.


Assuntos
Redes Neurais de Computação , Neurônios , Reconhecimento Psicológico
11.
Neuroscience ; 469: 46-58, 2021 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-34119576

RESUMO

Being able to accurately perceive the emotion expressed by the facial or verbal expression from others is critical to successful social interaction. However, only few studies examined the multimodal interactions on speech emotion, and there is no consistence in studies on the speech emotion perception. It remains unclear, how the speech emotion of different valence is perceived on the multimodal stimuli by our human brain. In this paper, we conducted a functional magnetic resonance imaging (fMRI) study with an event-related design, using dynamic facial expressions and emotional speech stimuli to express different emotions, in order to explore the perception mechanism of speech emotion in audio-visual modality. The representational similarity analysis (RSA), whole-brain searchlight analysis, and conjunction analysis of emotion were used to interpret the representation of speech emotion in different aspects. Significantly, a weighted RSA approach was creatively proposed to evaluate the contribution of each candidate model to the best fitted model and provided a supplement to RSA. The results of weighted RSA indicated that the fitted models were superior to all candidate models and the weights could be used to explain the representation of ROIs. The bilateral amygdala has been shown to be associated with the processing of both positive and negative emotions except neutral emotion. It is indicated that the left posterior insula and the left anterior superior temporal gyrus (STG) play important roles in the perception of multimodal speech emotion.


Assuntos
Mapeamento Encefálico , Percepção da Fala , Encéfalo , Emoções , Expressão Facial , Humanos , Imageamento por Ressonância Magnética , Fala , Lobo Temporal/diagnóstico por imagem
12.
Neural Netw ; 142: 205-212, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34023641

RESUMO

Efficient learning of spikes plays a valuable role in training spiking neural networks (SNNs) to have desired responses to input stimuli. However, current learning rules are limited to a binary form of spikes. The seemingly ubiquitous phenomenon of burst in nervous systems suggests a new way to carry more information with spike bursts in addition to times. Based on this, we introduce an advanced form, the augmented spikes, where spike coefficients are used to carry additional information. How could neurons learn and benefit from augmented spikes remains unclear. In this paper, we propose two new efficient learning rules to process spatiotemporal patterns composed of augmented spikes. Moreover, we examine the learning abilities of our methods with a synthetic recognition task of augmented spike patterns and two practical ones for image classification. Experimental results demonstrate that our rules are capable of extracting information carried by both the timing and coefficient of spikes. Our proposed approaches achieve remarkable performance and good robustness under various noise conditions, as compared to benchmarks. The improved performance indicates the merits of augmented spikes and our learning rules, which could be beneficial and generalized to a broad range of spike-based platforms.


Assuntos
Modelos Neurológicos , Redes Neurais de Computação , Potenciais de Ação , Aprendizagem , Neurônios
13.
Neural Netw ; 140: 261-273, 2021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-33838592

RESUMO

Continuous dimensional emotion recognition from speech helps robots or virtual agents capture the temporal dynamics of a speaker's emotional state in natural human-robot interactions. Temporal modulation cues obtained directly from the time-domain model of auditory perception can better reflect temporal dynamics than the acoustic features usually processed in the frequency domain. Feature extraction, which can reflect temporal dynamics of emotion from temporal modulation cues, is challenging because of the complexity and diversity of the auditory perception model. A recent neuroscientific study suggests that human brains derive multi-resolution representations through temporal modulation analysis. This study investigates multi-resolution representations of an auditory perception model and proposes a novel feature called multi-resolution modulation-filtered cochleagram (MMCG) for predicting valence and arousal values of emotional primitives. The MMCG is constructed by combining four modulation-filtered cochleagrams at different resolutions to capture various temporal and contextual modulation information. In addition, to model the multi-temporal dependencies of the MMCG, we designed a parallel long short-term memory (LSTM) architecture. The results of extensive experiments on the RECOLA and SEWA datasets demonstrate that MMCG provides the best recognition performance in both datasets among all evaluated features. The results also show that the parallel LSTM can build multi-temporal dependencies from the MMCG features, and the performance on valence and arousal prediction is better than that of a plain LSTM method.


Assuntos
Emoções , Modelos Neurológicos , Percepção da Fala , Interface para o Reconhecimento da Fala , Cóclea/fisiologia , Sinais (Psicologia) , Humanos , Aprendizado de Máquina
14.
IEEE Trans Neural Netw Learn Syst ; 32(2): 625-638, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-32203038

RESUMO

The capability for environmental sound recognition (ESR) can determine the fitness of individuals in a way to avoid dangers or pursue opportunities when critical sound events occur. It still remains mysterious about the fundamental principles of biological systems that result in such a remarkable ability. Additionally, the practical importance of ESR has attracted an increasing amount of research attention, but the chaotic and nonstationary difficulties continue to make it a challenging task. In this article, we propose a spike-based framework from a more brain-like perspective for the ESR task. Our framework is a unifying system with consistent integration of three major functional parts which are sparse encoding, efficient learning, and robust readout. We first introduce a simple sparse encoding, where key points are used for feature representation, and demonstrate its generalization to both spike- and nonspike-based systems. Then, we evaluate the learning properties of different learning rules in detail with our contributions being added for improvements. Our results highlight the advantages of multispike learning, providing a selection reference for various spike-based developments. Finally, we combine the multispike readout with the other parts to form a system for ESR. Experimental results show that our framework performs the best as compared to other baseline approaches. In addition, we show that our spike-based framework has several advantageous characteristics including early decision making, small dataset acquiring, and ongoing dynamic processing. Our framework is the first attempt to apply the multispike characteristic of nervous neurons to ESR. The outstanding performance of our approach would potentially contribute to draw more research efforts to push the boundaries of spike-based paradigm to a new horizon.


Assuntos
Meio Ambiente , Aprendizado de Máquina , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Som , Algoritmos , Encéfalo/fisiologia , Humanos , Modelos Neurológicos , Neurônios
15.
J Speech Lang Hear Res ; 63(7): 2245-2254, 2020 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-32579867

RESUMO

Purpose The primary purpose of this study was to explore the audiovisual speech perception strategies.80.23.47 adopted by normal-hearing and deaf people in processing familiar and unfamiliar languages. Our primary hypothesis was that they would adopt different perception strategies due to different sensory experiences at an early age, limitations of the physical device, and the developmental gap of language, and others. Method Thirty normal-hearing adults and 33 prelingually deaf adults participated in the study. They were asked to perform judgment and listening tasks while watching videos of a Uygur-Mandarin bilingual speaker in a familiar language (Standard Chinese) or an unfamiliar language (Modern Uygur) while their eye movements were recorded by eye-tracking technology. Results Task had a slight influence on the distribution of selective attention, whereas subject and language had significant influences. To be specific, the normal-hearing and the d10eaf participants mainly gazed at the speaker's eyes and mouth, respectively, in the experiment; moreover, while the normal-hearing participants had to stare longer at the speaker's mouth when they confronted with the unfamiliar language Modern Uygur, the deaf participant did not change their attention allocation pattern when perceiving the two languages. Conclusions Normal-hearing and deaf adults adopt different audiovisual speech perception strategies: Normal-hearing adults mainly look at the eyes, and deaf adults mainly look at the mouth. Additionally, language and task can also modulate the speech perception strategy.


Assuntos
Surdez , Percepção da Fala , Adulto , Tecnologia de Rastreamento Ocular , Audição , Humanos , Idioma , Boca
16.
Neuroscience ; 415: 70-76, 2019 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-31330232

RESUMO

Understanding brain processing mechanisms from the perception of speech sounds to high-level semantic processing is vital for effective human-robot communication. In this study, 128-channel electroencephalograph (EEG) signals were recorded when subjects were listening to real and pseudowords in Mandarin. By using an EEG source reconstruction method and a sliding-window Granger causality analysis, we analyzed the dynamic brain connectivity patterns. Results showed that the bilateral temporal cortex (lTC and rTC), the bilateral motor cortex (lMC and rMC), the frontal cortex (FC), and the occipital cortex (OC) were recruited in the process, with complex patterns in the real word condition than in the pseudoword condition. The spatial pattern is basically consistent with previous functional MRI studies on the understanding of spoken Chinese. For the real word condition, speech perception and processing involved different connection patterns in the initial phoneme perception and processing phase, the phonological processing and lexical selection phase, and the semantic integration phase. Specifically, compared with pseudowords, a hub region in the FC and unique patterns of lMC → rMC and lTC → FC connectivity were found during processing real words after 180 ms, while a distributed network of temporal, motor, and frontal brain areas was involved after 300 ms. This may be related to semantic processing and integration. The involvement of both bottom-up input and top-down modulation in real word processing may support the previously proposed TRACE model. In sum, the findings of this study suggest that representations of speech involve dynamic interactions among distributed brain regions that communicate through time-specific functional networks.


Assuntos
Percepção Auditiva/fisiologia , Encéfalo/fisiologia , Semântica , Percepção da Fala/fisiologia , Adulto , Mapeamento Encefálico , Eletroencefalografia , Feminino , Humanos , Masculino , Fonética
17.
Comput Intell Neurosci ; 2019: 3831809, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31933621

RESUMO

Aspect-level sentiment classification aims to identify the sentiment polarity of a review expressed toward a target. In recent years, neural network-based methods have achieved success in aspect-level sentiment classification, and these methods fall into two types: the first takes the target information into account for context modelling, and the second models the context without considering the target information. It is concluded that the former is better than the latter. However, most of the target-related models just focus on the impact of the target on context modelling, while ignoring the role of context in target modelling. In this study, we introduce an interactive neural network model named LT-T-TR, which divided a review into three parts: the left context with target phrase, the target phrase, and the right context with target phrase. And the interaction between the left/right context and the target phrase is utilized by an attention mechanism to learn the representations of the left/right context and the target phrase separately. As a result, the most important words in the left/right context or in the target phrase are captured, and the results on laptop and restaurant datasets demonstrate that our model outperforms the state-of-the-art methods.


Assuntos
Mineração de Dados/métodos , Emoções , Redes Neurais de Computação , Psicolinguística , Computadores , Comportamento do Consumidor , Humanos , Aprendizado de Máquina , Restaurantes , Semântica
18.
Sensors (Basel) ; 18(7)2018 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-30041441

RESUMO

In this paper, a novel imperceptible, fragile and blind watermark scheme is proposed for speech tampering detection and self-recovery. The embedded watermark data for content recovery is calculated from the original discrete cosine transform (DCT) coefficients of host speech. The watermark information is shared in a frames-group instead of stored in one frame. The scheme trades off between the data waste problem and the tampering coincidence problem. When a part of a watermarked speech signal is tampered with, one can accurately localize the tampered area, the watermark data in the area without any modification still can be extracted. Then, a compressive sensing technique is employed to retrieve the coefficients by exploiting the sparseness in the DCT domain. The smaller the tampered the area, the better quality of the recovered signal is. Experimental results show that the watermarked signal is imperceptible, and the recovered signal is intelligible for high tampering rates of up to 47.6%. A deep learning-based enhancement method is also proposed and implemented to increase the SNR of recovered speech signal.

19.
Neuroscience ; 359: 183-195, 2017 09 17.
Artigo em Inglês | MEDLINE | ID: mdl-28729063

RESUMO

One of the long-standing issues in neurolinguistic research is about the neural basis of word representation, concerning whether grammatical classification or semantic difference causes the neural dissociation of brain activity patterns when processing different word categories, especially nouns and verbs. To disentangle this puzzle, four orthogonalized word categories in Chinese: unambiguous nouns (UN), unambiguous verbs (UV), ambiguous words with noun-biased semantics (AN), and ambiguous words with verb-biased semantics (AV) were adopted in an auditory task for recording electroencephalographic (EEG) signals from 128 electrodes on the scalps of twenty-two subjects. With the advanced current density reconstruction (CDR) algorithm and the constraint of standardized low-resolution electromagnetic tomography, the spatiotemporal brain dynamics of word processing were explored with the results that in multiple time periods including P1 (60-90ms), N1 (100-140ms), P200 (150-250ms) and N400 (350-450ms), noun-verb dissociation over the parietal-occipital and frontal-central cortices appeared not only between the UN-UV grammatical classes but also between the grammatically identical but semantically different AN-AV pairs. The apparent semantic dissociation within one grammatical class strongly suggests that the semantic difference rather than grammatical classification could be interpreted as the origin of the noun-verb neural dissociation. Our results also revealed that semantic dissociation occurs from an early stage and repeats in multiple phases, thus supporting a functionally hierarchical word processing mechanism.


Assuntos
Córtex Cerebral/fisiologia , Semântica , Percepção da Fala/fisiologia , Estimulação Acústica , Adulto , Eletroencefalografia , Potenciais Evocados Auditivos , Feminino , Humanos , Masculino , Processamento de Sinais Assistido por Computador , Adulto Jovem
20.
J Craniomaxillofac Surg ; 44(11): 1800-1805, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-27713053

RESUMO

PURPOSE: Endoscope-assisted surgery has widely been adopted as a basic surgical procedure, with various training systems using virtual reality developed for this procedure. In the present study, a basic training system comprising virtual reality for the removal of submandibular glands under endoscope assistance was developed. The efficacy of the training system was verified in novice oral surgeons. MATERIAL AND METHODS: A virtual reality training system was developed using existing haptic devices. Virtual reality models were constructed from computed tomography data to ensure anatomical accuracy. Novice oral surgeons were trained using the developed virtual reality training system. RESULTS: The developed virtual reality training system included models of the submandibular gland and surrounding connective tissues and blood vessels entering the submandibular gland. Cutting or abrasion of the connective tissue and manipulations, such as elevation of blood vessels, were reproduced by the virtual reality system. A training program using the developed system was devised. Novice oral surgeons were trained in accordance with the devised training program. CONCLUSIONS: Our virtual reality training system for endoscope-assisted removal of the submandibular gland is effective in the training of novice oral surgeons in endoscope-assisted surgery.


Assuntos
Instrução por Computador , Cirurgia Endoscópica por Orifício Natural/educação , Glândula Submandibular/cirurgia , Simulação por Computador , Instrução por Computador/instrumentação , Humanos , Cirurgia Endoscópica por Orifício Natural/instrumentação , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...