Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Bioengineering (Basel) ; 11(3)2024 Mar 21.
Article in English | MEDLINE | ID: mdl-38534567

ABSTRACT

The 12-lead electrocardiogram (ECG) is crucial in assessing patient decisions. However, portable ECG devices capable of acquiring a complete 12-lead ECG are scarce. For the first time, a deep learning-based method is proposed to reconstruct the 12-lead ECG from Frank leads (VX, VY, and VZ) or EASI leads (VES, VAS, and VAI). The innovative ECG reconstruction network called M2Eformer is composed of a 2D-ECGblock and a ProbDecoder module. The 2D-ECGblock module adaptively segments EASI leads into multi-periods based on frequency energy, transforming the 1D time series into a 2D tensor representing within-cycle and between-cycle variations. The ProbDecoder module aims to extract Probsparse self-attention and achieve one-step output for the target leads. Experimental results from comparing recorded and reconstructed 12-lead ECG using Frank leads indicate that M2Eformer outperforms traditional ECG reconstruction methods on a public database. In this study, a self-constructed database (10 healthy individuals + 15 patients) was utilized for the clinical diagnostic validation of ECG reconstructed from EASI leads. Subsequently, both the ECG reconstructed using EASI and the recorded 12-lead ECG were subjected to a double-blind diagnostic experiment conducted by three cardiologists. The overall diagnostic consensus among three cardiology experts, reaching a rate of 96%, indicates the significant utility of EASI-reconstructed 12-lead ECG in facilitating the diagnosis of cardiac conditions.

2.
Brain Sci ; 13(3)2023 Mar 11.
Article in English | MEDLINE | ID: mdl-36979287

ABSTRACT

Clinical studies have shown that speech pauses can reflect the cognitive function differences between Alzheimer's Disease (AD) and non-AD patients, while the value of pause information in AD detection has not been fully explored. Herein, we propose a speech pause feature extraction and encoding strategy for only acoustic-signal-based AD detection. First, a voice activity detection (VAD) method was constructed to detect pause/non-pause feature and encode it to binary pause sequences that are easier to calculate. Then, an ensemble machine-learning-based approach was proposed for the classification of AD from the participants' spontaneous speech, based on the VAD Pause feature sequence and common acoustic feature sets (ComParE and eGeMAPS). The proposed pause feature sequence was verified in five machine-learning models. The validation data included two public challenge datasets (ADReSS and ADReSSo, English voice) and a local dataset (10 audio recordings containing five patients and five controls, Chinese voice). Results showed that the VAD Pause feature was more effective than common feature sets (ComParE: 6373 features and eGeMAPS: 88 features) for AD classification, and that the ensemble method improved the accuracy by more than 5% compared to several baseline methods (8% on the ADReSS dataset; 5.9% on the ADReSSo dataset). Moreover, the pause-sequence-based AD detection method could achieve 80% accuracy on the local dataset. Our study further demonstrated the potential of pause information in speech-based AD detection, and also contributed to a more accessible and general pause feature extraction and encoding method for AD detection.

3.
Math Biosci Eng ; 19(12): 13214-13226, 2022 Sep 09.
Article in English | MEDLINE | ID: mdl-36654043

ABSTRACT

As an advanced technique, compressed sensing has been used for rapid magnetic resonance imaging in recent years, Two-step Iterative Shrinkage Thresholding Algorithm (TwIST) is a popular algorithm based on Iterative Thresholding Shrinkage Algorithm (ISTA) for fast MR image reconstruction. However TwIST algorithms cannot dynamically adjust shrinkage factor according to the degree of convergence. So it is difficult to balance speed and efficiency. In this paper, we proposed an algorithm which can dynamically adjust the shrinkage factor to rebalance the fidelity item and regular item during TwIST iterative process. The shrinkage factor adjusting is judged by the previous reconstructed results throughout the iteration cycle. It can greatly accelerate the iterative convergence while ensuring convergence accuracy. We used MR images with 2 body parts and different sampling rates to simulate, the results proved that the proposed algorithm have a faster convergence rate and better reconstruction performance. We also used 60 MR images of different body parts for further simulation, and the results proved the universal superiority of the proposed algorithm.


Subject(s)
Algorithms , Image Processing, Computer-Assisted , Image Processing, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Phantoms, Imaging , Computer Simulation
4.
Entropy (Basel) ; 25(1)2022 Dec 26.
Article in English | MEDLINE | ID: mdl-36673182

ABSTRACT

Text-to-speech (TTS) synthesizers have been widely used as a vital assistive tool in various fields. Traditional sequence-to-sequence (seq2seq) TTS such as Tacotron2 uses a single soft attention mechanism for encoder and decoder alignment tasks, which is the biggest shortcoming that incorrectly or repeatedly generates words when dealing with long sentences. It may also generate sentences with run-on and wrong breaks regardless of punctuation marks, which causes the synthesized waveform to lack emotion and sound unnatural. In this paper, we propose an end-to-end neural generative TTS model that is based on the deep-inherited attention (DIA) mechanism along with an adjustable local-sensitive factor (LSF). The inheritance mechanism allows multiple iterations of the DIA by sharing the same training parameter, which tightens the token-frame correlation, as well as fastens the alignment process. In addition, LSF is adopted to enhance the context connection by expanding the DIA concentration region. In addition, a multi-RNN block is used in the decoder for better acoustic feature extraction and generation. Hidden-state information driven from the multi-RNN layers is utilized for attention alignment. The collaborative work of the DIA and multi-RNN layers contributes to outperformance in the high-quality prediction of the phrase breaks of the synthesized speech. We used WaveGlow as a vocoder for real-time, human-like audio synthesis. Human subjective experiments show that the DIA-TTS achieved a mean opinion score (MOS) of 4.48 in terms of naturalness. Ablation studies further prove the superiority of the DIA mechanism for the enhancement of phrase breaks and attention robustness.

5.
Front Neurosci ; 14: 615435, 2020.
Article in English | MEDLINE | ID: mdl-33519365

ABSTRACT

Medical image fusion, which aims to derive complementary information from multi-modality medical images, plays an important role in many clinical applications, such as medical diagnostics and treatment. We propose the LatLRR-FCNs, which is a hybrid medical image fusion framework consisting of the latent low-rank representation (LatLRR) and the fully convolutional networks (FCNs). Specifically, the LatLRR module is used to decompose the multi-modality medical images into low-rank and saliency components, which can provide fine-grained details and preserve energies, respectively. The FCN module aims to preserve both global and local information by generating the weighting maps for each modality image. The final weighting map is obtained using the weighted local energy and the weighted sum of the eight-neighborhood-based modified Laplacian method. The fused low-rank component is generated by combining the low-rank components of each modality image according to the guidance provided by the final weighting map within pyramid-based fusion. A simple sum strategy is used for the saliency components. The usefulness and efficiency of the proposed framework are thoroughly evaluated on four medical image fusion tasks, including computed tomography (CT) and magnetic resonance (MR), T1- and T2-weighted MR, positron emission tomography and MR, and single-photon emission CT and MR. The results demonstrate that by leveraging the LatLRR for image detail extraction and the FCNs for global and local information description, we can achieve performance superior to the state-of-the-art methods in terms of both objective assessment and visual quality in some cases. Furthermore, our method has a competitive performance in terms of computational costs compared to other baselines.

SELECTION OF CITATIONS
SEARCH DETAIL
...