Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
1.
Comput Med Imaging Graph ; 113: 102341, 2024 04.
Article in English | MEDLINE | ID: mdl-38277769

ABSTRACT

Breast cancer continues to be a significant cause of mortality among women globally. Timely identification and precise diagnosis of breast abnormalities are critical for enhancing patient prognosis. In this study, we focus on improving the early detection and accurate diagnosis of breast abnormalities, which is crucial for improving patient outcomes and reducing the mortality rate of breast cancer. To address the limitations of traditional screening methods, a novel unsupervised feature correlation network was developed to predict maps indicating breast abnormal variations using longitudinal 2D mammograms. The proposed model utilizes the reconstruction process of current year and prior year mammograms to extract tissue from different areas and analyze the differences between them to identify abnormal variations that may indicate the presence of cancer. The model incorporates a feature correlation module, an attention suppression gate, and a breast abnormality detection module, all working together to improve prediction accuracy. The proposed model not only provides breast abnormal variation maps but also distinguishes between normal and cancer mammograms, making it more advanced compared to the state-of-the-art baseline models. The results of the study show that the proposed model outperforms the baseline models in terms of Accuracy, Sensitivity, Specificity, Dice score, and cancer detection rate.


Subject(s)
Breast Neoplasms , Mammography , Female , Humans , Mammography/methods , Breast Neoplasms/diagnostic imaging , Prognosis
2.
BMC Bioinformatics ; 25(1): 27, 2024 Jan 15.
Article in English | MEDLINE | ID: mdl-38225583

ABSTRACT

BACKGROUND: The recent development of high-throughput sequencing has created a large collection of multi-omics data, which enables researchers to better investigate cancer molecular profiles and cancer taxonomy based on molecular subtypes. Integrating multi-omics data has been proven to be effective for building more precise classification models. Most current multi-omics integrative models use either an early fusion in the form of concatenation or late fusion with a separate feature extractor for each omic, which are mainly based on deep neural networks. Due to the nature of biological systems, graphs are a better structural representation of bio-medical data. Although few graph neural network (GNN) based multi-omics integrative methods have been proposed, they suffer from three common disadvantages. One is most of them use only one type of connection, either inter-omics or intra-omic connection; second, they only consider one kind of GNN layer, either graph convolution network (GCN) or graph attention network (GAT); and third, most of these methods have not been tested on a more complex classification task, such as cancer molecular subtypes. RESULTS: In this study, we propose a novel end-to-end multi-omics GNN framework for accurate and robust cancer subtype classification. The proposed model utilizes multi-omics data in the form of heterogeneous multi-layer graphs, which combine both inter-omics and intra-omic connections from established biological knowledge. The proposed model incorporates learned graph features and global genome features for accurate classification. We tested the proposed model on the Cancer Genome Atlas (TCGA) Pan-cancer dataset and TCGA breast invasive carcinoma (BRCA) dataset for molecular subtype and cancer subtype classification, respectively. The proposed model shows superior performance compared to four current state-of-the-art baseline models in terms of accuracy, F1 score, precision, and recall. The comparative analysis of GAT-based models and GCN-based models reveals that GAT-based models are preferred for smaller graphs with less information and GCN-based models are preferred for larger graphs with extra information.


Subject(s)
High-Throughput Nucleotide Sequencing , Neoplasms , Knowledge , Learning , Neural Networks, Computer , Neoplasms/genetics
3.
ACM BCB ; 20232023 Sep.
Article in English | MEDLINE | ID: mdl-39006863

ABSTRACT

In various applications, such as computer vision, medical imaging and robotics, three-dimensional (3D) image registration is a significant step. It enables the alignment of various datasets into a single coordinate system, consequently providing a consistent perspective that allows further analysis. By precisely aligning images we can compare, analyze, and combine data collected in different situations. This paper presents a novel approach for 3D or z-stack microscopy and medical image registration, utilizing a combination of conventional and deep learning techniques for feature extraction and adaptive likelihood-based methods for outlier detection. The proposed method uses the Scale-invariant Feature Transform (SIFT) and the Residual Network (ResNet50) deep neural learning network to extract effective features and obtain precise and exhaustive representations of image contents. The registration approach also employs the adaptive Maximum Likelihood Estimation SAmple Consensus (MLESAC) method that optimizes outlier detection and increases noise and distortion resistance to improve the efficacy of these combined extracted features. This integrated approach demonstrates robustness, flexibility, and adaptability across a variety of imaging modalities, enabling the registration of complex images with higher precision. Experimental results show that the proposed algorithm outperforms state-of-the-art image registration methods, including conventional SIFT, SIFT with Random Sample Consensus (RANSAC), and Oriented FAST and Rotated BRIEF (ORB) methods, as well as registration software packages such as bUnwrapJ and TurboReg, in terms of Mutual Information (MI), Phase Congruency-Based (PCB) metrics, and Gradiant-based metrics (GBM), using 3D MRI and 3D serial sections of multiplex microscopy images.

4.
Cancers (Basel) ; 14(21)2022 Oct 29.
Article in English | MEDLINE | ID: mdl-36358753

ABSTRACT

Breast cancer is among the most common and fatal diseases for women, and no permanent treatment has been discovered. Thus, early detection is a crucial step to control and cure breast cancer that can save the lives of millions of women. For example, in 2020, more than 65% of breast cancer patients were diagnosed in an early stage of cancer, from which all survived. Although early detection is the most effective approach for cancer treatment, breast cancer screening conducted by radiologists is very expensive and time-consuming. More importantly, conventional methods of analyzing breast cancer images suffer from high false-detection rates. Different breast cancer imaging modalities are used to extract and analyze the key features affecting the diagnosis and treatment of breast cancer. These imaging modalities can be divided into subgroups such as mammograms, ultrasound, magnetic resonance imaging, histopathological images, or any combination of them. Radiologists or pathologists analyze images produced by these methods manually, which leads to an increase in the risk of wrong decisions for cancer detection. Thus, the utilization of new automatic methods to analyze all kinds of breast screening images to assist radiologists to interpret images is required. Recently, artificial intelligence (AI) has been widely utilized to automatically improve the early detection and treatment of different types of cancer, specifically breast cancer, thereby enhancing the survival chance of patients. Advances in AI algorithms, such as deep learning, and the availability of datasets obtained from various imaging modalities have opened an opportunity to surpass the limitations of current breast cancer analysis methods. In this article, we first review breast cancer imaging modalities, and their strengths and limitations. Then, we explore and summarize the most recent studies that employed AI in breast cancer detection using various breast imaging modalities. In addition, we report available datasets on the breast-cancer imaging modalities which are important in developing AI-based algorithms and training deep learning models. In conclusion, this review paper tries to provide a comprehensive resource to help researchers working in breast cancer imaging analysis.

5.
Med Phys ; 49(6): 3654-3669, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35271746

ABSTRACT

PURPOSE: Automatic detection of very small and nonmass abnormalities from mammogram images has remained challenging. In clinical practice for each patient, radiologists commonly not only screen the mammogram images obtained during the examination, but also compare them with previous mammogram images to make a clinical decision. To design an artificial intelligence (AI) system to mimic radiologists for better cancer detection, in this work we proposed an end-to-end enhanced Siamese convolutional neural network to detect breast cancer using previous year and current year mammogram images. METHODS: The proposed Siamese-based network uses high-resolution mammogram images and fuses features of pairs of previous year and current year mammogram images to predict cancer probabilities. The proposed approach is developed based on the concept of one-shot learning that learns the abnormal differences between current and prior images instead of abnormal objects, and as a result can perform better with small sample size data sets. We developed two variants of the proposed network. In the first model, to fuse the features of current and previous images, we designed an enhanced distance learning network that considers not only the overall distance, but also the pixel-wise distances between the features. In the other model, we concatenated the features of current and previous images to fuse them. RESULTS: We compared the performance of the proposed models with those of some baseline models that use current images only (ResNet and VGG) and also use current and prior images (long short-term memory [LSTM] and vanilla Siamese) in terms of accuracy, sensitivity, precision, F1 score, and area under the curve (AUC). Results show that the proposed models outperform the baseline models and the proposed model with the distance learning network performs the best (accuracy: 0.92, sensitivity: 0.93, precision: 0.91, specificity: 0.91, F1: 0.92 and AUC: 0.95). CONCLUSIONS: Integrating prior mammogram images improves automatic cancer classification, specially for very small and nonmass abnormalities. For classification models that integrate current and prior mammogram images, using an enhanced and effective distance learning network can advance the performance of the models.


Subject(s)
Breast Neoplasms , Artificial Intelligence , Breast Neoplasms/diagnostic imaging , Female , Humans , Machine Learning , Mammography/methods , Neural Networks, Computer
6.
Adv Exp Med Biol ; 1361: 55-74, 2022.
Article in English | MEDLINE | ID: mdl-35230683

ABSTRACT

Copy number variation (CNV), which is deletion and multiplication of segments of a genome, is an important genomic alteration that has been associated with many diseases including cancer. In cancer, CNVs are mostly somatic aberrations that occur during cancer evolution. Advances in sequencing technologies and arrival of next-generation sequencing data (whole-genome sequencing and whole-exome sequencing or targeted sequencing) have opened up an opportunity to detect CNVs with higher accuracy and resolution. Many computational methods have been developed for somatic CNV detection, which is a challenging task due to complexity of cancer sequencing data, high level of noise and biases in the sequencing process, and big data nature of sequencing data. Nevertheless, computational detection of CNV in sequencing data has resulted in the discovery of actionable cancer-specific CNVs to be used to guide cancer therapeutics, contributing to significant progress in precision oncology. In this chapter, we start by introducing CNVs. Then, we discuss the main approaches and methods developed for detecting somatic CNV for next-generation sequencing data, along with its challenges. Finally, we describe the overall workflow for CNV detection and introduce the most common publicly available software tools developed for somatic CNV detection and analysis.


Subject(s)
DNA Copy Number Variations , Neoplasms , Algorithms , High-Throughput Nucleotide Sequencing/methods , Humans , Neoplasms/genetics , Precision Medicine , Software
7.
BMC Bioinformatics ; 22(1): 364, 2021 Jul 08.
Article in English | MEDLINE | ID: mdl-34238220

ABSTRACT

BACKGROUND: Analyzing single-cell RNA sequencing (scRNAseq) data plays an important role in understanding the intrinsic and extrinsic cellular processes in biological and biomedical research. One significant effort in this area is the identification of cell types. With the availability of a huge amount of single cell sequencing data and discovering more and more cell types, classifying cells into known cell types has become a priority nowadays. Several methods have been introduced to classify cells utilizing gene expression data. However, incorporating biological gene interaction networks has been proved valuable in cell classification procedures. RESULTS: In this study, we propose a multimodal end-to-end deep learning model, named sigGCN, for cell classification that combines a graph convolutional network (GCN) and a neural network to exploit gene interaction networks. We used standard classification metrics to evaluate the performance of the proposed method on the within-dataset classification and the cross-dataset classification. We compared the performance of the proposed method with those of the existing cell classification tools and traditional machine learning classification methods. CONCLUSIONS: Results indicate that the proposed method outperforms other commonly used methods in terms of classification accuracy and F1 scores. This study shows that the integration of prior knowledge about gene interactions with gene expressions using GCN methodologies can extract effective features improving the performance of cell classification.


Subject(s)
Machine Learning , Neural Networks, Computer , Gene Regulatory Networks
8.
Med Image Anal ; 71: 102049, 2021 07.
Article in English | MEDLINE | ID: mdl-33901993

ABSTRACT

The relatively recent reintroduction of deep learning has been a revolutionary force in the interpretation of diagnostic imaging studies. However, the technology used to acquire those images is undergoing a revolution itself at the very same time. Digital breast tomosynthesis (DBT) is one such technology, which has transformed the field of breast imaging. DBT, a form of three-dimensional mammography, is rapidly replacing the traditional two-dimensional mammograms. These parallel developments in both the acquisition and interpretation of breast images present a unique case study in how modern AI systems can be designed to adapt to new imaging methods. They also present a unique opportunity for co-development of both technologies that can better improve the validity of results and patient outcomes. In this review, we explore the ways in which deep learning can be best integrated into breast cancer screening workflows using DBT. We first explain the principles behind DBT itself and why it has become the gold standard in breast screening. We then survey the foundations of deep learning methods in diagnostic imaging, and review the current state of research into AI-based DBT interpretation. Finally, we present some of the limitations of integrating AI into clinical practice and the opportunities these present in this burgeoning field.


Subject(s)
Breast Neoplasms , Deep Learning , Breast/diagnostic imaging , Breast Neoplasms/diagnostic imaging , Early Detection of Cancer , Female , Humans , Mammography
9.
BMC Bioinformatics ; 21(Suppl 1): 192, 2020 Dec 09.
Article in English | MEDLINE | ID: mdl-33297952

ABSTRACT

BACKGROUND: Automatic segmentation and localization of lesions in mammogram (MG) images are challenging even with employing advanced methods such as deep learning (DL) methods. We developed a new model based on the architecture of the semantic segmentation U-Net model to precisely segment mass lesions in MG images. The proposed end-to-end convolutional neural network (CNN) based model extracts contextual information by combining low-level and high-level features. We trained the proposed model using huge publicly available databases, (CBIS-DDSM, BCDR-01, and INbreast), and a private database from the University of Connecticut Health Center (UCHC). RESULTS: We compared the performance of the proposed model with those of the state-of-the-art DL models including the fully convolutional network (FCN), SegNet, Dilated-Net, original U-Net, and Faster R-CNN models and the conventional region growing (RG) method. The proposed Vanilla U-Net model outperforms the Faster R-CNN model significantly in terms of the runtime and the Intersection over Union metric (IOU). Training with digitized film-based and fully digitized MG images, the proposed Vanilla U-Net model achieves a mean test accuracy of 92.6%. The proposed model achieves a mean Dice coefficient index (DI) of 0.951 and a mean IOU of 0.909 that show how close the output segments are to the corresponding lesions in the ground truth maps. Data augmentation has been very effective in our experiments resulting in an increase in the mean DI and the mean IOU from 0.922 to 0.951 and 0.856 to 0.909, respectively. CONCLUSIONS: The proposed Vanilla U-Net based model can be used for precise segmentation of masses in MG images. This is because the segmentation process incorporates more multi-scale spatial context, and captures more local and global context to predict a precise pixel-wise segmentation map of an input full MG image. These detected maps can help radiologists in differentiating benign and malignant lesions depend on the lesion shapes. We show that using transfer learning, introducing augmentation, and modifying the architecture of the original model results in better performance in terms of the mean accuracy, the mean DI, and the mean IOU in detecting mass lesion compared to the other DL and the conventional models.


Subject(s)
Image Processing, Computer-Assisted/methods , Mammography , Neural Networks, Computer , Automation , Databases, Factual , Humans
10.
BMC Cancer ; 20(1): 197, 2020 Mar 12.
Article in English | MEDLINE | ID: mdl-32164626

ABSTRACT

BACKGROUND: BRCA1/2 germline mutation related cancers are candidates for new immune therapeutic interventions. This study was a hypothesis generating exploration of genomic data collected at diagnosis for 19 patients. The prominent tumor mutation burden (TMB) in hereditary breast and ovarian cancers in this cohort was not correlated with high global immune activity in their microenvironments. More information is needed about the relationship between genomic instability, phenotypes and immune microenvironments of these hereditary tumors in order to find appropriate markers of immune activity and the most effective anticancer immune strategies. METHODS: Mining and statistical analyses of the original DNA and RNA sequencing data and The Cancer Genome Atlas data were performed. To interpret the data, we have used published literature and web available resources such as Gene Ontology, The Cancer immunome Atlas and the Cancer Research Institute iAtlas. RESULTS: We found that BRCA1/2 germline related breast and ovarian cancers do not represent a unique phenotypic identity, but they express a range of phenotypes similar to sporadic cancers. All breast and ovarian BRCA1/2 related tumors are characterized by high homologous recombination deficiency (HRD) and low aneuploidy. Interestingly, all sporadic high grade serous ovarian cancers (HGSOC) and most of the subtypes of triple negative breast cancers (TNBC) also express a high degree of HRD. CONCLUSIONS: TMB is not associated with the magnitude of the immune response in hereditary BRCA1/2 related breast and ovarian cancers or in sporadic TNBC and sporadic HGSOC. Hereditary tumors express phenotypes as heterogenous as sporadic tumors with various degree of "BRCAness" and various characteristics of the immune microenvironments. The subtyping criteria developed for sporadic tumors can be applied for the classification of hereditary tumors and possibly also characterization of their immune microenvironment. A high HRD score may be a good candidate biomarker for response to platinum, and potentially PARP-inhibition. TRIAL REGISTRATION: Phase I Study of the Oral PI3kinase Inhibitor BKM120 or BYL719 and the Oral PARP Inhibitor Olaparib in Patients With Recurrent TNBC or HGSOC (NCT01623349), first posted on June 20, 2012. The design and the outcome of the clinical trial is not in the scope of this study.


Subject(s)
BRCA1 Protein/genetics , BRCA2 Protein/genetics , Cystadenocarcinoma, Serous/genetics , Gene Expression Profiling/methods , Hereditary Breast and Ovarian Cancer Syndrome/genetics , Ovarian Neoplasms/genetics , Triple Negative Breast Neoplasms/genetics , Cystadenocarcinoma, Serous/pathology , Data Mining , Female , Genomic Instability , Germ-Line Mutation , Hereditary Breast and Ovarian Cancer Syndrome/pathology , Homologous Recombination , Humans , Ovarian Neoplasms/pathology , Sequence Analysis, RNA , Triple Negative Breast Neoplasms/pathology , Tumor Microenvironment , Whole Genome Sequencing
11.
Article in English | MEDLINE | ID: mdl-30222580

ABSTRACT

Copy number variation (CNV) is a type of genomic/genetic variation that plays an important role in phenotypic diversity, evolution, and disease susceptibility. Next generation sequencing (NGS) technologies have created an opportunity for more accurate detection of CNVs with higher resolution. However, efficient and precise detection of CNVs remains challenging due to high levels of noise and biases, data heterogeneity, and the "big data" nature of NGS data. Sequence coverage (readcount) data are mostly used for detecting CNVs, specially for whole exome sequencing data. Readcount data are contaminated with several types of biases and noise that hinder accurate detection of CNVs. In this work, we introduce a novel preprocessing pipeline for reducing noise and biases to improve the detection accuracy of CNVs in heterogeneous NGS data, such as cancer whole exome sequencing data. We have employed several normalization methods to reduce readcount's biases that are due to GC content of reads, read alignment problems, and sample impurity. We have also developed a novel efficient and effective smoothing approach based on Taut String to reduce noise and increase CNV detection power. Using simulated and real data we showed that employing the proposed preprocessing pipeline significantly improves the accuracy of CNV detection.


Subject(s)
DNA Copy Number Variations/genetics , Exome Sequencing/methods , Genomics/methods , Genome, Human/genetics , Humans , Neoplasms/genetics , Signal Processing, Computer-Assisted
12.
J Cancer Res Clin Oncol ; 146(2): 503-514, 2020 Feb.
Article in English | MEDLINE | ID: mdl-31745703

ABSTRACT

PURPOSE: Fusion genes can be therapeutically relevant if they result in constitutive activation of oncogenes or repression of tumor suppressors. However, the prevalence and role of fusion genes in female cancers remain largely unexplored. Here, we investigate the fusion gene landscape in triple-negative breast cancer (TNBC) and high-grade serous ovarian cancer (HGSOC), two subtypes of female cancers with high molecular similarity but limited treatment options at present. METHODS: RNA-seq was utilized to identify fusion genes in a cohort of 18 TNBC and HGSOC patients treated with the PI3K inhibitor buparlisib and the PARP inhibitor olaparib in a phase I clinical trial (NCT01623349). Differential gene expression analysis was performed to assess the function of fusion genes in silico. Finally, these findings were correlated with the reported clinical outcomes. RESULTS: A total of 156 fusion genes was detected, whereof 44/156 (28%) events occurred in more than one patient. Low recurrence across samples indicated that the majority of fusion genes were private passenger events. The long non-coding RNA MALAT1 was involved in 97/156 (62%) fusion genes, followed in prevalence by MUC16, FOXP1, WWOX and XIST. Gene expression of FOXP1 was significantly elevated in patients with vs. without FOXP1 fusion (P= 0.02). From a clinical perspective, FOXP1 fusions were associated with a favorable overall survival. CONCLUSIONS: In summary, this study provides the first characterization of fusion genes in a cohort of TNBC and HGSOC patients. An improved mechanistic understanding of fusion genes will support the future identification of innovative therapeutic approaches for these challenging diseases.


Subject(s)
Cystadenocarcinoma, Serous/genetics , Gene Fusion , Ovarian Neoplasms/genetics , Triple Negative Breast Neoplasms/genetics , Adult , Aged , Aminopyridines/administration & dosage , Aminopyridines/adverse effects , Antineoplastic Combined Chemotherapy Protocols/adverse effects , Clinical Trials, Phase I as Topic , Cystadenocarcinoma, Serous/drug therapy , Female , Forkhead Transcription Factors/genetics , Gene Expression Profiling , Humans , Middle Aged , Morpholines/administration & dosage , Morpholines/adverse effects , Ovarian Neoplasms/drug therapy , Phthalazines/administration & dosage , Phthalazines/adverse effects , Piperazines/administration & dosage , Piperazines/adverse effects , RNA-Seq/methods , Repressor Proteins/genetics , Triple Negative Breast Neoplasms/drug therapy
13.
Mol Cancer Res ; 17(12): 2492-2507, 2019 12.
Article in English | MEDLINE | ID: mdl-31537618

ABSTRACT

The major obstacle in successfully treating triple-negative breast cancer (TNBC) is resistance to cytotoxic chemotherapy, the mainstay of treatment in this disease. Previous preclinical models of chemoresistance in TNBC have suffered from a lack of clinical relevance. Using a single high dose chemotherapy treatment, we developed a novel MDA-MB-436 cell-based model of chemoresistance characterized by a unique and complex morphologic phenotype, which consists of polyploid giant cancer cells giving rise to neuron-like mononuclear daughter cells filled with smaller but functional mitochondria and numerous lipid droplets. This resistant phenotype is associated with metabolic reprogramming with a shift to a greater dependence on fatty acids and oxidative phosphorylation. We validated both the molecular and histologic features of this model in a clinical cohort of primary chemoresistant TNBCs and identified several metabolic vulnerabilities including a dependence on PLIN4, a perilipin coating the observed lipid droplets, expressed both in the TNBC-resistant cells and clinical chemoresistant tumors treated with neoadjuvant doxorubicin-based chemotherapy. These findings thus reveal a novel mechanism of chemotherapy resistance that has therapeutic implications in the treatment of drug-resistant cancer. IMPLICATIONS: These findings underlie the importance of a novel morphologic-metabolic phenotype associated with chemotherapy resistance in TNBC, and bring to light novel therapeutic targets resulting from vulnerabilities in this phenotype, including the expression of PLIN4 essential for stabilizing lipid droplets in resistant cells.


Subject(s)
Cellular Reprogramming/drug effects , Gene Expression Regulation, Neoplastic/drug effects , Perilipin-4/genetics , Triple Negative Breast Neoplasms/drug therapy , Antineoplastic Agents/pharmacology , Apoptosis/drug effects , Cell Line, Tumor , Cell Proliferation/drug effects , Cellular Reprogramming/genetics , Doxorubicin/pharmacology , Drug Resistance, Neoplasm/genetics , Female , Gene Expression Regulation, Neoplastic/genetics , Humans , Lipid Droplets/drug effects , Metabolic Networks and Pathways/drug effects , Triple Negative Breast Neoplasms/genetics , Triple Negative Breast Neoplasms/pathology
14.
BMC Bioinformatics ; 20(Suppl 11): 281, 2019 Jun 06.
Article in English | MEDLINE | ID: mdl-31167642

ABSTRACT

BACKGROUND: The limitations of traditional computer-aided detection (CAD) systems for mammography, the extreme importance of early detection of breast cancer and the high impact of the false diagnosis of patients drive researchers to investigate deep learning (DL) methods for mammograms (MGs). Recent breakthroughs in DL, in particular, convolutional neural networks (CNNs) have achieved remarkable advances in the medical fields. Specifically, CNNs are used in mammography for lesion localization and detection, risk assessment, image retrieval, and classification tasks. CNNs also help radiologists providing more accurate diagnosis by delivering precise quantitative analysis of suspicious lesions. RESULTS: In this survey, we conducted a detailed review of the strengths, limitations, and performance of the most recent CNNs applications in analyzing MG images. It summarizes 83 research studies for applying CNNs on various tasks in mammography. It focuses on finding the best practices used in these research studies to improve the diagnosis accuracy. This survey also provides a deep insight into the architecture of CNNs used for various tasks. Furthermore, it describes the most common publicly available MG repositories and highlights their main features and strengths. CONCLUSIONS: The mammography research community can utilize this survey as a basis for their current and future studies. The given comparison among common publicly available MG repositories guides the community to select the most appropriate database for their application(s). Moreover, this survey lists the best practices that improve the performance of CNNs including the pre-processing of images and the use of multi-view images. In addition, other listed techniques like transfer learning (TL), data augmentation, batch normalization, and dropout are appealing solutions to reduce overfitting and increase the generalization of the CNN models. Finally, this survey identifies the research challenges and directions that require further investigations by the community.


Subject(s)
Deep Learning , Mammography/methods , Neural Networks, Computer , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/pathology , Databases, Factual , Female , Humans , Image Processing, Computer-Assisted , Publications , Surveys and Questionnaires
15.
BMC Bioinformatics ; 20(1): 40, 2019 Jan 18.
Article in English | MEDLINE | ID: mdl-30658573

ABSTRACT

BACKGROUND: The analysis of single-cell RNA sequencing (scRNAseq) data plays an important role in understanding the intrinsic and extrinsic cellular processes in biological and biomedical research. One significant effort in this area is the detection of differentially expressed (DE) genes. scRNAseq data, however, are highly heterogeneous and have a large number of zero counts, which introduces challenges in detecting DE genes. Addressing these challenges requires employing new approaches beyond the conventional ones, which are based on a nonzero difference in average expression. Several methods have been developed for differential gene expression analysis of scRNAseq data. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to evaluate and compare the performance of differential gene expression analysis methods for scRNAseq data. RESULTS: In this study, we conducted a comprehensive evaluation of the performance of eleven differential gene expression analysis software tools, which are designed for scRNAseq data or can be applied to them. We used simulated and real data to evaluate the accuracy and precision of detection. Using simulated data, we investigated the effect of sample size on the detection accuracy of the tools. Using real data, we examined the agreement among the tools in identifying DE genes, the run time of the tools, and the biological relevance of the detected DE genes. CONCLUSIONS: In general, agreement among the tools in calling DE genes is not high. There is a trade-off between true-positive rates and the precision of calling DE genes. Methods with higher true positive rates tend to show low precision due to their introducing false positives, whereas methods with high precision show low true positive rates due to identifying few DE genes. We observed that current methods designed for scRNAseq data do not tend to show better performance compared to methods designed for bulk RNAseq data. Data multimodality and abundance of zero read counts are the main characteristics of scRNAseq data, which play important roles in the performance of differential gene expression analysis methods and need to be considered in terms of the development of new methods.


Subject(s)
Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Humans
16.
ACM BCB ; 2019: 423-428, 2019 Sep.
Article in English | MEDLINE | ID: mdl-32515750

ABSTRACT

Next-generation sequencing (NGS) technologies offer new opportunities for precise and accurate identification of genomic aberrations, including copy number variations (CNVs). For high-throughput NGS data, using depth of coverage has become a major approach to identify CNVs, especially for whole exome sequencing (WES) data. Due to the high level of noise and biases of read-count data and complexity of the WES data, existing CNV detection tools identify many false CNV segments. Besides, NGS generates a huge amount of data, requiring to use effective and efficient methods. In this work, we propose a novel segmentation algorithm based on the total variation approach to detect CNVs more precisely and efficiently using WES data. The proposed method also filters out outlier read-counts and identifies significant change points to reduce false positives. We used real and simulated data to evaluate the performance of the proposed method and compare its performance with those of other commonly used CNV detection methods. Using simulated and real data, we show that the proposed method outperforms the existing CNV detection methods in terms of accuracy and false discovery rate and has a faster runtime compared to the circular binary segmentation method.

17.
BMC Bioinformatics ; 19(Suppl 11): 361, 2018 Oct 22.
Article in English | MEDLINE | ID: mdl-30343665

ABSTRACT

BACKGROUND: Due to recent advances in sequencing technologies, sequence-based analysis has been widely applied to detecting copy number variations (CNVs). There are several techniques for identifying CNVs using next generation sequencing (NGS) data, however methods employing depth of coverage or read depth (RD) have recently become a main technique to identify CNVs. The main assumption of the RD-based CNV detection methods is that the readcount value at a specific genomic location is correlated with the copy number at that location. However, readcount data's noise and biases distort the association between the readcounts and copy numbers. For more accurate CNV identification, these biases and noise need to be mitigated. In this work, to detect CNVs more precisely and efficiently we propose a novel denoising method based on the total variation approach and the Taut String algorithm. RESULTS: To investigate the performance of the proposed denoising method, we computed sensitivities, false discovery rates and specificities of CNV detection when employing denoising, using both simulated and real data. We also compared the performance of the proposed denoising method, Taut String, with that of the commonly used approaches such as moving average (MA) and discrete wavelet transforms (DWT) in terms of sensitivity of detecting true CNVs and time complexity. The results show that Taut String works better than DWT and MA and has a better power to identify very narrow CNVs. The ability of Taut String denoising in preserving CNV segments' breakpoints and narrow CNVs increases the detection accuracy of segmentation algorithms, resulting in higher sensitivities and lower false discovery rates. CONCLUSIONS: In this study, we proposed a new denoising method for sequence-based CNV detection based on a signal processing technique. Existing CNV detection algorithms identify many false CNV segments and fail in detecting short CNV segments due to noise and biases. Employing an effective and efficient denoising method can significantly enhance the detection accuracy of the CNV segmentation algorithms. Advanced denoising methods from the signal processing field can be employed to implement such algorithms. We showed that non-linear denoising methods that consider sparsity and piecewise constant characteristics of CNV data result in better performance in CNV detection.


Subject(s)
DNA Copy Number Variations/genetics , High-Throughput Nucleotide Sequencing/methods , Algorithms , Breast Neoplasms/genetics , Computer Simulation , Female , Genomics , Humans , Signal Processing, Computer-Assisted , Time Factors , Wavelet Analysis
18.
Methods ; 145: 25-32, 2018 08 01.
Article in English | MEDLINE | ID: mdl-29702224

ABSTRACT

Differential gene expression analysis is one of the significant efforts in single cell RNA sequencing (scRNAseq) analysis to discover the specific changes in expression levels of individual cell types. Since scRNAseq exhibits multimodality, large amounts of zero counts, and sparsity, it is different from the traditional bulk RNA sequencing (RNAseq) data. The new challenges of scRNAseq data promote the development of new methods for identifying differentially expressed (DE) genes. In this study, we proposed a new method, SigEMD, that combines a data imputation approach, a logistic regression model and a nonparametric method based on the Earth Mover's Distance, to precisely and efficiently identify DE genes in scRNAseq data. The regression model and data imputation are used to reduce the impact of large amounts of zero counts, and the nonparametric method is used to improve the sensitivity of detecting DE genes from multimodal scRNAseq data. By additionally employing gene interaction network information to adjust the final states of DE genes, we further reduce the false positives of calling DE genes. We used simulated datasets and real datasets to evaluate the detection accuracy of the proposed method and to compare its performance with those of other differential expression analysis methods. Results indicate that the proposed method has an overall powerful performance in terms of precision in detection, sensitivity, and specificity.


Subject(s)
Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Software , High-Throughput Nucleotide Sequencing/methods , Sensitivity and Specificity , Statistics, Nonparametric
19.
Cancer Causes Control ; 29(3): 305-314, 2018 03.
Article in English | MEDLINE | ID: mdl-29427260

ABSTRACT

PURPOSE: The purpose of the study was to assess the feasibility of quantifying long-term trends in breast tumor DNA copy number variation (CNV) profiles. METHODS: We evaluated CNV profiles in formalin-fixed paraffin-embedded (FFPE) tumor specimens from 30 randomly selected Kaiser Permanente Northern California health plan women members diagnosed with breast cancer from 1950 to 2010. Assays were conducted for five cases per decade who had available tumor blocks and pathology reports. RESULTS: As compared to the tumors from the 1970s to 2000s, the older tumors dating back to the 1950s and 1960s were much more likely to (1) fail quality control, and (2) have fewer CNV events (average 23 and 31 vs. 58 to 69), fewer CNV genes (average 5.1 and 3.7k vs. 8.1 to 10.3k), shorter CNV length (average 2,440 and 3,300k vs. 5,740 to 9,280k), fewer high frequency Del genes (37 and 25% vs. 54 to 76%), and fewer high frequency high_Amp genes (20% vs. 56 to 73%). On average, assay interpretation took an extra 60 min/specimen for cases from the 1960s versus 20 min/specimen for the most recent tumors. CONCLUSIONS: Assays conducted in the mid-2010s for CNVs may be feasible for FFPE tumor specimens dating back to the 1980s, but less feasible for older specimens.


Subject(s)
Breast Neoplasms/genetics , DNA Copy Number Variations , DNA , Specimen Handling , Female , Formaldehyde , Humans , Paraffin Embedding , Time Factors , Tissue Fixation
20.
BMC Bioinformatics ; 18(1): 286, 2017 May 31.
Article in English | MEDLINE | ID: mdl-28569140

ABSTRACT

BACKGROUND: Recently copy number variation (CNV) has gained considerable interest as a type of genomic/genetic variation that plays an important role in disease susceptibility. Advances in sequencing technology have created an opportunity for detecting CNVs more accurately. Recently whole exome sequencing (WES) has become primary strategy for sequencing patient samples and study their genomics aberrations. However, compared to whole genome sequencing, WES introduces more biases and noise that make CNV detection very challenging. Additionally, tumors' complexity makes the detection of cancer specific CNVs even more difficult. Although many CNV detection tools have been developed since introducing NGS data, there are few tools for somatic CNV detection for WES data in cancer. RESULTS: In this study, we evaluated the performance of the most recent and commonly used CNV detection tools for WES data in cancer to address their limitations and provide guidelines for developing new ones. We focused on the tools that have been designed or have the ability to detect cancer somatic aberrations. We compared the performance of the tools in terms of sensitivity and false discovery rate (FDR) using real data and simulated data. Comparative analysis of the results of the tools showed that there is a low consensus among the tools in calling CNVs. Using real data, tools show moderate sensitivity (~50% - ~80%), fair specificity (~70% - ~94%) and poor FDRs (~27% - ~60%). Also, using simulated data we observed that increasing the coverage more than 10× in exonic regions does not improve the detection power of the tools significantly. CONCLUSIONS: The limited performance of the current CNV detection tools for WES data in cancer indicates the need for developing more efficient and precise CNV detection methods. Due to the complexity of tumors and high level of noise and biases in WES data, employing advanced novel segmentation, normalization and de-noising techniques that are designed specifically for cancer data is necessary. Also, CNV detection development suffers from the lack of a gold standard for performance evaluation. Finally, developing tools with user-friendly user interfaces and visualization features can enhance CNV studies for a broader range of users.


Subject(s)
DNA Copy Number Variations , Exome , High-Throughput Nucleotide Sequencing/methods , Neoplasms/genetics , Software , Algorithms , Female , Genome, Human , Humans , Sequence Analysis, DNA/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...