Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
Cell Rep Methods ; 4(6): 100797, 2024 Jun 17.
Article in English | MEDLINE | ID: mdl-38889685

ABSTRACT

Cancer of unknown primary (CUP) represents metastatic cancer where the primary site remains unidentified despite standard diagnostic procedures. To determine the tumor origin in such cases, we developed BPformer, a deep learning method integrating the transformer model with prior knowledge of biological pathways. Trained on transcriptomes from 10,410 primary tumors across 32 cancer types, BPformer achieved remarkable accuracy rates of 94%, 92%, and 89% in primary tumors and primary and metastatic sites of metastatic tumors, respectively, surpassing existing methods. Additionally, BPformer was validated in a retrospective study, demonstrating consistency with tumor sites diagnosed through immunohistochemistry and histopathology. Furthermore, BPformer was able to rank pathways based on their contribution to tumor origin identification, which helped to classify oncogenic signaling pathways into those that are highly conservative among different cancers versus those that are highly variable depending on their origins.


Subject(s)
Neoplasms, Unknown Primary , Humans , Neoplasms, Unknown Primary/genetics , Neoplasms, Unknown Primary/pathology , Neoplasms, Unknown Primary/metabolism , Neoplasms, Unknown Primary/diagnosis , Signal Transduction/genetics , Transcriptome , Deep Learning , Retrospective Studies
2.
Comput Struct Biotechnol J ; 23: 1469-1476, 2024 Dec.
Article in English | MEDLINE | ID: mdl-38623560

ABSTRACT

RNA plays an extensive role in a multi-dimensional regulatory system, and its biomedical relationships are scattered across numerous biological studies. However, text mining works dedicated to the extraction of RNA biomedical relations remain limited. In this study, we established a comprehensive and reliable corpus of RNA biomedical relations, recruiting over 30,000 sentences manually curated from more than 15,000 biomedical literature. We also updated RIscoper 2.0, a BERT-based deep learning tool to extract RNA biomedical relation sentences from literature. Benefiting from approximately 100,000 annotated named entities, we integrated the text classification and named entity recognition tasks in this tool. Additionally, RIscoper 2.0 outperformed the original tool in both tasks and can discover new RNA biomedical relations. Additionally, we provided a user-friendly online search tool that enables rapid scanning of RNA biomedical relationships using local and online resources. Both the online tools and data resources of RIscoper 2.0 are available at http://www.rnainter.org/riscoper.

3.
Front Health Serv ; 3: 1168429, 2023.
Article in English | MEDLINE | ID: mdl-37621376

ABSTRACT

Background: Medical training through specialization and even subspecialization has contributed significantly to clinical excellence in treating single acute conditions. However, the needs of complex patients go beyond single diseases, and there is a need to identify a group of generalists who are able to deliver cost-effective, holistic care to patients with multiple comorbidities and multi-faceted needs. Community hospitals (CHs) are a critical part of Singapore's shift toward a community-centric care model as the population ages. Community Hospitals of the Future ("CHoF") represent a series of emerging conversations around approaches to reimagine and redesign care delivery in a CH setting in response to changing care needs. Methods: An environmental scan in the CH landscape using semi-structured interviews was conducted with 26 senior management, management, and working-level staff from seven community hospitals in Singapore. This environmental scan aims to understand the current barriers and future opportunities for CHs; to guide how CHs would have to shift in terms of (i) care delivery and resourcing, (ii) information flow, and (iii) financing; and to conceptualize CHoF to meet the changing care needs in Singapore. Findings: The analysis of all transcripts revealed four broad sections of themes: (i) current care delivery in CHs, (ii) current challenges of CHs, (iii) future opportunities, and (iv) challenges in reimagining CHs. An emerging theme regarding the current key performance indicators used also surfaced. Resource limitations and financing structure of CH surfaced as limitations to expanding its capability. However, room for expansion of CH roles tapping on the current expertise were acknowledged and shared. Conclusion: With the current issues of (i) rapidly aging population, (ii) specialist-centric healthcare system, and (iii) fragmentation of care ecosystem, there is a need to further understand how CHoF can be modeled to better tackle them. Therefore, several important questions have been devised to land us in a microscopic view on how to develop CHoF in the right constructs. Demographic changes, patient segmentation, service and regulatory parameters, patient's perspective, care delivery, and financial levers (or lack of) are some of the categories that the interview questions looked into. Therefore, the data gathered would be used to guide and refine the concept of CHoF.

4.
Brief Bioinform ; 24(3)2023 05 19.
Article in English | MEDLINE | ID: mdl-36946415

ABSTRACT

Colorectal cancer (CRC) is one of the most common gastrointestinal malignancies. There are few recurrence risk signatures for CRC patients. Single-cell RNA-sequencing (scRNA-seq) provides a high-resolution platform for prognostic signature detection. However, scRNA-seq is not practical in large cohorts due to its high cost and most single-cell experiments lack clinical phenotype information. Few studies have been reported to use external bulk transcriptome with survival time to guide the detection of key cell subtypes in scRNA-seq data. We proposed scRankXMBD, a computational framework to prioritize prognostic-associated cell subpopulations based on within-cell relative expression orderings of gene pairs from single-cell transcriptomes. scRankXMBD achieves higher precision and concordance compared with five existing methods. Moreover, we developed single-cell gene pair signatures to predict recurrence risk for patients individually. Our work facilitates the application of the rank-based method in scRNA-seq data for prognostic biomarker discovery and precision oncology. scRankXMBD is available at https://github.com/xmuyulab/scRank-XMBD. (XMBD:Xiamen Big Data, a biomedical open software initiative in the National Institute for Data Science in Health and Medicine, Xiamen University, China.).


Subject(s)
Colorectal Neoplasms , Transcriptome , Humans , Gene Expression Profiling/methods , Prognosis , Precision Medicine , Software , Colorectal Neoplasms/genetics , Sequence Analysis, RNA
5.
Stem Cell Res Ther ; 13(1): 115, 2022 03 21.
Article in English | MEDLINE | ID: mdl-35313979

ABSTRACT

BACKGROUND: Stemness is defined as the potential of cells for self-renewal and differentiation. Many transcriptome-based methods for stemness evaluation have been proposed. However, all these methods showed low negative correlations with differentiation time and can't leverage the existing experimentally validated stem cells to recognize the stem-like cells. METHODS: Here, we constructed a stemness index for single-cell samples (StemSC) based on relative expression orderings (REO) of gene pairs. Firstly, we identified the stemness-related genes by selecting the genes significantly related to differentiation time. Then, we used 13 RNA-seq datasets from both the bulk and single-cell embryonic stem cell (ESC) samples to construct the reference REOs. Finally, the StemSC value of a given sample was calculated as the percentage of gene pairs with the same REOs as the ESC samples. RESULTS: We validated the StemSC by its higher negative correlations with differentiation time in eight normal datasets and its higher positive correlations with tumor dedifferentiation in three colorectal cancer datasets and four glioma datasets. Besides, the robust of StemSC to batch effect enabled us to leverage the existing experimentally validated cancer stem cells to recognize the stem-like cells in other independent tumor datasets. And the recognized stem-like tumor cells had fewer interactions with anti-tumor immune cells. Further survival analysis showed the immunotherapy-treated patients with high stemness had worse survival than those with low stemness. CONCLUSIONS: StemSC is a better stemness index to calculate the stemness across datasets, which can help researchers explore the effect of stemness on other biological processes.


Subject(s)
Glioma , Neoplastic Stem Cells , Cell Differentiation/genetics , Glioma/metabolism , Humans , Neoplastic Stem Cells/metabolism , Transcriptome
6.
Epigenetics ; 16(8): 908-916, 2021 08.
Article in English | MEDLINE | ID: mdl-32965167

ABSTRACT

Accurate diagnosis of the origin of brain metastases (BMs) is crucial for tailoring an effective therapy to improve patients' prognosis. BMs of unknown origin account for approximately 2-14% of patients with BMs. Hence, the aim of this study was to identify the original cancer type of BMs based on their DNA methylation profiles. The DNA methylation profiles of glioma (GM), BM, and seven other types of primary cancers were collected. In comparison with GM, the reversal CpG site pairs were identified for each of the seven other types of primary cancers based on the within-sample relative methylation orderings (RMOs) of the CpG sites. Then, using the reversal CpG site pairs, GMs were distinguished from BMs and the seven other types of primary cancers. All 61 of the GM samples were correctly identified as GM. The cancer type was also identified for the non-GM samples. For the seven other types of primary cancers, greater than 93% of samples of each cancer type were correctly identified as their corresponding cancer type, except for breast cancer, which had an 88% accuracy. For 133 BM samples, 132 BM samples were identified as non-GM, and 95% of the 133 BM samples were correctly classified into their corresponding original cancer types. The RMO-based method can accurately identify the origin of BMs, which is important for precision treatment.


Subject(s)
Brain Neoplasms , Breast Neoplasms , Brain Neoplasms/genetics , CpG Islands , DNA Methylation , Female , Humans , Prognosis
7.
J Gastroenterol Hepatol ; 36(6): 1714-1720, 2021 Jun.
Article in English | MEDLINE | ID: mdl-33150986

ABSTRACT

BACKGROUND: Pancreatic ductal adenocarcinoma (PDAC) accounts for about 90% of pancreatic cancer, which is one of the most aggressive malignant neoplasms with a 9.3% five-year survival rate. The pathological biopsy is the current golden standard for confirming suspicious lesions of PDAC, but it is not entirely reliable because of the insufficient sampling amount and inaccurate sampling location. Therefore, developing a robust signature to aid the accurate diagnosis of PDAC is critical. METHODS: Based on the within-sample relative expression orderings of gene pairs, we identified a qualitative signature to discriminate both PDAC and adjacent samples from both chronic pancreatitis and normal samples in the training datasets and validated it in other independent datasets produced by different laboratories with different measuring platforms. RESULTS: A six-gene-pair signature was identified in the training data and validated in eight independent datasets. For surgical samples, 96.63% of 356 PDAC tissues, 100% of 11 pancreatitis tissues of non-cancer patients, and 23 of 24 normal pancreatic tissues were correctly classified. Especially, 59 of 60 cancer-adjacent normal tissues of PDAC patients were correctly identified as PDAC. For biopsy samples, all of 11 PDAC biopsy tissues were correctly classified as PDAC. CONCLUSION: The signature can distinguish both PDAC and PDAC-adjacent normal tissues from both chronic pancreatitis and normal tissues of non-cancer patients even when the sampling locations are inaccurate, which can aid the diagnosis of PDAC.


Subject(s)
Biopsy/methods , Carcinoma, Pancreatic Ductal/diagnosis , Carcinoma, Pancreatic Ductal/genetics , Diagnostic Techniques, Digestive System , Gene Expression Profiling/methods , Pancreatic Neoplasms/diagnosis , Pancreatic Neoplasms/genetics , Specimen Handling/methods , Transcriptome , Carcinoma, Pancreatic Ductal/pathology , Datasets as Topic , Diagnosis, Differential , Humans , Pancreatic Neoplasms/pathology
8.
Radiother Oncol ; 155: 65-72, 2021 02.
Article in English | MEDLINE | ID: mdl-33065189

ABSTRACT

BACKGROUND AND PURPOSE: Currently, 5-fluorouracil (5-FU)-based adjuvant chemoradiotherapy (ACRT) is a preferred regimen for post-surgery gastric cancer (GC). However, the survival outcome of 5-FU-based ACRT varies greatly among different GC patients. Thus, it is necessary to classify which patients may benefit from 5-FU-based ACRT. MATERIALS AND METHODS: We collected 577 GC and 84 adjacent normal samples for training and 675 GC samples for validation. Based on the within-sample relative expression orderings (REOs) of gene expression levels, reversal gene pairs were selected, and the pairs correlating with overall survival (OS) of GC patients receiving 5-FU-based ACRT were identified as candidates. Finally, an optimized set of candidate gene pairs was selected as a classification signature in training data and validated in validation data. RESULTS: A signature consisting of 34 gene pairs was identified in training data and validated in three independent datasets. The classified low-risk group had better OS than the classified high-risk group. We also analyzed the recurrent free survival or disease free survival (RFS/DFS) of the validation datasets, and the similar results were shown. Furthermore, although the signature was identified based on the OS of GC patients receiving ACRT, it was not a prognostic signature for patients treated with surgery alone, but may be a potential signature for 5-FU-based chemotherapy alone. CONCLUSIONS: The signature can accurately classify GC patients who may benefit from 5-FU-based ACRT, which could aid clinicians in tailoring more effective GC treatments.


Subject(s)
Stomach Neoplasms , Chemoradiotherapy, Adjuvant , Chemotherapy, Adjuvant , Disease-Free Survival , Fluorouracil/therapeutic use , Humans , Prognosis , Stomach Neoplasms/drug therapy
9.
Biomed Res Int ; 2020: 6418460, 2020.
Article in English | MEDLINE | ID: mdl-32802863

ABSTRACT

The within-sample relative expression orderings (REOs) of genes, which are stable qualitative transcriptional characteristics, can provide abundant information for a disease. Methods based on REO comparisons have been proposed for identifying differentially expressed genes (DEGs) at the individual level and for detecting disease-associated genes based on one-phenotype disease data by reusing data of normal samples from other sources. Here, we evaluated the effects of common potential confounding factors, including age, cigarette smoking, sex, and race, on the REOs of gene pairs within normal lung tissues transcriptome. Our results showed that age has little effect on REOs within lung tissues. We found that about 0.23% of the significantly stable REOs of gene pairs in nonsmokers' lung tissues are reversed in smokers' lung tissues, introduced by 344 DEGs between the two groups of samples (RankCompV2, FDR <0.05), which are enriched in metabolism of xenobiotics by cytochrome P450, glutathione metabolism, and other pathways (hypergeometric test, FDR <0.05). Comparison between the normal lung tissue samples of males and females revealed fewer reversal REOs introduced by 24 DEGs between the sex groups, among which 19 DEGs are located on sex chromosomes and 5 DEGs involving in spermatogenesis and regulation of oocyte are located on autosomes. Between the normal lung tissue samples of white and black people, we identified 22 DEGs (RankCompV2, FDR <0.05) which introduced a few reversal REOs between the two races. In summary, the REO-based study should take into account the confounding factors of cigarette smoking, sex, and race.


Subject(s)
Cigarette Smoking/genetics , Computational Biology/methods , Lung/physiology , Adult , Age Factors , Aged , Aged, 80 and over , Algorithms , Cigarette Smoking/adverse effects , Databases, Genetic , Female , Gene Expression Profiling/methods , Humans , Lung/drug effects , Lung/metabolism , Male , Middle Aged , Race Factors , Sex Factors , Transcriptome , Young Adult
10.
Bioinformatics ; 36(15): 4283-4290, 2020 08 01.
Article in English | MEDLINE | ID: mdl-32428201

ABSTRACT

MOTIVATION: For some specific tissues, such as the heart and brain, normal controls are difficult to obtain. Thus, studies with only a particular type of disease samples (one phenotype) cannot be analyzed using common methods, such as significance analysis of microarrays, edgeR and limma. The RankComp algorithm, which was mainly developed to identify individual-level differentially expressed genes (DEGs), can be applied to identify population-level DEGs for the one-phenotype data but cannot identify the dysregulation directions of DEGs. RESULTS: Here, we optimized the RankComp algorithm, termed PhenoComp. Compared with RankComp, PhenoComp provided the dysregulation directions of DEGs and had more robust detection power in both simulated and real one-phenotype data. Moreover, using the DEGs detected by common methods as the 'gold standard', the results showed that the DEGs detected by PhenoComp using only one-phenotype data were comparable to those identified by common methods using case-control samples, independent of the measurement platform. PhenoComp also exhibited good performance for weakly differential expression signal data. AVAILABILITY AND IMPLEMENTATION: The PhenoComp algorithm is available on the web at https://github.com/XJJ-student/PhenoComp. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Phenotype
11.
Front Genet ; 11: 573787, 2020.
Article in English | MEDLINE | ID: mdl-33519891

ABSTRACT

It is meaningful to assess the risk of cancer incidence among patients with precancerous colorectal lesions. Comparing the within-sample relative expression orderings (REOs) of colorectal cancer patients measured by multiple platforms with that of normal colorectal tissues, a qualitative transcriptional signature consisting of 1,840 gene pairs was identified in the training data. Within an evaluation dataset of 16 active and 18 inactive (remissive) ulcerative colitis subjects, the median incidence risk score of colorectal carcinoma was 0.6402 in active ulcerative colitis subjects, significantly higher than that in remissive subjects (0.3114). Evaluation of two other independent datasets yielded similar results. Moreover, we found that the score significantly positively correlated with the degree of dysplasia in the case of colorectal adenomas. In the merged dataset, the median incidence risk score was 0.9027 among high-grade adenoma samples, significantly higher than that among low-grade adenomas (0.8565). In summary, the developed incidence risk score could well predict the incidence risk of precancerous colorectal lesions and has value in clinical application.

12.
Front Genet ; 10: 1228, 2019.
Article in English | MEDLINE | ID: mdl-31850075

ABSTRACT

The heterogeneity of cancer is a big obstacle for cancer diagnosis and treatment. Prioritizing combinations of driver genes that mutate in most patients of a specific cancer or a subtype of this cancer is a promising way to tackle this problem. Here, we developed an empirical algorithm, named PathMG, to identify common and subtype-specific mutated sub-pathways for a cancer. By analyzing mutation data of 408 samples (Lung-data1) for lung cancer, three sub-pathways each covering at least 90% of samples were identified as the common sub-pathways of lung cancer. These sub-pathways were enriched with mutated cancer genes and drug targets and were validated in two independent datasets (Lung-data2 and Lung-data3). Especially, applying PathMG to analyze two major subtypes of lung cancer, lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LSCC), we identified 13 subtype-specific sub-pathways with at least 0.25 mutation frequency difference between LUAD and LSCC samples in Lung-data1, and 12 of the 13 sub-pathways were reproducible in Lung-data2 and Lung-data3. Similar analyses were done for colorectal cancer. Together, PathMG provides us a novel tool to identify potential common and subtype-specific sub-pathways for a cancer, which can provide candidates for cancer diagnoses and sub-pathway targeted treatments.

13.
Cancer Sci ; 110(10): 3225-3234, 2019 Oct.
Article in English | MEDLINE | ID: mdl-31335996

ABSTRACT

Currently, using biopsy specimens for the early diagnosis of colorectal cancer (CRC) is not entirely reliable due to insufficient sampling amount and inaccurate sampling location. Thus, it is necessary to develop a signature that can accurately identify patients with CRC under these clinical scenarios. Based on the relative expression orderings of genes within individual samples, we developed a qualitative transcriptional signature to discriminate CRC tissues, including CRC adjacent normal tissues from non-CRC individuals. The signature was validated using multiple microarray and RNA sequencing data from different sources. In the training data, a signature consisting of 7 gene pairs was identified. It was well validated in both biopsy and surgical resection specimens from multiple datasets measured by different platforms. For biopsy specimens, 97.6% of 42 CRC tissues and 94.5% of 163 non-CRC (normal or inflammatory bowel disease) tissues were correctly classified. For surgically resected specimens, 99.5% of 854 CRC tissues and 96.3% of 81 CRC adjacent normal tissues were correctly identified as CRC. Notably, we additionally measured 33 CRC biopsy specimens by the Affymetrix platform and 13 CRC surgical resection specimens, with different proportions of tumor epithelial cells, ranging from 40% to 100%, by the RNA sequencing platform, and all these samples were correctly identified as CRC. The signature can be used for the early diagnosis of CRC, which is also suitable for minimum biopsy specimens and inaccurately sampled specimens, and thus has potential value for clinical application.


Subject(s)
Biomarkers, Tumor/genetics , Colorectal Neoplasms/diagnosis , Early Detection of Cancer/methods , Gene Expression Profiling/methods , Biopsy , Case-Control Studies , Colorectal Neoplasms/genetics , Colorectal Neoplasms/pathology , Gene Expression Regulation, Neoplastic , Humans , Oligonucleotide Array Sequence Analysis/methods , Sensitivity and Specificity , Sequence Analysis, RNA/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...