Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 61
Filtrar
1.
Braz. oral res. (Online) ; 37: e063, 2023. tab, graf
Artigo em Inglês | LILACS-Express | LILACS, BBO | ID: biblio-1439735

RESUMO

Abstract This study aimed to analyze the molecular characteristics of oral epithelial dysplasia (OED), highlighting the pathways and variants of genes that are frequently mutated in oral squamous cell carcinoma (OSCC) and other cancers. Ten archival OED cases were retrieved for retrospective clinicopathological analysis and exome sequencing. Comparative genomic analysis was performed between high-grade dysplasia (HGD) and low-grade dysplasia (LGD), focusing on 57 well-known cancer genes, of which 10 were previously described as the most mutated in OSCC. HGD cases had significantly more variants; however, a similar mutational landscape to OSCC was observed in both groups. CASP8+FAT1/HRAS, TP53, and miscellaneous molecular signatures were also present. FAT1 is the gene that is most affected by pathogenic variants. Hierarchical divisive clustering showed division between the two groups: "HGD-like cluster" with 4HGD and 2LGD and "LGD-like cluster" with 4 LGD. MLL4 pathogenic variants were exclusively in the "LGD-like cluster". TP53 was affected in one case of HGD; however, its pathway was usually altered. We describe new insights into the genetic basis of epithelial malignant transformation by genomic analysis, highlighting those associated with FAT1 and TP53. Some LGDs presented a similar mutational landscape to HGD after cluster analysis. Perhaps molecular alterations have not yet been reflected in histomorphology. The relative risk of malignant transformation in this molecular subgroup should be addressed in future studies.

2.
Acta Pharmaceutica Sinica ; (12): 3130-3139, 2023.
Artigo em Chinês | WPRIM | ID: wpr-999062

RESUMO

Analyze the changes in gene expression profiles during the process of Panax ginseng seed dormancy release, and screen for differential genes, providing a basis for analyzing the mechanism of P. ginseng seed dormancy release. Comparative transcriptome analysis was conducted by using RNA-Seq sequencing technology in P. ginseng seeds stored at different low temperature. A total of 80.97 Gb of Raw reads and 80.19 Gb of Clean reads were obtained from the transcriptome. Principal component analysis and correlation analysis showed that there were significant differences in gene expression patterns at different developmental stages. Upset results showed that 46 248 unigenes were co-expressed in four stages, and 414, 445, 400 and 389 unigenes were specifically expressed in 0, 8,14 and 28 days, respectively. Gene Ontology functional annotation showed that the differentially expressed genes were mainly involved in nsaturated fatty acid biosynthetic process, nuclear body and oxidoreductase activity. Encyclopedia of Genes and Genomes metabolic pathway showed that differentially expressed genes were mainly involved in peroxisome, mitogen-activated protein kinase signaling pathway-plant, plant hormone signal transduction, ribosome, biosynthesis of unsaturated fatty acid, circadian rhythm-plant and other metabolic pathways. In the process of P. ginseng seed dormancy release, multiple biological processes, such as unsaturated fatty acid biosynthesis and plant hormone signal transduction, are required to coordinate regulation, which constitutes a complex dormancy release regulation network. Transcriptome analysis and differential gene screening of P. ginseng seeds at different sand storage time laid a foundation for the analysis of P. ginseng seed dormancy release mechanism and molecular breeding.

3.
Chinese Journal of Digestion ; (12): 328-335, 2022.
Artigo em Chinês | WPRIM | ID: wpr-934153

RESUMO

Objective:Based on the artificial intelligence (AI) technology in endoscopy and the internet platform, to explore and construct a safe, standardized, scientific and rigorous database for digestive endoscopy, and to provide reference and evidence for the data quality control of AI in digestive endoscopy in China.Methods:After referring to relevant guidelines and standards, data collection and labelling standards of digestive endoscopy of 12 common gastrointestinal diseases were determined. The software of online collection and labelling of multi-center digestive endoscopy data in Shandong Province was developed. Endoscopic equipment with a domestic market share of >5% was used and dozens of experienced endoscopists from 9 medical centers in Shandong Province were uniformly trained for data labelling. From July 2019 to July 2020, the endoscopic examination data from 9 medical centers including Qilu Hospital of Shandong University, Shandong Provincial Hospital , Liaocheng People′s Hospital, Linyi People′s Hospital, Weihai Municipal Hospital, Taian City Central Hospital, Binzhou Medical University Hospital, Yantai Yuhuangding Hospital and Qilu Hospital of Shandong University (Qingdao) were prospectively and continuously collected and labeled. The optimized, desensitized, and generalized data were uploaded to the server. After the file synchronization, data processing, and expert review, a multi-center digestive endoscopy AI database with standard data collection and labelling in Shandong Province was constructed, namely cloud platform. Descriptive methods were used for statistical analysis.Results:The collection and labelling standards for multi-center digestive endoscopy AI data in Shandong province was established. The software of online collection and labelling of multi-center digestive endoscopy AI data in Shandong province was developed. The database in Shandong province was successfully constructed. In the database, 43 010 lesions, 40 353 images, and 11 289 examinations were labeled. Among them, there were 2 906 cases of early esophageal cancer, 2 912 cases of early gastric cancer, 2 397 cases of early colorectal cancer, and 9 773 cases of colorectal polyps (5 539 cases of adenomatous polyps, 1 161 cases of non-adenomatous polyps and 3 073 case of undetermined polyps).Conclusions:The multi-center AI cloud platform for digestive endoscopy in Shandong Province adopts unified standards and collection and labeling software, which ensures the safety and standardization of endoscopy data. It provides a reference and basis for the construction of a quality control system for standardized data collection and labelling of digestive endoscopy AI data in our country and for the third-party data supervision.

4.
Acta Pharmaceutica Sinica ; (12): 2216-2223, 2022.
Artigo em Chinês | WPRIM | ID: wpr-936583

RESUMO

Lu Dangshen is the geoherb in Shanxi Province. The content of Codonopsis pilosula polysaccharides (CPP) in Lu Dangshen is more than that in other Codonopsis Radix from other regions. Glycosyltransferase is the key enzyme for the synthesis of bioactive components, such as CPP and tangshenoside I. Based on the transcriptome data of C. pilosula [Codonopsis pilosula (Franch.) Nannf.] from different producing areas, this study carried out functional annotation of GO and KEGG, conservative domain analysis, phylogenetic tree analysis and expression pattern analysis of glycosyltransferase genes in C. pilosula to provides a theoretical basis for exploring the mechanism of genuineness formation in Lu Dangshen. In this study, 98 glycosyltransferase genes were screened and identified, which belonged to GT family 1, GT family 2, GT family 90 and other families. By GO functional annotation, it was found that most of the glycosyltransferase genes had catalytic activity. Analysis of KEGG functional annotation showed that C. pilosula glycosyltransferase was mainly involved in glycan organism and terpenoid and polyketone metabolism. Among them, conserved domain of 42 glycosyltransferase genes in GT family 1 was [X]-W-[2X]-Q-[3X]-[LH]-[5X]-[FLTHCGWNS]-[2X]-E-[4X]-[GVP]-[4X]-P-[4X]-Q-[2X]-[NAK]. Phylogenetic tree analysis based on the glycosyltransferase sequence in Arabidopsis thaliana showed that C. pilosula glycosyltransferases were mainly located in Arabidopsis thaliana UGT73, 72 and 85 branches. Gene expression pattern analysis showed that expression of CpUGT73AH2 was higher in Lu Dangshen than that in Baitiaodang and could respond to drought and low temperature stress. In conclusion, a glycosyltransferase gene CpUGT73AH2, which is involved in the metabolism of terpenoids and polyketides and can respond to environmental stress, was screened from the C. pilosula glycosyltransferase family 1, which was used to further study the role of C. pilosula glycosyltransferase in Lu Dangshen. It laid a theoretical foundation for further study on the role of C. pilosula glycosyltransferase in the formation of Lu Dangshen.

5.
Frontiers of Medicine ; (4): 275-291, 2021.
Artigo em Inglês | WPRIM | ID: wpr-880954

RESUMO

Although genome-wide association studies have identified more than eighty genetic variants associated with non-small cell lung cancer (NSCLC) risk, biological mechanisms of these variants remain largely unknown. By integrating a large-scale genotype data of 15 581 lung adenocarcinoma (AD) cases, 8350 squamous cell carcinoma (SqCC) cases, and 27 355 controls, as well as multiple transcriptome and epigenomic databases, we conducted histology-specific meta-analyses and functional annotations of both reported and novel susceptibility variants. We identified 3064 credible risk variants for NSCLC, which were overrepresented in enhancer-like and promoter-like histone modification peaks as well as DNase I hypersensitive sites. Transcription factor enrichment analysis revealed that USF1 was AD-specific while CREB1 was SqCC-specific. Functional annotation and gene-based analysis implicated 894 target genes, including 274 specifics for AD and 123 for SqCC, which were overrepresented in somatic driver genes (ER = 1.95, P = 0.005). Pathway enrichment analysis and Gene-Set Enrichment Analysis revealed that AD genes were primarily involved in immune-related pathways, while SqCC genes were homologous recombination deficiency related. Our results illustrate the molecular basis of both well-studied and new susceptibility loci of NSCLC, providing not only novel insights into the genetic heterogeneity between AD and SqCC but also a set of plausible gene targets for post-GWAS functional experiments.


Assuntos
Humanos , Adenocarcinoma de Pulmão/genética , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma de Células Escamosas/genética , Heterogeneidade Genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Neoplasias Pulmonares/genética , Polimorfismo de Nucleotídeo Único
6.
Electron. j. biotechnol ; 45: 30-37, May 15, 2020. ilus, graf
Artigo em Espanhol | LILACS | ID: biblio-1177412

RESUMO

BACKGROUND: Traditionally, microbial genome sequencing has been restrained to the species grown in pure culture. The development of culture-independent techniques over the last decade allows scientists to sequence microbial communities directly from environmental samples. Metagenomics is the study of complex genome by the isolation of DNA of the whole community. Next generation sequencing (NGS) of metagenomic DNA gives information about the microbial and taxonomical characterization of a particular niche. The objective of the present research is to study the microbial and taxonomical characterization of the metagenomic DNA, isolated from the frozen soil sample of a glacier in the north western Himalayas through NGS. RESULTS: The glacier community comprised of 16 phyla with the representation of members belonging to Proteobacteria and Acidobacteria. The number of genes annotated through the Kyoto Encyclopedia of Genes and Genomes (KEGG), GO, Pfam, Clusters of Orthologous Groups of proteins (COGs), and FIG databases were generated by COGNIZER. The annotation of genes assigned in each group from the metagenomics data through COG database and the number of genes annotated in different pathways through KEGG database were reported. CONCLUSION: Results indicate that the glacier soil taken in the present study, harbors taxonomically and metabolically diverse communities. The major bacterial group present in the niche is Proteobacteria followed by Acidobacteria, and Actinobacteria, etc. Different genes were annotated through COG and KEGG databases that integrate genomic, chemical, and systemic functional information.


Assuntos
Microbiologia do Solo , Bactérias/classificação , Sequenciamento de Nucleotídeos em Larga Escala , Microbiota/genética , Bactérias/isolamento & purificação , Clima Frio , Biologia Computacional , Camada de Gelo , Metagenômica , Genoma Microbiano , Índia
7.
Acta Pharmaceutica Sinica ; (12): 160-167, 2020.
Artigo em Chinês | WPRIM | ID: wpr-780570

RESUMO

In order to explore MYB transcription factors related to developmental processes and secondary metabolism in Morinda officinalis, we analyzed MoMYB expression based on transcriptome data from three tissues (root, stem and leaf). We used this analysis to provide a theoretical foundation for regulating the metabolism of M. officinalis. RNA-seq data along with the five databases including PFAM and plantTFDB and others were used to screen and classify MoMYB, including GO functional annotation and classification, subcellular localization, signal peptide prediction, conserved motif discovery, and comparative phylogenetic analysis. RT-qPCR was carried out to detect tissue-specific expression differences of MoMYB genes. According to transcriptome data, 109 MoMYB sequences were identified and divided into four classes, containing 51 sequences related to R2R3-MYB. Subcellular localization analysis indicated that a majority of sequences were located in nucleus. Blast2GO analysis showed that 109 MoMYB sequences were classified into three major functional ontologies including molecular function (112), biological processes (76) and cellular components (239). The R2-MYB conserved motif of 51 R2R3-MYB sequences possessed three significantly conserved tryptophan residues, whereas a phenylalanine replaced the first tryptophan in R3-MYB. The results of multiple sequence alignment and phylogenetic analysis revealed that the R2R3-MYB was distributed in all subgroups, apart from the S10, S19 and S21 subgroups. RT-qPCR indicated that several R2R3-MYB genes were differentially expressed among the three tissues, and this finding was consistent with transcriptome data. The 109 MoMYB sequences were annotated and divided into different classes, which lays the foundation for further study on MYB transcriptional factors in M. officinalis.

8.
Chinese Traditional and Herbal Drugs ; (24): 1052-1059, 2020.
Artigo em Chinês | WPRIM | ID: wpr-846607

RESUMO

Objective: To excavate the terpenoid synthesis and metabolism-related gene function and screen the interaction protein and fingerprint analysis of Antrodia cinnamomea mycelium, a cDNA library from A. cinnamomea mycelia was constructed and the EST sequences were analyzed. Methods: The cDNA library from the A. cinnamomea mycelium was constructed by the Gateway technique. A part of EST sequences about the bioinformatics, functional annotation and EST-SSR were analyzed. Results: The cDNA library of the A. cinnamomea mycelium was constructed successfully. The recombinant rate of the cDNA library was 95%, the titer of the library was 6.1 × 106 cfu/mL, the total cloning number was 1.2 × 107 cfu, the length of cDNA was between 300-2 000 bp with an average length of 1 000 bp. The clones were randomly sequenced and 65 valid ESTs were obtained. After being compared in the Genbank database, 45 ESTs had a definite annotation, and 18 ESTs were unnamed and hypothetical protein. The results with GO functional annotation showed that the ESTs involved the cell composition, transport, catalytic activity, regulation functions and etc. It contained 271 SSRs of all the ESTs in total. The nucleotide repeats in A. cinnamomea were abundant, among which dinucleotide and trinucleotide repeat units were more common accounting for 94.23%. Conclusion: The cDNA library from the A. cinnamomea mycelium and its ESTs related biological information were preliminarily identified, which will provide a theoretical foundation for research the mycelium genomics of A. cinnamomea.

9.
Acta Pharmaceutica Sinica B ; (6): 374-382, 2020.
Artigo em Inglês | WPRIM | ID: wpr-787622

RESUMO

Background@# () (2n = 2x = 16) is genus of flowering plants belonging to the Gelsemicaeae family.@*Method@#Here, a high-quality genome assembly using the Oxford Nanopore Technologies (ONT) platform and high-throughput chromosome conformation capture techniques (Hi-C) were used.@*Results@#A total of 56.11 Gb of raw GridION X5 platform ONT reads (6.23 Gb per cell) were generated. After filtering, 53.45 Gb of clean reads were obtained, giving 160 × coverage depth. The genome assemblies 335.13 Mb, close to the 338 Mb estimated by k-mer analysis, was generated with contig N50 of 10.23 Mb. The vast majority (99.2%) of the assembled sequence was anchored onto 8 pseudo-chromosomes. The genome completeness was then evaluated and 1338 of the 1440 conserved genes (92.9%) could be found in the assembly. Genome annotation revealed that 43.16% of the genome is composed of repetitive elements and 23.9% is composed of long terminal repeat elements. We predicted 26,768 protein-coding genes, of which 84.56% were functionally annotated.@*Conclusion@#The genomic sequences of could be a valuable source for comparative genomic analysis in the Gelsemicaeae family and will be useful for understanding the phylogenetic relationships of the indole alkaloid metabolism.

10.
Artigo | IMSEAR | ID: sea-205150

RESUMO

Noncoding RNAs (ncRNAs) are an important part of genes and having an important role in human cellular activities and serious diseases. To predict ncRNAs structure, there are many computational intelligence algorithms (CIAs) that are developed in past studies. However, many studies suggested that there were still many structures that are still unpredictable by researchers. In this paper, CIAs algorithms were comprehensively reviewed to predict ncRNAs structures. The advantages and disadvantages of CIA algorithms are briefly mentioned related to ncRNA genes. Moreover, the latest software tools are also compared and reviewed to identify the structure of ncRNAs for mining deep sequencing data. In this study, conventional machine learning algorithms are mainly focused and future trends are also described to predict ncRNAs structure. This paper concludes that there is a need for improving CIA algorithms by using deep learning architectures in terms of layers and computational complexity to predict ncRNAs structures.

11.
J Genet ; 2019 Sep; 98: 1-5
Artigo | IMSEAR | ID: sea-215402

RESUMO

The species of Oryza rufipogon. dw was first discovered at Dongxiang, Jiangxi in 1978. It is recognized as abundant in genetic resources with the characteristics of cold and insect resistance. A total of 100.15 Gb raw data was obtained from seven pair-end libraries by Illumina Hiseq4000 platform. Subsequently, a draft assembly genome of O. rufipogon. dw was generated with a final size of 422.7 Mb with a contig N50 of 15 kb and a scaffold N50 of 296.2 bb. The assembly genome size was higher than the estimated genome size (413 Mb) based on k-mer analysis. The identified repeat sequences accounted for 40.09% of the entire genome, and 32,521 protein-coding genes with an average of 4.59 exons per gene was annotated in five databases. Phylogenetic analysis using 1460 single-copy gene, O. rufipogon. dw was close with O. rufipogon by Bayes method. The wild rice species of O. rufipogon. dw divergence was estimated at ∼0.3 million years ago (Mya) from O. rufipogon, and ∼0.6 Mya from the O. sativa. The draft genome of O. rufipogon. dw provided an essential resource for its origin and evolution study.

12.
J Genet ; 2019 Aug; 98: 1-12
Artigo | IMSEAR | ID: sea-215404

RESUMO

Camelus dromedarius has played a pivotal role in both culture and way of life in the Arabian peninsula, particularly in arid regions where other domestic animals cannot be easily domesticated. Although, the mitochondrial genomes have recently been sequenced for several camelid species, wider phylogenetic studies are yet to be performed. The features of conserved gene elements, rapid evolutionary rate, and rare recombination make the mitochondrial genome a useful molecular marker for phylogenetic studies of closely related species. Here we carried out a comparative analysis of previously sequenced mitochondrial genomes of camelids with an emphasis on C. dromedarius, revealing a number of noticeable findings. First, the arrangement of mitochondrial genes in C. dromedarius is similar to those of the other camelids. Second, multiple sequence alignment of intergenic regions shows up to 90% similarity across different kinds of camels, with dromedary camels to reach 99%. Third, we successfully identified the three domains (termination-associated sequence, conserved domain and conserved sequence block) of the control region structure. The phylogenetic tree analysis showed that C. dromedarius mitogenomes were significantly clustered in the same clade with Lama pacos mitogenome. These findings will enhance our understanding of the nucleotide composition and molecular evolution of the mitogenomes of the genus Camelus, and provide more data for comparative mitogenomics in the family Camelidae.

13.
Artigo | IMSEAR | ID: sea-204945

RESUMO

The limited understanding of functional annotation of non-coding RNAs (ncRNAs) has been largely due to the complex functionalities of ncRNAs. They perform a vital part in the operation of the cell. There are many ncRNAs available such as micro RNAs or long non-coding RNAs that play important functions in the cell. In practice, there is a strong binding of the function of RNAs that must be considered to develop computational intelligent techniques. Comprehensive modeling of the structure of an ncRNA is essential that may provide the first clue towards an understanding of its functions. Many computational techniques have been developed to predict ncRNAs structures but few of them focused on the functions of ncRNA genes. Nevertheless, the accuracy of the functional annotation of ncRNAs is still facing computational challenges and results are not satisfactory. Here, many computational intelligent methods were described in this paper to predict the functional annotation of ncRNAs. The current literature review is suggested that there is still a dire need to develop advanced computational techniques for functional annotating of ncRNA genes in terms of accuracy and computational time.

14.
J Genet ; 2019 Feb; 98: 1-11
Artigo | IMSEAR | ID: sea-215480

RESUMO

Stem gall (Protomyces macrosporus Unger), a serious disease that affects leaves, petioles, stems and fruits of coriander (Coriandrum sativum L.) causing heavy loss in yield. Genetic improvement of coriander for stem gall disease is indispensable. Coriander cultivars of stem gall resistance (ACr-1) and susceptible (CS-6) leaf samples were utilized and transcriptome sequenced using Illumina NextSeq500 platform. After trimming low-quality reads and adapter sequences, a total of 49,163,108 and 43,746,120 high-quality reads were retained and further assembly resulted validated transcripts of 59,933 and 56,861. We have predicted 52,506 and 48,858 coding sequences (CDS) of which 50,506 and 46,945 were annotated using NCBInr database. Gene ontology analysis annotated 19,099 and 17,625 terms; pathway analysis obtained 24 different functional pathway categories; signal transduction, transport, catabolism, translation and carbohydrate metabolism pathways etc. were dominated. Differentially expressed genes analysis predicted 13,123 CDS commonly expressed of which 431 and 400 genes were significantly upregulated and downregulated, respectively, in which Rgenes, stress inducible transcription factors such as ERF, NAC, bZIP, MYB, DREB and WRKY and antifungal related genes were predicted. The real-time PCR analysis of HSP20 gene expression in resistance showed upregulation by 10-fold over susceptible sample and 18s used as a housekeeping gene for normalization. The present results provide an insights into various aspects underlying the development of resistance to stem gall in coriander.

15.
Practical Oncology Journal ; (6): 115-121, 2019.
Artigo em Chinês | WPRIM | ID: wpr-752824

RESUMO

Objective The aim of this study was to investigate the expression of miR-455-5p in epithelial ovarian cancer and its effect on the development of epithelial ovarian cancer. Methods The miRNA expression data of normal ovarian epithelial tis-sues and epithelial ovarian cancer tissues GSE83693 were downloaded from the GEO database. Differential expression analysis was used to obtain differential expression data of miRNAs in epithelial ovarian cancer. The expression of miR-455 -5p was analyzed whether there is difference expression between normal ovarian epithelium and epithelial ovary cancer tissues; qRT-PCR was used to verify the differential expression prediction results; bio-informatics software was used to analyze the KEGG pathway enrichment and GO gene function annotation of miR-455-5p target genes,and to explore the disorders of dyregulated miR-455-5p in the devel-opment of epithelial ovarian cancer. Results A total of 101 cases of differentially expressed miRNAs were screened,34 cases were up-regulated and 67 cases were down-regulated. Among them,miR-455-5p was down-regulated significantly(P<0. 01),and the different fulds were -2. 9019. The results of qRT-PCR showed that the expression of miR-455-5p in epithelial ovarian cancer cells(SKOV-3,OVCAR-3 and A2780)was significantly lower than that in normal ovarian epithelial cells(IOSE-80),and the dif-ferential expression was statistically significant(P<0. 05). The results of KEGG pathway enrichment analysis showed that miR-455-5p regulated target genes mainly involved in five pathways,including TGF-β signaling pathway,Hippo signaling pathway,ECM-receptor interaction,transcriptional dysregulation pathway in cancer,and chronic granule cellular leukemia,which were associated with tumors. GO functional annotation analysis showed that the target genes regulated by miR-455-5p in the above pathway was mainly involved in protein phosphorylation,promoted cell proliferation and migration,inhibited apoptosis,promoted epithelial-mesenchymal transition,regulated transcription and regulated cell cycle,etc. ,which associated with tumorigenesis. Conclusion The expression of miR-455-5p is down-regulated in epithelial ovarian cancer. The miR-455-5p target genes are involved in the pathogenesis and function of epithelial ovarian cancer,and are associated with the development of epithelial ovarian cancer.

16.
Journal of Zhejiang University. Science. B ; (12): 476-487, 2019.
Artigo em Inglês | WPRIM | ID: wpr-847032

RESUMO

Life may have begun in an RNA world, which is supported by increasing evidence of the vital role that RNAs perform in biological systems. In the human genome, most genes actually do not encode proteins; they are noncoding RNA genes. The largest class of noncoding genes is known as long noncoding RNAs (lncRNAs), which are transcripts greater in length than 200 nucleotides, but with no protein-coding capacity. While some lncRNAs have been demonstrated to be key regulators of gene expression and 3D genome organization, most lncRNAs are still uncharacterized. We thus propose several data mining and machine learning approaches for the functional annotation of human lncRNAs by leveraging the vast amount of data from genetic and genomic studies. Recent results from our studies and those of other groups indicate that genomic data mining can give insights into lncRNA functions and provide valuable information for experimental studies of candidate lncRNAs associated with human disease.

17.
Genomics, Proteomics & Bioinformatics ; (4): 305-310, 2019.
Artigo em Inglês | WPRIM | ID: wpr-772935

RESUMO

Published genomes frequently contain erroneous gene models that represent issues associated with identification of open reading frames, start sites, splice sites, and related structural features. The source of these inconsistencies is often traced back to integration across text file formats designed to describe long read alignments and predicted gene structures. In addition, the majority of gene prediction frameworks do not provide robust downstream filtering to remove problematic gene annotations, nor do they represent these annotations in a format consistent with current file standards. These frameworks also lack consideration for functional attributes, such as the presence or absence of protein domains that can be used for gene model validation. To provide oversight to the increasing number of published genome annotations, we present a software package, the Gene Filtering, Analysis, and Conversion (gFACs), to filter, analyze, and convert predicted gene models and alignments. The software operates across a wide range of alignment, analysis, and gene prediction files with a flexible framework for defining gene models with reliable structural and functional attributes. gFACs supports common downstream applications, including genome browsers, and generates extensive details on the filtering process, including distributions that can be visualized to further assess the proposed gene space. gFACs is freely available and implemented in Perl with support from BioPerl libraries at https://gitlab.com/PlantGenomicsLab/gFACs.

18.
Journal of Zhejiang University. Science. B ; (12): 476-487, 2019.
Artigo em Inglês | WPRIM | ID: wpr-776715

RESUMO

Life may have begun in an RNA world, which is supported by increasing evidence of the vital role that RNAs perform in biological systems. In the human genome, most genes actually do not encode proteins; they are noncoding RNA genes. The largest class of noncoding genes is known as long noncoding RNAs (lncRNAs), which are transcripts greater in length than 200 nucleotides, but with no protein-coding capacity. While some lncRNAs have been demonstrated to be key regulators of gene expression and 3D genome organization, most lncRNAs are still uncharacterized. We thus propose several data mining and machine learning approaches for the functional annotation of human lncRNAs by leveraging the vast amount of data from genetic and genomic studies. Recent results from our studies and those of other groups indicate that genomic data mining can give insights into lncRNA functions and provide valuable information for experimental studies of candidate lncRNAs associated with human disease.


Assuntos
Humanos , Transtorno do Espectro Autista , Genética , Mineração de Dados , Genômica , Aprendizado de Máquina , RNA Longo não Codificante , Fisiologia , Máquina de Vetores de Suporte
19.
Genomics & Informatics ; : 46-2019.
Artigo em Inglês | WPRIM | ID: wpr-785795

RESUMO

The implications of germline de novo variants (DNVs) in diseases are well documented. Despite extensive research, inconsistencies between studies remain a challenge, and the distribution and genetic characteristics of DNVs need to be precisely evaluated. To address this issue at the whole-genome scale, a large number of DNVs identified from the whole-genome sequencing of 1,902 healthy trios (i.e., parents and progeny) from the Simons Foundation for Autism Research Initiative study and 20 healthy Korean trios were analyzed. These apparently nonpathogenic DNVs were enriched in functional elements of the genome but relatively depleted in regions of common copy number variants, implying their potential function as triggers of evolution even in healthy groups. No strong mutational hotspots were identified. The pathogenicity of the DNVs was not strongly elevated, reflecting the health status of the cohort. The mutational signatures were consistent with previous studies. This study will serve as a reference for future DNV studies.


Assuntos
Humanos , Transtorno Autístico , Estudos de Coortes , Genoma , Pais , Virulência
20.
Mem. Inst. Oswaldo Cruz ; 114: e180438, 2019. tab, graf
Artigo em Inglês | LILACS | ID: biblio-1040619

RESUMO

Leishmania braziliensis is the etiological agent of American mucosal leishmaniasis, one of the most severe clinical forms of leishmaniasis. Here, we report the assembly of the L. braziliensis (M2904) genome into 35 continuous chromosomes. Also, the annotation of 8395 genes is provided. The public availability of this information will contribute to a better knowledge of this pathogen and help in the search for vaccines and novel drug targets aimed to control the disease caused by this Leishmania species.


Assuntos
Leishmania braziliensis/genética , DNA de Protozoário/genética , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA