Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 31
Filter
1.
Article in English | MEDLINE | ID: mdl-38862433

ABSTRACT

During the last decade, the generation and accumulation of petabase-scale high-throughput sequencing data have resulted in great challenges, including access to human data, as well as transfer, storage, and sharing of enormous amounts of data. To promote data-driven biological research, the Korean government announced that all biological data generated from government-funded research projects should be deposited at the Korea BioData Station (K-BDS), which consists of multiple databases for individual data types. Here, we introduce the Korean Nucleotide Archive (KoNA), a repository of nucleotide sequence data. As of July 2022, the Korean Read Archive in KoNA has collected over 477 TB of raw next-generation sequencing data from national genome projects. To ensure data quality and prepare for international alignment, a standard operating procedure was adopted, which is similar to that of the International Nucleotide Sequence Database Collaboration. The standard operating procedure includes quality control processes for submitted data and metadata using an automated pipeline, followed by manual examination. To ensure fast and stable data transfer, a high-speed transmission system called GBox is used in KoNA. Furthermore, the data uploaded to or downloaded from KoNA through GBox can be readily processed using a cloud computing service called Bio-Express. This seamless coupling of KoNA, GBox, and Bio-Express enhances the data experience, including submission, access, and analysis of raw nucleotide sequences. KoNA not only satisfies the unmet needs for a national sequence repository in Korea but also provides datasets to researchers globally and contributes to advances in genomics. The KoNA is available at https://www.kobic.re.kr/kona/.


Subject(s)
Databases, Nucleic Acid , Republic of Korea , Humans , High-Throughput Nucleotide Sequencing/methods
2.
BMC Genomics ; 25(1): 318, 2024 Mar 28.
Article in English | MEDLINE | ID: mdl-38549092

ABSTRACT

BACKGROUND: Detecting structural variations (SVs) at the population level using next-generation sequencing (NGS) requires substantial computational resources and processing time. Here, we compared the performances of 11 SV callers: Delly, Manta, GridSS, Wham, Sniffles, Lumpy, SvABA, Canvas, CNVnator, MELT, and INSurVeyor. These SV callers have been recently published and have been widely employed for processing massive whole-genome sequencing datasets. We evaluated the accuracy, sequence depth, running time, and memory usage of the SV callers. RESULTS: Notably, several callers exhibited better calling performance for deletions than for duplications, inversions, and insertions. Among the SV callers, Manta identified deletion SVs with better performance and efficient computing resources, and both Manta and MELT demonstrated relatively good precision regarding calling insertions. We confirmed that the copy number variation callers, Canvas and CNVnator, exhibited better performance in identifying long duplications as they employ the read-depth approach. Finally, we also verified the genotypes inferred from each SV caller using a phased long-read assembly dataset, and Manta showed the highest concordance in terms of the deletions and insertions. CONCLUSIONS: Our findings provide a comprehensive understanding of the accuracy and computational efficiency of SV callers, thereby facilitating integrative analysis of SV profiles in diverse large-scale genomic datasets.


Subject(s)
DNA Copy Number Variations , Genomics , Humans , Whole Genome Sequencing , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA , Genome, Human , Genomic Structural Variation
3.
Mol Cells ; 47(3): 100033, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38403196

ABSTRACT

Considering the recent increase in the number of colorectal cancer (CRC) cases in South Korea, we aimed to clarify the molecular characteristics of CRC unique to the Korean population. To gain insights into the complexities of CRC and promote the exchange of critical data, RNA-sequencing analysis was performed to reveal the molecular mechanisms that drive the development and progression of CRC; this analysis is critical for developing effective treatment strategies. We performed RNA-sequencing analysis of CRC and adjacent normal tissue samples from 214 Korean participants (comprising a total of 381 including 169 normal and 212 tumor samples) to investigate differential gene expression between the groups. We identified 19,575 genes expressed in CRC and normal tissues, with 3,830 differentially expressed genes (DEGs) between the groups. Functional annotation analysis revealed that the upregulated DEGs were significantly enriched in pathways related to the cell cycle, DNA replication, and IL-17, whereas the downregulated DEGs were enriched in metabolic pathways. We also analyzed the relationship between clinical information and subtypes using the Consensus Molecular Subtype (CMS) classification. Furthermore, we compared groups clustered within our dataset to CMS groups and performed additional analysis of the methylation data between DEGs and CMS groups to provide comprehensive biological insights from various perspectives. Our study provides valuable insights into the molecular mechanisms underlying CRC in Korean patients and serves as a platform for identifying potential target genes for this disease. The raw data and processed results have been deposited in a public repository for further analysis and exploration.


Subject(s)
Colorectal Neoplasms , Gene Expression Profiling , Humans , Gene Expression Profiling/methods , Colorectal Neoplasms/metabolism , Gene Expression Regulation, Neoplastic , Computational Biology/methods , RNA
4.
BMB Rep ; 57(2): 110-115, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37605617

ABSTRACT

Alterations in DNA methylation play an important pathophysiological role in the development and progression of colorectal cancer. We comprehensively profiled DNA methylation alterations in 165 Korean patients with colorectal cancer (CRC), and conducted an in-depth investigation of cancer-specific methylation patterns. Our analysis of the tumor samples revealed a significant presence of hypomethylated probes, primarily within the gene body regions; few hypermethylated sites were observed, which were mostly enriched in promoter-like and CpG island regions. The CpG Island Methylator PhenotypeHigh (CIMP-H) exhibited notable enrichment of microsatellite instability-high (MSI-H). Additionally, our findings indicated a significant correlation between methylation of the MLH1 gene and MSI-H status. Furthermore, we found that the CIMP-H had a higher tendency to affect the right-side of the colon tissues and was slightly more prevalent among older patients. Through our methylome profile analysis, we successfully verified the thylation patterns and clinical characteristics of Korean patients with CRC. This valuable dataset lays a strong foundation for exploring novel molecular insights and potential therapeutic targets for the treatment of CRC. [BMB Reports 2024; 57(2): 110-115].


Subject(s)
Colorectal Neoplasms , DNA Methylation , Humans , DNA Methylation/genetics , Microsatellite Instability , Mutation , Colorectal Neoplasms/genetics , Colorectal Neoplasms/pathology , Republic of Korea , CpG Islands/genetics , Phenotype
5.
BMB Rep ; 57(3): 161-166, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37964634

ABSTRACT

Aberrant DNA methylation plays a critical role in the development and progression of colorectal cancer (CRC), which has high incidence and mortality rates in Korea. Various CRC-associated methylation markers for cancer diagnosis and prognosis have been developed; however, they have not been validated for Korean patients owing to the lack of comprehensive clinical and methylome data. Here, we obtained reliable methylation profiles for 228 tumor, 103 adjacent normal, and two unmatched normal colon tissues from Korean patients with CRC using an Illumina Infinium EPIC array; the data were corrected for biological and experiment biases. A comparative methylome analysis confirmed the previous findings that hypermethylated positions in the tumor were highly enriched in CpG island and promoter, 5' untranslated, and first exon regions. However, hypomethylated positions were enriched in the open-sea regions considerably distant from CpG islands. After applying a CpG island methylator phenotype (CIMP) to the methylome data of tumor samples to stratify the CRC patients, we consolidated the previously established clinicopathological findings that the tumors with high CIMP signatures were significantly enriched in the right colon. The results showed a higher prevalence of microsatellite instability status and MLH1 methylation in tumors with high CMP signatures than in those with low or non-CIMP signatures. Therefore, our methylome analysis and dataset provide insights into applying CRC-associated methylation markers for Korean patients regarding cancer diagnosis and prognosis. [BMB Reports 2024; 57(3): 161-166].


Subject(s)
Colorectal Neoplasms , Epigenome , Humans , Colorectal Neoplasms/diagnosis , Colorectal Neoplasms/genetics , Colorectal Neoplasms/pathology , DNA Methylation/genetics , CpG Islands/genetics , Phenotype , Republic of Korea
6.
BMC Genomics ; 24(1): 613, 2023 Oct 13.
Article in English | MEDLINE | ID: mdl-37828501

ABSTRACT

BACKGROUND: The domestic dog, Canis lupus familiaris, is a companion animal for humans as well as an animal model in cancer research due to similar spontaneous occurrence of cancers as humans. Despite the social and biological importance of dogs, the catalogue of genomic variations and transcripts for dogs is relatively incomplete. RESULTS: We developed CanISO, a new database to hold a large collection of transcriptome profiles and genomic variations for domestic dogs. CanISO provides 87,692 novel transcript isoforms and 60,992 known isoforms from whole transcriptome sequencing of canine tumors (N = 157) and their matched normal tissues (N = 64). CanISO also provides genomic variation information for 210,444 unique germline single nucleotide polymorphisms (SNPs) from the whole exome sequencing of 183 dogs, with a query system that searches gene- and transcript-level information as well as covered SNPs. Transcriptome profiles can be compared with corresponding human transcript isoforms at a tissue level, or between sample groups to identify tumor-specific gene expression and alternative splicing patterns. CONCLUSIONS: CanISO is expected to increase understanding of the dog genome and transcriptome, as well as its functional associations with humans, such as shared/distinct mechanisms of cancer. CanISO is publicly available at https://www.kobic.re.kr/caniso/ .


Subject(s)
Neoplasms , Wolves , Dogs , Animals , Humans , Transcriptome , Wolves/genetics , Genome , Genomics , Neoplasms/genetics , Neoplasms/veterinary , Protein Isoforms/genetics
7.
Nat Genet ; 55(2): 221-231, 2023 02.
Article in English | MEDLINE | ID: mdl-36624345

ABSTRACT

Despite advances in predicting physical peptide-major histocompatibility complex I (pMHC I) binding, it remains challenging to identify functionally immunogenic neoepitopes, especially for MHC II. By using the results of >36,000 immunogenicity assay, we developed a method to identify pMHC whose structural alignment facilitates T cell reaction. Our method predicted neoepitopes for MHC II and MHC I that were responsive to checkpoint blockade when applied to >1,200 samples of various tumor types. To investigate selection by spontaneous immunity at the single epitope level, we analyzed the frequency spectrum of >25 million mutations in >9,000 treatment-naive tumors with >100 immune phenotypes. MHC II immunogenicity specifically lowered variant frequencies in tumors under high immune pressure, particularly with high TCR clonality and MHC II expression. A similar trend was shown for MHC I neoepitopes, but only in particular tissue types. In summary, we report immune selection imposed by MHC II-restricted natural or therapeutic T cell reactivity.


Subject(s)
Neoplasms , Humans , Neoplasms/genetics , Neoplasms/therapy , Epitopes/genetics , T-Lymphocytes , Peptides/chemistry , Peptides/metabolism
8.
Exp Mol Med ; 54(11): 1862-1871, 2022 11.
Article in English | MEDLINE | ID: mdl-36323850

ABSTRACT

Despite substantial advances in disease genetics, studies to date have largely focused on individuals of European descent. This limits further discoveries of novel functional genetic variants in other ethnic groups. To alleviate the paucity of East Asian population genome resources, we established the Korean Variant Archive 2 (KOVA 2), which is composed of 1896 whole-genome sequences and 3409 whole-exome sequences from healthy individuals of Korean ethnicity. This is the largest genome database from the ethnic Korean population to date, surpassing the 1909 Korean individuals deposited in gnomAD. The variants in KOVA 2 displayed all the known genetic features of those from previous genome databases, and we compiled data from Korean-specific runs of homozygosity, positively selected intervals, and structural variants. In doing so, we found loci, such as the loci of ADH1A/1B and UHRF1BP1, that are strongly selected in the Korean population relative to other East Asian populations. Our analysis of allele ages revealed a correlation between variant functionality and evolutionary age. The data can be browsed and downloaded from a public website ( https://www.kobic.re.kr/kova/ ). We anticipate that KOVA 2 will serve as a valuable resource for genetic studies involving East Asian populations.


Subject(s)
Asian People , Exome , Humans , Asian People/genetics , Republic of Korea , Polymorphism, Single Nucleotide
9.
Genomics ; 113(6): 4136-4148, 2021 11.
Article in English | MEDLINE | ID: mdl-34715294

ABSTRACT

Hereditary Spastic Paraplegias (HSP) are a group of rare inherited neurological disorders characterized by progressive loss of corticospinal motor-tract function. Numerous patients with HSP remain undiagnosed despite screening for known genetic causes of HSP. Therefore, identification of novel genetic variations related to HSP is needed. In this study, we identified 88 genetic variants in 54 genes from whole-exome data of 82 clinically well-defined Korean HSP families. Fifty-six percent were known HSP genes, and 44% were composed of putative candidate HSP genes involved in the HSPome and originally reported neuron-related genes, not previously diagnosed in HSP patients. Their inheritance modes were 39, de novo; 33, autosomal dominant; and 10, autosomal recessive. Notably, ALDH18A1 showed the second highest frequency. Fourteen known HSP genes were firstly reported in Koreans, with some of their variants being predictive of HSP-causing protein malfunction. SPAST and REEP1 mutants with unknown function induced neurite abnormality. Further, 54 HSP-related genes were closely linked to the HSP progression-related network. Additionally, the genetic spectrum and variation of known HSP genes differed across ethnic groups. These results expand the genetic spectrum for HSP and may contribute to the accurate diagnosis and treatment for rare HSP.


Subject(s)
Spastic Paraplegia, Hereditary , Asian People , Exome , Humans , Membrane Transport Proteins/genetics , Mutation , Republic of Korea , Spastic Paraplegia, Hereditary/diagnosis , Spastic Paraplegia, Hereditary/genetics , Spastin/genetics
11.
Gastroenterology ; 160(4): 1194-1207.e28, 2021 03.
Article in English | MEDLINE | ID: mdl-32946903

ABSTRACT

BACKGROUND & AIMS: Squalene epoxidase (SQLE), a rate-limiting enzyme in cholesterol biosynthesis, is suggested as a proto-oncogene. Paradoxically, SQLE is degraded by excess cholesterol, and low SQLE is associated with aggressive colorectal cancer (CRC). Therefore, we studied the functional consequences of SQLE reduction in CRC progression. METHODS: Gene and protein expression data and clinical features of CRCs were obtained from public databases and 293 human tissues, analyzed by immunohistochemistry. In vitro studies showed underlying mechanisms of CRC progression mediated by SQLE reduction. Mice were fed a 2% high-cholesterol or a control diet before and after cecum implantation of SQLE genetic knockdown/control CRC cells. Metastatic dissemination and circulating cancer stem cells were demonstrated by in vivo tracking and flow cytometry analysis, respectively. RESULTS: In vitro studies showed that SQLE reduction helped cancer cells overcome constraints by inducing the epithelial-mesenchymal transition required to generate cancer stem cells. Surprisingly, SQLE interacted with GSK3ß and p53. Active GSK3ß contributes to the stability of SQLE, thereby increasing cell cholesterol content, whereas SQLE depletion disrupted the GSK3ß/p53 complex, resulting in a metastatic phenotype. This was confirmed in a spontaneous CRC metastasis mice model, where SQLE reduction, by a high-cholesterol regimen or genetic knockdown, strikingly promoted CRC aggressiveness through the production of migratory cancer stem cells. CONCLUSIONS: We showed that SQLE reduction caused by cholesterol accumulation aggravates CRC progression via the activation of the ß-catenin oncogenic pathway and deactivation of the p53 tumor suppressor pathway. Our findings provide new insights into the link between cholesterol and CRC, identifying SQLE as a key regulator in CRC aggressiveness and a prognostic biomarker.


Subject(s)
Cholesterol/metabolism , Colorectal Neoplasms/pathology , Squalene Monooxygenase/metabolism , Adult , Aged , Animals , Cell Line, Tumor , Colon/pathology , Disease Models, Animal , Female , Gene Knockdown Techniques , Glycogen Synthase Kinase 3 beta/metabolism , Humans , Intestinal Mucosa/pathology , Male , Mice , Middle Aged , Neoplastic Stem Cells/pathology , Oxidation-Reduction , Proto-Oncogene Mas , Rectum/pathology , Squalene Monooxygenase/genetics , Tumor Suppressor Protein p53/metabolism , Young Adult , beta Catenin/metabolism
12.
Front Genet ; 11: 590924, 2020.
Article in English | MEDLINE | ID: mdl-33584793

ABSTRACT

Lennox-Gastaut syndrome (LGS) is a severe type of childhood-onset epilepsy characterized by multiple types of seizures, specific discharges on electroencephalography, and intellectual disability. Most patients with LGS do not respond well to drug treatment and show poor long-term prognosis. Approximately 30% of patients without brain abnormalities have unidentifiable causes. Therefore, accurate diagnosis and treatment of LGS remain challenging. To identify causative mutations of LGS, we analyzed the whole-exome sequencing data of 17 unrelated Korean families, including patients with LGS and LGS-like epilepsy without brain abnormalities, using the Genome Analysis Toolkit. We identified 14 mutations in 14 genes as causes of LGS or LGS-like epilepsy. 64 percent of the identified genes were reported as LGS or epilepsy-related genes. Many of these variations were novel and considered as pathogenic or likely pathogenic. Network analysis was performed to classify the identified genes into two network clusters: neuronal signal transmission or neuronal development. Additionally, knockdown of two candidate genes with insufficient evidence of neuronal functions, SLC25A39 and TBC1D8, decreased neurite outgrowth and the expression level of MAP2, a neuronal marker. These results expand the spectrum of genetic variations and may aid the diagnosis and management of individuals with LGS.

13.
Genet Test Mol Biomarkers ; 24(1): 54-58, 2020 Jan.
Article in English | MEDLINE | ID: mdl-31829726

ABSTRACT

Aim: Lennox-Gastaut syndrome (LGS) is a severe type of childhood-onset epilepsy with multiple types of seizures, specific discharges on electroencephalography, and intellectual disability. However, LGS-related genes are largely unknown. To identify causative genes related to LGS, we collected and analyzed data from a three-generation Korean family in which one member had LGS and two had intellectual disability. Methods: Genomic DNAs were extracted from blood samples of all participants and used in whole-exome sequencing (WES). Genetic variants were detected by the Genome Analysis Toolkit and confirmed by Sanger sequencing. Variant pathogenicity was evaluated by prediction programs and the American College of Medical Genetics criteria. The LGS patient had generalized slow spike-and-wave discharges, multiple types of seizures, and developmental delay. Results: Analyses of the WES data from the family revealed a novel variant (c.1048G>A, p.Ala350Thr) in the IQ motif and Sec7 domain 2 (IQSEC2). This variant is within a highly evolutionarily conserved IQ-like motif, indicating a decrease in the calmodulin-binding capacity or α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid transmission. The hemizygous variant in the male with LGS was a maternally inherited X-linked variant from the heterozygous maternal grandmother and mother, both of whom had intellectual disability. Conclusion: These findings indicate that the variant of IQSEC2 triggered both LGS and intellectual disability dependent on sex in this family. We report a novel X-linked inherited IQSEC2 variant for LGS and intellectual disability, which enhances the spectrum of variants in the IQ-like motif of IQSEC2.


Subject(s)
Guanine Nucleotide Exchange Factors/genetics , Intellectual Disability/genetics , Lennox Gastaut Syndrome/genetics , Adult , Child , Epilepsy/genetics , Family , Female , Genes, X-Linked/genetics , Guanine Nucleotide Exchange Factors/metabolism , Humans , Male , Pedigree , Republic of Korea , Exome Sequencing
14.
Cell Death Dis ; 10(8): 570, 2019 07 29.
Article in English | MEDLINE | ID: mdl-31358734

ABSTRACT

The initiation of centrosome duplication is regulated by the Plk4/STIL/hsSAS-6 axis; however, the involvement of other centrosomal proteins in this process remains unclear. In this study, we demonstrate that Cep131 physically interacts with Plk4 following phosphorylation of residues S21 and T205. Localizing at the centriole, phosphorylated Cep131 has an increased capability to interact with STIL, leading to further activation and stabilization of Plk4 for initiating centrosome duplication. Moreover, we found that Cep131 overexpression resulted in centrosome amplification by excessive recruitment of STIL to the centriole and subsequent stabilization of Plk4, contributing to centrosome amplification. The xenograft mouse model also showed that both centrosome amplification and colon cancer growth were significantly increased by Cep131 overexpression. These findings demonstrate that Cep131 is a novel substrate of Plk4, and that phosphorylation or dysregulated Cep131 overexpression promotes Plk4 stabilization and therefore centrosome amplification, establishing a perspective in understanding a relationship between centrosome amplification and cancer development.


Subject(s)
Cell Cycle Proteins/genetics , Centrosome/metabolism , Colonic Neoplasms/genetics , Cytoskeletal Proteins/genetics , Protein Serine-Threonine Kinases/genetics , Animals , Cell Line, Tumor , Colonic Neoplasms/pathology , Disease Progression , Gene Expression Regulation, Neoplastic/genetics , HCT116 Cells , HEK293 Cells , Heterografts , Humans , Intracellular Signaling Peptides and Proteins/genetics , Mice , Phosphorylation/genetics
15.
Plant Physiol ; 171(1): 452-67, 2016 05.
Article in English | MEDLINE | ID: mdl-26966169

ABSTRACT

Plant leaves, harvesting light energy and fixing CO2, are a major source of foods on the earth. Leaves undergo developmental and physiological shifts during their lifespan, ending with senescence and death. We characterized the key regulatory features of the leaf transcriptome during aging by analyzing total- and small-RNA transcriptomes throughout the lifespan of Arabidopsis (Arabidopsis thaliana) leaves at multidimensions, including age, RNA-type, and organelle. Intriguingly, senescing leaves showed more coordinated temporal changes in transcriptomes than growing leaves, with sophisticated regulatory networks comprising transcription factors and diverse small regulatory RNAs. The chloroplast transcriptome, but not the mitochondrial transcriptome, showed major changes during leaf aging, with a strongly shared expression pattern of nuclear transcripts encoding chloroplast-targeted proteins. Thus, unlike animal aging, leaf senescence proceeds with tight temporal and distinct interorganellar coordination of various transcriptomes that would be critical for the highly regulated degeneration and nutrient recycling contributing to plant fitness and productivity.


Subject(s)
Arabidopsis/genetics , Gene Expression Regulation, Plant , Plant Leaves/physiology , Transcriptome , Antisense Elements (Genetics) , Arabidopsis/physiology , Arabidopsis Proteins/genetics , Arabidopsis Proteins/metabolism , Chloroplasts/genetics , Gene Expression Profiling/methods , Gene Regulatory Networks , Organelles/genetics , Organelles/metabolism , Plant Leaves/cytology , RNA, Small Untranslated/genetics , Time Factors , Transcription Factors/genetics , Transcription Factors/metabolism
16.
Nucleic Acids Res ; 43(12): 5716-29, 2015 Jul 13.
Article in English | MEDLINE | ID: mdl-26001967

ABSTRACT

Global network modeling of distal regulatory interactions is essential in understanding the overall architecture of gene expression programs. Here, we developed a Bayesian probabilistic model and computational method for global causal network construction with breast cancer as a model. Whereas physical regulator binding was well supported by gene expression causality in general, distal elements in intragenic regions or loci distant from the target gene exhibited particularly strong functional effects. Modeling the action of long-range enhancers was critical in recovering true biological interactions with increased coverage and specificity overall and unraveling regulatory complexity underlying tumor subclasses and drug responses in particular. Transcriptional cancer drivers and risk genes were discovered based on the network analysis of somatic and genetic cancer-related DNA variants. Notably, we observed that the risk genes were functionally downstream of the cancer drivers and were selectively susceptible to network perturbation by tumorigenic changes in their upstream drivers. Furthermore, cancer risk alleles tended to increase the susceptibility of the transcription of their associated genes. These findings suggest that transcriptional cancer drivers selectively induce a combinatorial misregulation of downstream risk genes, and that genetic risk factors, mostly residing in distal regulatory regions, increase transcriptional susceptibility to upstream cancer-driving somatic changes.


Subject(s)
Gene Expression Regulation, Neoplastic , Gene Regulatory Networks , Genes, Neoplasm , Transcription, Genetic , Bayes Theorem , Cell Line, Tumor , Enhancer Elements, Genetic , Gene Expression Regulation, Neoplastic/drug effects , Genetic Variation , Genomics/methods , Humans , MCF-7 Cells , Risk , Transcription Factors/metabolism
17.
Bioinformatics ; 31(4): 596-8, 2015 Feb 15.
Article in English | MEDLINE | ID: mdl-25322835

ABSTRACT

SUMMARY: Deep sequencing of small RNAs has become a routine process in recent years, but no dedicated viewer is as yet available to explore the sequence features simultaneously along with secondary structure and gene expression of microRNA (miRNA). We present a highly interactive application that visualizes the sequence alignment, secondary structure and normalized read counts in synchronous multipanel windows. This helps users to easily examine the relationships between the structure of precursor and the sequences and abundance of final products and thereby will facilitate the studies on miRNA biogenesis and regulation. The project manager handles multiple samples of multiple groups. The read alignment is imported in BAM file format. Implemented features comprise sorting, zooming, highlighting, editing, filtering, saving, exporting, etc. Currently, miRseqViewer supports 84 organisms whose annotation is available at miRBase. AVAILABILITY AND IMPLEMENTATION: miRseqViewer, implemented in Java, is available at https://github.com/insoo078/mirseqviewer or at http://msv.kobic.re.kr. CONTACT: sanghyuk@ewha.ac.kr.


Subject(s)
Computational Biology/methods , Computer Graphics , Databases, Nucleic Acid , MicroRNAs/genetics , Sequence Analysis, RNA/methods , Software , High-Throughput Nucleotide Sequencing , Humans , Sequence Alignment
18.
Genomics Inform ; 12(1): 42-7, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24748860

ABSTRACT

Asian populations contain a variety of ethnic groups that have ethnically specific genetic differences. Ethnic variants may be highly relevant in disease and human differentiation studies. Here, we identified ethnically specific variants and then investigated their distribution across Asian ethnic groups. We obtained 58,960 Pan-Asian single nucleotide polymorphisms of 1,953 individuals from 72 ethnic groups of 11 Asian countries. We selected 9,306 ethnic variant single nucleotide polymorphisms (ESNPs) and 5,167 ethnic variant copy number polymorphisms (ECNPs) using the nearest shrunken centroid method. We analyzed ESNPs and ECNPs in 3 hierarchical levels: superpopulation, subpopulation, and ethnic population. We also identified ESNP- and ECNP-related genes and their features. This study represents the first attempt to identify Asian ESNP and ECNP markers, which can be used to identify genetic differences and predict disease susceptibility and drug effectiveness in Asian ethnic populations.

19.
BMB Rep ; 46(6): 305-9, 2013 Jun.
Article in English | MEDLINE | ID: mdl-23790973

ABSTRACT

The determination of relatedness between individuals in a family is crucial in analysis of common complex diseases. We present a method to infer close inter-familial relationships based on SNP genotyping data and provide the relationship coefficient of kinship in Korean families. We obtained blood samples from 43 Korean individuals in two families. SNP data was obtained using the Affymetrix Genome-wide Human SNP array 6.0 and the Illumina Human 1M-Duo chip. To measure the kinship coefficient with the SNP genotyping data, we considered all possible pairs of individuals in each family. The genetic distance between two individuals in a pair was determined using the allele sharing distance method. The results show that genetic distance is proportional to the kinship coefficient and that a close degree of kinship can be confirmed with SNP genotyping data. This study represents the first attempt to identify the genetic distance between very closely related individuals.


Subject(s)
Asian People/genetics , Polymorphism, Single Nucleotide , Alleles , Family , Genome, Human , Genome-Wide Association Study , Genotype , Humans , Republic of Korea
20.
PLoS One ; 8(2): e55596, 2013.
Article in English | MEDLINE | ID: mdl-23405175

ABSTRACT

BACKGROUND: Deep sequencing techniques provide a remarkable opportunity for comprehensive understanding of tumorigenesis at the molecular level. As omics studies become popular, integrative approaches need to be developed to move from a simple cataloguing of mutations and changes in gene expression to dissecting the molecular nature of carcinogenesis at the systemic level and understanding the complex networks that lead to cancer development. RESULTS: Here, we describe a high-throughput, multi-dimensional sequencing study of primary lung adenocarcinoma tumors and adjacent normal tissues of six Korean female never-smoker patients. Our data encompass results from exome-seq, RNA-seq, small RNA-seq, and MeDIP-seq. We identified and validated novel genetic aberrations, including 47 somatic mutations and 19 fusion transcripts. One of the fusions involves the c-RET gene, which was recently reported to form fusion genes that may function as drivers of carcinogenesis in lung cancer patients. We also characterized gene expression profiles, which we integrated with genomic aberrations and gene regulations into functional networks. The most prominent gene network module that emerged indicates that disturbances in G2/M transition and mitotic progression are causally linked to tumorigenesis in these patients. Also, results from the analysis strongly suggest that several novel microRNA-target interactions represent key regulatory elements of the gene network. CONCLUSIONS: Our study not only provides an overview of the alterations occurring in lung adenocarcinoma at multiple levels from genome to transcriptome and epigenome, but also offers a model for integrative genomics analysis and proposes potential target pathways for the control of lung adenocarcinoma.


Subject(s)
Adenocarcinoma/genetics , Biomarkers, Tumor/genetics , Carcinoma, Non-Small-Cell Lung/genetics , High-Throughput Nucleotide Sequencing , Lung Neoplasms/genetics , Smoking/genetics , Case-Control Studies , Female , Gene Expression Profiling , Humans , MicroRNAs/genetics , Oligonucleotide Array Sequence Analysis , RNA, Messenger/genetics , Real-Time Polymerase Chain Reaction , Reverse Transcriptase Polymerase Chain Reaction
SELECTION OF CITATIONS
SEARCH DETAIL
...