Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 23(1): 297, 2022 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-35879669

RESUMO

Since the completion of the Human Genome Project at the turn of the century, there has been an unprecedented proliferation of sequencing data. One of the consequences is that it becomes extremely difficult to store, backup, and migrate enormous amount of genomic datasets, not to mention they continue to expand as the cost of sequencing decreases. Herein, a much more efficient and scalable program to perform genome compression is required urgently. In this manuscript, we propose a new Apache Spark based Genome Compression method called SparkGC that can run efficiently and cost-effectively on a scalable computational cluster to compress large collections of genomes. SparkGC uses Spark's in-memory computation capabilities to reduce compression time by keeping data active in memory between the first-order and second-order compression. The evaluation shows that the compression ratio of SparkGC is better than the best state-of-the-art methods, at least better by 30%. The compression speed is also at least 3.8 times that of the best state-of-the-art methods on only one worker node and scales quite well with the number of nodes. SparkGC is of significant benefit to genomic data storage and transmission. The source code of SparkGC is publicly available at https://github.com/haichangyao/SparkGC .


Assuntos
Algoritmos , Compressão de Dados , Compressão de Dados/métodos , Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Análise de Sequência de DNA/métodos , Software
2.
Biomed Res Int ; 2019: 3108950, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31915686

RESUMO

With the maturity of genome sequencing technology, huge amounts of sequence reads as well as assembled genomes are generating. With the explosive growth of genomic data, the storage and transmission of genomic data are facing enormous challenges. FASTA, as one of the main storage formats for genome sequences, is widely used in the Gene Bank because it eases sequence analysis and gene research and is easy to be read. Many compression methods for FASTA genome sequences have been proposed, but they still have room for improvement. For example, the compression ratio and speed are not so high and robust enough, and memory consumption is not ideal, etc. Therefore, it is of great significance to improve the efficiency, robustness, and practicability of genomic data compression to reduce the storage and transmission cost of genomic data further and promote the research and development of genomic technology. In this manuscript, a hybrid referential compression method (HRCM) for FASTA genome sequences is proposed. HRCM is a lossless compression method able to compress single sequence as well as large collections of sequences. It is implemented through three stages: sequence information extraction, sequence information matching, and sequence information encoding. A large number of experiments fully evaluated the performance of HRCM. Experimental verification shows that HRCM is superior to the best-known methods in genome batch compression. Moreover, HRCM memory consumption is relatively low and can be deployed on standard PCs.


Assuntos
Big Data , Compressão de Dados/métodos , Genômica/métodos , Software , Bases de Dados Genéticas , Humanos
3.
Acta Biochim Biophys Sin (Shanghai) ; 45(8): 692-9, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23709205

RESUMO

Keloids are tumor-like skin scars that grow as a result of the aberrant healing of skin injuries, with no effective treatment. The molecular mechanism underlying keloid pathogenesis is still largely unknown. In this study, we compared microRNA (miRNA) expression profiles between keloid-derived fibroblasts and normal fibroblasts (including fetal and adult dermal fibroblasts) by miRNA microarray analysis. We found that the miRNA profiles in keloid-derived fibroblasts are different with those in normal fibroblasts. Nine miRNAs were differentially expressed, six of which were significantly up-regulated in keloid fibroblasts (KFs), including miR-152, miR-23b-3p, miR-31-5p, miR-320c, miR-30a-5p, and hsv1-miR-H7, and three of which were significantly down-regulated, including miR-4328, miR-145-5p, and miR-143-3p. Functional annotations of differentially expressed miRNA targets revealed that they were enriched in several signaling pathways important for scar wound healing. In conclusion, we demonstrate that the miRNA expression profile is altered in KFs compared with in fetal and adult dermal fibroblasts, and the expression profile may provide a useful clue for exploring the pathogenesis of keloids. miRNAs might partially contribute to the etiology of keloids by affecting several signaling pathways relevant to scar wound healing.


Assuntos
Perfilação da Expressão Gênica , Queloide/patologia , MicroRNAs/genética , Células Cultivadas , Análise por Conglomerados , Fibroblastos/patologia , Humanos , Queloide/metabolismo , Reação em Cadeia da Polimerase Via Transcriptase Reversa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...