Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
2.
Nature ; 618(7967): 981-985, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37225998

RESUMEN

Soils store more carbon than other terrestrial ecosystems1,2. How soil organic carbon (SOC) forms and persists remains uncertain1,3, which makes it challenging to understand how it will respond to climatic change3,4. It has been suggested that soil microorganisms play an important role in SOC formation, preservation and loss5-7. Although microorganisms affect the accumulation and loss of soil organic matter through many pathways4,6,8-11, microbial carbon use efficiency (CUE) is an integrative metric that can capture the balance of these processes12,13. Although CUE has the potential to act as a predictor of variation in SOC storage, the role of CUE in SOC persistence remains unresolved7,14,15. Here we examine the relationship between CUE and the preservation of SOC, and interactions with climate, vegetation and edaphic properties, using a combination of global-scale datasets, a microbial-process explicit model, data assimilation, deep learning and meta-analysis. We find that CUE is at least four times as important as other evaluated factors, such as carbon input, decomposition or vertical transport, in determining SOC storage and its spatial variation across the globe. In addition, CUE shows a positive correlation with SOC content. Our findings point to microbial CUE as a major determinant of global SOC storage. Understanding the microbial processes underlying CUE and their environmental dependence may help the prediction of SOC feedback to a changing climate.


Asunto(s)
Secuestro de Carbono , Carbono , Ecosistema , Microbiología del Suelo , Suelo , Carbono/análisis , Carbono/metabolismo , Cambio Climático , Plantas , Suelo/química , Conjuntos de Datos como Asunto , Aprendizaje Profundo
3.
IEEE Robot Autom Lett ; 8(8): 5055-5060, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38283263

RESUMEN

The clinical efficacy of robotic rehabilitation interventions hinges on appropriate neuromuscular recruitment from the patient. The first purpose of this study was to evaluate the use of supervised machine learning techniques to predict neuromuscular recruitment of the ankle plantar flexors during walking with ankle exoskeleton resistance in individuals with cerebral palsy (CP). The second goal of this study was to utilize the predictive models of plantar flexor recruitment in the design of a personalized biofeedback framework intended to improve (i.e., increase) user engagement when walking with resistance. First, we developed and trained multilayer perceptrons (MLPs), a type of artificial neural network (ANN), utilizing features extracted exclusively from the exoskeleton's onboard sensors, and demonstrated 85-87% accuracy, on average, in predicting muscle recruitment from electromyography measurements. Next, our participants completed a gait training session while receiving audio-visual biofeedback of their personalized real-time planar flexor recruitment predictions from the online MLP. We found that adding biofeedback to resistance elevated plantar flexor recruitment by 24 16% compared to resistance alone. This study highlights the potential for online machine learning frameworks to improve the effectiveness and delivery of robotic rehabilitation systems in clinical populations.

4.
Biol Methods Protoc ; 7(1): bpac022, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36157711

RESUMEN

Building realistically complex models of infectious disease transmission that are relevant for informing public health is conceptually challenging and requires knowledge of coding architecture that can implement key modeling conventions. For example, many of the models built to understand COVID-19 dynamics have included stochasticity, transmission dynamics that change throughout the epidemic due to changes in host behavior or public health interventions, and spatial structures that account for important spatio-temporal heterogeneities. Here we introduce an R package, SPARSEMODr, that allows users to simulate disease models that are stochastic and spatially explicit, including a model for COVID-19 that was useful in the early phases of the epidemic. SPARSEMOD stands for SPAtial Resolution-SEnsitive Models of Outbreak Dynamics, and our goal is to demonstrate particular conventions for rapidly simulating the dynamics of more complex, spatial models of infectious disease. In this report, we outline the features and workflows of our software package that allow for user-customized simulations. We believe the example models provided in our package will be useful in educational settings, as the coding conventions are adaptable, and will help new modelers to better understand important assumptions that were built into sophisticated COVID-19 models.

5.
BMC Bioinformatics ; 22(1): 323, 2021 Jun 14.
Artículo en Inglés | MEDLINE | ID: mdl-34126932

RESUMEN

BACKGROUND: Histone modification constitutes a basic mechanism for the genetic regulation of gene expression. In early 2000s, a powerful technique has emerged that couples chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq). This technique provides a direct survey of the DNA regions associated to these modifications. In order to realize the full potential of this technique, increasingly sophisticated statistical algorithms have been developed or adapted to analyze the massive amount of data it generates. Many of these algorithms were built around natural assumptions such as the Poisson distribution to model the noise in the count data. In this work we start from these natural assumptions and show that it is possible to improve upon them. RESULTS: Our comparisons on seven reference datasets of histone modifications (H3K36me3 & H3K4me3) suggest that natural assumptions are not always realistic under application conditions. We show that the unconstrained multiple changepoint detection model with alternative noise assumptions and supervised learning of the penalty parameter reduces the over-dispersion exhibited by count data. These models, implemented in the R package CROCS ( https://github.com/aLiehrmann/CROCS ), detect the peaks more accurately than algorithms which rely on natural assumptions. CONCLUSION: The segmentation models we propose can benefit researchers in the field of epigenetics by providing new high-quality peak prediction tracks for H3K36me3 and H3K4me3 histone modifications.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Secuenciación de Nucleótidos de Alto Rendimiento , Algoritmos , Inmunoprecipitación de Cromatina , Análisis de Secuencia de ADN
6.
Comput Biol Med ; 130: 104208, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33484946

RESUMEN

The electrocardiogram (ECG) signal is the most widely used non-invasive tool for the investigation of cardiovascular diseases. Automatic delineation of ECG fiducial points, in particular the R-peak, serves as the basis for ECG processing and analysis. This study proposes a new method of ECG signal analysis by introducing a new class of graphical models based on optimal changepoint detection models, named the graph-constrained changepoint detection (GCCD) model. The GCCD model treats fiducial points delineation in the non-stationary ECG signal as a changepoint detection problem. The proposed model exploits the sparsity of changepoints to detect abrupt changes within the ECG signal; thereby, the R-peak detection task can be relaxed from any preprocessing step. In this novel approach, prior biological knowledge about the expected sequence of changes is incorporated into the model using the constraint graph, which can be defined manually or automatically. First, we define the constraint graph manually; then, we present a graph learning algorithm that can search for an optimal graph in a greedy scheme. Finally, we compare the manually defined graphs and learned graphs in terms of graph structure and detection accuracy. We evaluate the performance of the algorithm using the MIT-BIH Arrhythmia Database. The proposed model achieves an overall sensitivity of 99.64%, positive predictivity of 99.71%, and detection error rate of 0.19 for the manually defined constraint graph and overall sensitivity of 99.76%, positive predictivity of 99.68%, and detection error rate of 0.55 for the automatic learning constraint graph.


Asunto(s)
Algoritmos , Procesamiento de Señales Asistido por Computador , Arritmias Cardíacas/diagnóstico por imagen , Bases de Datos Factuales , Electrocardiografía , Humanos
7.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 332-336, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-33017996

RESUMEN

Electrocardiogram (ECG) signal is the most commonly used non-invasive tool in the assessment of cardiovascular diseases. Segmentation of the ECG signal to locate its constitutive waves, in particular the R-peaks, is a key step in ECG processing and analysis. Over the years, several segmentation and QRS complex detection algorithms have been proposed with different features; however, their performance highly depends on applying preprocessing steps which makes them unreliable in realtime data analysis of ambulatory care settings and remote monitoring systems, where the collected data is highly noisy. Moreover, some issues still remain with the current algorithms in regard to the diverse morphological categories for the ECG signal and their high computation cost. In this paper, we introduce a novel graph-based optimal changepoint detection (GCCD) method for reliable detection of Rpeak positions without employing any preprocessing step. The proposed model guarantees to compute the globally optimal changepoint detection solution. It is also generic in nature and can be applied to other time-series biomedical signals. Based on the MIT-BIH arrhythmia (MIT-BIH-AR) database, the proposed method achieves overall sensitivity Sen = 99.76, positive predictivity PPR = 99.68, and detection error rate DER = 0.55 which are comparable to other state-of-the-art approaches.1 2.


Asunto(s)
Electrocardiografía , Procesamiento de Señales Asistido por Computador , Algoritmos , Arritmias Cardíacas/diagnóstico , Bases de Datos Factuales , Humanos
8.
Biostatistics ; 21(4): 709-726, 2020 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-30753436

RESUMEN

Calcium imaging data promises to transform the field of neuroscience by making it possible to record from large populations of neurons simultaneously. However, determining the exact moment in time at which a neuron spikes, from a calcium imaging data set, amounts to a non-trivial deconvolution problem which is of critical importance for downstream analyses. While a number of formulations have been proposed for this task in the recent literature, in this article, we focus on a formulation recently proposed in Jewell and Witten (2018. Exact spike train inference via $\ell_{0} $ optimization. The Annals of Applied Statistics12(4), 2457-2482) that can accurately estimate not just the spike rate, but also the specific times at which the neuron spikes. We develop a much faster algorithm that can be used to deconvolve a fluorescence trace of 100 000 timesteps in less than a second. Furthermore, we present a modification to this algorithm that precludes the possibility of a "negative spike". We demonstrate the performance of this algorithm for spike deconvolution on calcium imaging datasets that were recently released as part of the $\texttt{spikefinder}$ challenge (http://spikefinder.codeneuro.org/). The algorithm presented in this article was used in the Allen Institute for Brain Science's "platform paper" to decode neural activity from the Allen Brain Observatory; this is the main scientific paper in which their data resource is presented. Our $\texttt{C++}$ implementation, along with $\texttt{R}$ and $\texttt{python}$ wrappers, is publicly available. $\texttt{R}$ code is available on $\texttt{CRAN}$ and $\texttt{Github}$, and $\texttt{python}$ wrappers are available on $\texttt{Github}$; see https://github.com/jewellsean/FastLZeroSpikeInference.


Asunto(s)
Calcio , Neuronas , Algoritmos , Encéfalo/diagnóstico por imagen , Diagnóstico por Imagen , Humanos
9.
Pac Symp Biocomput ; 25: 367-378, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-31797611

RESUMEN

Joint peak detection is a central problem when comparing samples in epigenomic data analysis, but current algorithms for this task are unsupervised and limited to at most 2 sample types. We propose PeakSegPipeline, a new genome-wide multi-sample peak calling pipeline for epigenomic data sets. It performs peak detection using a constrained maximum likelihood segmentation model with essentially only one free parameter that needs to be tuned: the number of peaks. To select the number of peaks, we propose to learn a penalty function based on user-provided labels that indicate genomic regions with or without peaks in specific samples. In comparisons with state-of-the-art peak detection algorithms, PeakSegPipeline achieves similar or better accuracy, and a more interpretable model with overlapping peaks that occur in exactly the same positions across all samples. Our novel approach is able to learn that predicted peak sizes vary by experiment type.


Asunto(s)
Algoritmos , Biología Computacional , Genómica , Aprendizaje Automático
10.
Sci Data ; 5: 180240, 2018 10 30.
Artículo en Inglés | MEDLINE | ID: mdl-30375995

RESUMEN

Neuroblastoma, a pediatric tumor of the sympathetic nervous system, is predominantly driven by copy number aberrations, which predict survival outcome in global neuroblastoma cohorts and in low-risk cases. For high-risk patients there is still a need for better prognostic biomarkers. Via an international collaboration, we collected copy number profiles of 556 high-risk neuroblastomas generated on different array platforms. This manuscript describes the composition of the dataset, the methods used to process the data, including segmentation and aberration calling, and data validation. t-SNE analysis shows that samples cluster according to MYCN status, and shows a difference between array platforms. 97.3% of samples are characterized by the presence of segmental aberrations, in regions frequently affected in neuroblastoma. Focal aberrations affect genes known to be involved in neuroblastoma, such as ALK and LIN28B. To conclude, we compiled a unique large copy number dataset of high-risk neuroblastoma tumors, available via R2 and a Shiny web application. The availability of patient survival data allows to further investigate the prognostic value of copy number aberrations.


Asunto(s)
Variaciones en el Número de Copia de ADN , Neoplasias del Sistema Nervioso/genética , Neuroblastoma/genética , Biomarcadores de Tumor/genética , Niño , Preescolar , ADN de Neoplasias/genética , Humanos
11.
Am J Hum Genet ; 103(4): 474-483, 2018 10 04.
Artículo en Inglés | MEDLINE | ID: mdl-30220433

RESUMEN

Advances in high-throughput DNA sequencing have revolutionized the discovery of variants in the human genome; however, interpreting the phenotypic effects of those variants is still a challenge. While several computational approaches to predict variant impact are available, their accuracy is limited and further improvement is needed. Here, we introduce ClinPred, an efficient tool for identifying disease-relevant nonsynonymous variants. Our predictor incorporates two machine learning algorithms that use existing pathogenicity scores and, notably, benefits from inclusion of normal population allele frequency from the gnomAD database as an input feature. Another major strength of our approach is the use of ClinVar-a rapidly growing database that allows selection of confidently annotated disease-causing variants-as a training set. Compared to other methods, ClinPred showed superior accuracy for predicting pathogenicity, achieving the highest area under the curve (AUC) score and increasing both the specificity and sensitivity in different test datasets. It also obtained the best performance according to various other metrics. Moreover, ClinPred performance remained robust with respect to disease type (cancer or rare disease) and mechanism (gain or loss of function). Importantly, we observed that adding allele frequency as a predictive feature-as opposed to setting fixed allele frequency cutoffs-boosts the performance of prediction. We provide pre-computed ClinPred scores for all possible human missense variants in the exome to facilitate its use by the community.


Asunto(s)
Biología Computacional/métodos , Enfermedad/genética , Polimorfismo de Nucleótido Simple/genética , Algoritmos , Área Bajo la Curva , Exoma/genética , Frecuencia de los Genes/genética , Genoma Humano/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Aprendizaje Automático , Programas Informáticos
12.
J Natl Cancer Inst ; 110(10): 1084-1093, 2018 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-29514301

RESUMEN

Background: Neuroblastoma is characterized by substantial clinical heterogeneity. Despite intensive treatment, the survival rates of high-risk neuroblastoma patients are still disappointingly low. Somatic chromosomal copy number aberrations have been shown to be associated with patient outcome, particularly in low- and intermediate-risk neuroblastoma patients. To improve outcome prediction in high-risk neuroblastoma, we aimed to design a prognostic classification method based on copy number aberrations. Methods: In an international collaboration, normalized high-resolution DNA copy number data (arrayCGH and SNP arrays) from 556 high-risk neuroblastomas obtained at diagnosis were collected from nine collaborative groups and segmented using the same method. We applied logistic and Cox proportional hazard regression to identify genomic aberrations associated with poor outcome. Results: In this study, we identified two types of copy number aberrations that are associated with extremely poor outcome. Distal 6q losses were detected in 5.9% of patients and were associated with a 10-year survival probability of only 3.4% (95% confidence interval [CI] = 0.5% to 23.3%, two-sided P = .002). Amplifications of regions not encompassing the MYCN locus were detected in 18.1% of patients and were associated with a 10-year survival probability of only 5.8% (95% CI = 1.5% to 22.2%, two-sided P < .001). Conclusions: Using a unique large copy number data set of high-risk neuroblastoma cases, we identified a small subset of high-risk neuroblastoma patients with extremely low survival probability that might be eligible for inclusion in clinical trials of new therapeutics. The amplicons may also nominate alternative treatments that target the amplified genes.


Asunto(s)
Deleción Cromosómica , Cromosomas Humanos Par 6 , Amplificación de Genes , Genómica , Neuroblastoma/genética , Neuroblastoma/mortalidad , Biomarcadores de Tumor , Preescolar , Variaciones en el Número de Copia de ADN , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Genómica/métodos , Humanos , Lactante , Proteína Proto-Oncogénica N-Myc/genética , Estadificación de Neoplasias , Neuroblastoma/patología , Neuroblastoma/terapia , Pronóstico
13.
Stat Comput ; 27(2): 519-533, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-32355427

RESUMEN

Many common approaches to detecting changepoints, for example based on statistical criteria such as penalised likelihood or minimum description length, can be formulated in terms of minimising a cost over segmentations. We focus on a class of dynamic programming algorithms that can solve the resulting minimisation problem exactly, and thus find the optimal segmentation under the given statistical criteria. The standard implementation of these dynamic programming methods have a computational cost that scales at least quadratically in the length of the time-series. Recently pruning ideas have been suggested that can speed up the dynamic programming algorithms, whilst still being guaranteed to be optimal, in that they find the true minimum of the cost function. Here we extend these pruning methods, and introduce two new algorithms for segmenting data: FPOP and SNIP. Empirical results show that FPOP is substantially faster than existing dynamic programming methods, and unlike the existing methods its computational efficiency is robust to the number of changepoints in the data. We evaluate the method for detecting copy number variations and observe that FPOP has a computational cost that is even competitive with that of binary segmentation, but can give much more accurate segmentations.

14.
Bioinformatics ; 33(4): 491-499, 2017 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-27797775

RESUMEN

Motivation: Many peak detection algorithms have been proposed for ChIP-seq data analysis, but it is not obvious which algorithm and what parameters are optimal for any given dataset. In contrast, regions with and without obvious peaks can be easily labeled by visual inspection of aligned read counts in a genome browser. We propose a supervised machine learning approach for ChIP-seq data analysis, using labels that encode qualitative judgments about which genomic regions contain or do not contain peaks. The main idea is to manually label a small subset of the genome, and then learn a model that makes consistent peak predictions on the rest of the genome. Results: We created 7 new histone mark datasets with 12 826 visually determined labels, and analyzed 3 existing transcription factor datasets. We observed that default peak detection parameters yield high false positive rates, which can be reduced by learning parameters using a relatively small training set of labeled data from the same experiment type. We also observed that labels from different people are highly consistent. Overall, these data indicate that our supervised labeling method is useful for quantitatively training and testing peak detection algorithms. Availability and Implementation: Labeled histone mark data http://cbio.ensmp.fr/~thocking/chip-seq-chunk-db/ , R package to compute the label error of predicted peaks https://github.com/tdhock/PeakError. Contacts: toby.hocking@mail.mcgill.ca or guil.bourque@mcgill.ca. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Inmunoprecipitación de Cromatina/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Aprendizaje Automático Supervisado , Animales , Genómica/métodos , Humanos
15.
Clin Cancer Res ; 22(22): 5564-5573, 2016 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-27440268

RESUMEN

PURPOSE: The tumor genomic copy number profile is of prognostic significance in neuroblastoma patients. We have studied the genomic copy number profile of cell-free DNA (cfDNA) and compared this with primary tumor arrayCGH (aCGH) at diagnosis. EXPERIMENTAL DESIGN: In 70 patients, cfDNA genomic copy number profiling was performed using the OncoScan platform. The profiles were classified according to the overall pattern, including numerical chromosome alterations (NCA), segmental chromosome alterations (SCA), and MYCN amplification (MNA). RESULTS: Interpretable and dynamic cfDNA profiles were obtained in 66 of 70 and 52 of 70 cases, respectively. An overall identical genomic profile between tumor aCGH and cfDNA was observed in 47 cases (3 NCAs, 22 SCAs, 22 MNAs). In one case, cfDNA showed an additional SCA not detected by tumor aCGH. In 4 of 8 cases with a silent tumor aCGH profile, cfDNA analysis revealed a dynamic profile (3 SCAs, 1 NCA). In 14 cases, cfDNA analysis did not reveal any copy number changes. A total of 378 breakpoints common to the primary tumor and cfDNA of any given patient were identified, 27 breakpoints were seen by tumor aCGH, and 54 breakpoints were seen in cfDNA only, including two cases with interstitial IGFR1 gains and two alterations targeting TERT CONCLUSIONS: These results demonstrate the feasibility of cfDNA copy number profiling in neuroblastoma patients, with a concordance of the overall genomic profile in aCGH and cfDNA dynamic cases of 97% and a sensitivity of 77%, respectively. Furthermore, neuroblastoma heterogeneity is highlighted, suggesting that cfDNA might reflect genetic alterations of more aggressive cell clones. Clin Cancer Res; 22(22); 5564-73. ©2016 AACRSee related commentary by Janku and Kurzrock, p. 5400.


Asunto(s)
ADN Tumoral Circulante/genética , Dosificación de Gen/genética , Neuroblastoma/sangre , Neuroblastoma/genética , Adolescente , Niño , Preescolar , Aberraciones Cromosómicas , Hibridación Genómica Comparativa/métodos , Femenino , Amplificación de Genes/genética , Genómica/métodos , Humanos , Lactante , Masculino , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Pronóstico , Estudios Prospectivos
16.
Cancer Sci ; 105(7): 897-904, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24815991

RESUMEN

Clonal heterogeneity in lymphoid malignancies has been recently reported in adult T-cell lymphoma/leukemia, peripheral T-cell lymphoma, not otherwise specified, and mantle cell lymphoma. Our analysis was extended to other types of lymphoma including marginal zone lymphoma, follicular lymphoma and diffuse large B-cell lymphoma. To determine the presence of clonal heterogeneity, 332 cases were examined using array comparative genomic hybridization analysis. Results showed that incidence of clonal heterogeneity varied from 25% to 69% among different types of lymphoma. Survival analysis revealed that mantle cell lymphoma and diffuse large B-cell lymphoma with clonal heterogeneity showed significantly poorer prognosis, and that clonal heterogeneity was confirmed as an independent predictor of poor prognosis for both types of lymphoma. Interestingly, 8q24.1 (MYC) gain, 9p21.3 (CDKN2A/2B) loss and 17p13 (TP53, ATP1B2, SAT2, SHBG) loss were recurrent genomic lesions among various types of lymphoma with clonal heterogeneity, suggesting at least in part that alterations of these genes may play a role in clonal heterogeneity.


Asunto(s)
Linfoma/genética , Linfoma/mortalidad , Linfoma/patología , Linfoma de Burkitt/genética , Linfoma de Burkitt/mortalidad , Linfoma de Burkitt/patología , Cromosomas Humanos Par 8 , Hibridación Genómica Comparativa , Inhibidor p16 de la Quinasa Dependiente de Ciclina/genética , Eliminación de Gen , Dosificación de Gen , Humanos , Linfoma de Células B de la Zona Marginal/genética , Linfoma de Células B de la Zona Marginal/mortalidad , Linfoma de Células B de la Zona Marginal/patología , Linfoma Folicular/genética , Linfoma Folicular/mortalidad , Linfoma Folicular/patología , Linfoma de Células B Grandes Difuso/genética , Linfoma de Células B Grandes Difuso/mortalidad , Linfoma de Células B Grandes Difuso/patología , Pronóstico
17.
Bioinformatics ; 30(11): 1539-46, 2014 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-24493034

RESUMEN

MOTIVATION: DNA copy number profiles characterize regions of chromosome gains, losses and breakpoints in tumor genomes. Although many models have been proposed to detect these alterations, it is not clear which model is appropriate before visual inspection the signal, noise and models for a particular profile. RESULTS: We propose SegAnnDB, a Web-based computer vision system for genomic segmentation: first, visually inspect the profiles and manually annotate altered regions, then SegAnnDB determines the precise alteration locations using a mathematical model of the data and annotations. SegAnnDB facilitates collaboration between biologists and bioinformaticians, and uses the University of California, Santa Cruz genome browser to visualize copy number alterations alongside known genes. AVAILABILITY AND IMPLEMENTATION: The breakpoints project on INRIA GForge hosts the source code, an Amazon Machine Image can be launched and a demonstration Web site is http://bioviz.rocq.inria.fr.


Asunto(s)
Variaciones en el Número de Copia de ADN , Programas Informáticos , Algoritmos , Puntos de Rotura del Cromosoma , Genómica/métodos , Internet
18.
BMC Bioinformatics ; 14: 164, 2013 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-23697330

RESUMEN

BACKGROUND: Many models have been proposed to detect copy number alterations in chromosomal copy number profiles, but it is usually not obvious to decide which is most effective for a given data set. Furthermore, most methods have a smoothing parameter that determines the number of breakpoints and must be chosen using various heuristics. RESULTS: We present three contributions for copy number profile smoothing model selection. First, we propose to select the model and degree of smoothness that maximizes agreement with visual breakpoint region annotations. Second, we develop cross-validation procedures to estimate the error of the trained models. Third, we apply these methods to compare 17 smoothing models on a new database of 575 annotated neuroblastoma copy number profiles, which we make available as a public benchmark for testing new algorithms. CONCLUSIONS: Whereas previous studies have been qualitative or limited to simulated data, our annotation-guided approach is quantitative and suggests which algorithms are fastest and most accurate in practice on real data. In the neuroblastoma data, the equivalent pelt.n and cghseg.k methods were the best breakpoint detectors, and exhibited reasonable computation times.


Asunto(s)
Puntos de Rotura del Cromosoma , Dosificación de Gen/genética , Perfilación de la Expresión Génica/métodos , Modelos Genéticos , Algoritmos , Mapeo Cromosómico/métodos , Humanos
19.
PLoS One ; 5(8): e11913, 2010 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-20689851

RESUMEN

BACKGROUND: The recent advent of high-throughput SNP genotyping technologies has opened new avenues of research for population genetics. In particular, a growing interest in the identification of footprints of selection, based on genome scans for adaptive differentiation, has emerged. METHODOLOGY/PRINCIPAL FINDINGS: The purpose of this study is to develop an efficient model-based approach to perform bayesian exploratory analyses for adaptive differentiation in very large SNP data sets. The basic idea is to start with a very simple model for neutral loci that is easy to implement under a bayesian framework and to identify selected loci as outliers via Posterior Predictive P-values (PPP-values). Applications of this strategy are considered using two different statistical models. The first one was initially interpreted in the context of populations evolving respectively under pure genetic drift from a common ancestral population while the second one relies on populations under migration-drift equilibrium. Robustness and power of the two resulting bayesian model-based approaches to detect SNP under selection are further evaluated through extensive simulations. An application to a cattle data set is also provided. CONCLUSIONS/SIGNIFICANCE: The procedure described turns out to be much faster than former bayesian approaches and also reasonably efficient especially to detect loci under positive selection.


Asunto(s)
Bases de Datos Genéticas , Genómica/métodos , Polimorfismo de Nucleótido Simple , Selección Genética , Adaptación Fisiológica , Animales , Teorema de Bayes , Bovinos , Eliminación de Gen , Sitios Genéticos/genética , Genotipo , Funciones de Verosimilitud , Modelos Genéticos , Reproducibilidad de los Resultados
20.
Nat Biotechnol ; 26(6): 702-8, 2008 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18500334

RESUMEN

We describe the use of zinc-finger nucleases (ZFNs) for somatic and germline disruption of genes in zebrafish (Danio rerio), in which targeted mutagenesis was previously intractable. ZFNs induce a targeted double-strand break in the genome that is repaired to generate small insertions and deletions. We designed ZFNs targeting the zebrafish golden and no tail/Brachyury (ntl) genes and developed a budding yeast-based assay to identify the most active ZFNs for use in vivo. Injection of ZFN-encoding mRNA into one-cell embryos yielded a high percentage of animals carrying distinct mutations at the ZFN-specified position and exhibiting expected loss-of-function phenotypes. Over half the ZFN mRNA-injected founder animals transmitted disrupted ntl alleles at frequencies averaging 20%. The frequency and precision of gene-disruption events observed suggest that this approach should be applicable to any loci in zebrafish or in other organisms that allow mRNA delivery into the fertilized egg.


Asunto(s)
Animales Modificados Genéticamente/fisiología , Marcación de Gen/métodos , Ingeniería Genética/métodos , Mutagénesis Sitio-Dirigida/métodos , Proteínas de Pez Cebra/genética , Pez Cebra/genética , Dedos de Zinc/genética , Animales , Desoxirribonucleasas/genética , Ingeniería de Proteínas/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...