Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
1.
Bioinformatics ; 36(13): 4047-4057, 2020 07 01.
Article in English | MEDLINE | ID: mdl-31860066

ABSTRACT

MOTIVATION: The advent of in vivo automated techniques for single-cell lineaging, sequencing and analysis of gene expression has begun to dramatically increase our understanding of organismal development. We applied novel meta-analysis and visualization techniques to the EPIC single-cell-resolution developmental gene expression dataset for Caenorhabditis elegans from Bao, Murray, Waterston et al. to gain insights into regulatory mechanisms governing the timing of development. RESULTS: Our meta-analysis of the EPIC dataset revealed that a simple linear combination of the expression levels of the developmental genes is strongly correlated with the developmental age of the organism, irrespective of the cell division rate of different cell lineages. We uncovered a pattern of collective sinusoidal oscillation in gene activation, in multiple dominant frequencies and in multiple orthogonal axes of gene expression, pointing to the existence of a coordinated, multi-frequency global timing mechanism. We developed a novel method based on Fisher's Discriminant Analysis to identify gene expression weightings that maximally separate traits of interest, and found that remarkably, simple linear gene expression weightings are capable of producing sinusoidal oscillations of any frequency and phase, adding to the growing body of evidence that oscillatory mechanisms likely play an important role in the timing of development. We cross-linked EPIC with gene ontology and anatomy ontology terms, employing Fisher's Discriminant Analysis methods to identify previously unknown positive and negative genetic contributions to developmental processes and cell phenotypes. This meta-analysis demonstrates new evidence for direct linear and/or sinusoidal mechanisms regulating the timing of development. We uncovered a number of previously unknown positive and negative correlations between developmental genes and developmental processes or cell phenotypes. Our results highlight both the continued relevance of the EPIC technique, and the value of meta-analysis of previously published results. The presented analysis and visualization techniques are broadly applicable across developmental and systems biology. AVAILABILITY AND IMPLEMENTATION: Analysis software available upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Caenorhabditis elegans Proteins , Caenorhabditis elegans , Animals , Caenorhabditis elegans/genetics , Caenorhabditis elegans/metabolism , Caenorhabditis elegans Proteins/genetics , Caenorhabditis elegans Proteins/metabolism , Cell Lineage , Gene Expression Regulation, Developmental , Transcriptional Activation
2.
J Chem Theory Comput ; 13(11): 5255-5264, 2017 Nov 14.
Article in English | MEDLINE | ID: mdl-28926232

ABSTRACT

We investigate the impact of choosing regressors and molecular representations for the construction of fast machine learning (ML) models of 13 electronic ground-state properties of organic molecules. The performance of each regressor/representation/property combination is assessed using learning curves which report out-of-sample errors as a function of training set size with up to ∼118k distinct molecules. Molecular structures and properties at the hybrid density functional theory (DFT) level of theory come from the QM9 database [ Ramakrishnan et al. Sci. Data 2014 , 1 , 140022 ] and include enthalpies and free energies of atomization, HOMO/LUMO energies and gap, dipole moment, polarizability, zero point vibrational energy, heat capacity, and the highest fundamental vibrational frequency. Various molecular representations have been studied (Coulomb matrix, bag of bonds, BAML and ECFP4, molecular graphs (MG)), as well as newly developed distribution based variants including histograms of distances (HD), angles (HDA/MARAD), and dihedrals (HDAD). Regressors include linear models (Bayesian ridge regression (BR) and linear regression with elastic net regularization (EN)), random forest (RF), kernel ridge regression (KRR), and two types of neural networks, graph convolutions (GC) and gated graph networks (GG). Out-of sample errors are strongly dependent on the choice of representation and regressor and molecular property. Electronic properties are typically best accounted for by MG and GC, while energetic properties are better described by HDAD and KRR. The specific combinations with the lowest out-of-sample errors in the ∼118k training set size limit are (free) energies and enthalpies of atomization (HDAD/KRR), HOMO/LUMO eigenvalue and gap (MG/GC), dipole moment (MG/GC), static polarizability (MG/GG), zero point vibrational energy (HDAD/KRR), heat capacity at room temperature (HDAD/KRR), and highest fundamental vibrational frequency (BAML/RF). We present numerical evidence that ML model predictions deviate from DFT (B3LYP) less than DFT (B3LYP) deviates from experiment for all properties. Furthermore, out-of-sample prediction errors with respect to hybrid DFT reference are on par with, or close to, chemical accuracy. The results suggest that ML models could be more accurate than hybrid DFT if explicitly electron correlated quantum (or experimental) data were available.

3.
Hum Genomics ; 2(4): 212-35, 2006 Jan.
Article in English | MEDLINE | ID: mdl-16460647

ABSTRACT

The ability to infer personal genetic ancestry is being increasingly utilised in certain medical and forensic situations. Herein, the unsupervised Bayesian clustering algorithms structure, is employed to analyse 377 autosomal short tandem repeats typed on 1,056 individuals from the Centre d'Etude du Polymorphisme Humain Human Diversity Panel. Individuals of known geographical origin were hierarchically classified into a framework of increasingly homogeneous clusters to serve as reference populations into which individuals of unknown ancestry can be assigned. The groupings were characterised by the geographical affinities of cluster members and the accuracy of these procedures was verified using several genetic indices. Fine-scale substructure was detectable beyond the broad population level classifications that previously have been explored in this dataset. Metrics indicated that within certain lines, the strongest structuring signals were detected at the leaves of the hierarchy where lineage-specific groupings were identified. The accuracy of unknown assignment was assessed at each level of the hierarchy using a 'leave one out' strategy in which each individual was stripped of cluster membership and then re-assigned using the supervised Bayesian clustering algorithm implemented in GeneClass2. Although most clusters at all levels of resolution experienced highly accurate assignment, a decline was observed in the finer levels due to the mixed membership characteristics of some individuals. The parameters defined by this study allowed for assignment of unknown individuals to genetically defined clusters with measured likelihood. Shared ancestry data can then be inferred for the unknown individual.


Subject(s)
Genetics, Population/methods , Population Groups/genetics , Algorithms , Bayes Theorem , Genetics, Medical , Geography , Humans , Reference Values , Repetitive Sequences, Nucleic Acid
4.
Cleft Palate Craniofac J ; 42(5): 539-47, 2005 Sep.
Article in English | MEDLINE | ID: mdl-16149837

ABSTRACT

OBJECTIVES: The aims of the study were: (1) to develop a technique to quantify plagiocephaly that is safe, accurate, objective, easy to use, well tolerated, and inexpensive; and (2) to compare this method with tracings from a flexicurve ruler. DESIGN: A case-control study of 31 case infants recruited from outpatient plagiocephaly clinics and 29 control infants recruited from other pediatric outpatient clinics. PARTICIPANTS: Infants in the study had been diagnosed with nonsynostotic plagiocephaly or brachycephaly and were between 2 and 12 months old. INTERVENTIONS: Infants' head shapes were measured using (a) digital photographs of a head circumference band and (b) a flexicurve ruler. Flexicurve tracings were scanned, and both the digital photos and the scanned flexicurve tracings were analyzed using a custom-written computer program. MAIN OUTCOME MEASURES: The oblique cranial length ratio was used to quantify cranial asymmetry, and the cephalic index was used to quantify the degree of brachycephaly. RESULTS: The infants tolerated the photo technique better than the flexicurve. Also, mothers preferred the photo technique. There was less within-subject variance for the photos than for the flexicurve measurements. The results suggested that an oblique cranial length ratio of >or= 106% can define plagiocephaly and that a cephalic index of >or= 93% can define brachycephaly. CONCLUSIONS: The photographic technique was better accepted and more repeatable than the flexicurve measuring system. We propose that "normal" head shape is indicated in infants with both an oblique cranial length ratio of less than 106% and a cephalic index of less than 93%.


Subject(s)
Photography/methods , Skull/abnormalities , Case-Control Studies , Cephalometry/instrumentation , Cephalometry/methods , Cephalometry/statistics & numerical data , Ethnicity , Female , Humans , Image Processing, Computer-Assisted , Infant , Male , Photography/statistics & numerical data , Reproducibility of Results , Skull/pathology , Software , Time Factors
5.
Pediatrics ; 114(4): 970-80, 2004 Oct.
Article in English | MEDLINE | ID: mdl-15466093

ABSTRACT

OBJECTIVES: Although referrals for nonsynostotic plagiocephaly (NSP) have increased in recent years, the prevalence, natural history, and determinants of the condition have been unclear. The objective of this study was to assess the prevalence and natural history of NSP in normal infants in the first 2 years of life and to identify factors that may contribute to the development of NSP. METHODS: Two hundred infants were recruited at birth. At 6 weeks, 4 months, 8 months, 12 months, and 2 years, the head circumference shape was digitally photographed, and head shape was quantified using custom-written software. At each age, infants were classified as cases when the cephalic index was > or =93% and/or the oblique cranial length ratio was > or =106%. Neck rotation and a range of infant, infant care, socioeconomic, and obstetric factors were assessed. RESULTS: Ninety-six percent of infants were followed to 12 months, and 90.5% were followed to 2 years. Prevalence of plagiocephaly and/or brachycephaly at 6 weeks and 4, 8, 12, and 24 months was 16.0%, 19.7%, 9.2%, 6.8%, and 3.3% respectively. The mean cephalic index by 2 years was 81.6% (range: 72.0%-102.6%); the mean oblique cranial length ratio was 102.6% (range: 100.1%-109.4%). Significant univariate risk factors of NSP at 6 weeks include limited passive neck rotation at birth, preferential head orientation, supine sleep position, and head position not varied when put to sleep. At 4 months, risk factors were male gender, firstborn, limited passive neck rotation at birth, limited active head rotation at 4 months, supine sleeping at birth and 6 weeks, lower activity level, and trying unsuccessfully to vary the head position when putting the infant down to sleep. CONCLUSIONS: There is a wide range of head shapes in infants, and prevalence of NSP increases to 4 months but diminishes as infants grow older. The majority of cases will have resolved by 2 years of age. Limited head rotation, lower activity levels, and supine sleep position seem to be important determinants.


Subject(s)
Plagiocephaly, Nonsynostotic/epidemiology , Skull/abnormalities , Supine Position , Beds , Child, Preschool , Cohort Studies , Female , Humans , Infant , Logistic Models , Male , Multivariate Analysis , New Zealand/epidemiology , Prevalence , Prospective Studies , Risk Factors , Sex Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...