Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
J Comput Biol ; 15(1): 1-20, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18257674

RESUMO

Getting and analyzing biological interaction networks is at the core of systems biology. To help understanding these complex networks, many recent works have suggested to focus on motifs which occur more frequently than expected in random. To identify such exceptional motifs in a given network, we propose a statistical and analytical method which does not require any simulation. For this, we first provide an analytical expression of the mean and variance of the count under any exchangeable random graph model. Then we approximate the motif count distribution by a compound Poisson distribution whose parameters are derived from the mean and variance of the count. Thanks to simulations, we show that the compound Poisson approximation outperforms the Gaussian approximation. The compound Poisson distribution can then be used to get an approximate p-value and to decide if an observed count is significantly high or not. Our methodology is applied on protein-protein interaction (PPI) networks, and statistical issues related to exceptional motif detection are discussed.


Assuntos
Biologia Computacional , Mapeamento de Interação de Proteínas , Simulação por Computador , Escherichia coli/metabolismo , Helicobacter pylori/metabolismo , Modelos Biológicos , Distribuição de Poisson , Saccharomyces cerevisiae/metabolismo
2.
Animal ; 2(8): 1203-14, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22443733

RESUMO

Research in animal sciences, especially nutrition, increasingly requires processing and modeling of databases. In certain areas of research, the number of publications and results per publications is increasing, thus periodically requiring quantitative summarizations of literature data. In such instances, statistical methods dealing with the analysis of summary (literature) data, known as meta-analyses, must be used. The implementation of a meta-analysis is done in several phases. The first phase concerns the definition of the study objectives and the identification of the criteria to be used in the selection of prior publications to be used in the construction of the database. Publications must be scrupulously evaluated before being entered into the database. During this phase, it is important to carefully encode each record with pertinent descriptive attributes (experiments, treatments, etc.) to serve as important reference points for the rest of the analysis. Databases from literature data are inherently unbalanced statistically, leading to considerable analytical and interpretation difficulties; missing data are frequent, and data structures are not the outcomes of a classical experimental system. An initial graphical examination of the data is recommended to enhance a global view as well as to identify specific relationships to be investigated. This phase is followed by a study of the meta-system made up of the database to be interpreted. These steps condition the definition of the applied statistical model. Variance decomposition must account for inter- and intrastudy sources; dependent and independent variables must be identified either as discrete (qualitative) or continuous (quantitative). Effects must be defined as either fixed or random. Often, observations must be weighed to account for differences in the precision of the reported means. Once model parameters are estimated, extensive analyses of residual variations must be performed. The roles of the different treatments and studies in the results obtained must be identified. Often, this requires returning to an earlier step in the process. Thus, meta-analyses have inherent heuristic qualities.

3.
Biometrics ; 63(3): 758-66, 2007 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-17825008

RESUMO

Microarray-CGH (comparative genomic hybridization) experiments are used to detect and map chromosomal imbalances. A CGH profile can be viewed as a succession of segments that represent homogeneous regions in the genome whose representative sequences share the same relative copy number on average. Segmentation methods constitute a natural framework for the analysis, but they do not provide a biological status for the detected segments. We propose a new model for this segmentation/clustering problem, combining a segmentation model with a mixture model. We present a new hybrid algorithm called dynamic programming-expectation maximization (DP-EM) to estimate the parameters of the model by maximum likelihood. This algorithm combines DP and the EM algorithm. We also propose a model selection heuristic to select the number of clusters and the number of segments. An example of our procedure is presented, based on publicly available data sets. We compare our method to segmentation methods and to hidden Markov models, and we show that the new segmentation/clustering model is a promising alternative that can be applied in the more general context of signal processing.


Assuntos
Mapeamento Cromossômico/métodos , Análise por Conglomerados , Dosagem de Genes/genética , Modelos Genéticos , Reconhecimento Automatizado de Padrão/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Inteligência Artificial , Simulação por Computador , Interpretação Estatística de Dados , Bases de Dados Genéticas , Armazenamento e Recuperação da Informação/métodos , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos
4.
Eur J Nutr ; 45(2): 79-87, 2006 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-16003590

RESUMO

BACKGROUND: While a relationship between alcohol and cardiovascular risk factors is well established, data suggest that the type of alcoholic beverage could modulate this relationship. AIM OF THE STUDY: To determine whether drinking patterns modulate the relationship between alcohol and cardiovascular risk factors. METHODS: We tested the relationship between preference of alcoholic beverages and atherosclerotic risk factors in a cross-sectional study of 2,126 men. A hierarchical clustering method determined six drinking patterns, 'low drinkers', 'high quality wines', 'beer and cider', 'digestives', 'local wines', and 'table wines', according to the preferential intake of alcoholic beverages. Logistic models estimated the relative risk of abnormal markers in the drinking patterns compared with low drinkers. Unadjusted estimates investigated the relationship with the cluster as a group, while adjustment on alcohol, nutritional and socio-demographic factors investigated the relationship with the preference of alcoholic beverage in itself. RESULTS: Abstainers had high total plasma homocysteine (tHcy), even after full adjustment (odds ratio (OR) = 1.6, 95% confidence interval (CI): 1.0, 2.8). Drinkers of high quality wine had low lipoprotein( a), high tHcy and high body mass index; beer and cider drinkers had high tHcy and waist circumference. Drinkers of digestives had high triacylglycerol; after adjustment they were at risk of low apolipoprotein A-I (OR = 3.1, 95% CI: 1.2, 7.3), and high tHcy (OR = 4.9, 95% CI: 1.2, 33.3). Local wines drinkers were similar to low drinkers. Table wine drinkers had high apolipoprotein B, high triacylglycerol, and high waist-to-hip ratio. CONCLUSION: Our data suggest that preference of alcoholic beverage could indicate groups at specific risks of atherosclerotic disease.


Assuntos
Consumo de Bebidas Alcoólicas , Bebidas Alcoólicas/classificação , Aterosclerose/epidemiologia , Homocisteína/sangue , Lipídeos/sangue , Consumo de Bebidas Alcoólicas/efeitos adversos , Bebidas Alcoólicas/análise , Aterosclerose/sangue , Cerveja , Índice de Massa Corporal , Análise por Conglomerados , Intervalos de Confiança , Estudos Transversais , Relação Dose-Resposta a Droga , França/epidemiologia , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Razão de Chances , Fatores de Risco , Vinho
6.
Prev Vet Med ; 68(2-4): 91-102, 2005 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-15820109

RESUMO

A prospective study was carried out on 92 randomly selected beef herds in the Midi-Pyrenees region in France. The objective was to determine factors associated with time to neonatal gastroenteritis. By taking into account the "intra-herd" correlation in failure time (in the semiparametric Cox model), we identified 12 management risk factors associated with hazard of diarrhoea. Some previously have been identified, but "new" risk factors were feeding of corn silage and the incidence of diarrhoea in the last season. We used the two main approaches which are often reviewed: marginal and frailty Cox models. Our results show that these two models give different parameter estimates, so the choice of the model remains crucial.


Assuntos
Criação de Animais Domésticos/métodos , Doenças dos Bovinos/etiologia , Diarreia/veterinária , Modelos Biológicos , Animais , Animais Recém-Nascidos , Bovinos , Doenças dos Bovinos/epidemiologia , Diarreia/etiologia , Feminino , França/epidemiologia , Gravidez , Modelos de Riscos Proporcionais , Estudos Prospectivos , Fatores de Risco
7.
BMC Bioinformatics ; 5: 125, 2004 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-15350197

RESUMO

BACKGROUND: Thousands of genes in a genomewide data set are tested against some null hypothesis, for detecting differentially expressed genes in microarray experiments. The expected proportion of false positive genes in a set of genes, called the False Discovery Rate (FDR), has been proposed to measure the statistical significance of this set. Various procedures exist for controlling the FDR. However the threshold (generally 5%) is arbitrary and a specific measure associated with each gene would be worthwhile. RESULTS: Using process intensity estimation methods, we define and give estimates of the local FDR, which may be considered as the probability for a gene to be a false positive. After a global assessment rule controlling the false positive error, the local FDR is a valuable guideline for deciding wether a gene is differentially expressed. The interest of the method is illustrated on three well known data sets. A R routine for computing local FDR estimates from p-values is available at http://www.inapg.fr/ens_rech/mathinfo/recherche/mathematique/outil.html. CONCLUSIONS: The local FDR associated with each gene measures the probability that it is a false positive. It gives the opportunity to compute the FDR of any given group of clones (of the same gene) or genes pertaining to the same regulation network or the same chromosomic region.


Assuntos
Perfilação da Expressão Gênica/estatística & dados numéricos , Regulação Neoplásica da Expressão Gênica/genética , Regulação da Expressão Gênica/genética , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Doença Aguda , Animais , Apolipoproteína A-I/genética , Neoplasias da Mama/genética , HDL-Colesterol/sangue , HDL-Colesterol/genética , Interpretação Estatística de Dados , Humanos , Leucemia Mieloide/genética , Camundongos , Camundongos Knockout , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Projetos de Pesquisa/estatística & dados numéricos
8.
J Comput Biol ; 9(6): 761-73, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-12614545

RESUMO

The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.


Assuntos
Sequência de Bases , Biologia Computacional , Transcrição Gênica , Algoritmos , Bacillus subtilis/genética , Simulação por Computador , Escherichia coli/genética , Modelos Genéticos , Probabilidade , Regiões Promotoras Genéticas
9.
J Comput Biol ; 8(4): 373-80, 2001.
Artigo em Inglês | MEDLINE | ID: mdl-11571073

RESUMO

Let X(1)...X(n) be a sequence of i.i.d. positive or negative integer-valued random variables and H(n) = max(0 < or = i < or = j < or = n)(X(i) +...+ X(j)) be the local score of the sequence. The exact distribution of H(n) is obtained using a simple Markov chain. This result is applied to the scoring of DNA and protein sequences in molecular biology.


Assuntos
Biologia Computacional , Análise de Sequência de DNA/estatística & dados numéricos , Análise de Sequência de Proteína/estatística & dados numéricos , Proteínas de Ligação a DNA , Cadeias de Markov , Proteínas Nucleares , Alinhamento de Sequência/estatística & dados numéricos
10.
Genome Biol ; 2(6): RESEARCH0019, 2001.
Artigo em Inglês | MEDLINE | ID: mdl-11423008

RESUMO

BACKGROUND: In global gene expression profiling experiments, variation in the expression of genes of interest can often be hidden by general noise. To determine how biologically significant variation can be distinguished under such conditions we have analyzed the differences in gene expression when Bacillus subtilis is grown either on methionine or on methylthioribose as sulfur source. RESULTS: An unexpected link between arginine metabolism and sulfur metabolism was discovered, enabling us to identify a high-affinity arginine transport system encoded by the yqiXYZ genes. In addition, we tentatively identified a methionine/methionine sulfoxide transport system which is encoded by the operon ytmIJKLMhisP and is presumably used in the degradation of methionine sulfoxide to methane sulfonate for sulfur recycling. Experimental parameters resulting in systematic biases in gene expression were also uncovered. In particular, we found that the late competence operons comE, comF and comG were associated with subtle variations in growth conditions. CONCLUSIONS: Using variance analysis it is possible to distinguish between systematic biases and relevant gene-expression variation in transcriptome experiments. Co-variation of metabolic gene expression pathways was thus uncovered linking nitrogen and sulfur metabolism in B. subtilis.


Assuntos
Arginina/metabolismo , Bacillus subtilis/genética , Bacillus subtilis/metabolismo , Regulação Bacteriana da Expressão Gênica , Metionina/análogos & derivados , Metionina/metabolismo , Bacillus subtilis/crescimento & desenvolvimento , Perfilação da Expressão Gênica , Genes Bacterianos , Variação Genética , Proteínas de Membrana Transportadoras/genética , Mutação , Análise de Sequência com Séries de Oligonucleotídeos , Óperon , RNA Bacteriano/biossíntese , Enxofre/metabolismo , Tioglicosídeos/metabolismo
11.
Ann Microbiol (Paris) ; 134A(3): 399-409, 1983.
Artigo em Francês | MEDLINE | ID: mdl-6638756

RESUMO

Yeasts are found to a large extent in cheeses, more particularly in soft cheeses such as Camembert. The proximity between two species previously identified by standard methods was studied using a factorial discriminant analysis on 326 strains. Twenty-three fermentation and assimilation tests (discriminant variables) gave a fairly good discrimination between species. This treatment has allowed us to confirm the present tendencies noticed in yeast classification and has also enabled us to group some of the species.


Assuntos
Leveduras/fisiologia , Análise de Variância , Queijo , Especificidade da Espécie , Leveduras/classificação , Leveduras/isolamento & purificação
12.
Int J Biomed Comput ; 10(6): 507-18, 1979 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-511387

RESUMO

In this paper, an iterative algorithm is proposed for computing estimates of parameters and sums of squares in non-orthogonal multivariate analysis of variance, without inverting any matrix. It is useful in the case of a large design matrix for it saves memory and computation time. It was first proposed by Stevens (1948) for 3 factors and is here generalised to any number of factors and interactions of any order. Convergence properties are studied. The more orthogonal is the design, the faster is the convergence. Several examples are provided.


Assuntos
Análise de Variância , Computadores , Modelos Teóricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...