Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sensors (Basel) ; 22(15)2022 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-35898068

RESUMO

Multiscale PCA (MSPCA) is a well-established fault-detection and isolation (FDI) technique. It utilizes wavelet analysis and PCA to extract important features from process data. This study demonstrates limitations in the conventional MSPCA fault detection algorithm, thereby proposing an enhanced MSPCA (EMSPCA) FDI algorithm that uses a new wavelet thresholding criterion. As such, it improves the projection of faults in the residual space and the threshold estimation of the fault detection statistic. When tested with a synthetic model, EMSPCA resulted in a 30% improvement in detection rate with equal false alarm rates. The EMSPCA algorithm also relies on the novel application of reconstruction-based fault isolation at multiple scales. The proposed algorithm reduces fault smearing and consequently improves fault isolation performance. The paper will further investigate the use of soft vs. hard wavelet thresholding, decimated vs. undecimated wavelet transforms, the choice of wavelet decomposition depth, and their implications on FDI performance.The FDI performance of the developed EMSPCA method was illustrated for sensor faults. This undertaking considered synthetic data, the simulated data of a continuously stirred reactor (CSTR), and experimental data from a packed-bed pilot plant. The results of these examples show the advantages of EMSPCA over existing techniques.


Assuntos
Algoritmos , Análise de Componente Principal/métodos , Análise de Ondaletas
2.
IEEE Trans Nanobioscience ; 17(4): 498-506, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30296237

RESUMO

In this paper, we develop an improved fault detection (FD) technique in order to enhance the monitoring abilities of nonlinear biological processes. Generalized likelihood ratio test (GLRT)-based kernel principal component analysis (KPCA) (called also kernel GLRT) is an effective data-driven technique for monitoring nonlinear processes. However, it is well known that the data collected from complex and multivariate processes are multiscale due to the variety of changes that could occur in process with different localization in time and frequency. Thus, to enhance the process monitoring abilities, we propose to combine the advantages of kernel GLRT and multiscale representation using wavelets by developing a multiscale kernel GLRT (MS-KGLRT) detection chart. The proposed fault detection approach is addressed so that the KPCA is used to compute the model in the feature space and the MS-KGLRT chart is applied to detect the faults. The detection performance of the new chart is studied using two examples, one using synthetic data and the other using biological process representing a Cad System in E. Coli (CSEC) model for detecting small and moderate shifts (offset or bias and drift). The MS-KGLRT chart is used to enhance fault detection of the CSEC model through monitoring some of the key variables involved in this model such as enzymes, lysine, and cadaverine.


Assuntos
Algoritmos , Modelos Biológicos , Modelos Estatísticos , Biologia de Sistemas/métodos , Escherichia coli/fisiologia , Análise de Componente Principal
3.
Environ Res ; 160: 183-194, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-28987729

RESUMO

Quick validation and detection of faults in measured air quality data is a crucial step towards achieving the objectives of air quality networks. Therefore, the objectives of this paper are threefold: (i) to develop a modeling technique that can be used to predict the normal behavior of air quality variables and help provide accurate reference for monitoring purposes; (ii) to develop fault detection method that can effectively and quickly detect any anomalies in measured air quality data. For this purpose, a new fault detection method that is based on the combination of generalized likelihood ratio test (GLRT) and exponentially weighted moving average (EWMA) will be developed. GLRT is a well-known statistical fault detection method that relies on maximizing the detection probability for a given false alarm rate. In this paper, we propose to develop GLRT-based EWMA fault detection method that will be able to detect the changes in the values of certain air quality variables; (iii) to develop fault isolation and identification method that allows defining the fault source(s) in order to properly apply appropriate corrective actions. In this paper, reconstruction approach that is based on Midpoint-Radii Principal Component Analysis (MRPCA) model will be developed to handle the types of data and models associated with air quality monitoring networks. All air quality modeling, fault detection, fault isolation and reconstruction methods developed in this paper will be validated using real air quality data (such as particulate matter, ozone, nitrogen and carbon oxides measurement).


Assuntos
Poluição do Ar , Monitoramento Ambiental , Modelos Teóricos
4.
IEEE Trans Nanobioscience ; 16(6): 504-512, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28708564

RESUMO

In our previous work, we have demonstrated the effectiveness of the linear multiscale principal component analysis (PCA)-based moving window (MW)-generalized likelihood ratio test (GLRT) technique over the classical PCA and multiscale principal component analysis (MSPCA)-based GLRT methods. The developed fault detection algorithm provided optimal properties by maximizing the detection probability for a particular false alarm rate (FAR) with different values of windows, and however, most real systems are nonlinear, which make the linear PCA method not able to tackle the issue of non-linearity to a great extent. Thus, in this paper, first, we apply a nonlinear PCA to obtain an accurate principal component of a set of data and handle a wide range of nonlinearities using the kernel principal component analysis (KPCA) model. The KPCA is among the most popular nonlinear statistical methods. Second, we extend the MW-GLRT technique to one that utilizes exponential weights to residuals in the moving window (instead of equal weightage) as it might be able to further improve fault detection performance by reducing the FAR using exponentially weighed moving average (EWMA). The developed detection method, which is called EWMA-GLRT, provides improved properties, such as smaller missed detection and FARs and smaller average run length. The idea behind the developed EWMA-GLRT is to compute a new GLRT statistic that integrates current and previous data information in a decreasing exponential fashion giving more weight to the more recent data. This provides a more accurate estimation of the GLRT statistic and provides a stronger memory that will enable better decision making with respect to fault detection. Therefore, in this paper, a KPCA-based EWMA-GLRT method is developed and utilized in practice to improve fault detection in biological phenomena modeled by S-systems and to enhance monitoring process mean. The idea behind a KPCA-based EWMA-GLRT fault detection algorithm is to combine the advantages brought forward by the proposed EWMA-GLRT fault detection chart with the KPCA model. Thus, it is used to enhance fault detection of the Cad System in E. coli model through monitoring some of the key variables involved in this model such as enzymes, transport proteins, regulatory proteins, lysine, and cadaverine. The results demonstrate the effectiveness of the proposed KPCA-based EWMA-GLRT method over Q , GLRT, EWMA, Shewhart, and moving window-GLRT methods. The detection performance is assessed and evaluated in terms of FAR, missed detection rates, and average run length (ARL1) values.


Assuntos
Interpretação Estatística de Dados , Escherichia coli/fisiologia , Modelos Biológicos , Modelos Estatísticos , Dinâmica não Linear , Análise de Componente Principal , Animais , Simulação por Computador , Humanos
5.
Math Biosci ; 249: 75-91, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24524881

RESUMO

A central challenge in computational modeling of biological systems is the determination of the model parameters. In such cases, estimating these variables or parameters from other easily obtained measurements can be extremely useful. For example, time-series dynamic genomic data can be used to develop models representing dynamic genetic regulatory networks, which can be used to design intervention strategies to cure major diseases and to better understand the behavior of biological systems. Unfortunately, biological measurements are usually highly infected by errors that hide the important characteristics in the data. Therefore, these noisy measurements need to be filtered to enhance their usefulness in practice. This paper addresses the problem of state and parameter estimation of biological phenomena modeled by S-systems using Bayesian approaches, where the nonlinear observed system is assumed to progress according to a probabilistic state space model. The performances of various conventional and state-of-the-art state estimation techniques are compared. These techniques include the extended Kalman filter (EKF), unscented Kalman filter (UKF), particle filter (PF), and the developed variational Bayesian filter (VBF). Specifically, two comparative studies are performed. In the first comparative study, the state variables (the enzyme CadA, the model cadBA, the cadaverine Cadav and the lysine Lys for a model of the Cad System in Escherichia coli (CSEC)) are estimated from noisy measurements of these variables, and the various estimation techniques are compared by computing the estimation root mean square error (RMSE) with respect to the noise-free data. In the second comparative study, the state variables as well as the model parameters are simultaneously estimated. In this case, in addition to comparing the performances of the various state estimation techniques, the effect of the number of estimated model parameters on the accuracy and convergence of these techniques is also assessed. The results of both comparative studies show that the UKF provides a higher accuracy than the EKF due to the limited ability of EKF to accurately estimate the mean and covariance matrix of the estimated states through lineralization of the nonlinear process model. The results also show that the VBF provides a relative improvement over PF. This is because, unlike the PF which depends on the choice of sampling distribution used to estimate the posterior distribution, the VBF yields an optimum choice of the sampling distribution, which also utilizes the observed data. The results of the second comparative study show that, for all techniques, estimating more model parameters affects the estimation accuracy as well as the convergence of the estimated states and parameters. The VBF, however, still provides advantages over other methods with respect to estimation accuracy as well convergence.


Assuntos
Modelos Biológicos , Dinâmica não Linear , Algoritmos , Teorema de Bayes , Cadaverina/metabolismo , Escherichia coli/metabolismo , Conceitos Matemáticos , Redes e Vias Metabólicas , Modelos Estatísticos , Biologia de Sistemas
6.
Bioinformatics ; 29(19): 2410-8, 2013 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-23940252

RESUMO

MOTIVATION: Network component analysis (NCA) is an efficient method of reconstructing the transcription factor activity (TFA), which makes use of the gene expression data and prior information available about transcription factor (TF)-gene regulations. Most of the contemporary algorithms either exhibit the drawback of inconsistency and poor reliability, or suffer from prohibitive computational complexity. In addition, the existing algorithms do not possess the ability to counteract the presence of outliers in the microarray data. Hence, robust and computationally efficient algorithms are needed to enable practical applications. RESULTS: We propose ROBust Network Component Analysis (ROBNCA), a novel iterative algorithm that explicitly models the possible outliers in the microarray data. An attractive feature of the ROBNCA algorithm is the derivation of a closed form solution for estimating the connectivity matrix, which was not available in prior contributions. The ROBNCA algorithm is compared with FastNCA and the non-iterative NCA (NI-NCA). ROBNCA estimates the TF activity profiles as well as the TF-gene control strength matrix with a much higher degree of accuracy than FastNCA and NI-NCA, irrespective of varying noise, correlation and/or amount of outliers in case of synthetic data. The ROBNCA algorithm is also tested on Saccharomyces cerevisiae data and Escherichia coli data, and it is observed to outperform the existing algorithms. The run time of the ROBNCA algorithm is comparable with that of FastNCA, and is hundreds of times faster than NI-NCA. AVAILABILITY: The ROBNCA software is available at http://people.tamu.edu/∼amina/ROBNCA


Assuntos
Algoritmos , Fatores de Transcrição/análise , Ciclo Celular , Escherichia coli/química , Escherichia coli/genética , Escherichia coli/metabolismo , Expressão Gênica , Redes Neurais de Computação , Dinâmica não Linear , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
7.
Adv Bioinformatics ; 2013: 205763, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23737768

RESUMO

This paper proposes a novel algorithm for inferring gene regulatory networks which makes use of cubature Kalman filter (CKF) and Kalman filter (KF) techniques in conjunction with compressed sensing methods. The gene network is described using a state-space model. A nonlinear model for the evolution of gene expression is considered, while the gene expression data is assumed to follow a linear Gaussian model. The hidden states are estimated using CKF. The system parameters are modeled as a Gauss-Markov process and are estimated using compressed sensing-based KF. These parameters provide insight into the regulatory relations among the genes. The Cramér-Rao lower bound of the parameter estimates is calculated for the system model and used as a benchmark to assess the estimation accuracy. The proposed algorithm is evaluated rigorously using synthetic data in different scenarios which include different number of genes and varying number of sample points. In addition, the algorithm is tested on the DREAM4 in silico data sets as well as the in vivo data sets from IRMA network. The proposed algorithm shows superior performance in terms of accuracy, robustness, and scalability.

9.
Adv Bioinformatics ; 2013: 953814, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23509452

RESUMO

The large influx of data from high-throughput genomic and proteomic technologies has encouraged the researchers to seek approaches for understanding the structure of gene regulatory networks and proteomic networks. This work reviews some of the most important statistical methods used for modeling of gene regulatory networks (GRNs) and protein-protein interaction (PPI) networks. The paper focuses on the recent advances in the statistical graphical modeling techniques, state-space representation models, and information theoretic methods that were proposed for inferring the topology of GRNs. It appears that the problem of inferring the structure of PPI networks is quite different from that of GRNs. Clustering and probabilistic graphical modeling techniques are of prime importance in the statistical inference of PPI networks, and some of the recent approaches using these techniques are also reviewed in this paper. Performance evaluation criteria for the approaches used for modeling GRNs and PPI networks are also discussed.

10.
Artigo em Inglês | MEDLINE | ID: mdl-23221089

RESUMO

An important objective of modeling biological phenomena is to develop therapeutic intervention strategies to move an undesirable state of a diseased network toward a more desirable one. Such transitions can be achieved by the use of drugs to act on some genes/metabolites that affect the undesirable behavior. Due to the fact that biological phenomena are complex processes with nonlinear dynamics that are impossible to perfectly represent with a mathematical model, the need for model-free nonlinear intervention strategies that are capable of guiding the target variables to their desired values often arises. In many applications, fuzzy systems have been found to be very useful for parameter estimation, model development and control design of nonlinear processes. In this paper, a model-free fuzzy intervention strategy (that does not require a mathematical model of the biological phenomenon) is proposed to guide the target variables of biological systems to their desired values. The proposed fuzzy intervention strategy is applied to three different biological models: a glycolytic-glycogenolytic pathway model, a purine metabolism pathway model, and a generic pathway model. The simulation results for all models demonstrate the effectiveness of the proposed scheme.


Assuntos
Biologia Computacional/métodos , Lógica Fuzzy , Redes e Vias Metabólicas , Modelos Biológicos , Simulação por Computador , Glicogenólise , Glicólise , Método de Monte Carlo , Dinâmica não Linear , Purinas/química , Purinas/metabolismo
11.
Adv Bioinformatics ; 2012: 534810, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23209459

RESUMO

The problems of modeling and intervention of biological phenomena have captured the interest of many researchers in the past few decades. The aim of the therapeutic intervention strategies is to move an undesirable state of a diseased network towards a more desirable one. Such an objective can be achieved by the application of drugs to act on some genes/metabolites that experience the undesirable behavior. For the purpose of design and analysis of intervention strategies, mathematical models that can capture the complex dynamics of the biological systems are needed. S-systems, which offer a good compromise between accuracy and mathematical flexibility, are a promising framework for modeling the dynamical behavior of biological phenomena. Due to the complex nonlinear dynamics of the biological phenomena represented by S-systems, nonlinear intervention schemes are needed to cope with the complexity of the nonlinear S-system models. Here, we present an intervention technique based on feedback linearization for biological phenomena modeled by S-systems. This technique is based on perfect knowledge of the S-system model. The proposed intervention technique is applied to the glycolytic-glycogenolytic pathway, and simulation results presented demonstrate the effectiveness of the proposed technique.

12.
EURASIP J Bioinform Syst Biol ; 2012(1): 18, 2012 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-23186305

RESUMO

: Reference assisted assembly requires the use of a reference sequence, as a model, to assist in the assembly of the novel genome. The standard method for identifying the best reference sequence for the assembly of a novel genome aims at counting the number of reads that align to the reference sequence, and then choosing the reference sequence which has the highest number of reads aligning to it. This article explores the use of minimum description length (MDL) principle and its two variants, the two-part MDL and Sophisticated MDL, in identifying the optimal reference sequence for genome assembly. The article compares the MDL based proposed scheme with the standard method coming to the conclusion that "counting the number of reads of the novel genome present in the reference sequence" is not a sufficient condition. Therefore, the proposed MDL scheme includes within itself the standard method of "counting the number of reads that align to the reference sequence" and also moves forward towards looking at the model, the reference sequence, as well, in identifying the optimal reference sequence. The proposed MDL based scheme not only becomes the sufficient criterion for identifying the optimal reference sequence for genome assembly but also improves the reference sequence so that it becomes more suitable for the assembly of the novel genome.

13.
Artigo em Inglês | MEDLINE | ID: mdl-22350207

RESUMO

This paper considers the problem of learning the structure of gene regulatory networks from gene expression time series data. A more realistic scenario when the state space model representing a gene network evolves nonlinearly is considered while a linear model is assumed for the microarray data. To capture the nonlinearity, a particle filter-based state estimation algorithm is considered instead of the contemporary linear approximation-based approaches. The parameters characterizing the regulatory relations among various genes are estimated online using a Kalman filter. Since a particular gene interacts with a few other genes only, the parameter vector is expected to be sparse. The state estimates delivered by the particle filter and the observed microarray data are then subjected to a LASSO-based least squares regression operation which yields a parsimonious and efficient description of the regulatory network by setting the irrelevant coefficients to zero. The performance of the aforementioned algorithm is compared with the extended Kalman filter (EKF) and Unscented Kalman Filter (UKF) employing the Mean Square Error (MSE) as the fidelity criterion in recovering the parameters of gene regulatory networks from synthetic data and real biological data. Extensive computer simulations illustrate that the proposed particle filter-based network inference algorithm outperforms EKF and UKF, and therefore, it can serve as a natural framework for modeling gene regulatory networks with nonlinear and sparse structure.


Assuntos
Algoritmos , Biologia Computacional/métodos , Redes Reguladoras de Genes , Dinâmica não Linear , Animais , Simulação por Computador , Bases de Dados Genéticas , Drosophila melanogaster , Perfilação da Expressão Gênica , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos
14.
IEEE Trans Biomed Eng ; 58(5): 1260-7, 2011 May.
Artigo em Inglês | MEDLINE | ID: mdl-21172748

RESUMO

Recent years have witnessed extensive research activity in modeling biological phenomena as well as in developing intervention strategies for such phenomena. S-systems, which offer a good compromise between accuracy and mathematical flexibility, are a promising framework for modeling the dynamical behavior of biological phenomena. In this paper, two different intervention strategies, namely direct and indirect, are proposed for the S-system model. In the indirect approach, the prespecified desired values for the target variables are used to compute the reference values for the control inputs, and two control algorithms, namely simple sampled-data control and model predictive control (MPC), are developed for transferring the control variables from their initial values to the computed reference ones. In the direct approach, a MPC algorithm is developed that directly guides the target variables to their desired values. The proposed intervention strategies are applied to the glycolytic-glycogenolytic pathway and the simulation results presented demonstrate the effectiveness of the proposed schemes.


Assuntos
Modelos Biológicos , Biologia de Sistemas/métodos , Algoritmos , Redes Reguladoras de Genes , Gluconeogênese , Glicólise , Modelos Estatísticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...