ABSTRACT
In this paper, we propose a new methodology for analysis of microarray images. First, a new gridding algorithm is proposed for determining the individual spots and their borders. Then, a Gaussian mixture model (GMM) approach is presented for the analysis of the individual spot images. The main advantages of the proposed methodology are modeling flexibility and adaptability to the data, which are well-known strengths of GMM. The maximum likelihood and maximum a posteriori approaches are used to estimate the GMM parameters via the expectation maximization algorithm. The proposed approach has the ability to detect and compensate for artifacts that might occur in microarray images. This is accomplished by a model-based criterion that selects the number of the mixture components. We present numerical experiments with artificial and real data where we compare the proposed approach with previous ones and existing software tools for microarray image analysis and demonstrate its advantages.
Subject(s)
Algorithms , Artificial Intelligence , Gene Expression Profiling/methods , Image Interpretation, Computer-Assisted/methods , Microscopy, Fluorescence/methods , Models, Genetic , Oligonucleotide Array Sequence Analysis/methods , Animals , Computer Simulation , Humans , In Situ Hybridization/methods , Models, StatisticalABSTRACT
Gaussian mixture models (GMMs) constitute a well-known type of probabilistic neural networks. One of their many successful applications is in image segmentation, where spatially constrained mixture models have been trained using the expectation-maximization (EM) framework. In this letter, we elaborate on this method and propose a new methodology for the M-step of the EM algorithm that is based on a novel constrained optimization formulation. Numerical experiments using simulated images illustrate the superior performance of our method in terms of the attained maximum value of the objective function and segmentation accuracy compared to previous implementations of this approach.
Subject(s)
Neural Networks, ComputerABSTRACT
OBJECTIVES: This paper proposes a greedy algorithm for learning a mixture of motifs model through likelihood maximization, in order to discover common substrings, known as motifs, from a given collection of related biosequences. METHODS: The approach sequentially adds a new motif component to a mixture model by performing a combined scheme of global and local search for appropriately initializing the component parameters. A hierarchical clustering scheme is also applied initially which leads to the identification of candidate motif models and speeds up the global searching procedure. RESULTS: The performance of the proposed algorithm has been studied in both artificial and real biological datasets. In comparison with the well-known MEME approach, the algorithm is advantageous since it identifies motifs with significant conservation and produces larger protein fingerprints. CONCLUSION: The proposed greedy algorithm constitutes a promising approach for discovering multiple probabilistic motifs in biological sequences. By using an effective incremental mixture modeling strategy, our technique manages to successfully overcome the limitation of the MEME scheme which erases motif occurrences each time a new motif is discovered.