Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
J Comput Biol ; 29(12): 1397-1411, 2022 12.
Article in English | MEDLINE | ID: mdl-36450118

ABSTRACT

Single-step nonadaptive group testing approaches for reducing the number of tests required to detect a small subset of positive samples from a larger set require solving two algorithmic problems. First, how to design the samples-to-tests measurement matrix, and second, how to decode the results of the tests to uncover positive samples. In this study, we focus on the first challenge. We introduce real-valued group testing, which matches the characteristics of existing PCR testing pipelines more closely than combinatorial group testing or compressed sensing settings. We show a set of conditions that allow measurement matrices to guarantee unambiguous decoding of positives in this new setting. For small matrix sizes, we also propose an algorithm for constructing matrices that meet the proposed condition. On simulated data sets, we show that the matrices resulting from the algorithm can successfully recover positive samples at higher positivity rates than matrices designed for combinatorial group testing setting. We use wet laboratory experiments involving SARS-CoV-2 nasopharyngeal swab samples to further validate the approach.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19 Testing , COVID-19/diagnosis , Polymerase Chain Reaction , Sensitivity and Specificity
2.
Cells ; 11(14)2022 07 20.
Article in English | MEDLINE | ID: mdl-35883687

ABSTRACT

Cytogenetics laboratory tests are among the most important procedures for the diagnosis of genetic diseases, especially in the area of hematological malignancies. Manual chromosomal karyotyping methods are time consuming and labor intensive and, hence, expensive. Therefore, to alleviate the process of analysis, several attempts have been made to enhance karyograms. The current chromosomal image enhancement is based on classical image processing. This approach has its limitations, one of which is that it has a mandatory application to all chromosomes, where customized application to each chromosome is ideal. Moreover, each chromosome needs a different level of enhancement, depending on whether a given area is from the chromosome itself or it is just an artifact from staining. The analysis of poor-quality karyograms, which is a difficulty faced often in preparations from cancer samples, is time consuming and might result in missing the abnormality or difficulty in reporting the exact breakpoint within the chromosome. We developed ChromoEnhancer, a novel artificial-intelligence-based method to enhance neoplastic karyogram images. The method is based on Generative Adversarial Networks (GANs) with a data-centric approach. GANs are known for the conversion of one image domain to another. We used GANs to convert poor-quality karyograms into good-quality images. Our method of karyogram enhancement led to robust routine cytogenetic analysis and, therefore, to accurate detection of cryptic chromosomal abnormalities. To evaluate ChromoEnahancer, we randomly assigned a subset of the enhanced images and their corresponding original (unenhanced) images to two independent cytogeneticists to measure the karyogram quality and the elapsed time to complete the analysis, using four rating criteria, each scaled from 1 to 5. Furthermore, we compared the enhanced images with our method to the original ones, using quantitative measures (PSNR and SSIM metrics).


Subject(s)
Chromosome Aberrations , Image Processing, Computer-Assisted , Cytogenetics , Humans , Image Processing, Computer-Assisted/methods , Intelligence , Karyotyping
3.
BMC Bioinformatics ; 21(1): 122, 2020 Mar 23.
Article in English | MEDLINE | ID: mdl-32293263

ABSTRACT

BACKGROUND: Cancer is caused by genetic mutations, but not all somatic mutations in human DNA drive the emergence or growth of cancers. While many frequently-mutated cancer driver genes have already been identified and are being utilized for diagnostic, prognostic, or therapeutic purposes, identifying driver genes that harbor mutations occurring with low frequency in human cancers is an ongoing endeavor. Typically, mutations that do not confer growth advantage to tumors - passenger mutations - dominate the mutation landscape of tumor cell genome, making identification of low-frequency driver mutations a challenge. The leading approach for discovering new putative driver genes involves analyzing patterns of mutations in large cohorts of patients and using statistical methods to discriminate driver from passenger mutations. RESULTS: We propose a novel cancer driver gene detection method, QuaDMutNetEx. QuaDMutNetEx discovers cancer drivers with low mutation frequency by giving preference to genes encoding proteins that are connected in human protein-protein interaction networks, and that at the same time show low deviation from the mutual exclusivity pattern that characterizes driver mutations occurring in the same pathway or functional gene group across a cohort of cancer samples. CONCLUSIONS: Evaluation of QuaDMutNetEx on four different tumor sample datasets show that the proposed method finds biologically-connected sets of low-frequency driver genes, including many genes that are not found if the network connectivity information is not considered. Improved quality and interpretability of the discovered putative driver gene sets compared to existing methods shows that QuaDMutNetEx is a valuable new tool for detecting driver genes. QuaDMutNetEx is available for download from https://github.com/bokhariy/QuaDMutNetExunder the GNU GPLv3 license.


Subject(s)
Algorithms , Computational Biology/methods , Neoplasms/genetics , Humans , Mutation
4.
Physiol Meas ; 39(12): 124002, 2018 12 07.
Article in English | MEDLINE | ID: mdl-30524050

ABSTRACT

OBJECTIVE: The healing of wounds is critical in protecting the human body against environmental factors. The mechanisms involving protein expression during this complex physiological process have not been fully elucidated. APPROACH: Here, we use reverse-phase protein microarrays (RPPA) involving 94 phosphoproteins to study tissue samples from tubes implanted in healing dermal wounds in seven human subjects tracked over two weeks. We compare the proteomic profiles to proteomes of controls obtained from skin biopsies from the same subjects. MAIN RESULTS: Compared to previous proteomic studies of wound healing, our approach focuses on wound tissue instead of wound fluid, and has the sensitivity to go beyond measuring only highly abundant proteins. To study the temporal dynamics of networks involved in wound healing, we applied two network analysis methods that integrate the experimental results with prior knowledge about protein-protein physical and regulatory interactions, as well as higher-level biological processes and associated pathways. SIGNIFICANCE: We uncovered densely connected networks of proteins that are up- or down-regulated during human wound healing, as well as their relationships to microRNAs and to proteins outside of our set of targets that we measured with proteomic microarrays.


Subject(s)
Proteomics , Skin Physiological Phenomena , Skin/metabolism , Wound Healing , Down-Regulation , Humans , Phosphoproteins/metabolism , Protein Array Analysis , Up-Regulation
5.
BMC Bioinformatics ; 18(1): 458, 2017 Oct 24.
Article in English | MEDLINE | ID: mdl-29065872

ABSTRACT

BACKGROUND: Somatic mutations accumulate in human cells throughout life. Some may have no adverse consequences, but some of them may lead to cancer. A cancer genome is typically unstable, and thus more mutations can accumulate in the DNA of cancer cells. An ongoing problem is to figure out which mutations are drivers - play a role in oncogenesis, and which are passengers - do not play a role. One way of addressing this question is through inspection of somatic mutations in DNA of cancer samples from a cohort of patients and detection of patterns that differentiate driver from passenger mutations. RESULTS: We propose QuaDMutEx, a method that incorporates three novel elements: a new gene set penalty that includes non-linear penalization of multiple mutations in putative sets of driver genes, an ability to adjust the method to handle slow- and fast-evolving tumors, and a computationally efficient method for finding gene sets that minimize the penalty, through a combination of heuristic Monte Carlo optimization and exact binary quadratic programming. Compared to existing methods, the proposed algorithm finds sets of putative driver genes that show higher coverage and lower excess coverage in eight sets of cancer samples coming from brain, ovarian, lung, and breast tumors. CONCLUSIONS: Superior ability to improve on both coverage and excess coverage on different types of cancer shows that QuaDMutEx is a tool that should be part of a state-of-the-art toolbox in the driver gene discovery pipeline. It can detect genes harboring rare driver mutations that may be missed by existing methods. QuaDMutEx is available for download from https://github.com/bokhariy/QuaDMutEx under the GNU GPLv3 license.


Subject(s)
Algorithms , Databases, Factual , Humans , Internet , Monte Carlo Method , Mutation , Neoplasms/genetics , Neoplasms/pathology , User-Computer Interface
6.
BMC Syst Biol ; 7: 106, 2013 Oct 22.
Article in English | MEDLINE | ID: mdl-24148309

ABSTRACT

BACKGROUND: The regulation of gene expression by transcription factors is a key determinant of cellular phenotypes. Deciphering genome-wide networks that capture which transcription factors regulate which genes is one of the major efforts towards understanding and accurate modeling of living systems. However, reverse-engineering the network from gene expression profiles remains a challenge, because the data are noisy, high dimensional and sparse, and the regulation is often obscured by indirect connections. RESULTS: We introduce a gene regulatory network inference algorithm ENNET, which reverse-engineers networks of transcriptional regulation from a variety of expression profiles with a superior accuracy compared to the state-of-the-art methods. The proposed method relies on the boosting of regression stumps combined with a relative variable importance measure for the initial scoring of transcription factors with respect to each gene. Then, we propose a technique for using a distribution of the initial scores and information about knockouts to refine the predictions. We evaluated the proposed method on the DREAM3, DREAM4 and DREAM5 data sets and achieved higher accuracy than the winners of those competitions and other established methods. CONCLUSIONS: Superior accuracy achieved on the three different benchmark data sets shows that ENNET is a top contender in the task of network inference. It is a versatile method that uses information about which gene was knocked-out in which experiment if it is available, but remains the top performer even without such information. ENNET is available for download from https://github.com/slawekj/ennet under the GNU GPLv3 license.


Subject(s)
Algorithms , Computational Biology/methods , Gene Regulatory Networks , Transcription, Genetic , Transcriptome
7.
Mol Cancer Res ; 11(6): 676-85, 2013 Jun.
Article in English | MEDLINE | ID: mdl-23635402

ABSTRACT

The NCI-60 cell line set is likely the most molecularly profiled set of human tumor cell lines in the world. However, a critical missing component of previous analyses has been the inability to place the massive amounts of "-omic" data in the context of functional protein signaling networks, which often contain many of the drug targets for new targeted therapeutics. We used reverse-phase protein array (RPPA) analysis to measure the activation/phosphorylation state of 135 proteins, with a total analysis of nearly 200 key protein isoforms involved in cell proliferation, survival, migration, adhesion, etc., in all 60 cell lines. We aggregated the signaling data into biochemical modules of interconnected kinase substrates for 6 key cancer signaling pathways: AKT, mTOR, EGF receptor (EGFR), insulin-like growth factor-1 receptor (IGF-1R), integrin, and apoptosis signaling. The net activation state of these protein network modules was correlated to available individual protein, phosphoprotein, mutational, metabolomic, miRNA, transcriptional, and drug sensitivity data. Pathway activation mapping identified reproducible and distinct signaling cohorts that transcended organ-type distinctions. Direct correlations with the protein network modules involved largely protein phosphorylation data but we also identified direct correlations of signaling networks with metabolites, miRNA, and DNA data. The integration of protein activation measurements into biochemically interconnected modules provided a novel means to align the functional protein architecture with multiple "-omic" data sets and therapeutic response correlations. This approach may provide a deeper understanding of how cellular biochemistry defines therapeutic response. Such "-omic" portraits could inform rational anticancer agent screenings and drive personalized therapeutic approaches.


Subject(s)
Neoplasm Proteins/metabolism , Neoplasms/drug therapy , Neoplasms/metabolism , Proteomics/methods , Signal Transduction , Systems Biology , Cell Line, Tumor , Cluster Analysis , ErbB Receptors/metabolism , Humans , Integrins/metabolism , Models, Biological , Protein Array Analysis
8.
J Theor Biol ; 330: 1-8, 2013 Aug 07.
Article in English | MEDLINE | ID: mdl-23541620

ABSTRACT

New folds of protein structures emerge in evolution as a result of insertions, deletions or shuffling of fragments of underlying gene sequences, and from aggregated effects of point mutations. The result of these evolutionary processes is a rich and complex universe of protein sequences and structures, with characteristic features such as heavy-tailed distribution of fold occurrences, and a distinct shape of relationship between sequence identity and structure similarity. Better understanding of how the protein universe evolved to its present form can be achieved by creating models of protein structure evolution. Here we introduce a stochastic model of evolution that involves residue substitutions as the sole source of structure innovation, and is nonetheless able to reproduce the diversity of the protein domains repertoire, its cluster structure with heavy-tailed distribution of family sizes, and presence of the twilight zone populated with remote homologs.


Subject(s)
Evolution, Molecular , Models, Molecular , Point Mutation , Protein Conformation , Protein Folding , Proteins/chemistry , Proteins/genetics
9.
Adv Wound Care (New Rochelle) ; 2(9): 499-509, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24527361

ABSTRACT

OBJECTIVE: The wound healing process is well-understood on the cellular and tissue level; however, its complex molecular mechanisms are not yet uncovered in their entirety. Viewing wounds as perturbed molecular networks provides the tools for analyzing and optimizing the healing process. It helps to answer specific questions that lead to better understanding of the complexity of the process. What are the molecular pathways involved in wound healing? How do these pathways interact with each other during the different stages of wound healing? Is it possible to grasp the entire mechanism of regulatory interactions in the healing of a wound? APPROACH: Networks are structures composed of nodes connected by links. A network describing the state of a cell taking part in the healing process may contain nodes representing genes, proteins, microRNAs, metabolites, and drug molecules. The links connecting nodes represent interactions such as binding, regulation, co-expression, chemical reaction, and others. Both nodes and links can be weighted by numbers related to molecular concentration and the intensity of intermolecular interactions. Proceeding from data and from molecular profiling experiments, different types of networks are built to characterize the stages of the healing process. Network nodes having a higher degree of connectivity and centrality usually play more important roles for the functioning of the system they describe. RESULTS: We describe here the algorithms and software packages for building, manipulating and analyzing networks proceeding from information available from a literature or database search or directly extracted from experimental gene expression, metabolic, and proteomic data. Network analysis identifies genes/proteins most differentiated during the healing process, and their organization in functional pathways or modules, and their distribution into gene ontology categories of biological processes, molecular functions, and cellular localization. We provide an example of how network analysis can be used to reach better understanding of regulation of key wound healing mediators and microRNAs that regulate them. INNOVATION: Univariate statistical tests widely used in clinical studies are not enough to improve understanding and optimize the processes of wound healing. Network methods of analysis of patients "omics" data, such as transcriptoms, proteomes, and others can provide a better insight into the healing processes and help in development of better treatment practices. We review several articles that are examples of this emergent approach to the study of wound healing. CONCLUSION: Network analysis has the potential to considerably contribute to the better understanding of the molecular mechanisms of wound healing and to the discovery of means to control and optimize that process.

10.
Proteins ; 80(7): 1780-90, 2012 Jul.
Article in English | MEDLINE | ID: mdl-22434500

ABSTRACT

Inspection of structure changes in proteins borne by altering their sequences brings understanding of physics, functioning and evolution of existing proteins, and helps engineer modified ones. On single amino acid substitutions, the most frequent mutation type, shifts in backbone conformation are typically small, raising doubts if and how such minor modifications could drive evolutionary divergence. Here, we report that the distribution of magnitudes of structure change on such substitutions is heavy-tailed--whereas protein structures are robust to most substitutions, changes much larger than average occur with raised odds compared to what would be expected for exponential distribution with the same mean. This nonexponential behavior allows for reconciling the apparent contradiction between the observed conservation of protein structures and the substantial evolutionary plasticity implied in their diversity. The presence of the heavy tail in the distribution promotes structure divergence, facilitating exploration of new functionality, and conformations within folds, as well as exploration of structure space for new folds.


Subject(s)
Point Mutation , Proteins/chemistry , Proteins/genetics , Amino Acid Substitution , Animals , Computational Biology , Computer Simulation , Databases, Protein , Evolution, Molecular , Humans , Models, Molecular , Protein Conformation
11.
Comb Chem High Throughput Screen ; 9(3): 213-28, 2006 Mar.
Article in English | MEDLINE | ID: mdl-16533155

ABSTRACT

Virtual filtering and screening of combinatorial libraries have recently gained attention as methods complementing the high-throughput screening and combinatorial chemistry. These chemoinformatic techniques rely heavily on quantitative structure-activity relationship (QSAR) analysis, a field with established methodology and successful history. In this review, we discuss the computational methods for building QSAR models. We start with outlining their usefulness in high-throughput screening and identifying the general scheme of a QSAR model. Following, we focus on the methodologies in constructing three main components of QSAR model, namely the methods for describing the molecular structure of compounds, for selection of informative descriptors and for activity prediction. We present both the well-established methods as well as techniques recently introduced into the QSAR domain.


Subject(s)
Models, Molecular , Quantitative Structure-Activity Relationship
12.
J Chem Inf Model ; 46(1): 416-23, 2006.
Article in English | MEDLINE | ID: mdl-16426075

ABSTRACT

We propose a new classification method for the prediction of drug properties, called random feature subset boosting for linear discriminant analysis (LDA). The main novelty of this method is the ability to overcome the problems with constructing ensembles of linear discriminant models based on generalized eigenvectors of covariance matrices. Such linear models are popular in building classification-based structure-activity relationships. The introduction of ensembles of LDA models allows for an analysis of more complex problems than by using single LDA, for example, those involving multiple mechanisms of action. Using four data sets, we show experimentally that the method is competitive with other recently studied chemoinformatic methods, including support vector machines and models based on decision trees. We present an easy scheme for interpreting the model despite its apparent sophistication. We also outline theoretical evidence as to why, contrary to the conventional AdaBoost ensemble algorithm, this method is able to increase the accuracy of LDA models.


Subject(s)
Computer Simulation , Models, Chemical , Pharmaceutical Preparations/chemistry , Pharmaceutical Preparations/metabolism , ATP Binding Cassette Transporter, Subfamily B, Member 1/metabolism , Algorithms , Biological Transport, Active , Humans , Linear Models , Structure-Activity Relationship , Substrate Specificity
13.
Comput Methods Programs Biomed ; 81(1): 56-65, 2006 Jan.
Article in English | MEDLINE | ID: mdl-16310282

ABSTRACT

The most frequent symptoms of ductal carcinoma recognised by mammography are clusters of microcalcifications. Their detection from mammograms is difficult, especially for glandular breasts. We present a new computer-aided detection system for small field digital mammography in planning of breast biopsy. The system processes the mammograms in several steps. First, we filter the original picture with a filter that is sensitive to microcalcification contrast shape. Then, we enhance the mammogram contrast by using wavelet-based sharpening algorithm. Afterwards, we present to radiologist, for visual analysis, such a contrast-enhanced mammogram with suggested positions of microcalcification clusters. We have evaluated the usefulness of the system with the help of four experienced radiologists, who found that it significantly improves the detection of microcalcifications in small field digital mammography.


Subject(s)
Breast Neoplasms/diagnosis , Calcinosis/pathology , Mammography/methods , Algorithms , Breast Diseases/diagnosis , Cluster Analysis , Diagnosis, Computer-Assisted , Humans , Image Processing, Computer-Assisted , ROC Curve , Radiographic Image Enhancement , Radiographic Image Interpretation, Computer-Assisted , Reproducibility of Results
14.
Comput Methods Programs Biomed ; 79(2): 135-49, 2005 Aug.
Article in English | MEDLINE | ID: mdl-15925425

ABSTRACT

We have employed two pattern recognition methods used commonly for face recognition in order to analyse digital mammograms. The methods are based on novel classification schemes, the AdaBoost and the support vector machines (SVM). A number of tests have been carried out to evaluate the accuracy of these two algorithms under different circumstances. Results for the AdaBoost classifier method are promising, especially for classifying mass-type lesions. In the best case the algorithm achieved accuracy of 76% for all lesion types and 90% for masses only. The SVM based algorithm did not perform as well. In order to achieve a higher accuracy for this method, we should choose image features that are better suited for analysing digital mammograms than the currently used ones.


Subject(s)
Breast/abnormalities , Mammography/methods , Pattern Recognition, Automated , Algorithms , Breast Neoplasms/diagnostic imaging , Female , Humans , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...