Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
1.
J Assoc Inf Sci Technol ; 73(2): 225-239, 2022 Feb.
Article in English | MEDLINE | ID: mdl-35873357

ABSTRACT

This article considers the interdisciplinary opportunities and challenges of working with digital cultural heritage, such as digitized historical newspapers, and proposes an integrated digital hermeneutics workflow to combine purely disciplinary research approaches from computer science, humanities, and library work. Common interests and motivations of the above-mentioned disciplines have resulted in interdisciplinary projects and collaborations such as the NewsEye project, which is working on novel solutions on how digital heritage data is (re)searched, accessed, used, and analyzed. We argue that collaborations of different disciplines can benefit from a good understanding of the workflows and traditions of each of the disciplines involved but must find integrated approaches to successfully exploit the full potential of digitized sources. The paper is furthermore providing an insight into digital tools, methods, and hermeneutics in action, showing that integrated interdisciplinary research needs to build something in between the disciplines while respecting and understanding each other's expertise and expectations.

2.
Phys Life Rev ; 34-35: 52-53, 2020 12.
Article in English | MEDLINE | ID: mdl-32595075
3.
Bioinformatics ; 35(24): 5385-5388, 2019 12 15.
Article in English | MEDLINE | ID: mdl-31233141

ABSTRACT

SUMMARY: Biomine Explorer is a web application that enables interactive exploration of large heterogeneous biological networks constructed from selected publicly available biological knowledge sources. It is built on top of Biomine, a system which integrates cross-references from several biological databases into a large heterogeneous probabilistic network. Biomine Explorer offers user-friendly interfaces for search, visualization, exploration and manipulation as well as public and private storage of discovered subnetworks with permanent links suitable for inclusion into scientific publications. A JSON-based web API for network search queries is also available for advanced users. AVAILABILITY AND IMPLEMENTATION: Biomine Explorer is implemented as a web application, which is publicly available at https://biomine.ijs.si. Registration is not required but registered users can benefit from additional features such as private network repositories.


Subject(s)
Software , Databases, Factual , Internet
4.
IEEE J Biomed Health Inform ; 19(6): 1945-52, 2015 Nov.
Article in English | MEDLINE | ID: mdl-24691540

ABSTRACT

We present a method for measuring beat-to-beat heart rate from ballistocardiograms acquired with force sensors. First, a model for the heartbeat shape is adaptively inferred from the signal using hierarchical clustering. Then, beat-to-beat intervals are detected by finding positions where the heartbeat shape best fits the signal. The method was validated with overnight recordings from 46 subjects in varying setups (sleep clinic, home, single bed, double bed, two sensor types). The mean beat-to-beat interval error was 13 ms and on an average 54% of the beat-to-beat intervals were detected. The method is part of a home-use e-health system for an unobtrusive sleep measurement.


Subject(s)
Ballistocardiography/methods , Heart Rate/physiology , Signal Processing, Computer-Assisted , Adult , Algorithms , Female , Humans , Male , Models, Statistical , Telemedicine , Young Adult
5.
PLoS One ; 9(3): e89618, 2014.
Article in English | MEDLINE | ID: mdl-24619061

ABSTRACT

There is an acute need for better tools to extract knowledge from the growing flood of sequence data. For example, thousands of complete genomes have been sequenced, and their metabolic networks inferred. Such data should enable a better understanding of evolution. However, most existing network analysis methods are based on pair-wise comparisons, and these do not scale to thousands of genomes. Here we propose the use of weighted graphs as a data structure to enable large-scale phylogenetic analysis of networks. We have developed three types of weighted graph for enzymes: taxonomic (these summarize phylogenetic importance), isoenzymatic (these summarize enzymatic variety/redundancy), and sequence-similarity (these summarize sequence conservation); and we applied these types of weighted graph to survey prokaryotic metabolism. To demonstrate the utility of this approach we have compared and contrasted the large-scale evolution of metabolism in Archaea and Eubacteria. Our results provide evidence for limits to the contingency of evolution.


Subject(s)
Genome , Genomics , Algorithms , Archaea/genetics , Archaea/metabolism , Bacteria/genetics , Bacteria/metabolism , Biological Evolution , Enzymes/genetics , Enzymes/metabolism , Genome, Archaeal , Genome, Bacterial , Glycolysis , Isoenzymes , Metabolic Networks and Pathways , Models, Biological , Mutation
6.
BMC Bioinformatics ; 13: 119, 2012 Jun 06.
Article in English | MEDLINE | ID: mdl-22672646

ABSTRACT

BACKGROUND: Biological databases contain large amounts of data concerning the functions and associations of genes and proteins. Integration of data from several such databases into a single repository can aid the discovery of previously unknown connections spanning multiple types of relationships and databases. RESULTS: Biomine is a system that integrates cross-references from several biological databases into a graph model with multiple types of edges, such as protein interactions, gene-disease associations and gene ontology annotations. Edges are weighted based on their type, reliability, and informativeness. We present Biomine and evaluate its performance in link prediction, where the goal is to predict pairs of nodes that will be connected in the future, based on current data. In particular, we formulate protein interaction prediction and disease gene prioritization tasks as instances of link prediction. The predictions are based on a proximity measure computed on the integrated graph. We consider and experiment with several such measures, and perform a parameter optimization procedure where different edge types are weighted to optimize link prediction accuracy. We also propose a novel method for disease-gene prioritization, defined as finding a subset of candidate genes that cluster together in the graph. We experimentally evaluate Biomine by predicting future annotations in the source databases and prioritizing lists of putative disease genes. CONCLUSIONS: The experimental results show that Biomine has strong potential for predicting links when a set of selected candidate links is available. The predictions obtained using the entire Biomine dataset are shown to clearly outperform ones obtained using any single source of data alone, when different types of links are suitably weighted. In the gene prioritization task, an established reference set of disease-associated genes is useful, but the results show that under favorable conditions, Biomine can also perform well when no such information is available.The Biomine system is a proof of concept. Its current version contains 1.1 million entities and 8.1 million relations between them, with focus on human genetics. Some of its functionalities are available in a public query interface at http://biomine.cs.helsinki.fi, allowing searching for and visualizing connections between given biological entities.


Subject(s)
Databases, Genetic/statistics & numerical data , Genetic Association Studies/methods , Genetic Predisposition to Disease , Software , Algorithms , Humans , Kruppel-Like Factor 6 , Kruppel-Like Transcription Factors/genetics , Phosphatidylinositol 3-Kinases/genetics , Proteins/genetics , Proteins/metabolism , Proto-Oncogene Proteins/genetics , Stomach Neoplasms/genetics
7.
Article in English | MEDLINE | ID: mdl-23366752

ABSTRACT

We describe an online sleep monitoring service, based on unobtrusive ballistocardiography (BCG) measurement in an ordinary bed. The novelty of the system is that the sleep tracking web application is based on measurements from a fully unobtrusive sensor. The BCG signal is measured with a piezoelectric film sensor under the mattress topper, and sent to the web server for analysis. Heart rate and respiratory variation, activity, sleep stages, and stress reactions are inferred based on the signal. The sleep information is presented to the user along with measurements of the sleeping environment (temperature, noise, luminosity) and user-logged tags (e.g. stress, alcohol, exercise). The approach is designed for long-term use at home, allowing users to follow the development of their sleep over months and years. The service has also a medical use, as sleep disorder patients can be measured for long periods before and after interventions.


Subject(s)
Monitoring, Physiologic/methods , Online Systems , Sleep/physiology , Ballistocardiography , Heart Rate/physiology , Humans , Internet , Respiration , Signal Processing, Computer-Assisted , Telemetry , Time Factors
8.
BMC Bioinformatics ; 12: 416, 2011 Oct 26.
Article in English | MEDLINE | ID: mdl-22029475

ABSTRACT

BACKGROUND: In experimental data analysis, bioinformatics researchers increasingly rely on tools that enable the composition and reuse of scientific workflows. The utility of current bioinformatics workflow environments can be significantly increased by offering advanced data mining services as workflow components. Such services can support, for instance, knowledge discovery from diverse distributed data and knowledge sources (such as GO, KEGG, PubMed, and experimental databases). Specifically, cutting-edge data analysis approaches, such as semantic data mining, link discovery, and visualization, have not yet been made available to researchers investigating complex biological datasets. RESULTS: We present a new methodology, SegMine, for semantic analysis of microarray data by exploiting general biological knowledge, and a new workflow environment, Orange4WS, with integrated support for web services in which the SegMine methodology is implemented. The SegMine methodology consists of two main steps. First, the semantic subgroup discovery algorithm is used to construct elaborate rules that identify enriched gene sets. Then, a link discovery service is used for the creation and visualization of new biological hypotheses. The utility of SegMine, implemented as a set of workflows in Orange4WS, is demonstrated in two microarray data analysis applications. In the analysis of senescence in human stem cells, the use of SegMine resulted in three novel research hypotheses that could improve understanding of the underlying mechanisms of senescence and identification of candidate marker genes. CONCLUSIONS: Compared to the available data analysis systems, SegMine offers improved hypothesis generation and data interpretation for bioinformatics in an easy-to-use integrated workflow environment.


Subject(s)
Algorithms , Gene Expression Profiling , Oligonucleotide Array Sequence Analysis/methods , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics , Software , Adipose Tissue/pathology , Autophagy , Cellular Senescence , Humans , Mesenchymal Stem Cells/pathology , Stem Cells/pathology , Workflow
9.
J Biol Chem ; 286(21): 18375-82, 2011 May 27.
Article in English | MEDLINE | ID: mdl-21324892

ABSTRACT

The expression levels of caspase-8 inhibitory c-FLIP proteins play an important role in regulating death receptor-mediated apoptosis, as their concentration at the moment when the death-inducing signaling complex (DISC) is formed determines the outcome of the DISC signal. Experimental studies have shown that c-FLIP proteins are subject to dynamic turnover and that their stability and expression levels can be rapidly altered. Even though the influence of c-FLIP on the apoptotic behavior of a single cell has been captured in mathematical simulation studies, the effect of c-FLIP turnover and stability has not been investigated. In this study, a mathematical model of apoptosis was developed to analyze how the dynamic turnover and stability of the c-FLIP isoforms regulate apoptotic signaling for both individual cells and cell populations. Intercellular parameter and concentration distributions were used to describe the behavior of cell populations. Monte-Carlo simulations of cell populations showed that c-FLIP turnover is a key determinant of death receptor responses. The fact that the developed model simulates the state of whole cell populations makes it possible to validate it by comparison with empirical data. The proposed modeling approach can be used to further determine limiting factors in the DISC signaling process.


Subject(s)
Apoptosis/physiology , CASP8 and FADD-Like Apoptosis Regulating Protein/metabolism , Cell Communication/physiology , Models, Biological , Signal Transduction/physiology , fas Receptor/metabolism , CASP8 and FADD-Like Apoptosis Regulating Protein/genetics , Cell Line, Tumor , Humans , Monte Carlo Method , fas Receptor/genetics
10.
BMC Bioinformatics ; 8 Suppl 2: S9, 2007 May 03.
Article in English | MEDLINE | ID: mdl-17493258

ABSTRACT

BACKGROUND: Haplotype Reconstruction is the problem of resolving the hidden phase information in genotype data obtained from laboratory measurements. Solving this problem is an important intermediate step in gene association studies, which seek to uncover the genetic basis of complex diseases. We propose a novel approach for haplotype reconstruction based on constrained hidden Markov models. Models are constructed by incrementally refining and regularizing the structure of a simple generative model for genotype data under Hardy-Weinberg equilibrium. RESULTS: The proposed method is evaluated on real-world and simulated population data. Results show that it is competitive with other recently proposed methods in terms of reconstruction accuracy, while offering a particularly good trade-off between computational costs and quality of results for large datasets. CONCLUSION: Relatively simple probabilistic approaches for haplotype reconstruction based on structured hidden Markov models are competitive with more complex, well-established techniques in this field.


Subject(s)
Artificial Intelligence , Chromosome Mapping/methods , DNA Mutational Analysis/methods , Genetics, Population , Models, Genetic , Pattern Recognition, Automated/methods , Sequence Analysis, DNA/methods , Algorithms , Base Sequence , Genetic Linkage/genetics , Haplotypes , Markov Chains , Models, Statistical , Molecular Sequence Data , Polymorphism, Single Nucleotide/genetics
11.
BMC Bioinformatics ; 7: 542, 2006 Dec 22.
Article in English | MEDLINE | ID: mdl-17187677

ABSTRACT

BACKGROUND: Haplotypes extracted from human DNA can be used for gene mapping and other analysis of genetic patterns within and across populations. A fundamental problem is, however, that current practical laboratory methods do not give haplotype information. Estimation of phased haplotypes of unrelated individuals given their unphased genotypes is known as the haplotype reconstruction or phasing problem. RESULTS: We define three novel statistical models and give an efficient algorithm for haplotype reconstruction, jointly called HaploRec. HaploRec is based on exploiting local regularities conserved in haplotypes: it reconstructs haplotypes so that they have maximal local coherence. This approach--not assuming statistical dependence for remotely located markers--has two useful properties: it is well-suited for sparse marker maps, such as those used in gene mapping, and it can actually take advantage of long maps. CONCLUSION: Our experimental results with simulated and real data show that HaploRec is a powerful method for the large scale haplotyping needed in association studies. With sample sizes large enough for gene mapping it appeared to be the best compared to all other tested methods (Phase, fastPhase, PL-EM, Snphap, Gerbil; simulated data), with small samples it was competitive with the best available methods (real data). HaploRec is several orders of magnitude faster than Phase and comparable to the other methods; the running times are roughly linear in the number of subjects and the number of markers. HaploRec is publicly available at http://www.cs.helsinki.fi/group/genetics/haplotyping.html.


Subject(s)
Haplotypes/genetics , Models, Genetic , Software , Chromosome Mapping/methods , Databases, Genetic , Humans , Linkage Disequilibrium/genetics , Markov Chains
12.
Article in English | MEDLINE | ID: mdl-17048403

ABSTRACT

We describe TreeDT, a novel association-based gene mapping method. Given a set of disease-associated haplotypes and a set of control haplotypes, TreeDT predicts likely locations of a disease susceptibility gene. TreeDT extracts, essentially in the form of haplotype trees, information about historical recombinations in the population: A haplotype tree constructed at a given chromosomal location is an estimate of the genealogy of the haplotypes. TreeDT constructs these trees for all locations on the given haplotypes and performs a novel disequilibrium test on each tree: Is there a small set of subtrees with relatively high proportions of disease-associated chromosomes, suggesting shared genetic history for those and a likely disease gene location? We give a detailed description of TreeDT and the tree disequilibrium tests, we analyze the algorithm formally, and we evaluate its performance experimentally on both simulated and real data sets. Experimental results demonstrate that TreeDT has high accuracy on difficult mapping tasks and comparisons to other methods (EATDT, HPM, TDT) show that TreeDT is very competitive.


Subject(s)
Chromosome Mapping/methods , Computational Biology/methods , Genetic Predisposition to Disease/genetics , Pedigree , Algorithms , Computer Simulation , Diabetes Mellitus, Type 1/genetics , Haplotypes/genetics , Humans , Linkage Disequilibrium/genetics , Recombination, Genetic/genetics , Statistics, Nonparametric
13.
Hum Genomics ; 2(5): 336-40, 2006 Mar.
Article in English | MEDLINE | ID: mdl-16595078

ABSTRACT

Data mining methods are gaining more interest as potential tools in mapping and identification of complex disease loci. The methods are well suited to large numbers of genetic marker loci produced by high-throughput laboratory analyses, but also might be useful for clarifying the phenotype definitions prior to more traditional mapping analyses. Here, the current data mining-based methods for linkage disequilibrium mapping and phenotype analyses are reviewed.


Subject(s)
Chromosome Mapping/methods , Linkage Disequilibrium , Software , Gene Frequency , Humans
14.
Bioinformatics ; 19(10): 1183-93, 2003 Jul 01.
Article in English | MEDLINE | ID: mdl-12835260

ABSTRACT

MOTIVATION: The development of in silico models to predict chemical carcinogenesis from molecular structure would help greatly to prevent environmentally caused cancers. The Predictive Toxicology Challenge (PTC) competition was organized to test the state-of-the-art in applying machine learning to form such predictive models. RESULTS: Fourteen machine learning groups generated 111 models. The use of Receiver Operating Characteristic (ROC) space allowed the models to be uniformly compared regardless of the error cost function. We developed a statistical method to test if a model performs significantly better than random in ROC space. Using this test as criteria five models performed better than random guessing at a significance level p of 0.05 (not corrected for multiple testing). Statistically the best predictor was the Viniti model for female mice, with p value below 0.002. The toxicologically most interesting models were Leuven2 for male mice, and Kwansei for female rats. These models performed well in the statistical analysis and they are in the middle of ROC space, i.e. distant from extreme cost assumptions. These predictive models were also independently judged by domain experts to be among the three most interesting, and are believed to include a small but significant amount of empirically learned toxicological knowledge. AVAILABILITY: PTC details and data can be found at: http://www.predictive-toxicology.org/ptc/.


Subject(s)
Artificial Intelligence , Carcinogenicity Tests/methods , Carcinogens/chemistry , Carcinogens/toxicity , Models, Biological , Models, Statistical , Neoplasms/chemically induced , Risk Assessment/methods , Algorithms , Animals , Data Collection , Databases, Factual , Environmental Exposure/adverse effects , Female , Government Programs/organization & administration , Male , Mice , Rats , Reproducibility of Results , Sensitivity and Specificity , Sex Factors , Species Specificity , Structure-Activity Relationship , Toxicology/methods , United States
SELECTION OF CITATIONS
SEARCH DETAIL
...