Your browser doesn't support javascript.
loading
Montrer: 20 | 50 | 100
Résultats 1 - 8 de 8
Filtre
1.
Chinese Journal of Experimental Traditional Medical Formulae ; (24): 114-122, 2023.
Article Dans Chinois | WPRIM | ID: wpr-975163

Résumé

ObjectiveTo achieve high-dimensional prediction of class imbalanced of adverse drug reaction(ADR) of traditional Chinese medicine(TCM) and to classify and identify risk factors affecting the occurrence of ADR based on the post-marketing safety data of TCM monitored centrally in real world hospitals. MethodThe ensemble clustering resampling combined with regularized Group Lasso regression was used to perform high-dimensional balancing of ADR class-imbalanced data, and then to integrate the balanced datasets to achieve ADR prediction and the risk factor identification by category. ResultA practical example study of the proposed method on a monitoring data of TCM injection performed that the accuracy of the ADR prediction, the prediction sensitivity, the prediction specificity and the area under receiver operating characteristic curve(AUC) were all above 0.8 on the test set. Meanwhile, 40 risk factors affecting the occurrence of ADR were screened out from total 600 high-dimensional variables. And the effect of risk factors on the occurrence of ADR was identified by classification weighting. The important risk factors were classified as follows:past history, medication information, name of combined drugs, disease status, number of combined drugs and personal data. ConclusionIn the real world data of rare ADR with a large amount of clinical variables, this paper realized accurate ADR prediction on high-dimensional and class imbalanced condition, and classified and identified the key risk factors and their clinical significance of categories, so as to provide risk early warning for clinical rational drug use and combined drug use, as well as scientific basis for reevaluation of safety of post-marketing TCM.

2.
Journal of Xi'an Jiaotong University(Medical Sciences) ; (6): 628-632, 2021.
Article Dans Chinois | WPRIM | ID: wpr-1006702

Résumé

【Objective】 To compare the performance of five commonly used variable selection methods in high-dimensional biomedical data variable screening so as to explore the effects of sample size and association among candidate variables on screening results and provide evidence for the development of variable selection strategy in high-dimensional biomedical data analysis. 【Methods】 Variable selection algorithms were implemented based on R-programming language. Monte Carlo method was used to simulate high-dimensional biomedical data under different conditions to evaluate and compare the performance of different variable selection methods. Variable selection performance was evaluated based on the true positive rate and true negative rate in screening. 【Results】 For specified high-dimensional data, the variable selection performance was improved for all the methods when sample size was increased, and the association between candidate variables did affect variable screening results. Simulation results indicated that the elastic network algorithm yielded the best screening performance, LASSO algorithm took the second place, and ridge algorithm did not work at all. 【Conclusion】 Elastic network algorithm is an ideal variable screening method for high-dimensional data variable screening.

3.
Korean Journal of Nuclear Medicine ; : 99-108, 2018.
Article Dans Anglais | WPRIM | ID: wpr-786980

Résumé

Radiomics utilizes high-dimensional imaging data to discover the association with diagnostic, prognostic, predictive endpoint or radiogenomics. It is an emerging field of study that potentially depicts the intratumoral heterogeneity from quantitative and classified high-throughput data. The radiomics approach has an analytic pipeline where the imaging features are extracted, processed and analyzed. At this point, special data handling is essential because it faces issues of a high-dimensional biomarker compared to a single biomarker approach. This article describes the potential role of radiomics in oncologic studies, the basic analytic pipeline and special data handling with high-dimensional data to facilitate the radiomics approach as a tool for personalized medicine in oncology.


Sujets)
Caractéristiques de la population , Médecine de précision
4.
Journal of Zhejiang University. Science. B ; (12): 935-947, 2018.
Article Dans Anglais | WPRIM | ID: wpr-1010434

Résumé

OBJECTIVE@#As one of the most popular designs used in genetic research, family-based design has been well recognized for its advantages, such as robustness against population stratification and admixture. With vast amounts of genetic data collected from family-based studies, there is a great interest in studying the role of genetic markers from the aspect of risk prediction. This study aims to develop a new statistical approach for family-based risk prediction analysis with an improved prediction accuracy compared with existing methods based on family history.@*METHODS@#In this study, we propose an ensemble-based likelihood ratio (ELR) approach, Fam-ELR, for family-based genomic risk prediction. Fam-ELR incorporates a clustered receiver operating characteristic (ROC) curve method to consider correlations among family samples, and uses a computationally efficient tree-assembling procedure for variable selection and model building.@*RESULTS@#Through simulations, Fam-ELR shows its robustness in various underlying disease models and pedigree structures, and attains better performance than two existing family-based risk prediction methods. In a real-data application to a family-based genome-wide dataset of conduct disorder, Fam-ELR demonstrates its ability to integrate potential risk predictors and interactions into the model for improved accuracy, especially on a genome-wide level.@*CONCLUSIONS@#By comparing existing approaches, such as genetic risk-score approach, Fam-ELR has the capacity of incorporating genetic variants with small or moderate marginal effects and their interactions into an improved risk prediction model. Therefore, it is a robust and useful approach for high-dimensional family-based risk prediction, especially on complex disease with unknown or less known disease etiology.


Sujets)
Femelle , Humains , Mâle , Aire sous la courbe , Simulation numérique , Trouble de la conduite/physiopathologie , Santé de la famille , Marqueurs génétiques , Prédisposition génétique à une maladie , Variation génétique , Génome humain , Étude d'association pangénomique , Génomique , Fonctions de vraisemblance , Modèles génétiques , Odds ratio , Pedigree , Courbe ROC , Reproductibilité des résultats , Facteurs de risque
5.
Chinese Journal of Epidemiology ; (12): 679-683, 2017.
Article Dans Chinois | WPRIM | ID: wpr-737706

Résumé

With the rapid development of genome sequencing technology and bioinformatics in recent years,it has become possible to measure thousands of omics data which might be associated with the progress of diseases,i.e."high-dimensional data".This type of omics data have a common feature that the number of variable p is usually greater than the observation cases n,and often has high correlation between independent variables.Therefore,it is a great statistical challenge to identify really meaningful variables from omics data.This paper summarizes the methods of Bayesian variable selection in the analysis of high-dimensional data.

6.
Chinese Journal of Epidemiology ; (12): 679-683, 2017.
Article Dans Chinois | WPRIM | ID: wpr-736238

Résumé

With the rapid development of genome sequencing technology and bioinformatics in recent years,it has become possible to measure thousands of omics data which might be associated with the progress of diseases,i.e."high-dimensional data".This type of omics data have a common feature that the number of variable p is usually greater than the observation cases n,and often has high correlation between independent variables.Therefore,it is a great statistical challenge to identify really meaningful variables from omics data.This paper summarizes the methods of Bayesian variable selection in the analysis of high-dimensional data.

7.
Genomics & Informatics ; : 129-132, 2006.
Article Dans Anglais | WPRIM | ID: wpr-61948

Résumé

Toxicogenomics has recently emerged in the field of toxicology and the DNA microarray technique has become common strategy for predictive toxicology which studies molecular mechanism caused by exposure of chemical or environmental stress. Although microarray experiment offers extensive genomic information to the researchers, yet high dimensional characteristic of the data often makes it hard to extract meaningful result. Therefore we developed toxicant enrichment analysis similar to the common enrichment approach. We also developed web-based system graPT to enable considerable prediction of toxic endpoints of experimental chemical.


Sujets)
Séquençage par oligonucléotides en batterie , Toxicogénétique , Toxicologie
8.
Genomics & Informatics ; : 65-74, 2003.
Article Dans Anglais | WPRIM | ID: wpr-197484

Résumé

Data mining differs primarily from traditional data analysis on an important dimension, namely the scale of the data. That is the reason why not only statistical but also computer science principles are needed to extract information from large data sets. In this paper we briefly review data mining, its characteristics, typical data mining algorithms, and potential and ongoing applications of data mining at biopharmaceutical industries. The distinguishing characteristics of data mining lie in its understandability, scalability, its problem driven nature, and its analysis of retrospective or observational data in contrast to experimentally designed data. At a high level one can identify three types of problems for which data mining is useful: description, prediction and search. Brief review of data mining algorithms include decision trees and rules, nonlinear classification methods, memory-based methods, model-based clustering, and graphical dependency models. Application areas covered are discovery compound libraries, clinical trial and disease management data, genomics and proteomics, structural databases for candidate drug compounds, and other applications of pharmaceutical relevance.


Sujets)
Classification , Fouille de données , Ensemble de données , Arbres de décision , Prise en charge de la maladie , Découverte de médicament , Génomique , Protéomique , Études rétrospectives , Statistiques comme sujet
SÉLECTION CITATIONS
Détails de la recherche