Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Commun Med (Lond) ; 3(1): 139, 2023 Oct 06.
Article in English | MEDLINE | ID: mdl-37803172

ABSTRACT

BACKGROUND: Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods, followed by classification of the now complete samples. The focus of the machine learning researcher is to optimise the classifier's performance. METHODS: We utilise three simulated and three real-world clinical datasets with different feature types and missingness patterns. Initially, we evaluate how the downstream classifier performance depends on the choice of classifier and imputation methods. We employ ANOVA to quantitatively evaluate how the choice of missingness rate, imputation method, and classifier method influences the performance. Additionally, we compare commonly used methods for assessing imputation quality and introduce a class of discrepancy scores based on the sliced Wasserstein distance. We also assess the stability of the imputations and the interpretability of model built on the imputed data. RESULTS: The performance of the classifier is most affected by the percentage of missingness in the test data, with a considerable performance decline observed as the test missingness rate increases. We also show that the commonly used measures for assessing imputation quality tend to lead to imputed data which poorly matches the underlying data distribution, whereas our new class of discrepancy scores performs much better on this measure. Furthermore, we show that the interpretability of classifier models trained using poorly imputed data is compromised. CONCLUSIONS: It is imperative to consider the quality of the imputation when performing downstream classification as the effects on the classifier can be considerable.


Many artificial intelligence (AI) methods aim to classify samples of data into groups, e.g., patients with disease vs. those without. This often requires datasets to be complete, i.e., that all data has been collected for all samples. However, in clinical practice this is often not the case and some data can be missing. One solution is to 'complete' the dataset using a technique called imputation to replace those missing values. However, assessing how well the imputation method performs is challenging. In this work, we demonstrate why people should care about imputation, develop a new method for assessing imputation quality, and demonstrate that if we build AI models on poorly imputed data, the model can give different results to those we would hope for. Our findings may improve the utility and quality of AI models in the clinic.

3.
Br J Cancer ; 125(6): 884-892, 2021 09.
Article in English | MEDLINE | ID: mdl-34168297

ABSTRACT

BACKGROUND: This study investigates whether quantitative breast density (BD) serves as an imaging biomarker for more intensive breast cancer screening by predicting interval, and node-positive cancers. METHODS: This case-control study of 1204 women aged 47-73 includes 599 cancer cases (302 screen-detected, 297 interval; 239 node-positive, 360 node-negative) and 605 controls. Automated BD software calculated fibroglandular volume (FGV), volumetric breast density (VBD) and density grade (DG). A radiologist assessed BD using a visual analogue scale (VAS) from 0 to 100. Logistic regression and area under the receiver operating characteristic curves (AUC) determined whether BD could predict mode of detection (screen-detected or interval); node-negative cancers; node-positive cancers, and all cancers vs. controls. RESULTS: FGV, VBD, VAS, and DG all discriminated interval cancers (all p < 0.01) from controls. Only FGV-quartile discriminated screen-detected cancers (p < 0.01). Based on AUC, FGV discriminated all cancer types better than VBD or VAS. FGV showed a significantly greater discrimination of interval cancers, AUC = 0.65, than of screen-detected cancers, AUC = 0.61 (p < 0.01) as did VBD (0.63 and 0.53, respectively, p < 0.001). CONCLUSION: FGV, VBD, VAS and DG discriminate interval cancers from controls, reflecting some masking risk. Only FGV discriminates screen-detected cancers perhaps adding a unique component of breast cancer risk.


Subject(s)
Breast Density , Breast Neoplasms/diagnostic imaging , Mammography/methods , Aged , Case-Control Studies , Early Detection of Cancer , Female , Humans , Middle Aged , Randomized Controlled Trials as Topic , Visual Analog Scale
4.
Urology ; 149: e37-e39, 2021 Mar.
Article in English | MEDLINE | ID: mdl-33129874

ABSTRACT

In most cases an ectopic ureter is associated with a duplicated renal collecting system while in only a few single systems is found. Bilateral single system ureteral ectopia is even rarer. A 9-year-old girl presented with urinary incontinence. Investigations pointed towards bilateral single system ectopic ureters with ectopic openings into vagina with a hypoplastic bladder. The left ureteric system was tortuous with malrotated and hypoplastic left kidney. A 4 × 2 cm hard calculus was found in the vagina. Right Ureteric reimplantation with left to right uretero-ureterostomy was done with satisfactory postoperative day time continence at 6 months without the need for bladder reconstruction or urinary diversion.


Subject(s)
Abnormalities, Multiple , Ureter/abnormalities , Vagina/abnormalities , Abnormalities, Multiple/classification , Child , Female , Humans , Ureter/pathology
5.
Nat Rev Drug Discov ; 12(1): 35-50, 2013 01.
Article in English | MEDLINE | ID: mdl-23274470

ABSTRACT

Selecting the best targets is a key challenge for drug discovery, and achieving this effectively, efficiently and systematically is particularly important for prioritizing candidates from the sizeable lists of potential therapeutic targets that are now emerging from large-scale multi-omics initiatives, such as those in oncology. Here, we describe an objective, systematic, multifaceted computational assessment of biological and chemical space that can be applied to any human gene set to prioritize targets for therapeutic exploration. We use this approach to evaluate an exemplar set of 479 cancer-associated genes, reveal the tension between biological relevance and chemical tractability, and describe major gaps in available knowledge that could be addressed to aid objective decision-making. We also propose drug repurposing opportunities and identify potentially druggable cancer-associated proteins that have been poorly explored with regard to the discovery of small-molecule modulators, despite their biological relevance.


Subject(s)
Antineoplastic Agents/pharmacology , Drug Discovery/methods , Molecular Targeted Therapy , Neoplasms/drug therapy , Decision Making , Drug Design , Humans , Neoplasms/genetics , Neoplasms/pathology
6.
Nucleic Acids Res ; 40(Database issue): D947-56, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22013161

ABSTRACT

canSAR is a fully integrated cancer research and drug discovery resource developed to utilize the growing publicly available biological annotation, chemical screening, RNA interference screening, expression, amplification and 3D structural data. Scientists can, in a single place, rapidly identify biological annotation of a target, its structural characterization, expression levels and protein interaction data, as well as suitable cell lines for experiments, potential tool compounds and similarity to known drug targets. canSAR has, from the outset, been completely use-case driven which has dramatically influenced the design of the back-end and the functionality provided through the interfaces. The Web interface at http://cansar.icr.ac.uk provides flexible, multipoint entry into canSAR. This allows easy access to the multidisciplinary data within, including target and compound synopses, bioactivity views and expert tools for chemogenomic, expression and protein interaction network data.


Subject(s)
Antineoplastic Agents/chemistry , Databases, Genetic , Neoplasms/genetics , Neoplasms/metabolism , Antineoplastic Agents/pharmacology , Cell Line, Tumor , Drug Discovery , Gene Expression , Genetic Variation , Humans , Internet , Models, Molecular , Protein Interaction Maps , RNA Interference , Systems Integration , Translational Research, Biomedical
7.
Biophys J ; 99(7): 2190-9, 2010 Oct 06.
Article in English | MEDLINE | ID: mdl-20923653

ABSTRACT

Microtubules are supramolecular structures that make up the cytoskeleton and strongly affect the mechanical properties of the cell. Within the cytoskeleton filaments, the microtubule (MT) exhibits by far the highest bending stiffness. Bending stiffness depends on the mechanical properties and intermolecular interactions of the tubulin dimers (the MT building blocks). Computational molecular modeling has the potential for obtaining quantitative insights into this area. However, to our knowledge, standard molecular modeling techniques, such as molecular dynamics (MD) and normal mode analysis (NMA), are not yet able to simulate large molecular structures like the MTs; in fact, their possibilities are normally limited to much smaller protein complexes. In this work, we developed a multiscale approach by merging the modeling contribution from MD and NMA. In particular, MD simulations were used to refine the molecular conformation and arrangement of the tubulin dimers inside the MT lattice. Subsequently, NMA was used to investigate the vibrational properties of MTs modeled as an elastic network. The coarse-grain model here developed can describe systems of hundreds of interacting tubulin monomers (corresponding to up to 1,000,000 atoms). In particular, we were able to simulate coarse-grain models of entire MTs, with lengths up to 350 nm. A quantitative mechanical investigation was performed; from the bending and stretching modes, we estimated MT macroscopic properties such as bending stiffness, Young modulus, and persistence length, thus allowing a direct comparison with experimental data.


Subject(s)
Elasticity , Microtubules/metabolism , Models, Biological , Anisotropy , Molecular Dynamics Simulation , Protein Multimerization , Reference Standards , Tubulin/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL
...