Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
Forensic Sci Int Synerg ; 8: 100458, 2024.
Article in English | MEDLINE | ID: mdl-38487302

ABSTRACT

In forensic and security scenarios, accurate facial recognition in surveillance videos, often challenged by variations in pose, illumination, and expression, is essential. Traditional manual comparison methods lack standardization, revealing a critical gap in evidence reliability. We propose an enhanced images-to-video recognition approach, pairing facial images with attributes like pose and quality. Utilizing datasets such as ENFSI 2015, SCFace, XQLFW, ChokePoint, and ForenFace, we assess evidence strength using calibration methods for likelihood ratio estimation. Three models-ArcFace, FaceNet, and QMagFace-undergo validation, with the log-likelihood ratio cost (Cllr) as a key metric. Results indicate that prioritizing high-quality frames and aligning attributes with reference images optimizes recognition, yielding similar Cllr values to the top 25% best frames approach. A combined embedding weighted by frame quality emerges as the second-best method. Upon preprocessing facial images with the super resolution CodeFormer, it unexpectedly increased Cllr, undermining evidence reliability, advising against its use in such forensic applications.

2.
Med Image Anal ; 82: 102603, 2022 11.
Article in English | MEDLINE | ID: mdl-36116297

ABSTRACT

Automating report generation for medical imaging promises to minimize labor and aid diagnosis in clinical practice. Deep learning algorithms have recently been shown to be capable of captioning natural photos. However, doing a similar thing for medical data, is difficult due to the variety in reports written by different radiologists with fluctuating levels of knowledge and experience. Current methods for automatic report generation tend to merely copy one of the training samples in the created report. To tackle this issue, we propose variational topic inference, a probabilistic approach for automatic chest X-ray report generation. Specifically, we introduce a probabilistic latent variable model where a latent variable defines a single topic. The topics are inferred in a conditional variational inference framework by aligning vision and language modalities in a latent space, with each topic governing the generation of one sentence in the report. We further adopt a visual attention module that enables the model to attend to different locations in the image while generating the descriptions. We conduct extensive experiments on two benchmarks, namely Indiana U. Chest X-rays and MIMIC-CXR. The results demonstrate that our proposed variational topic inference method can generate reports with novel sentence structure, rather than mere copies of reports used in training, while still achieving comparable performance to state-of-the-art methods in terms of standard language generation criteria.


Subject(s)
Algorithms , Models, Theoretical , Humans , X-Rays , Uncertainty
3.
Forensic Sci Int ; 334: 111239, 2022 May.
Article in English | MEDLINE | ID: mdl-35364422

ABSTRACT

Forensic facial image comparison lacks a methodological standardization and empirical validation. We aim to address this problem by assessing the potential of machine learning to support the human expert in the courtroom. To yield valid evidence in court, decision making systems for facial image comparison should not only be accurate, they should also provide a calibrated confidence measure. This confidence is best conveyed using a score-based likelihood ratio. In this study we compare the performance of different calibrations for such scores. The score, either a distance or a similarity, is converted to a likelihood ratio using three types of calibration following similar techniques as applied in forensic fields such as speaker comparison and DNA matching, but which have not yet been tested in facial image comparison. The calibration types tested are: naive, quality score based on typicality, and feature-based. As transparency is essential in forensics, we focus on state-of-the-art open software and study their power compared to a state-of-the-art commercial system. With the European Network of Forensic Science Institutes (ENFSI) Proficiency tests as benchmark, calibration results on three public databases namely Labeled Faces in the Wild, SC Face and ForenFace show that both quality score and feature based calibration outperform naive calibration. Overall, the commercial system outperforms open software when evaluating these Likelihood Ratios. In general, we conclude that calibration implemented before likelihood ratio estimation is recommended. Furthermore, in terms of performance the commercial system is preferred over open software. As open software is more transparent, more research on open software is urged for.


Subject(s)
Forensic Sciences , Software , Calibration , Forensic Medicine , Forensic Sciences/methods , Humans , Machine Learning
4.
IEEE Trans Vis Comput Graph ; 27(2): 422-431, 2021 Feb.
Article in English | MEDLINE | ID: mdl-33074815

ABSTRACT

In this paper, we introduce 11-20 (Image Insight 2020), a multimedia analytics approach for analytic categorization of image collections. Advanced visualizations for image collections exist, but they need tight integration with a machine model to support the task of analytic categorization. Directly employing computer vision and interactive learning techniques gravitates towards search. Analytic categorization, however, is not machine classification (the difference between the two is called the pragmatic gap): a human adds/redefines/deletes categories of relevance on the fly to build insight, whereas the machine classifier is rigid and non-adaptive. Analytic categorization that truly brings the user to insight requires a flexible machine model that allows dynamic sliding on the exploration-search axis, as well as semantic interactions: a human thinks about image data mostly in semantic terms. 11-20 brings three major contributions to multimedia analytics on image collections and towards closing the pragmatic gap. Firstly, a new machine model that closely follows the user's interactions and dynamically models her categories of relevance. II-20's machine model, in addition to matching and exceeding the state of the art's ability to produce relevant suggestions, allows the user to dynamically slide on the exploration-search axis without any additional input from her side. Secondly, the dynamic, 1-image-at-a-time Tetris metaphor that synergizes with the model. It allows a well-trained model to analyze the collection by itself with minimal interaction from the user and complements the classic grid metaphor. Thirdly, the fast-forward interaction, allowing the user to harness the model to quickly expand ("fast-forward") the categories of relevance, expands the multimedia analytics semantic interaction dictionary. Automated experiments show that II-20's machine model outperforms the existing state of the art and also demonstrate the Tetris metaphor's analytic quality. User studies further confirm that II-20 is an intuitive, efficient, and effective multimedia analytics tool.

5.
IEEE Trans Vis Comput Graph ; 27(2): 550-560, 2021 Feb.
Article in English | MEDLINE | ID: mdl-33048721

ABSTRACT

Many processes, from gene interaction in biology to computer networks to social media, can be modeled more precisely as temporal hypergraphs than by regular graphs. This is because hypergraphs generalize graphs by extending edges to connect any number of vertices, allowing complex relationships to be described more accurately and predict their behavior over time. However, the interactive exploration and seamless refinement of such hypergraph-based prediction models still pose a major challenge. We contribute Hyper-Matrix, a novel visual analytics technique that addresses this challenge through a tight coupling between machine-learning and interactive visualizations. In particular, the technique incorporates a geometric deep learning model as a blueprint for problem-specific models while integrating visualizations for graph-based and category-based data with a novel combination of interactions for an effective user-driven exploration of hypergraph models. To eliminate demanding context switches and ensure scalability, our matrix-based visualization provides drill-down capabilities across multiple levels of semantic zoom, from an overview of model predictions down to the content. We facilitate a focused analysis of relevant connections and groups based on interactive user-steering for filtering and search tasks, a dynamically modifiable partition hierarchy, various matrix reordering techniques, and interactive model feedback. We evaluate our technique in a case study and through formative evaluation with law enforcement experts using real-world internet forum communication data. The results show that our approach surpasses existing solutions in terms of scalability and applicability, enables the incorporation of domain knowledge, and allows for fast search-space traversal. With the proposed technique, we pave the way for the visual analytics of temporal hypergraphs in a wide variety of domains.

6.
J Forensic Sci ; 65(4): 1169-1183, 2020 Jul.
Article in English | MEDLINE | ID: mdl-32396227

ABSTRACT

In this study, we aim to compare the performance of systems and forensic facial comparison experts in terms of likelihood ratio computation to assess the potential of the machine to support the human expert in the courtroom. In forensics, transparency in the methods is essential. Consequently, state-of-the-art free software was preferred over commercial software. Three different open-source automated systems chosen for their availability and clarity were as follows: OpenFace, SeetaFace, and FaceNet; all three based on convolutional neural networks that return a distance (OpenFace, FaceNet) or similarity (SeetaFace). The returned distance or similarity is converted to a likelihood ratio using three different distribution fits: parametric fit Weibull distribution, nonparametric fit kernel density estimation, and isotonic regression with pool adjacent violators algorithm. The results show that with low-quality frontal images, automated systems have better performance to detect nonmatches than investigators: 100% of precision and specificity in confusion matrix against 89% and 86% obtained by investigators, but with good quality images forensic experts have better results. The rank correlation between investigators and software is around 80%. We conclude that the software can assist in reporting officers as it can do faster and more reliable comparisons with full-frontal images, which can help the forensic expert in casework.


Subject(s)
Automated Facial Recognition/methods , Likelihood Functions , Neural Networks, Computer , Forensic Sciences/methods , Humans , Models, Statistical , Sensitivity and Specificity , Software
8.
IEEE Trans Pattern Anal Mach Intell ; 28(10): 1678-89, 2006 Oct.
Article in English | MEDLINE | ID: mdl-16986547

ABSTRACT

This paper presents the semantic pathfinder architecture for generic indexing of multimedia archives. The semantic pathfinder extracts semantic concepts from video by exploring different paths through three consecutive analysis steps, which we derive from the observation that produced video is the result of an authoring-driven process. We exploit this authoring metaphor for machine-driven understanding. The pathfinder starts with the content analysis step. In this analysis step, we follow a data-driven approach of indexing semantics. The style analysis step is the second analysis step. Here, we tackle the indexing problem by viewing a video from the perspective of production. Finally, in the context analysis step, we view semantics in context. The virtue of the semantic pathfinder is its ability to learn the best path of analysis steps on a per-concept basis. To show the generality of this novel indexing approach, we develop detectors for a lexicon of 32 concepts and we evaluate the semantic pathfinder against the 2004 NIST TRECVID video retrieval benchmark, using a news archive of 64 hours. Top ranking performance in the semantic concept detection task indicates the merit of the semantic pathfinder for generic indexing of multimedia archives.


Subject(s)
Abstracting and Indexing/methods , Image Interpretation, Computer-Assisted/methods , Information Storage and Retrieval/methods , Multimedia/classification , Natural Language Processing , Pattern Recognition, Automated/methods , Video Recording/methods , Algorithms , Artificial Intelligence , Semantics , Vocabulary, Controlled
9.
IEEE Trans Image Process ; 13(11): 1432-43, 2004 Nov.
Article in English | MEDLINE | ID: mdl-15540453

ABSTRACT

This paper presents a method for dense optical flow estimation in which the motion field within patches that result from an initial intensity segmentation is parametrized with models of different order. We propose a novel formulation which introduces regularization constraints between the model parameters of neighboring patches. In this way, we provide the additional constraints for very small patches and for patches whose intensity variation cannot sufficiently constrain the estimation of their motion parameters. In order to preserve motion discontinuities, we use robust functions as a regularization mean. We adopt a three-frame approach and control the balance between the backward and forward constraints by a real-valued direction field on which regularization constraints are applied. An iterative deterministic relaxation method is employed in order to solve the corresponding optimization problem. Experimental results show that the proposed method deals successfully with motions large in magnitude, motion discontinuities, and produces accurate piecewise-smooth motion fields.


Subject(s)
Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Movement/physiology , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Subtraction Technique , Cluster Analysis , Computer Graphics , Computer Simulation , Humans , Image Enhancement/methods , Information Storage and Retrieval/methods , Models, Biological , Models, Statistical , Motion , Numerical Analysis, Computer-Assisted , Reproducibility of Results , Sensitivity and Specificity , User-Computer Interface , Walking/physiology
10.
IEEE Trans Image Process ; 11(9): 1081-91, 2002.
Article in English | MEDLINE | ID: mdl-18249729

ABSTRACT

We propose a new method for contour tracking in video. The inverted distance transform of the edge map is used as an edge indicator function for contour detection. Using the concept of topographical distance, the watershed segmentation can be formulated as a minimization. This new viewpoint gives a way to combine the results of the watershed algorithm on different surfaces. In particular, our algorithm determines the contour as a combination of the current edge map and the contour, predicted from the tracking result in the previous frame. We also show that the problem of background clutter can be relaxed by taking the object motion into account. The compensation with object motion allows to detect and remove spurious edges in background. The experimental results confirm the expected advantages of the proposed method over the existing approaches.

SELECTION OF CITATIONS
SEARCH DETAIL
...