Search | VHL Regional Portal

1.

Behavioral meaures of psychotic disorders: Using automatic facial coding to detect nonverbal expressions in video.

Martin, Elizabeth A; Lian, Wenxuan; Oltmanns, Joshua R; Jonas, Katherine G; Samaras, Dimitris; Hallquist, Michael N; Ruggero, Camilo J; Clouston, Sean A P; Kotov, Roman.

J Psychiatr Res ; 176: 9-17, 2024 May 30.

Article in English | MEDLINE | ID: mdl-38830297

ABSTRACT

Emotional deficits in psychosis are prevalent and difficult to treat. In particular, much remains unknown about facial expression abnormalities, and a key reason is that expressions are very labor-intensive to code. Automatic facial coding (AFC) can remove this barrier. The current study sought to both provide evidence for the utility of AFC in psychosis for research purposes and to provide evidence that AFC are valid measures of clinical constructs. Changes of facial expressions and head position of participants-39 with schizophrenia/schizoaffective disorder (SZ), 46 with other psychotic disorders (OP), and 108 never psychotic individuals (NP)-were assessed via FaceReader, a commercially available automated facial expression analysis software, using video recorded during a clinical interview. We first examined the behavioral measures of the psychotic disorder groups and tested if they can discriminate between the groups. Next, we evaluated links of behavioral measures with clinical symptoms, controlling for group membership. We found the SZ group was characterized by significantly less variation in neutral expressions, happy expressions, arousal, and head movements compared to NP. These measures discriminated SZ from NP well (AUC = 0.79, sensitivity = 0.79, specificity = 0.67) but discriminated SZ from OP less well (AUC = 0.66, sensitivity = 0.77, specificity = 0.46). We also found significant correlations between clinician-rated symptoms and most behavioral measures (particularly happy expressions, arousal, and head movements). Taken together, these results suggest that AFC can provide useful behavioral measures of psychosis, which could improve research on non-verbal expressions in psychosis and, ultimately, enhance treatment.

2.

Keratin 17 modulates the immune topography of pancreatic cancer.

Delgado-Coka, Lyanne; Horowitz, Michael; Torrente-Goncalves, Mariana; Roa-Peña, Lucia; Leiton, Cindy V; Hasan, Mahmudul; Babu, Sruthi; Fassler, Danielle; Oentoro, Jaymie; Bai, Ji-Dong K; Petricoin, Emanuel F; Matrisian, Lynn M; Blais, Edik Matthew; Marchenko, Natalia; Allard, Felicia D; Jiang, Wei; Larson, Brent; Hendifar, Andrew; Chen, Chao; Abousamra, Shahira; Samaras, Dimitris; Kurc, Tahsin; Saltz, Joel; Escobar-Hoyos, Luisa F; Shroyer, Kenneth R.

J Transl Med ; 22(1): 443, 2024 May 10.

Article in English | MEDLINE | ID: mdl-38730319

ABSTRACT

BACKGROUND: The immune microenvironment impacts tumor growth, invasion, metastasis, and patient survival and may provide opportunities for therapeutic intervention in pancreatic ductal adenocarcinoma (PDAC). Although never studied as a potential modulator of the immune response in most cancers, Keratin 17 (K17), a biomarker of the most aggressive (basal) molecular subtype of PDAC, is intimately involved in the histogenesis of the immune response in psoriasis, basal cell carcinoma, and cervical squamous cell carcinoma. Thus, we hypothesized that K17 expression could also impact the immune cell response in PDAC, and that uncovering this relationship could provide insight to guide the development of immunotherapeutic opportunities to extend patient survival. METHODS: Multiplex immunohistochemistry (mIHC) and automated image analysis based on novel computational imaging technology were used to decipher the abundance and spatial distribution of T cells, macrophages, and tumor cells, relative to K17 expression in 235 PDACs. RESULTS: K17 expression had profound effects on the exclusion of intratumoral CD8+ T cells and was also associated with decreased numbers of peritumoral CD8+ T cells, CD16+ macrophages, and CD163+ macrophages (p < 0.0001). The differences in the intratumor and peritumoral CD8+ T cell abundance were not impacted by neoadjuvant therapy, tumor stage, grade, lymph node status, histologic subtype, nor KRAS, p53, SMAD4, or CDKN2A mutations. CONCLUSIONS: Thus, K17 expression correlates with major differences in the immune microenvironment that are independent of any tested clinicopathologic or tumor intrinsic variables, suggesting that targeting K17-mediated immune effects on the immune system could restore the innate immunologic response to PDAC and might provide novel opportunities to restore immunotherapeutic approaches for this most deadly form of cancer.

Subject(s)

Keratin-17 , Pancreatic Neoplasms , Humans , Keratin-17/metabolism , Pancreatic Neoplasms/immunology , Pancreatic Neoplasms/pathology , Tumor Microenvironment/immunology , Female , Carcinoma, Pancreatic Ductal/immunology , Carcinoma, Pancreatic Ductal/pathology , Male , CD8-Positive T-Lymphocytes/immunology , Macrophages/metabolism , Macrophages/immunology , Middle Aged , Aged , Receptors, Cell Surface , Antigens, Differentiation, Myelomonocytic , Antigens, CD

3.

PathLDM: Text conditioned Latent Diffusion Model for Histopathology.

Yellapragada, Srikar; Graikos, Alexandros; Prasanna, Prateek; Kurc, Tahsin; Saltz, Joel; Samaras, Dimitris.

IEEE Winter Conf Appl Comput Vis ; 2024: 5170-5179, 2024 Jan.

Article in English | MEDLINE | ID: mdl-38808304

ABSTRACT

To achieve high-quality results, diffusion models must be trained on large datasets. This can be notably prohibitive for models in specialized domains, such as computational pathology. Conditioning on labeled data is known to help in data-efficient model training. Therefore, histopathology reports, which are rich in valuable clinical information, are an ideal choice as guidance for a histopathology generative model. In this paper, we introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images. Leveraging the rich contextual information provided by pathology text reports, our approach fuses image and textual data to enhance the generation process. By utilizing GPT's capabilities to distill and summarize complex text reports, we establish an effective conditioning mechanism. Through strategic conditioning and necessary architectural enhancements, we achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.

4.

Keratin 17 modulates the immune topography of pancreatic cancer.

Delgado-Coka, Lyanne A; Horowitz, Michael; Torrente-Goncalves, Mariana; Roa-Peña, Lucia; Leiton, Cindy V; Hasan, Mahmudul; Babu, Sruthi; Fassler, Danielle; Oentoro, Jaymie; Karen Bai, Ji-Dong; Petricoin, Emanuel F; Matrisian, Lynn M; Blais, Edik Matthew; Marchenko, Natalia; Allard, Felicia D; Jiang, Wei; Larson, Brent; Hendifar, Andrew; Chen, Chao; Abousamra, Shahira; Samaras, Dimitris; Kurc, Tahsin; Saltz, Joel; Escobar-Hoyos, Luisa F; Shroyer, Kenneth.

Res Sq ; 2024 Feb 20.

Article in English | MEDLINE | ID: mdl-38464123

ABSTRACT

Background: The immune microenvironment impacts tumor growth, invasion, metastasis, and patient survival and may provide opportunities for therapeutic intervention in pancreatic ductal adenocarcinoma (PDAC). Although never studied as a potential modulator of the immune response in most cancers, Keratin 17 (K17), a biomarker of the most aggressive (basal) molecular subtype of PDAC, is intimately involved in the histogenesis of the immune response in psoriasis, basal cell carcinoma, and cervical squamous cell carcinoma. Thus, we hypothesized that K17 expression could also impact the immune cell response in PDAC, and that uncovering this relationship could provide insight to guide the development of immunotherapeutic opportunities to extend patient survival. Methods: Multiplex immunohistochemistry (mIHC) and automated image analysis based on novel computational imaging technology were used to decipher the abundance and spatial distribution of T cells, macrophages, and tumor cells, relative to K17 expression in 235 PDACs. Results: K17 expression had profound effects on the exclusion of intratumoral CD8 + T cells and was also associated with decreased numbers of peritumoral CD8 + T cells, CD16 + macrophages, and CD163 + macrophages (p < 0.0001). The differences in the intratumor and peritumoral CD8 + T cell abundance were not impacted by neoadjuvant therapy, tumor stage, grade, lymph node status, histologic subtype, nor KRAS, p53, SMAD4, or CDKN2A mutations. Conclusions: Thus, K17 expression correlates with major differences in the immune microenvironment that are independent of any tested clinicopathologic or tumor intrinsic variables, suggesting that targeting K17-mediated immune effects on the immune system could restore the innate immunologic response to PDAC and might provide novel opportunities to restore immunotherapeutic approaches for this most deadly form of cancer.

5.

Computational pathology: A survey review and the way forward.

Hosseini, Mahdi S; Bejnordi, Babak Ehteshami; Trinh, Vincent Quoc-Huy; Chan, Lyndon; Hasan, Danial; Li, Xingwen; Yang, Stephen; Kim, Taehyo; Zhang, Haochen; Wu, Theodore; Chinniah, Kajanan; Maghsoudlou, Sina; Zhang, Ryan; Zhu, Jiadai; Khaki, Samir; Buin, Andrei; Chaji, Fatemeh; Salehi, Ala; Nguyen, Bich Ngoc; Samaras, Dimitris; Plataniotis, Konstantinos N.

J Pathol Inform ; 15: 100357, 2024 Dec.

Article in English | MEDLINE | ID: mdl-38420608

ABSTRACT

Computational Pathology (CPath) is an interdisciplinary science that augments developments of computational approaches to analyze and model medical histopathology images. The main objective for CPath is to develop infrastructure and workflows of digital diagnostics as an assistive CAD system for clinical pathology, facilitating transformational changes in the diagnosis and treatment of cancer that are mainly address by CPath tools. With evergrowing developments in deep learning and computer vision algorithms, and the ease of the data flow from digital pathology, currently CPath is witnessing a paradigm shift. Despite the sheer volume of engineering and scientific works being introduced for cancer image analysis, there is still a considerable gap of adopting and integrating these algorithms in clinical practice. This raises a significant question regarding the direction and trends that are undertaken in CPath. In this article we provide a comprehensive review of more than 800 papers to address the challenges faced in problem design all-the-way to the application and implementation viewpoints. We have catalogued each paper into a model-card by examining the key works and challenges faced to layout the current landscape in CPath. We hope this helps the community to locate relevant works and facilitate understanding of the field's future directions. In a nutshell, we oversee the CPath developments in cycle of stages which are required to be cohesively linked together to address the challenges associated with such multidisciplinary science. We overview this cycle from different perspectives of data-centric, model-centric, and application-centric problems. We finally sketch remaining challenges and provide directions for future technical developments and clinical integration of CPath. For updated information on this survey review paper and accessing to the original model cards repository, please refer to GitHub. Updated version of this draft can also be found from arXiv.

6.

Attention De-sparsification Matters: Inducing diversity in digital pathology representation learning.

Kapse, Saarthak; Das, Srijan; Zhang, Jingwei; Gupta, Rajarsi R; Saltz, Joel; Samaras, Dimitris; Prasanna, Prateek.

Med Image Anal ; 93: 103070, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38176354

ABSTRACT

We propose DiRL, a Diversity-inducing Representation Learning technique for histopathology imaging. Self-supervised learning (SSL) techniques, such as contrastive and non-contrastive approaches, have been shown to learn rich and effective representations of digitized tissue samples with limited pathologist supervision. Our analysis of vanilla SSL-pretrained models' attention distribution reveals an insightful observation: sparsity in attention, i.e, models tends to localize most of their attention to some prominent patterns in the image. Although attention sparsity can be beneficial in natural images due to these prominent patterns being the object of interest itself, this can be sub-optimal in digital pathology; this is because, unlike natural images, digital pathology scans are not object-centric, but rather a complex phenotype of various spatially intermixed biological components. Inadequate diversification of attention in these complex images could result in crucial information loss. To address this, we leverage cell segmentation to densely extract multiple histopathology-specific representations, and then propose a prior-guided dense pretext task, designed to match the multiple corresponding representations between the views. Through this, the model learns to attend to various components more closely and evenly, thus inducing adequate diversification in attention for capturing context-rich representations. Through quantitative and qualitative analysis on multiple tasks across cancer types, we demonstrate the efficacy of our method and observe that the attention is more globally distributed.

Subject(s)

Image Processing, Computer-Assisted , Machine Learning , Pathology , Humans , Phenotype , Pathology/methods

7.

A systematic study of key elements underlying molecular property prediction.

Deng, Jianyuan; Yang, Zhibo; Wang, Hehe; Ojima, Iwao; Samaras, Dimitris; Wang, Fusheng.

Nat Commun ; 14(1): 6395, 2023 Oct 13.

Article in English | MEDLINE | ID: mdl-37833262

ABSTRACT

Artificial intelligence (AI) has been widely applied in drug discovery with a major task as molecular property prediction. Despite booming techniques in molecular representation learning, key elements underlying molecular property prediction remain largely unexplored, which impedes further advancements in this field. Herein, we conduct an extensive evaluation of representative models using various representations on the MoleculeNet datasets, a suite of opioids-related datasets and two additional activity datasets from the literature. To investigate the predictive power in low-data and high-data space, a series of descriptors datasets of varying sizes are also assembled to evaluate the models. In total, we have trained 62,820 models, including 50,220 models on fixed representations, 4200 models on SMILES sequences and 8400 models on molecular graphs. Based on extensive experimentation and rigorous comparison, we show that representation learning models exhibit limited performance in molecular property prediction in most datasets. Besides, multiple key elements underlying molecular property prediction can affect the evaluation results. Furthermore, we show that activity cliffs can significantly impact model prediction. Finally, we explore into potential causes why representation learning models can fail and show that dataset size is essential for representation learning models to excel.

8.

Topology-Guided Multi-Class Cell Context Generation for Digital Pathology.

Abousamra, Shahira; Gupta, Rajarsi; Kurc, Tahsin; Samaras, Dimitris; Saltz, Joel; Chen, Chao.

Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit ; 2023: 3323-3333, 2023 Jun.

Article in English | MEDLINE | ID: mdl-38741683

ABSTRACT

In digital pathology, the spatial context of cells is important for cell classification, cancer diagnosis and prognosis. To model such complex cell context, however, is challenging. Cells form different mixtures, lineages, clusters and holes. To model such structural patterns in a learnable fashion, we introduce several mathematical tools from spatial statistics and topological data analysis. We incorporate such structural descriptors into a deep generative model as both conditional inputs and a differentiable loss. This way, we are able to generate high quality multi-class cell layouts for the first time. We show that the topology-rich cell layouts can be used for data augmentation and improve the performance of downstream tasks such as cell classification.

9.

Unsupervised Stain Decomposition via Inversion Regulation for Multiplex Immunohistochemistry Images.

Abousamra, Shahira; Fassler, Danielle; Yao, Jiachen; Gupta, Rajarsi; Kurc, Tahsin; Escobar-Hoyos, Luisa; Samaras, Dimitris; Shroyer, Kenneth; Saltz, Joel; Chen, Chao.

Proc Mach Learn Res ; 227: 74-94, 2023 Jul.

Article in English | MEDLINE | ID: mdl-38817539

ABSTRACT

Multiplex Immunohistochemistry (mIHC) is a cost-effective and accessible method for in situ labeling of multiple protein biomarkers in a tissue sample. By assigning a different stain to each biomarker, it allows the visualization of different types of cells within the tumor vicinity for downstream analysis. However, to detect different types of stains in a given mIHC image is a challenging problem, especially when the number of stains is high. Previous deep-learning-based methods mostly assume full supervision; yet the annotation can be costly. In this paper, we propose a novel unsupervised stain decomposition method to detect different stains simultaneously. Our method does not require any supervision, except for color samples of different stains. A main technical challenge is that the problem is underdetermined and can have multiple solutions. To conquer this issue, we propose a novel inversion regulation technique, which eliminates most undesirable solutions. On a 7-plexed IHC images dataset, the proposed method achieves high quality stain decomposition results without human annotation.

10.

Token Sparsification for Faster Medical Image Segmentation.

Zhou, Lei; Liu, Huidong; Bae, Joseph; He, Junjun; Samaras, Dimitris; Prasanna, Prateek.

Inf Process Med Imaging ; 13939: 743-754, 2023 Jun.

Article in English | MEDLINE | ID: mdl-38680428

ABSTRACT

Can we use sparse tokens for dense prediction, e.g., segmentation? Although token sparsification has been applied to Vision Transformers (ViT) to accelerate classification, it is still unknown how to perform segmentation from sparse tokens. To this end, we reformulate segmentation as a sparse encoding â token completion â dense decoding (SCD) pipeline. We first empirically show that naïvely applying existing approaches from classification token pruning and masked image modeling (MIM) leads to failure and inefficient training caused by inappropriate sampling algorithms and the low quality of the restored dense features. In this paper, we propose Soft-topK Token Pruning (STP) and Multi-layer Token Assembly (MTA) to address these problems. In sparse encoding, STP predicts token importance scores with a lightweight sub-network and samples the topK tokens. The intractable topK gradients are approximated through a continuous perturbed score distribution. In token completion, MTA restores a full token sequence by assembling both sparse output tokens and pruned multi-layer intermediate ones. The last dense decoding stage is compatible with existing segmentation decoders, e.g., UNETR. Experiments show SCD pipelines equipped with STP and MTA are much faster than baselines without token pruning in both training (up to 120% higher throughput) and inference (up to 60.6% higher throughput) while maintaining segmentation quality. Code is available here: https://github.com/cvlab-stonybrook/TokenSparse-for-MedSeg.

11.

Weighting the factors affecting attention guidance during free viewing and visual search: The unexpected role of object recognition uncertainty.

Chakraborty, Souradeep; Samaras, Dimitris; Zelinsky, Gregory J.

J Vis ; 22(4): 13, 2022 03 02.

Article in English | MEDLINE | ID: mdl-35323870

ABSTRACT

The factors determining how attention is allocated during visual tasks have been studied for decades, but few studies have attempted to model the weighting of several of these factors within and across tasks to better understand their relative contributions. Here we consider the roles of saliency, center bias, target features, and object recognition uncertainty in predicting the first nine changes in fixation made during free viewing and visual search tasks in the OSIE and COCO-Search18 datasets, respectively. We focus on the latter-most and least familiar of these factors by proposing a new method of quantifying uncertainty in an image, one based on object recognition. We hypothesize that the greater the number of object categories competing for an object proposal, the greater the uncertainty of how that object should be recognized and, hence, the greater the need for attention to resolve this uncertainty. As expected, we found that target features best predicted target-present search, with their dominance obscuring the use of other features. Unexpectedly, we found that target features were only weakly used during target-absent search. We also found that object recognition uncertainty outperformed an unsupervised saliency model in predicting free-viewing fixations, although saliency was slightly more predictive of search. We conclude that uncertainty in object recognition, a measure that is image computable and highly interpretable, is better than bottom-up saliency in predicting attention during free viewing.

Subject(s)

Visual Perception , Bias , Humans , Uncertainty

12.

An Expandable Informatics Framework for Enhancing Central Cancer Registries with Digital Pathology Specimens, Computational Imaging Tools, and Advanced Mining Capabilities.

Foran, David J; Durbin, Eric B; Chen, Wenjin; Sadimin, Evita; Sharma, Ashish; Banerjee, Imon; Kurc, Tahsin; Li, Nan; Stroup, Antoinette M; Harris, Gerald; Gu, Annie; Schymura, Maria; Gupta, Rajarsi; Bremer, Erich; Balsamo, Joseph; DiPrima, Tammy; Wang, Feiqiao; Abousamra, Shahira; Samaras, Dimitris; Hands, Isaac; Ward, Kevin; Saltz, Joel H.

J Pathol Inform ; 13: 5, 2022.

Article in English | MEDLINE | ID: mdl-35136672

ABSTRACT

BACKGROUND: Population-based state cancer registries are an authoritative source for cancer statistics in the United States. They routinely collect a variety of data, including patient demographics, primary tumor site, stage at diagnosis, first course of treatment, and survival, on every cancer case that is reported across all U.S. states and territories. The goal of our project is to enrich NCI's Surveillance, Epidemiology, and End Results (SEER) registry data with high-quality population-based biospecimen data in the form of digital pathology, machine-learning-based classifications, and quantitative histopathology imaging feature sets (referred to here as Pathomics features). MATERIALS AND METHODS: As part of the project, the underlying informatics infrastructure was designed, tested, and implemented through close collaboration with several participating SEER registries to ensure consistency with registry processes, computational scalability, and ability to support creation of population cohorts that span multiple sites. Utilizing computational imaging algorithms and methods to both generate indices and search for matches makes it possible to reduce inter- and intra-observer inconsistencies and to improve the objectivity with which large image repositories are interrogated. RESULTS: Our team has created and continues to expand a well-curated repository of high-quality digitized pathology images corresponding to subjects whose data are routinely collected by the collaborating registries. Our team has systematically deployed and tested key, visual analytic methods to facilitate automated creation of population cohorts for epidemiological studies and tools to support visualization of feature clusters and evaluation of whole-slide images. As part of these efforts, we are developing and optimizing advanced search and matching algorithms to facilitate automated, content-based retrieval of digitized specimens based on their underlying image features and staining characteristics. CONCLUSION: To meet the challenges of this project, we established the analytic pipelines, methods, and workflows to support the expansion and management of a growing repository of high-quality digitized pathology and information-rich, population cohorts containing objective imaging and clinical attributes to facilitate studies that seek to discriminate among different subtypes of disease, stratify patient populations, and perform comparisons of tumor characteristics within and across patient cohorts. We have also successfully developed a suite of tools based on a deep-learning method to perform quantitative characterizations of tumor regions, assess infiltrating lymphocyte distributions, and generate objective nuclear feature measurements. As part of these efforts, our team has implemented reliable methods that enable investigators to systematically search through large repositories to automatically retrieve digitized pathology specimens and correlated clinical data based on their computational signatures.

13.

Artificial intelligence in drug discovery: applications and techniques.

Deng, Jianyuan; Yang, Zhibo; Ojima, Iwao; Samaras, Dimitris; Wang, Fusheng.

Brief Bioinform ; 23(1)2022 01 17.

Article in English | MEDLINE | ID: mdl-34734228

ABSTRACT

Artificial intelligence (AI) has been transforming the practice of drug discovery in the past decade. Various AI techniques have been used in many drug discovery applications, such as virtual screening and drug design. In this survey, we first give an overview on drug discovery and discuss related applications, which can be reduced to two major tasks, i.e. molecular property prediction and molecule generation. We then present common data resources, molecule representations and benchmark platforms. As a major part of the survey, AI techniques are dissected into model architectures and learning paradigms. To reflect the technical development of AI in drug discovery over the years, the surveyed works are organized chronologically. We expect that this survey provides a comprehensive review on AI in drug discovery. We also provide a GitHub repository with a collection of papers (and codes, if applicable) as a learning resource, which is regularly updated.

Subject(s)

Artificial Intelligence , Drug Discovery , Drug Design , Drug Discovery/methods

14.

Physics-Based Shadow Image Decomposition for Shadow Removal.

Le, Hieu; Samaras, Dimitris.

IEEE Trans Pattern Anal Mach Intell ; 44(12): 9088-9101, 2022 Dec.

Article in English | MEDLINE | ID: mdl-34735336

ABSTRACT

We propose a novel deep learning method for shadow removal. Inspired by physical models of shadow formation, we use a linear illumination transformation to model the shadow effects in the image that allows the shadow image to be expressed as a combination of the shadow-free image, the shadow parameters, and a matte layer. We use two deep networks, namely SP-Net and M-Net, to predict the shadow parameters and the shadow matte respectively. This system allows us to remove the shadow effects from images. We then employ an inpainting network, I-Net, to further refine the results. We train and test our framework on the most challenging shadow removal dataset (ISTD). Our method improves the state-of-the-art in terms of mean absolute error (MAE) for the shadow area by 20%. Furthermore, this decomposition allows us to formulate a patch-based weakly-supervised shadow removal method. This model can be trained without any shadow- free images (that are cumbersome to acquire) and achieves competitive shadow removal results compared to state-of-the-art methods that are trained with fully paired shadow and shadow-free images. Last, we introduce SBU-Timelapse, a video shadow removal dataset for evaluating shadow removal methods.

15.

Target-absent Human Attention.

Yang, Zhibo; Mondal, Sounak; Ahn, Seoyoung; Zelinsky, Gregory; Hoai, Minh; Samaras, Dimitris.

Comput Vis ECCV ; 13664: 52-68, 2022 Oct.

Article in English | MEDLINE | ID: mdl-38144433

ABSTRACT

The prediction of human gaze behavior is important for building human-computer interaction systems that can anticipate the user's attention. Computer vision models have been developed to predict the fixations made by people as they search for target objects. But what about when the target is not in the image? Equally important is to know how people search when they cannot find a target, and when they would stop searching. In this paper, we propose a data-driven computational model that addresses the search-termination problem and predicts the scanpath of search fixations made by people searching for targets that do not appear in images. We model visual search as an imitation learning problem and represent the internal knowledge that the viewer acquires through fixations using a novel state representation that we call Foveated Feature Maps (FFMs). FFMs integrate a simulated foveated retina into a pretrained ConvNet that produces an in-network feature pyramid, all with minimal computational overhead. Our method integrates FFMs as the state representation in inverse reinforcement learning. Experimentally, we improve the state of the art in predicting human target-absent search behavior on the COCO-Search18 dataset. Code is available at: https://github.com/cvlab-stonybrook/Target-absent-Human-Attention.

16.

Radiologically Defined Tumor-habitat Adjacency as a Prognostic Biomarker in Glioblastoma.

Xu, Xuan; Samaras, Dimitris; Prasanna, Prateek.

Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 3998-4001, 2021 11.

Article in English | MEDLINE | ID: mdl-34892107

ABSTRACT

Intratumor heterogeneity in glioblastoma (GBM) has been linked to adverse clinical outcomes including poor survival and sub-optimal response to therapies. Different techniques, such as radiomics, have been used to characterize GBM phenotype. However, the spatial diversity and the interaction between different sub-regions within the tumor (habitats) and its microenvironment has been relatively unexplored. Besides, existing approaches have mainly focused on the radiomic analysis within globally defined regions without considering local heterogeneity. In this paper, we developed a 3D spatial co-localization descriptor based on the adjacency of "habitats" to quantify the diversity of physiologically similar sub-regions on multi-protocol magnetic resonance imaging. We demonstrated the utility of this spatial phenotype descriptor in predicting overall patient survival. Our experimental results on N=236 treatment-naïve MRI scans suggest that the co-localization features in conjunction with traditional clinical measures, such as age and tumor volume, outperform texture based radiomic features. The presented descriptor provides a tool for more complete characterization of intratumor heterogeneity in solid cancers.

Subject(s)

Brain Neoplasms , Glioblastoma , Biomarkers , Brain Neoplasms/diagnostic imaging , Ecosystem , Glioblastoma/diagnostic imaging , Humans , Prognosis , Tumor Microenvironment

17.

COCO-Search18 fixation dataset for predicting goal-directed attention control.

Chen, Yupei; Yang, Zhibo; Ahn, Seoyoung; Samaras, Dimitris; Hoai, Minh; Zelinsky, Gregory.

Sci Rep ; 11(1): 8776, 2021 04 22.

Article in English | MEDLINE | ID: mdl-33888734

ABSTRACT

Attention control is a basic behavioral process that has been studied for decades. The currently best models of attention control are deep networks trained on free-viewing behavior to predict bottom-up attention control - saliency. We introduce COCO-Search18, the first dataset of laboratory-quality goal-directed behavior large enough to train deep-network models. We collected eye-movement behavior from 10 people searching for each of 18 target-object categories in 6202 natural-scene images, yielding [Formula: see text] 300,000 search fixations. We thoroughly characterize COCO-Search18, and benchmark it using three machine-learning methods: a ResNet50 object detector, a ResNet50 trained on fixation-density maps, and an inverse-reinforcement-learning model trained on behavioral search scanpaths. Models were also trained/tested on images transformed to approximate a foveated retina, a fundamental biological constraint. These models, each having a different reliance on behavioral training, collectively comprise the new state-of-the-art in predicting goal-directed search fixations. Our expectation is that future work using COCO-Search18 will far surpass these initial efforts, finding applications in domains ranging from human-computer interactive systems that can anticipate a person's intent and render assistance to the potentially early identification of attention-related clinical disorders (ADHD, PTSD, phobia) based on deviation from neurotypical fixation behavior.

Subject(s)

Attention , Fixation, Ocular , Goals , Datasets as Topic , Deep Learning , Humans , Man-Machine Systems

18.

Sequence-to-Segments Networks for Detecting Segments in Videos.

Wei, Zijun; Wang, Boyu; Hoai, Minh; Zhang, Jianming; Shen, Xiaohui; Lin, Zhe; Mech, Radomir; Samaras, Dimitris.

IEEE Trans Pattern Anal Mach Intell ; 43(3): 1009-1021, 2021 Mar.

Article in English | MEDLINE | ID: mdl-31514124

ABSTRACT

Detecting segments of interest from videos is a common problem for many applications. And yet it is a challenging problem as it often requires not only knowledge of individual target segments, but also contextual understanding of the entire video and the relationships between the target segments. To address this problem, we propose the Sequence-to-Segments Network (S2N), a novel and general end-to-end sequential encoder-decoder architecture. S2N first encodes the input video into a sequence of hidden states that capture information progressively, as it appears in the video. It then employs the Segment Detection Unit (SDU), a novel decoding architecture, that sequentially detects segments. At each decoding step, the SDU integrates the decoder state and encoder hidden states to detect a target segment. During training, we address the problem of finding the best assignment of predicted segments to ground truth using the Hungarian Matching Algorithm with Lexicographic Cost. Additionally we propose to use the squared Earth Mover's Distance to optimize the localization errors of the segments. We show the state-of-the-art performance of S2N across numerous tasks, including video highlighting, video summarization, and human action proposal generation.

19.

Large Scale Shadow Annotation and Detection Using Lazy Annotation and Stacked CNNs.

Hou, Le; Vicente, Tomas F Yago; Hoai, Minh; Samaras, Dimitris.

IEEE Trans Pattern Anal Mach Intell ; 43(4): 1337-1351, 2021 04.

Article in English | MEDLINE | ID: mdl-31634124

ABSTRACT

Recent shadow detection algorithms have shown initial success on small datasets of images from specific domains. However, shadow detection on broader image domains is still challenging due to the lack of annotated training data, caused by the intense manual labor required for annotating shadow data. In this paper we propose "lazy annotation", an efficient annotation method where an annotator only needs to mark the important shadow areas and some non-shadow areas. This yields data with noisy labels that are not yet useful for training a shadow detector. We address the problem of label noise by jointly learning a shadow region classifier and recovering the labels in the training set. We consider the training labels as unknowns and formulate label recovery as the minimization of the sum of squared leave-one-out errors of a Least Squares SVM, which can be efficiently optimized. Experimental results show that a classifier trained with recovered labels achieves comparable performance to a classifier trained on the properly annotated data. These results motivated us to collect a new dataset that is 20 times larger than existing datasets and contains a large variety of scenes and image types. Naturally, such a large dataset is appropriate for training deep learning methods. Thus, we propose a stacked Convolutional Neural Network architecture that efficiently trains on patch level shadow examples while incorporating image level semantic information. This means that the detected shadow patches are refined based on image semantics. Our proposed pipeline, trained on recovered labels, performs at state-of-the art level. Furthermore, the proposed model performs exceptionally well on a cross dataset task, proving the generalization power of the proposed architecture and dataset.

20.

Deep Learning-Based Mapping of Tumor Infiltrating Lymphocytes in Whole Slide Images of 23 Types of Cancer.

Abousamra, Shahira; Gupta, Rajarsi; Hou, Le; Batiste, Rebecca; Zhao, Tianhao; Shankar, Anand; Rao, Arvind; Chen, Chao; Samaras, Dimitris; Kurc, Tahsin; Saltz, Joel.

Front Oncol ; 11: 806603, 2021.

Article in English | MEDLINE | ID: mdl-35251953

ABSTRACT

The role of tumor infiltrating lymphocytes (TILs) as a biomarker to predict disease progression and clinical outcomes has generated tremendous interest in translational cancer research. We present an updated and enhanced deep learning workflow to classify 50x50 um tiled image patches (100x100 pixels at 20x magnification) as TIL positive or negative based on the presence of 2 or more TILs in gigapixel whole slide images (WSIs) from the Cancer Genome Atlas (TCGA). This workflow generates TIL maps to study the abundance and spatial distribution of TILs in 23 different types of cancer. We trained three state-of-the-art, popular convolutional neural network (CNN) architectures (namely VGG16, Inception-V4, and ResNet-34) with a large volume of training data, which combined manual annotations from pathologists (strong annotations) and computer-generated labels from our previously reported first-generation TIL model for 13 cancer types (model-generated annotations). Specifically, this training dataset contains TIL positive and negative patches from cancers in additional organ sites and curated data to help improve algorithmic performance by decreasing known false positives and false negatives. Our new TIL workflow also incorporates automated thresholding to convert model predictions into binary classifications to generate TIL maps. The new TIL models all achieve better performance with improvements of up to 13% in accuracy and 15% in F-score. We report these new TIL models and a curated dataset of TIL maps, referred to as TIL-Maps-23, for 7983 WSIs spanning 23 types of cancer with complex and diverse visual appearances, which will be publicly available along with the code to evaluate performance. Code Available at: https://github.com/ShahiraAbousamra/til_classification.

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL