Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 34
Filter
1.
Cancers (Basel) ; 16(13)2024 Jun 26.
Article in English | MEDLINE | ID: mdl-39001410

ABSTRACT

BACKGROUND: Bladder cancer (BC) segmentation on MRI images is the first step to determining the presence of muscular invasion. This study aimed to assess the tumor segmentation performance of three deep learning (DL) models on multi-parametric MRI (mp-MRI) images. METHODS: We studied 53 patients with bladder cancer. Bladder tumors were segmented on each slice of T2-weighted (T2WI), diffusion-weighted imaging/apparent diffusion coefficient (DWI/ADC), and T1-weighted contrast-enhanced (T1WI) images acquired at a 3Tesla MRI scanner. We trained Unet, MAnet, and PSPnet using three loss functions: cross-entropy (CE), dice similarity coefficient loss (DSC), and focal loss (FL). We evaluated the model performances using DSC, Hausdorff distance (HD), and expected calibration error (ECE). RESULTS: The MAnet algorithm with the CE+DSC loss function gave the highest DSC values on the ADC, T2WI, and T1WI images. PSPnet with CE+DSC obtained the smallest HDs on the ADC, T2WI, and T1WI images. The segmentation accuracy overall was better on the ADC and T1WI than on the T2WI. The ECEs were the smallest for PSPnet with FL on the ADC images, while they were the smallest for MAnet with CE+DSC on the T2WI and T1WI. CONCLUSIONS: Compared to Unet, MAnet and PSPnet with a hybrid CE+DSC loss function displayed better performances in BC segmentation depending on the choice of the evaluation metric.

2.
Med Image Anal ; 91: 103015, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37918314

ABSTRACT

Most segmentation losses are arguably variants of the Cross-Entropy (CE) or Dice losses. On the surface, these two categories of losses (i.e., distribution based vs. geometry based) seem unrelated, and there is no clear consensus as to which category is a better choice, with varying performances for each across different benchmarks and applications. Furthermore, it is widely argued within the medical-imaging community that Dice and CE are complementary, which has motivated the use of compound CE-Dice losses. In this work, we provide a theoretical analysis, which shows that CE and Dice share a much deeper connection than previously thought. First, we show that, from a constrained-optimization perspective, they both decompose into two components, i.e., a similar ground-truth matching term, which pushes the predicted foreground regions towards the ground-truth, and a region-size penalty term imposing different biases on the size (or proportion) of the predicted regions. Then, we provide bound relationships and an information-theoretic analysis, which uncover hidden region-size biases: Dice has an intrinsic bias towards specific extremely imbalanced solutions, whereas CE implicitly encourages the ground-truth region proportions. Our theoretical results explain the wide experimental evidence in the medical-imaging literature, whereby Dice losses bring improvements for imbalanced segmentation. It also explains why CE dominates natural-image problems with diverse class proportions, in which case Dice might have difficulty adapting to different region-size distributions. Based on our theoretical analysis, we propose a principled and simple solution, which enables to control explicitly the region-size bias. The proposed method integrates CE with explicit terms based on L1 or the KL divergence, which encourage segmenting region proportions to match target class proportions, thereby mitigating class imbalance but without losing generality. Comprehensive experiments and ablation studies over different losses and applications validate our theoretical analysis, as well as the effectiveness of explicit and simple region-size terms. The code is available at https://github.com/by-liu/SegLossBias .

3.
Med Image Anal ; 91: 103011, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37924752

ABSTRACT

Semi-supervised learning relaxes the need of large pixel-wise labeled datasets for image segmentation by leveraging unlabeled data. A prominent way to exploit unlabeled data is to regularize model predictions. Since the predictions of unlabeled data can be unreliable, uncertainty-aware schemes are typically employed to gradually learn from meaningful and reliable predictions. Uncertainty estimation methods, however, rely on multiple inferences from the model predictions that must be computed for each training step, which is computationally expensive. Moreover, these uncertainty maps capture pixel-wise disparities and do not consider global information. This work proposes a novel method to estimate segmentation uncertainty by leveraging global information from the segmentation masks. More precisely, an anatomically-aware representation is first learnt to model the available segmentation masks. The learnt representation thereupon maps the prediction of a new segmentation into an anatomically-plausible segmentation. The deviation from the plausible segmentation aids in estimating the underlying pixel-level uncertainty in order to further guide the segmentation network. The proposed method consequently estimates the uncertainty using a single inference from our representation, thereby reducing the total computation. We evaluate our method on two publicly available segmentation datasets of left atria in cardiac MRIs and of multiple organs in abdominal CTs. Our anatomically-aware method improves the segmentation accuracy over the state-of-the-art semi-supervised methods in terms of two commonly used evaluation metrics.


Subject(s)
Benchmarking , Heart Atria , Humans , Uncertainty , Supervised Machine Learning , Image Processing, Computer-Assisted
4.
Sci Rep ; 13(1): 13259, 2023 08 15.
Article in English | MEDLINE | ID: mdl-37582862

ABSTRACT

Neonatal MRIs are used increasingly in preterm infants. However, it is not always feasible to analyze this data. Having a tool that assesses brain maturation during this period of extraordinary changes would be immensely helpful. Approaches based on deep learning approaches could solve this task since, once properly trained and validated, they can be used in practically any system and provide holistic quantitative information in a matter of minutes. However, one major deterrent for radiologists is that these tools are not easily interpretable. Indeed, it is important that structures driving the results be detailed and survive comparison to the available literature. To solve these challenges, we propose an interpretable pipeline based on deep learning to predict postmenstrual age at scan, a key measure for assessing neonatal brain development. For this purpose, we train a state-of-the-art deep neural network to segment the brain into 87 different regions using normal preterm and term infants from the dHCP study. We then extract informative features for brain age estimation using the segmented MRIs and predict the brain age at scan with a regression model. The proposed framework achieves a mean absolute error of 0.46 weeks to predict postmenstrual age at scan. While our model is based solely on structural T2-weighted images, the results are superior to recent, arguably more complex approaches. Furthermore, based on the extracted knowledge from the trained models, we found that frontal and parietal lobes are among the most important structures for neonatal brain age estimation.


Subject(s)
Infant, Premature , Premature Birth , Female , Humans , Infant, Newborn , Infant , Brain/diagnostic imaging , Magnetic Resonance Imaging/methods , Neural Networks, Computer
6.
Med Image Anal ; 87: 102826, 2023 07.
Article in English | MEDLINE | ID: mdl-37146441

ABSTRACT

Despite the undeniable progress in visual recognition tasks fueled by deep neural networks, there exists recent evidence showing that these models are poorly calibrated, resulting in over-confident predictions. The standard practices of minimizing the cross-entropy loss during training promote the predicted softmax probabilities to match the one-hot label assignments. Nevertheless, this yields a pre-softmax activation of the correct class that is significantly larger than the remaining activations, which exacerbates the miscalibration problem. Recent observations from the classification literature suggest that loss functions that embed implicit or explicit maximization of the entropy of predictions yield state-of-the-art calibration performances. Despite these findings, the impact of these losses in the relevant task of calibrating medical image segmentation networks remains unexplored. In this work, we provide a unifying constrained-optimization perspective of current state-of-the-art calibration losses. Specifically, these losses could be viewed as approximations of a linear penalty (or a Lagrangian term) imposing equality constraints on logit distances. This points to an important limitation of such underlying equality constraints, whose ensuing gradients constantly push towards a non-informative solution, which might prevent from reaching the best compromise between the discriminative performance and calibration of the model during gradient-based optimization. Following our observations, we propose a simple and flexible generalization based on inequality constraints, which imposes a controllable margin on logit distances. Comprehensive experiments on a variety of public medical image segmentation benchmarks demonstrate that our method sets novel state-of-the-art results on these tasks in terms of network calibration, whereas the discriminative performance is also improved. The code is available at https://github.com/Bala93/MarginLoss.


Subject(s)
Image Processing, Computer-Assisted , Neural Networks, Computer , Humans , Image Processing, Computer-Assisted/methods , Calibration , Entropy
7.
Med Image Anal ; 83: 102670, 2023 01.
Article in English | MEDLINE | ID: mdl-36413905

ABSTRACT

Despite achieving promising results in a breadth of medical image segmentation tasks, deep neural networks (DNNs) require large training datasets with pixel-wise annotations. Obtaining these curated datasets is a cumbersome process which limits the applicability of DNNs in scenarios where annotated images are scarce. Mixed supervision is an appealing alternative for mitigating this obstacle. In this setting, only a small fraction of the data contains complete pixel-wise annotations and other images have a weaker form of supervision, e.g., only a handful of pixels are labeled. In this work, we propose a dual-branch architecture, where the upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch. Combined with a standard cross-entropy loss over the labeled pixels, our novel formulation integrates two important terms: (i) a Shannon entropy loss defined over the less-supervised images, which encourages confident student predictions in the bottom branch; and (ii) a Kullback-Leibler (KL) divergence term, which transfers the knowledge (i.e., predictions) of the strongly supervised branch to the less-supervised branch and guides the entropy (student-confidence) term to avoid trivial solutions. We show that the synergy between the entropy and KL divergence yields substantial improvements in performance. We also discuss an interesting link between Shannon-entropy minimization and standard pseudo-mask generation, and argue that the former should be preferred over the latter for leveraging information from unlabeled pixels. We evaluate the effectiveness of the proposed formulation through a series of quantitative and qualitative experiments using two publicly available datasets. Results demonstrate that our method significantly outperforms other strategies for semantic segmentation within a mixed-supervision framework, as well as recent semi-supervised approaches. Moreover, in line with recent observations in classification, we show that the branch trained with reduced supervision and guided by the top branch largely outperforms the latter. Our code is publicly available: https://github.com/by-liu/ConfKD.


Subject(s)
Neural Networks, Computer , Semantics , Humans , Entropy
8.
Med Image Anal ; 82: 102617, 2022 11.
Article in English | MEDLINE | ID: mdl-36228364

ABSTRACT

Domain adaptation (DA) has drawn high interest for its capacity to adapt a model trained on labeled source data to perform well on unlabeled or weakly labeled target data from a different domain. Most common DA techniques require concurrent access to the input images of both the source and target domains. However, in practice, privacy concerns often impede the availability of source images in the adaptation phase. This is a very frequent DA scenario in medical imaging, where, for instance, the source and target images could come from different clinical sites. We introduce a source-free domain adaptation for image segmentation. Our formulation is based on minimizing a label-free entropy loss defined over target-domain data, which we further guide with weak labels of the target samples and a domain-invariant prior on the segmentation regions. Many priors can be derived from anatomical information. Here, a class-ratio prior is estimated from anatomical knowledge and integrated in the form of a Kullback-Leibler (KL) divergence in our overall loss function. Furthermore, we motivate our overall loss with an interesting link to maximizing the mutual information between the target images and their label predictions. We show the effectiveness of our prior-aware entropy minimization in a variety of domain-adaptation scenarios, with different modalities and applications, including spine, prostate and cardiac segmentation. Our method yields comparable results to several state-of-the-art adaptation techniques, despite having access to much less information, as the source images are entirely absent in our adaptation phase. Our straightforward adaptation strategy uses only one network, contrary to popular adversarial techniques, which are not applicable to a source-free DA setting. Our framework can be readily used in a breadth of segmentation problems, and our code is publicly available: https://github.com/mathilde-b/SFDA.


Subject(s)
Prostate , Spine , Humans , Male , Image Processing, Computer-Assisted/methods
9.
Med Image Anal ; 80: 102526, 2022 Aug.
Article in English | MEDLINE | ID: mdl-35780592

ABSTRACT

Current unsupervised anomaly localization approaches rely on generative models to learn the distribution of normal images, which is later used to identify potential anomalous regions derived from errors on the reconstructed images. To address the limitations of residual-based anomaly localization, very recent literature has focused on attention maps, by integrating supervision on them in the form of homogenization constraints. In this work, we propose a novel formulation that addresses the problem in a more principled manner, leveraging well-known knowledge in constrained optimization. In particular, the equality constraint on the attention maps in prior work is replaced by an inequality constraint, which allows more flexibility. In addition, to address the limitations of penalty-based functions we employ an extension of the popular log-barrier methods to handle the constraint. Last, we propose an alternative regularization term that maximizes the Shannon entropy of the attention maps, reducing the amount of hyperparameters of the proposed model. Comprehensive experiments on two publicly available datasets on brain lesion segmentation demonstrate that the proposed approach substantially outperforms relevant literature, establishing new state-of-the-art results for unsupervised lesion segmentation.

10.
IEEE J Biomed Health Inform ; 26(9): 4599-4610, 2022 09.
Article in English | MEDLINE | ID: mdl-35763468

ABSTRACT

Learning similarity is a key aspect in medical image analysis, particularly in recommendation systems or in uncovering the interpretation of anatomical data in images. Most existing methods learn such similarities in the embedding space over image sets using a single metric learner. Images, however, have a variety of object attributes such as color, shape, or artifacts. Encoding such attributes using a single metric learner is inadequate and may fail to generalize. Instead, multiple learners could focus on separate aspects of these attributes in subspaces of an overarching embedding. This, however, implies the number of learners to be found empirically for each new dataset. This work, Dynamic Subspace Learners, proposes to dynamically exploit multiple learners by removing the need of knowing apriori the number of learners and aggregating new subspace learners during training. Furthermore, the visual interpretability of such subspace learning is enforced by integrating an attention module into our method. This integrated attention mechanism provides a visual insight of discriminative image features that contribute to the clustering of image sets and a visual explanation of the embedding features. The benefits of our attention-based dynamic subspace learners are evaluated in the application of image clustering, image retrieval, and weakly supervised segmentation. Our method achieves competitive results with the performances of multiple learners baselines and significantly outperforms the classification network in terms of clustering and retrieval scores on three different public benchmark datasets. Moreover, our method also provides an attention map generated directly during inference to illustrate the visual interpretability of the embedding features. These attention maps offer a proxy-labels, which improves the segmentation accuracy up to 15% in Dice scores when compared to state-of-the-art interpretation techniques.


Subject(s)
Algorithms , Artificial Intelligence , Artifacts , Cluster Analysis , Humans
11.
Sci Rep ; 12(1): 6174, 2022 04 13.
Article in English | MEDLINE | ID: mdl-35418576

ABSTRACT

The segmentation of retinal vasculature from eye fundus images is a fundamental task in retinal image analysis. Over recent years, increasingly complex approaches based on sophisticated Convolutional Neural Network architectures have been pushing performance on well-established benchmark datasets. In this paper, we take a step back and analyze the real need of such complexity. We first compile and review the performance of 20 different techniques on some popular databases, and we demonstrate that a minimalistic version of a standard U-Net with several orders of magnitude less parameters, carefully trained and rigorously evaluated, closely approximates the performance of current best techniques. We then show that a cascaded extension (W-Net) reaches outstanding performance on several popular datasets, still using orders of magnitude less learnable weights than any previously published work. Furthermore, we provide the most comprehensive cross-dataset performance analysis to date, involving up to 10 different databases. Our analysis demonstrates that the retinal vessel segmentation is far from solved when considering test images that differ substantially from the training data, and that this task represents an ideal scenario for the exploration of domain adaptation techniques. In this context, we experiment with a simple self-labeling strategy that enables moderate enhancement of cross-dataset performance, indicating that there is still much room for improvement in this area. Finally, we test our approach on Artery/Vein and vessel segmentation from OCTA imaging problems, where we again achieve results well-aligned with the state-of-the-art, at a fraction of the model complexity available in recent literature. Code to reproduce the results in this paper is released.


Subject(s)
Neural Networks, Computer , Retinal Vessels , Fundus Oculi , Image Processing, Computer-Assisted/methods , Retina/diagnostic imaging , Retinal Vessels/diagnostic imaging
12.
Med Image Anal ; 77: 102374, 2022 04.
Article in English | MEDLINE | ID: mdl-35101728

ABSTRACT

Weakly supervised learning has emerged as an appealing alternative to alleviate the need for large labeled datasets in semantic segmentation. Most current approaches exploit class activation maps (CAMs), which can be generated from image-level annotations. Nevertheless, resulting maps have been demonstrated to be highly discriminant, failing to serve as optimal proxy pixel-level labels. We present a novel learning strategy that leverages self-supervision in a multi-modal image scenario to significantly enhance original CAMs. In particular, the proposed method is based on two observations. First, the learning of fully-supervised segmentation networks implicitly imposes equivariance by means of data augmentation, whereas this implicit constraint disappears on CAMs generated with image tags. And second, the commonalities between image modalities can be employed as an efficient self-supervisory signal, correcting the inconsistency shown by CAMs obtained across multiple modalities. To effectively train our model, we integrate a novel loss function that includes a within-modality and a cross-modality equivariant term to explicitly impose these constraints during training. In addition, we add a KL-divergence on the class prediction distributions to facilitate the information exchange between modalities which, combined with the equivariant regularizers further improves the performance of our model. Exhaustive experiments on the popular multi-modal BraTS and prostate DECATHLON segmentation challenge datasets demonstrate that our approach outperforms relevant recent literature under the same learning conditions.


Subject(s)
Image Processing, Computer-Assisted , Neural Networks, Computer , Humans , Image Processing, Computer-Assisted/methods , Male , Prostate , Semantics , Supervised Machine Learning
13.
IEEE Trans Med Imaging ; 41(3): 702-714, 2022 03.
Article in English | MEDLINE | ID: mdl-34705638

ABSTRACT

Weakly-supervised learning (WSL) has recently triggered substantial interest as it mitigates the lack of pixel-wise annotations. Given global image labels, WSL methods yield pixel-level predictions (segmentations), which enable to interpret class predictions. Despite their recent success, mostly with natural images, such methods can face important challenges when the foreground and background regions have similar visual cues, yielding high false-positive rates in segmentations, as is the case in challenging histology images. WSL training is commonly driven by standard classification losses, which implicitly maximize model confidence, and locate the discriminative regions linked to classification decisions. Therefore, they lack mechanisms for modeling explicitly non-discriminative regions and reducing false-positive rates. We propose novel regularization terms, which enable the model to seek both non-discriminative and discriminative regions, while discouraging unbalanced segmentations. We introduce high uncertainty as a criterion to localize non-discriminative regions that do not affect classifier decision, and describe it with original Kullback-Leibler (KL) divergence losses evaluating the deviation of posterior predictions from the uniform distribution. Our KL terms encourage high uncertainty of the model when the latter inputs the latent non-discriminative regions. Our loss integrates: (i) a cross-entropy seeking a foreground, where model confidence about class prediction is high; (ii) a KL regularizer seeking a background, where model uncertainty is high; and (iii) log-barrier terms discouraging unbalanced segmentations. Comprehensive experiments and ablation studies over the public GlaS colon cancer data and a Camelyon16 patch-based benchmark for breast cancer show substantial improvements over state-of-the-art WSL methods, and confirm the effect of our new regularizers (our code is publicly available at https://github.com/sbelharbi/deep-wsl-histo-min-max-uncertainty).


Subject(s)
Breast Neoplasms , Histological Techniques , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/pathology , Entropy , Female , Humans , Uncertainty
14.
Comput Biol Med ; 134: 104472, 2021 07.
Article in English | MEDLINE | ID: mdl-34023696

ABSTRACT

Precise determination and assessment of bladder cancer (BC) extent of muscle invasion involvement guides proper risk stratification and personalized therapy selection. In this context, segmentation of both bladder walls and cancer are of pivotal importance, as it provides invaluable information to stage the primary tumor. Hence, multiregion segmentation on patients presenting with symptoms of bladder tumors using deep learning heralds a new level of staging accuracy and prediction of the biologic behavior of the tumor. Nevertheless, despite the success of these models in other medical problems, progress in multiregion bladder segmentation, particularly in MRI and CT modalities, is still at a nascent stage, with just a handful of works tackling a multiregion scenario. Furthermore, most existing approaches systematically follow prior literature in other clinical problems, without casting a doubt on the validity of these methods on bladder segmentation, which may present different challenges. Inspired by this, we provide an in-depth look at bladder cancer segmentation using deep learning models. The critical determinants for accurate differentiation of muscle invasive disease, current status of deep learning based bladder segmentation, lessons and limitations of prior work are highlighted.


Subject(s)
Deep Learning , Humans , Image Processing, Computer-Assisted , Magnetic Resonance Imaging , Neural Networks, Computer , Tomography, X-Ray Computed , Urinary Bladder/diagnostic imaging
15.
IEEE Trans Med Imaging ; 40(7): 1737-1749, 2021 07.
Article in English | MEDLINE | ID: mdl-33710953

ABSTRACT

This paper presents a client/server privacy-preserving network in the context of multicentric medical image analysis. Our approach is based on adversarial learning which encodes images to obfuscate the patient identity while preserving enough information for a target task. Our novel architecture is composed of three components: 1) an encoder network which removes identity-specific features from input medical images, 2) a discriminator network that attempts to identify the subject from the encoded images, 3) a medical image analysis network which analyzes the content of the encoded images (segmentation in our case). By simultaneously fooling the discriminator and optimizing the medical analysis network, the encoder learns to remove privacy-specific features while keeping those essentials for the target task. Our approach is illustrated on the problem of segmenting brain MRI from the large-scale Parkinson Progression Marker Initiative (PPMI) dataset. Using longitudinal data from PPMI, we show that the discriminator learns to heavily distort input images while allowing for highly accurate segmentation results. Our results also demonstrate that an encoder trained on the PPMI dataset can be used for segmenting other datasets, without the need for retraining. The code is made available at: https://github.com/bachkimn/Privacy-Net-An-Adversarial-Approach-forIdentity-Obfuscated-Segmentation-of-MedicalImages.


Subject(s)
Image Processing, Computer-Assisted , Privacy , Humans , Magnetic Resonance Imaging
16.
IEEE J Biomed Health Inform ; 25(8): 3094-3104, 2021 08.
Article in English | MEDLINE | ID: mdl-33621184

ABSTRACT

Prostate cancer is one of the main diseases affecting men worldwide. The gold standard for diagnosis and prognosis is the Gleason grading system. In this process, pathologists manually analyze prostate histology slides under microscope, in a high time-consuming and subjective task. In the last years, computer-aided-diagnosis (CAD) systems have emerged as a promising tool that could support pathologists in the daily clinical practice. Nevertheless, these systems are usually trained using tedious and prone-to-error pixel-level annotations of Gleason grades in the tissue. To alleviate the need of manual pixel-wise labeling, just a handful of works have been presented in the literature. Furthermore, despite the promising results achieved on global scoring the location of cancerous patterns in the tissue is only qualitatively addressed. These heatmaps of tumor regions, however, are crucial to the reliability of CAD systems as they provide explainability to the system's output and give confidence to pathologists that the model is focusing on medical relevant features. Motivated by this, we propose a novel weakly-supervised deep-learning model, based on self-learning CNNs, that leverages only the global Gleason score of gigapixel whole slide images during training to accurately perform both, grading of patch-level patterns and biopsy-level scoring. To evaluate the performance of the proposed method, we perform extensive experiments on three different external datasets for the patch-level Gleason grading, and on two different test sets for global Grade Group prediction. We empirically demonstrate that our approach outperforms its supervised counterpart on patch-level Gleason grading by a large margin, as well as state-of-the-art methods on global biopsy-level scoring. Particularly, the proposed model brings an average improvement on the Cohen's quadratic kappa ( κ) score of nearly 18% compared to full-supervision for the patch-level Gleason grading task. This suggests that the absence of the annotator's bias in our approach and the capability of using large weakly labeled datasets during training leads to higher performing and more robust models. Furthermore, raw features obtained from the patch-level classifier showed to generalize better than previous approaches in the literature to the subjective global biopsy-level scoring.


Subject(s)
Image Interpretation, Computer-Assisted , Prostatic Neoplasms , Humans , Male , Neoplasm Grading , Reproducibility of Results
17.
IEEE J Biomed Health Inform ; 25(1): 121-130, 2021 01.
Article in English | MEDLINE | ID: mdl-32305947

ABSTRACT

Even though convolutional neural networks (CNNs) are driving progress in medical image segmentation, standard models still have some drawbacks. First, the use of multi-scale approaches, i.e., encoder-decoder architectures, leads to a redundant use of information, where similar low-level features are extracted multiple times at multiple scales. Second, long-range feature dependencies are not efficiently modeled, resulting in non-optimal discriminative feature representations associated with each semantic class. In this paper we attempt to overcome these limitations with the proposed architecture, by capturing richer contextual dependencies based on the use of guided self-attention mechanisms. This approach is able to integrate local features with their corresponding global dependencies, as well as highlight interdependent channel maps in an adaptive manner. Further, the additional loss between different modules guides the attention mechanisms to neglect irrelevant information and focus on more discriminant regions of the image by emphasizing relevant feature associations. We evaluate the proposed model in the context of semantic segmentation on three different datasets: abdominal organs, cardiovascular structures and brain tumors. A series of ablation experiments support the importance of these attention modules in the proposed architecture. In addition, compared to other state-of-the-art segmentation networks our model yields better segmentation performance, increasing the accuracy of the predictions while reducing the standard deviation. This demonstrates the efficiency of our approach to generate precise and reliable automatic segmentations of medical images. Our code is made publicly available at: https://github.com/sinAshish/Multi-Scale-Attention.


Subject(s)
Image Processing, Computer-Assisted , Neural Networks, Computer , Humans , Semantics
18.
Med Image Anal ; 67: 101851, 2021 01.
Article in English | MEDLINE | ID: mdl-33080507

ABSTRACT

Widely used loss functions for CNN segmentation, e.g., Dice or cross-entropy, are based on integrals over the segmentation regions. Unfortunately, for highly unbalanced segmentations, such regional summations have values that differ by several orders of magnitude across classes, which affects training performance and stability. We propose a boundary loss, which takes the form of a distance metric on the space of contours, not regions. This can mitigate the difficulties of highly unbalanced problems because it uses integrals over the interface between regions instead of unbalanced integrals over the regions. Furthermore, a boundary loss complements regional information. Inspired by graph-based optimization techniques for computing active-contour flows, we express a non-symmetric L2 distance on the space of contours as a regional integral, which avoids completely local differential computations involving contour points. This yields a boundary loss expressed with the regional softmax probability outputs of the network, which can be easily combined with standard regional losses and implemented with any existing deep network architecture for N-D segmentation. We report comprehensive evaluations and comparisons on different unbalanced problems, showing that our boundary loss can yield significant increases in performances while improving training stability. Our code is publicly available1.


Subject(s)
Image Processing, Computer-Assisted , Humans
19.
Transl Vis Sci Technol ; 9(2): 34, 2020 06.
Article in English | MEDLINE | ID: mdl-32832207

ABSTRACT

Purpose: Introducing a new technique to improve deep learning (DL) models designed for automatic grading of diabetic retinopathy (DR) from retinal fundus images by enhancing predictions' consistency. Methods: A convolutional neural network (CNN) was optimized in three different manners to predict DR grade from eye fundus images. The optimization criteria were (1) the standard cross-entropy (CE) loss; (2) CE supplemented with label smoothing (LS), a regularization approach widely employed in computer vision tasks; and (3) our proposed non-uniform label smoothing (N-ULS), a modification of LS that models the underlying structure of expert annotations. Results: Performance was measured in terms of quadratic-weighted κ score (quad-κ) and average area under the receiver operating curve (AUROC), as well as with suitable metrics for analyzing diagnostic consistency, like weighted precision, recall, and F1 score, or Matthews correlation coefficient. While LS generally harmed the performance of the CNN, N-ULS statistically significantly improved performance with respect to CE in terms quad-κ score (73.17 vs. 77.69, P < 0.025), without any performance decrease in average AUROC. N-ULS achieved this while simultaneously increasing performance for all other analyzed metrics. Conclusions: For extending standard modeling approaches from DR detection to the more complex task of DR grading, it is essential to consider the underlying structure of expert annotations. The approach introduced in this article can be easily implemented in conjunction with deep neural networks to increase their consistency without sacrificing per-class performance. Translational Relevance: A straightforward modification of current standard training practices of CNNs can substantially improve consistency in DR grading, better modeling expert annotations and human variability.


Subject(s)
Diabetes Mellitus , Diabetic Retinopathy , Diabetic Retinopathy/diagnostic imaging , Fundus Oculi , Humans , Neural Networks, Computer
20.
Neural Netw ; 130: 297-308, 2020 Oct.
Article in English | MEDLINE | ID: mdl-32721843

ABSTRACT

An efficient strategy for weakly-supervised segmentation is to impose constraints or regularization priors on target regions. Recent efforts have focused on incorporating such constraints in the training of convolutional neural networks (CNN), however this has so far been done within a continuous optimization framework. Yet, various segmentation constraints and regularization priors can be modeled and optimized more efficiently in a discrete formulation. This paper proposes a method, based on the alternating direction method of multipliers (ADMM) algorithm, to train a CNN with discrete constraints and regularization priors. This method is applied to the segmentation of medical images with weak annotations, where both size constraints and boundary length regularization are enforced. Experiments on two benchmark datasets for medical image segmentation show our method to provide significant improvements compared to existing approaches in terms of segmentation accuracy, constraint satisfaction and convergence speed.


Subject(s)
Pattern Recognition, Automated/methods , Supervised Machine Learning , Algorithms , Humans , Image Processing, Computer-Assisted/methods , Neural Networks, Computer
SELECTION OF CITATIONS
SEARCH DETAIL
...