Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 54
Filter
1.
Sci Rep ; 14(1): 23449, 2024 Oct 08.
Article in English | MEDLINE | ID: mdl-39379560

ABSTRACT

We propose an improved superpixel segmentation algorithm based on visual saliency and color entropy for online color detection in printed products. This method addresses the issues of low accuracy and slow speed in detecting color deviations in print quality control. The improved superpixel segmentation algorithm consists of three main steps: Firstly, simulating human visual perception to obtain visually salient regions of the image, thereby achieving region-based superpixel segmentation. Secondly, adaptively determining the superpixel size within the salient regions using color information entropy. Finally, the superpixel segmentation method is optimized using hue angle distance based on chromaticity, ultimately achieving a region-based adaptive superpixel segmentation algorithm. Color detection of printed products compares the color mean values of post-printing images under the same superpixel labels, outputting labels with color deviations to identify areas of color differences. The experimental results show that the improved superpixel algorithm introduces color phase distance with better segmentation accuracy, and combines it with human visual perception to better reproduce the color information of printed materials. Using the method described in this article for printing color quality inspection can reduce data computation, quickly detect and mark color difference areas, and provide the degree of color deviation.

2.
Schizophr Res Cogn ; 38: 100324, 2024 Dec.
Article in English | MEDLINE | ID: mdl-39238484

ABSTRACT

Background: Visual exploration is abnormal in schizophrenia; however, few studies have investigated the physiological responses during selecting objectives in more ecological scenarios. This study aimed to demonstrate that people with schizophrenia have difficulties observing the prominent elements of an image due to a deficit mechanism of sensory modulation (active sensing) during natural vision. Methods: An electroencephalogram recording with eye tracking data was collected on 18 healthy individuals and 18 people affected by schizophrenia while looking at natural images. These had a prominent color element and blinking produced by changes in image luminance. Results: We found fewer fixations when all images were scanned, late focus on prominent image areas, decreased amplitude in the eye-fixation-related potential, and decreased intertrial coherence in the SCZ group. Conclusions: The decrease in the visual attention response evoked by the prominence of visual stimuli in patients affected by schizophrenia is generated by a reduction in endogenous attention mechanisms to initiate and maintain visual exploration. Further work is required to explain the relationship of this decrease with clinical indicators.

3.
Brain Sci ; 14(8)2024 Jul 26.
Article in English | MEDLINE | ID: mdl-39199448

ABSTRACT

To ensure survival, the visual system must rapidly extract the most important elements from a large stream of information. This necessity clashes with the computational limitations of the human brain, so a strong early data reduction is required to efficiently process information in fast vision. A theoretical early vision model, recently developed to preserve maximum information using minimal computational resources, allows efficient image data reduction by extracting simplified sketches containing only optimally informative, salient features. Here, we investigate the neural substrates of this mechanism for optimal encoding of information, possibly located in early visual structures. We adopted a flicker adaptation paradigm, which has been demonstrated to specifically impair the contrast sensitivity of the magnocellular pathway. We compared flicker-induced contrast threshold changes in three different tasks. The results indicate that, after adapting to a uniform flickering field, thresholds for image discrimination using briefly presented sketches increase. Similar threshold elevations occur for motion discrimination, a task typically targeting the magnocellular system. Instead, contrast thresholds for orientation discrimination, a task typically targeting the parvocellular system, do not change with flicker adaptation. The computation performed by this early data reduction mechanism seems thus consistent with magnocellular processing.

4.
Acta Psychol (Amst) ; 243: 104124, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38232506

ABSTRACT

In the first years of life, infants progressively develop attention selection skills to gather information from visually clustered environments. As young as newborns, infants are sensitive to the distinguished differences in color, orientation, and luminance, which are the components of visual saliency. However, we know little about how saliency-driven attention emerges and develops socially through everyday free-viewing experiences. The present work assessed the saliency change in infants' egocentric scenes and investigated the impacts of manual engagements on infant object looking in the interactive context of object play. Thirty parent-infant dyads, including infants in two age groups (younger: 3- to 6-month-old; older: 9- to 12-month-old), completed a brief session of object play. Infants' looking behaviors were recorded by the head-mounted eye-tracking gear, and both parents' and infants' manual actions on objects were annotated separately for analyses. The present findings revealed distinct attention mechanisms that underlie the hand-eye coordination between parents and infants and within infants during object play: younger infants are predominantly biased toward the characteristics of the visual saliency accompanying the parent's handled actions on the objects; on the other hand, older infants gradually employed more attention to the object, regardless of the saliency in view, as they gained more self-generated manual actions. Taken together, the present work highlights the tight coordination between visual experiences and sensorimotor competence and proposes a novel dyadic pathway to sustained attention that social sensitivity to parents' hands emerges through saliency-driven attention, preparing infants to focus, follow, and steadily track moving targets in free-flow viewing activities.


Subject(s)
Attention , Child Development , Visual Perception , Humans , Infant
5.
Vision Res ; 211: 108281, 2023 10.
Article in English | MEDLINE | ID: mdl-37421829

ABSTRACT

Models of emotion processing suggest that threat-related stimuli such as fearful faces can be detected based on the rapid extraction of low spatial frequencies. However, this remains debated as other models argue that the decoding of facial expressions occurs with a more flexible use of spatial frequencies. The purpose of this study was to clarify the role of spatial frequencies and differences in luminance contrast between spatial frequencies, on the detection of facial emotions. We used a saccadic choice task in which emotional-neutral face pairs were presented and participants were asked to make a saccade toward the neutral or the emotional (happy or fearful) face. Faces were displayed either in low, high, or broad spatial frequencies. Results showed that participants were better to saccade toward the emotional face. They were also better for high or broad than low spatial frequencies, and the accuracy was higher with a happy target. An analysis of the eye and mouth saliency ofour stimuli revealed that the mouth saliency of the target correlates with participants' performance. Overall, this study underlines the importance of local more than global information, and of the saliency of the mouth region in the detection of emotional and neutral faces.


Subject(s)
Emotions , Saccades , Humans , Happiness , Facial Expression
6.
Brain Dev ; 45(8): 432-444, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37188548

ABSTRACT

Individuals with cerebral visual impairment (CVI) have difficulties identifying common objects, especially when presented as cartoons or abstract images. In this study, participants were shown a series of images of ten common objects, each from five possible categories ranging from abstract black & white line drawings to color photographs. Fifty individuals with CVI and 50 neurotypical controls verbally identified each object and success rates and reaction times were collected. Visual gaze behavior was recorded using an eye tracker to quantify the extent of visual search area explored and number of fixations. A receiver operating characteristic (ROC) analysis was also carried out to compare the degree of alignment between the distribution of individual eye gaze patterns and image saliency features computed by the graph-based visual saliency (GBVS) model. Compared to controls, CVI participants showed significantly lower success rates and longer reaction times when identifying objects. In the CVI group, success rate improved moving from abstract black & white images to color photographs, suggesting that object form (as defined by outlines and contours) and color are important cues for correct identification. Eye tracking data revealed that the CVI group showed significantly greater visual search areas and number of fixations per image, and the distribution of eye gaze patterns in the CVI group was less aligned with the high saliency features of the image compared to controls. These results have important implications in helping to understand the complex profile of visual perceptual difficulties associated with CVI.


Subject(s)
Brain Diseases , Eye Movements , Humans , Attention , Fixation, Ocular , Visual Perception , Vision Disorders
7.
Front Neurosci ; 17: 926418, 2023.
Article in English | MEDLINE | ID: mdl-36998731

ABSTRACT

This paper conjectures and validates a framework that allows for action during inference in supervised neural networks. Supervised neural networks are constructed with the objective to maximize their performance metric in any given task. This is done by reducing free energy and its associated surprisal during training. However, the bottom-up inference nature of supervised networks is a passive process that renders them fallible to noise. In this paper, we provide a thorough background of supervised neural networks, both generative and discriminative, and discuss their functionality from the perspective of free energy principle. We then provide a framework for introducing action during inference. We introduce a new measurement called stochastic surprisal that is a function of the network, the input, and any possible action. This action can be any one of the outputs that the neural network has learnt, thereby lending stochasticity to the measurement. Stochastic surprisal is validated on two applications: Image Quality Assessment and Recognition under noisy conditions. We show that, while noise characteristics are ignored to make robust recognition, they are analyzed to estimate image quality scores. We apply stochastic surprisal on two applications, three datasets, and as a plug-in on 12 networks. In all, it provides a statistically significant increase among all measures. We conclude by discussing the implications of the proposed stochastic surprisal in other areas of cognitive psychology including expectancy-mismatch and abductive reasoning.

8.
Sensors (Basel) ; 23(5)2023 Feb 27.
Article in English | MEDLINE | ID: mdl-36904837

ABSTRACT

The just noticeable difference (JND) model reflects the visibility limitations of the human visual system (HVS), which plays an important role in perceptual image/video processing and is commonly applied to perceptual redundancy removal. However, existing JND models are usually constructed by treating the color components of three channels equally, and their estimation of the masking effect is inadequate. In this paper, we introduce visual saliency and color sensitivity modulation to improve the JND model. Firstly, we comprehensively combined contrast masking, pattern masking, and edge protection to estimate the masking effect. Then, the visual saliency of HVS was taken into account to adaptively modulate the masking effect. Finally, we built color sensitivity modulation according to the perceptual sensitivities of HVS, to adjust the sub-JND thresholds of Y, Cb, and Cr components. Thus, the color-sensitivity-based JND model (CSJND) was constructed. Extensive experiments and subjective tests were conducted to verify the effectiveness of the CSJND model. We found that consistency between the CSJND model and HVS was better than existing state-of-the-art JND models.

9.
Front Hum Neurosci ; 17: 1049615, 2023.
Article in English | MEDLINE | ID: mdl-36845876

ABSTRACT

In naturalistic conditions, objects in the scene may be partly occluded and the visual system has to recognize the whole image based on the little information contained in some visible fragments. Previous studies demonstrated that humans can successfully recognize severely occluded images, but the underlying mechanisms occurring in the early stages of visual processing are still poorly understood. The main objective of this work is to investigate the contribution of local information contained in a few visible fragments to image discrimination in fast vision. It has been already shown that a specific set of features, predicted by a constrained maximum-entropy model to be optimal carriers of information (optimal features), are used to build simplified early visual representations (primal sketch) that are sufficient for fast image discrimination. These features are also considered salient by the visual system and can guide visual attention when presented isolated in artificial stimuli. Here, we explore whether these local features also play a significant role in more natural settings, where all existing features are kept, but the overall available information is drastically reduced. Indeed, the task requires discrimination of naturalistic images based on a very brief presentation (25 ms) of a few small visible image fragments. In the main experiment, we reduced the possibility to perform the task based on global-luminance positional cues by presenting randomly inverted-contrast images, and we measured how much observers' performance relies on the local features contained in the fragments or on global information. The size and the number of fragments were determined in two preliminary experiments. Results show that observers are very skilled in fast image discrimination, even when a drastic occlusion is applied. When observers cannot rely on the position of global-luminance information, the probability of correct discrimination increases when the visible fragments contain a high number of optimal features. These results suggest that such optimal local information contributes to the successful reconstruction of naturalistic images even in challenging conditions.

10.
Insects ; 14(2)2023 Jan 17.
Article in English | MEDLINE | ID: mdl-36835667

ABSTRACT

As insect infestation is the leading factor accounting for nutritive and economic losses in stored grains, it is important to detect the presence and number of insects for the sake of taking proper control measures. Inspired by the human visual attention mechanism, we propose a U-net-like frequency-enhanced saliency (FESNet) detection model, resulting in the pixelwise segmentation of grain pests. The frequency clues, as well as the spatial information, are leveraged to enhance the detection performance of small insects from the cluttered grain background. Firstly, we collect a dedicated dataset, GrainPest, with pixel-level annotation after analyzing the image attributes of the existing salient object detection datasets. Secondly, we design a FESNet with the discrete wavelet transformation (DWT) and the discrete cosine transformation (DCT), both involved in the traditional convolution layers. As current salient object detection models will reduce the spatial information with pooling operations in the sequence of encoding stages, a special branch of the discrete wavelet transformation (DWT) is connected to the higher stages to capture accurate spatial information for saliency detection. Then, we introduce the discrete cosine transform (DCT) into the backbone bottlenecks to enhance the channel attention with low-frequency information. Moreover, we also propose a new receptive field block (NRFB) to enlarge the receptive fields by aggregating three atrous convolution features. Finally, in the phase of decoding, we use the high-frequency information and aggregated features together to restore the saliency map. Extensive experiments and ablation studies on our dataset, GrainPest, and open dataset, Salient Objects in Clutter (SOC), demonstrate that the proposed model performs favorably against the state-of-the-art model.

11.
Behav Res Methods ; 55(6): 2940-2959, 2023 09.
Article in English | MEDLINE | ID: mdl-36002630

ABSTRACT

In the process of making a movie, directors constantly care about where the spectator will look on the screen. Shot composition, framing, camera movements, or editing are tools commonly used to direct attention. In order to provide a quantitative analysis of the relationship between those tools and gaze patterns, we propose a new eye-tracking database, containing gaze-pattern information on movie sequences, as well as editing annotations, and we show how state-of-the-art computational saliency techniques behave on this dataset. In this work, we expose strong links between movie editing and spectators gaze distributions, and open several leads on how the knowledge of editing information could improve human visual attention modeling for cinematic content. The dataset generated and analyzed for this study is available at https://github.com/abruckert/eye_tracking_filmmaking.


Subject(s)
Eye Movements , Motion Pictures , Humans , Movement , Fixation, Ocular
12.
Sensors (Basel) ; 22(17)2022 Aug 25.
Article in English | MEDLINE | ID: mdl-36080849

ABSTRACT

The purpose of infrared and visible image fusion is to generate images with prominent targets and rich information which provides the basis for target detection and recognition. Among the existing image fusion methods, the traditional method is easy to produce artifacts, and the information of the visible target and texture details are not fully preserved, especially for the image fusion under dark scenes and smoke conditions. Therefore, an infrared and visible image fusion method is proposed based on visual saliency image and image contrast enhancement processing. Aiming at the problem that low image contrast brings difficulty to fusion, an improved gamma correction and local mean method is used to enhance the input image contrast. To suppress artifacts that are prone to occur in the process of image fusion, a differential rolling guidance filter (DRGF) method is adopted to decompose the input image into the basic layer and the detail layer. Compared with the traditional multi-scale decomposition method, this method can retain specific edge information and reduce the occurrence of artifacts. In order to solve the problem that the salient object of the fused image is not prominent and the texture detail information is not fully preserved, the salient map extraction method is used to extract the infrared image salient map to guide the fusion image target weight, and on the other hand, it is used to control the fusion weight of the basic layer to improve the shortcomings of the traditional 'average' fusion method to weaken the contrast information. In addition, a method based on pixel intensity and gradient is proposed to fuse the detail layer and retain the edge and detail information to the greatest extent. Experimental results show that the proposed method is superior to other fusion algorithms in both subjective and objective aspects.


Subject(s)
Algorithms , Image Enhancement , Artifacts , Image Enhancement/methods , Image Processing, Computer-Assisted/methods
13.
Entropy (Basel) ; 24(6)2022 Jun 19.
Article in English | MEDLINE | ID: mdl-35741563

ABSTRACT

Recently, the rapid development of the Internet of Things has contributed to the generation of telemedicine. However, online diagnoses by doctors require the analyses of multiple multi-modal medical images, which are inconvenient and inefficient. Multi-modal medical image fusion is proposed to solve this problem. Due to its outstanding feature extraction and representation capabilities, convolutional neural networks (CNNs) have been widely used in medical image fusion. However, most existing CNN-based medical image fusion methods calculate their weight maps by a simple weighted average strategy, which weakens the quality of fused images due to the effect of inessential information. In this paper, we propose a CNN-based CT and MRI image fusion method (MMAN), which adopts a visual saliency-based strategy to preserve more useful information. Firstly, a multi-scale mixed attention block is designed to extract features. This block can gather more helpful information and refine the extracted features both in the channel and spatial levels. Then, a visual saliency-based fusion strategy is used to fuse the feature maps. Finally, the fused image can be obtained via reconstruction blocks. The experimental results of our method preserve more textual details, clearer edge information and higher contrast when compared to other state-of-the-art methods.

14.
Pattern Recognit ; 1212022 Jan.
Article in English | MEDLINE | ID: mdl-34483373

ABSTRACT

Salient object detection (SOD) is viewed as a pixel-wise saliency modeling task by traditional deep learning-based methods. A limitation of current SOD models is insufficient utilization of inter-pixel information, which usually results in imperfect segmentation near edge regions and low spatial coherence. As we demonstrate, using a saliency mask as the only label is suboptimal. To address this limitation, we propose a connectivity-based approach called bilateral connectivity network (BiconNet), which uses connectivity masks together with saliency masks as labels for effective modeling of inter-pixel relationships and object saliency. Moreover, we propose a bilateral voting module to enhance the output connectivity map, and a novel edge feature enhancement method that efficiently utilizes edge-specific features. Through comprehensive experiments on five benchmark datasets, we demonstrate that our proposed method can be plugged into any existing state-of-the-art saliency-based SOD framework to improve its performance with negligible parameter increase.

15.
Sensors (Basel) ; 21(22)2021 Nov 13.
Article in English | MEDLINE | ID: mdl-34833622

ABSTRACT

Geostationary optical remote sensing satellites, such as the GF-4, have a high temporal resolution and wide coverage, which enables the continuous tracking and observation of ship targets over a large range. However, the ship targets in the images are usually small and dim and the images are easily affected by clouds, islands and other factors, which make it difficult to detect the ship targets. This paper proposes a new method for detecting ships moving on the sea surface using GF-4 satellite images. First, the adaptive nonlinear gray stretch (ANGS) method was used to enhance the image and highlight small and dim ship targets. Second, a multi-scale dual-neighbor difference contrast measure (MDDCM) method was designed to enable detection of the position of the candidate ship target. The shape characteristics of each candidate area were analyzed to remove false ship targets. Finally, the joint probability data association (JPDA) method was used for multi-frame data association and tracking. Our results suggest that the proposed method can effectively detect and track moving ship targets in GF-4 satellite optical remote sensing images, with better detection performance than other classical methods.

16.
Vision (Basel) ; 5(4)2021 Oct 19.
Article in English | MEDLINE | ID: mdl-34698314

ABSTRACT

Visual saliency maps have been developed to estimate the bottom-up visual attention of humans. A conventional saliency map represents a bottom-up visual attention using image features such as the intensity, orientation, and color. However, it is difficult to estimate the visual attention using a conventional saliency map in the case of a top-down visual attention. In this study, we investigate the visual saliency for characters by applying still images including both characters and symbols. The experimental results indicate that characters have specific visual saliency independent of the type of language.

17.
Front Neurosci ; 15: 645743, 2021.
Article in English | MEDLINE | ID: mdl-33994923

ABSTRACT

Under fast viewing conditions, the visual system extracts salient and simplified representations of complex visual scenes. Saccadic eye movements optimize such visual analysis through the dynamic sampling of the most informative and salient regions in the scene. However, a general definition of saliency, as well as its role for natural active vision, is still a matter for discussion. Following the general idea that visual saliency may be based on the amount of local information, a recent constrained maximum-entropy model of early vision, applied to natural images, extracts a set of local optimal information-carriers, as candidate salient features. These optimal features proved to be more informative than others in fast vision, when embedded in simplified sketches of natural images. In the present study, for the first time, these features were presented in isolation, to investigate whether they can be visually more salient than other non-optimal features, even in the absence of any meaningful global arrangement (contour, line, etc.). In four psychophysics experiments, fast discriminability of a compound of optimal features (target) in comparison with a similar compound of non-optimal features (distractor) was measured as a function of their number and contrast. Results showed that the saliency predictions from the constrained maximum-entropy model are well verified in the data, even when the optimal features are presented in smaller numbers or at lower contrast. In the eye movements experiment, the target and the distractor compounds were presented in the periphery at different angles. Participants were asked to perform a simple choice-saccade task. Results showed that saccades can select informative optimal features spatially interleaved with non-optimal features even at the shortest latencies. Saccades' choice accuracy and landing position precision improved with SNR. In conclusion, the optimal features predicted by the reference model, turn out to be more salient than others, despite the lack of any clues coming from a global meaningful structure, suggesting that they get preferential treatment during fast image analysis. Also, peripheral fast visual processing of these informative local features is able to guide gaze orientation. We speculate that active vision is efficiently adapted to maximize information in natural visual scenes.

18.
Psychon Bull Rev ; 28(5): 1601-1614, 2021 Oct.
Article in English | MEDLINE | ID: mdl-34009623

ABSTRACT

Similarity-based semantic interference (SI) hinders memory recognition. Within long-term visual memory paradigms, the more scenes (or objects) from the same semantic category are viewed, the harder it is to recognize each individual instance. A growing body of evidence shows that overt attention is intimately linked to memory. However, it is yet to be understood whether SI mediates overt attention during scene encoding, and so explain its detrimental impact on recognition memory. In the current experiment, participants watched 372 photographs belonging to different semantic categories (e.g., a kitchen) with different frequency (4, 20, 40 or 60 images), while being eye-tracked. After 10 minutes, they were presented with the same 372 photographs plus 372 new photographs and asked whether they recognized (or not) each photo (i.e., old/new paradigm). We found that the more the SI, the poorer the recognition performance, especially for old scenes of which memory representations existed. Scenes more widely explored were better recognized, but for increasing SI, participants focused on more local regions of the scene in search for its potentially distinctive details. Attending to the centre of the display, or to scene regions rich in low-level saliency was detrimental to recognition accuracy, and as SI increased participants were more likely to rely on visual saliency. The complexity of maintaining faithful memory representations for increasing SI also manifested in longer fixation durations; in fact, a more successful encoding was also associated with shorter fixations. Our study highlights the interdependence between attention and memory during high-level processing of semantic information.


Subject(s)
Eye Movements , Semantics , Attention , Humans , Memory, Long-Term , Pattern Recognition, Visual , Photic Stimulation , Recognition, Psychology
19.
Front Neurorobot ; 15: 767299, 2021.
Article in English | MEDLINE | ID: mdl-35095455

ABSTRACT

This article proposes a bottom-up visual saliency model that uses the wavelet transform to conduct multiscale analysis and computation in the frequency domain. First, we compute the multiscale magnitude spectra by performing a wavelet transform to decompose the magnitude spectrum of the discrete cosine coefficients of an input image. Next, we obtain multiple saliency maps of different spatial scales through an inverse transformation from the frequency domain to the spatial domain, which utilizes the discrete cosine magnitude spectra after multiscale wavelet decomposition. Then, we employ an evaluation function to automatically select the two best multiscale saliency maps. A final saliency map is generated via an adaptive integration of the two selected multiscale saliency maps. The proposed model is fast, efficient, and can simultaneously detect salient regions or objects of different sizes. It outperforms state-of-the-art bottom-up saliency approaches in the experiments of psychophysical consistency, eye fixation prediction, and saliency detection for natural images. In addition, the proposed model is applied to automatic ship detection in optical satellite images. Ship detection tests on satellite data of visual optical spectrum not only demonstrate our saliency model's effectiveness in detecting small and large salient targets but also verify its robustness against various sea background disturbances.

20.
Sensors (Basel) ; 22(1)2021 Dec 22.
Article in English | MEDLINE | ID: mdl-35009596

ABSTRACT

As a powerful technique to merge complementary information of original images, infrared (IR) and visible image fusion approaches are widely used in surveillance, target detecting, tracking, and biological recognition, etc. In this paper, an efficient IR and visible image fusion method is proposed to simultaneously enhance the significant targets/regions in all source images and preserve rich background details in visible images. The multi-scale representation based on the fast global smoother is firstly used to decompose source images into the base and detail layers, aiming to extract the salient structure information and suppress the halos around the edges. Then, a target-enhanced parallel Gaussian fuzzy logic-based fusion rule is proposed to merge the base layers, which can avoid the brightness loss and highlight significant targets/regions. In addition, the visual saliency map-based fusion rule is designed to merge the detail layers with the purpose of obtaining rich details. Finally, the fused image is reconstructed. Extensive experiments are conducted on 21 image pairs and a Nato-camp sequence (32 image pairs) to verify the effectiveness and superiority of the proposed method. Compared with several state-of-the-art methods, experimental results demonstrate that the proposed method can achieve more competitive or superior performances according to both the visual results and objective evaluation.


Subject(s)
Algorithms , Fuzzy Logic , Normal Distribution
SELECTION OF CITATIONS
SEARCH DETAIL