Pesquisa | Portal Regional da BVS

1.

Forensic species identification: practical guide for animal and plant DNA analysis.

Corradini, Beatrice; Gianfreda, Denise; Ferri, Gianmarco; Ferrari, Francesca; Borciani, Ilaria; Santunione, Anna Laura; Cecchi, Rossana.

Int J Legal Med ; 2024 Jul 10.

Artigo em Inglês | MEDLINE | ID: mdl-38985195

RESUMO

The importance of non-human DNA in the forensic field has increased greatly in recent years, together with the type of applications. The molecular species identification of animal and botanical material may be crucial both for wildlife trafficking and crime scene investigation. However, especially for forensic botany, several challenges slow down the implementation of the discipline in the routine.Although the importance of molecular analysis of animal origin samples is widely recognized and the same value is acknowledged to the botanical counterpart, the latter does not find the same degree of application.The availability of molecular methods, especially useful in cases where the material is fragmented, scarce or spoiled preventing the morphological identification, is not well known. This work is intended to reaffirm the relevance of non-human forensic genetics (NHFG), highlighting differences, benefits and pitfalls of the current most common molecular analysis workflow for animal and botanical samples, giving a practical guide. A flowchart describing the analysis paths, divided in three major working areas (inspection and sampling, molecular analysis, data processing and interpretation), is provided. More real casework examples of the utility of non-human evidence in forensic investigations should be shared by the scientific community, especially for plants. Moreover, concrete efforts to encourage initiatives in order to promote quality and standardization in the NHFG field are also needed.

2.

3D object reconstruction: A comprehensive view-dependent dataset.

Staszak, Rafal; Belter, Dominik.

Data Brief ; 55: 110569, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-38966660

RESUMO

The dataset contains RGB, depth, segmentation images of the scenes and information about the camera poses that can be used to create a full 3D model of the scene and develop methods that reconstruct objects from a single RGB-D camera view. Data were collected in the custom simulator that loads random graspable objects and random tables from the ShapeNet dataset. The graspable object is placed above the table in a random position. Then, the scene is simulated using the PhysX engine to make sure that the scene is physically plausible. The simulator captures images of the scene from a random pose and then takes the second image from the camera pose that is on the opposite side of the scene. The second subset was created using Kinect Azure and a set of real objects located on the ArUco board that was used to estimate the camera pose.

3.

A Phone in a Basket Looks Like a Knife in a Cup: Role-Filler Independence in Visual Processing.

Hafri, Alon; Bonner, Michael F; Landau, Barbara; Firestone, Chaz.

Open Mind (Camb) ; 8: 766-794, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38957507

RESUMO

When a piece of fruit is in a bowl, and the bowl is on a table, we appreciate not only the individual objects and their features, but also the relations containment and support, which abstract away from the particular objects involved. Independent representation of roles (e.g., containers vs. supporters) and "fillers" of those roles (e.g., bowls vs. cups, tables vs. chairs) is a core principle of language and higher-level reasoning. But does such role-filler independence also arise in automatic visual processing? Here, we show that it does, by exploring a surprising error that such independence can produce. In four experiments, participants saw a stream of images containing different objects arranged in force-dynamic relations-e.g., a phone contained in a basket, a marker resting on a garbage can, or a knife sitting in a cup. Participants had to respond to a single target image (e.g., a phone in a basket) within a stream of distractors presented under time constraints. Surprisingly, even though participants completed this task quickly and accurately, they false-alarmed more often to images matching the target's relational category than to those that did not-even when those images involved completely different objects. In other words, participants searching for a phone in a basket were more likely to mistakenly respond to a knife in a cup than to a marker on a garbage can. Follow-up experiments ruled out strategic responses and also controlled for various confounding image features. We suggest that visual processing represents relations abstractly, in ways that separate roles from fillers.

4.

Scene semantics affects allocentric spatial coding for action in naturalistic (virtual) environments.

Baltaretu, Bianca R; Schuetz, Immo; Võ, Melissa L-H; Fiehler, Katja.

Sci Rep ; 14(1): 15549, 2024 Jul 05.

Artigo em Inglês | MEDLINE | ID: mdl-38969745

RESUMO

Interacting with objects in our environment requires determining their locations, often with respect to surrounding objects (i.e., allocentrically). According to the scene grammar framework, these usually small, local objects are movable within a scene and represent the lowest level of a scene's hierarchy. How do higher hierarchical levels of scene grammar influence allocentric coding for memory-guided actions? Here, we focused on the effect of large, immovable objects (anchors) on the encoding of local object positions. In a virtual reality study, participants (n = 30) viewed one of four possible scenes (two kitchens or two bathrooms), with two anchors connected by a shelf, onto which were presented three local objects (congruent with one anchor) (Encoding). The scene was re-presented (Test) with 1) local objects missing and 2) one of the anchors shifted (Shift) or not (No shift). Participants, then, saw a floating local object (target), which they grabbed and placed back on the shelf in its remembered position (Response). Eye-tracking data revealed that both local objects and anchors were fixated, with preference for local objects. Additionally, anchors guided allocentric coding of local objects, despite being task-irrelevant. Overall, anchors implicitly influence spatial coding of local object locations for memory-guided actions within naturalistic (virtual) environments.

Assuntos

Semântica , Realidade Virtual , Humanos , Feminino , Masculino , Adulto , Adulto Jovem , Percepção Espacial/fisiologia , Memória/fisiologia

5.

A comparison of differing organizational formats for teaching requesting skills to children with autism.

Agius, May M; Stansfield, Jois; Murray, Janice.

Augment Altern Commun ; : 1-12, 2024 Jul 08.

Artigo em Inglês | MEDLINE | ID: mdl-38975951

RESUMO

The selection of high-tech AAC for children diagnosed with autism spectrum disorder can be a challenging process due to the vast array of options available. One of the decisions that clinicians need to make involves how vocabulary will be organized on the display. This study aimed to compare a visual scene display (VSD) with a grid display using a multiple-probe design across participants with an embedded adapted alternating treatment design. Four young children with autism spectrum disorder who were beginning communicators were recruited and taught to request preferred items using two display formats: VSD and grid layout on a mainstream tablet with an AAC app. Two of the participants achieved criterion with both displays, the other two participants failed to achieve criterion in either display. For all participants, progress was similar in both displays. The results are discussed through the lens of each participant's characteristics with suggestions for clinical decision-making.

6.

Passive Polarized Vision for Autonomous Vehicles: A Review.

Serres, Julien R; Lapray, Pierre-Jean; Viollet, Stéphane; Kronland-Martinet, Thomas; Moutenet, Antoine; Morel, Olivier; Bigué, Laurent.

Sensors (Basel) ; 24(11)2024 May 22.

Artigo em Inglês | MEDLINE | ID: mdl-38894104

RESUMO

This review article aims to address common research questions in passive polarized vision for robotics. What kind of polarization sensing can we embed into robots? Can we find our geolocation and true north heading by detecting light scattering from the sky as animals do? How should polarization images be related to the physical properties of reflecting surfaces in the context of scene understanding? This review article is divided into three main sections to address these questions, as well as to assist roboticists in identifying future directions in passive polarized vision for robotics. After an introduction, three key interconnected areas will be covered in the following sections: embedded polarization imaging; polarized vision for robotics navigation; and polarized vision for scene understanding. We will then discuss how polarized vision, a type of vision commonly used in the animal kingdom, should be implemented in robotics; this type of vision has not yet been exploited in robotics service. Passive polarized vision could be a supplemental perceptive modality of localization techniques to complement and reinforce more conventional ones.

7.

HAtt-Flow: Hierarchical Attention-Flow Mechanism for Group-Activity Scene Graph Generation in Videos.

Chappa, Naga Venkata Sai Raviteja; Nguyen, Pha; Le, Thi Hoang Ngan; Dobbs, Page Daniel; Luu, Khoa.

Sensors (Basel) ; 24(11)2024 May 24.

Artigo em Inglês | MEDLINE | ID: mdl-38894164

RESUMO

Group-activity scene graph (GASG) generation is a challenging task in computer vision, aiming to anticipate and describe relationships between subjects and objects in video sequences. Traditional video scene graph generation (VidSGG) methods focus on retrospective analysis, limiting their predictive capabilities. To enrich the scene-understanding capabilities, we introduced a GASG dataset extending the JRDB dataset with nuanced annotations involving appearance, interaction, position, relationship, and situation attributes. This work also introduces an innovative approach, a Hierarchical Attention-Flow (HAtt-Flow) mechanism, rooted in flow network theory to enhance GASG performance. Flow-attention incorporates flow conservation principles, fostering competition for sources and allocation for sinks, effectively preventing the generation of trivial attention. Our proposed approach offers a unique perspective on attention mechanisms, where conventional "values" and "keys" are transformed into sources and sinks, respectively, creating a novel framework for attention-based models. Through extensive experiments, we demonstrate the effectiveness of our Hatt-Flow model and the superiority of our proposed flow-attention mechanism. This work represents a significant advancement in predictive video scene understanding, providing valuable insights and techniques for applications that require real-time relationship prediction in video data.

8.

A review of the contributions of forensic archaeology and anthropology to the process of disaster victim identification.

Hanson, Ian; Fenn, James.

J Forensic Sci ; 2024 Jun 17.

Artigo em Inglês | MEDLINE | ID: mdl-38886927

RESUMO

Forensic archaeology and anthropology have developed significantly over past decades and now provide considerable assistance to the investigation process of disaster victim recovery and identification. In what are often chaotic death and crime scenes, the formal process of utilizing archaeological methods can bring control, order, and ensure systematic search. Procedures assist in defining scene extent, locating victims and evidence, rule out areas for consideration, and provide standardized recording and quality assurance through dedicated use of standardized forms (pro formas). Combined archaeological and anthropological search methods maximize opportunities to recovery the missing through identifying remains, mapping distributions, and providing accounting of victims at the scene. Anthropological assistance in examinations contributes to individual assessment, resolving commingling and fragmentation issues, and utilizing DNA sampling methods and matching data to reassociate and account for the missing. Utilization of archaeology, anthropology, and DNA matching data provides scope to review crime scene recovery and determine requirements and potential for further survey and retrieval. Adopting the most suitable methods for a particular context can maximize recovery, efficiency, and resource use. Case studies demonstrate the utility of archaeological methods in a range of scenarios. They exemplify the success of multidisciplinary analysis in providing evidence of the sequence of events, the timing of events, the impact of taphonomic processes, the location and accounting of victims, and the demonstration of systematic scene search. The considerations provided in this article, utilizing archaeology and anthropology processes, may assist investigators in planning and implementing responses to mass fatalities.

9.

Clutter resilience via auditory stream segregation in echolocating greater mouse-eared bats.

Pedersen, Michael B; Beedholm, Kristian; Hubancheva, Antoniya; Koseva, Kaloyana; Uebel, Astrid S; Hochradel, Klaus; Madsen, Peter T; Stidsholt, Laura.

J Exp Biol ; 227(12)2024 Jun 15.

Artigo em Inglês | MEDLINE | ID: mdl-38841890

RESUMO

Bats use echolocation to navigate and hunt in darkness, and must in that process segregate target echoes from unwanted clutter echoes. Bats may do this by approaching a target at steep angles relative to the plane of the background, utilizing their directional transmission and receiving systems to minimize clutter from background objects, but it remains unknown how bats negotiate clutter that cannot be spatially avoided. Here, we tested the hypothesis that when movement no longer offers spatial release, echolocating bats mitigate clutter by calling at lower source levels and longer call intervals to ease auditory streaming. We trained five greater mouse-eared bats (Myotis myotis) to land on a spherical loudspeaker with two microphones attached. We used a phantom-echo setup, where the loudspeaker/target transmitted phantom clutter echoes by playing back the bats' own calls at time delays of 1, 3 and 5âms with a virtual target strength 7âdB higher than the physical target. We show that the bats successfully landed on the target, irrespective of the clutter echo delays. Rather than decreasing their source levels, the bats used similar source level distributions in clutter and control trials. Similarly, the bats did not increase their call intervals, but instead used the same distribution of call intervals across control and clutter trials. These observations reject our hypothesis, leading us to conclude that bats display great resilience to clutter via short auditory integration times and acute auditory stream segregation rather than via biosonar adjustments.

Assuntos

Quirópteros , Ecolocação , Animais , Quirópteros/fisiologia , Ecolocação/fisiologia , Masculino , Feminino , Vocalização Animal/fisiologia

10.

Toward Intraoperative Visual Intelligence: Real-Time Surgical Instrument Segmentation for Enhanced Surgical Monitoring.

Daneshgar Rahbar, Mostafa; Pappas, George; Jaber, Nabih.

Healthcare (Basel) ; 12(11)2024 May 29.

Artigo em Inglês | MEDLINE | ID: mdl-38891187

RESUMO

BACKGROUND: Open surgery relies heavily on the surgeon's visual acuity and spatial awareness to track instruments within a dynamic and often cluttered surgical field. METHODS: This system utilizes a head-mounted depth camera to monitor surgical scenes, providing both image data and depth information. The video captured from this camera is scaled down, compressed using MPEG, and transmitted to a high-performance workstation via the RTSP (Real-Time Streaming Protocol), a reliable protocol designed for real-time media transmission. To segment surgical instruments, we utilize the enhanced U-Net with GridMask (EUGNet) for its proven effectiveness in surgical tool segmentation. RESULTS: For rigorous validation, the system's performance reliability and accuracy are evaluated using prerecorded RGB-D surgical videos. This work demonstrates the potential of this system to improve situational awareness, surgical efficiency, and generate data-driven insights within the operating room. In a simulated surgical environment, the system achieves a high accuracy of 85.5% in identifying and segmenting surgical instruments. Furthermore, the wireless video transmission proves reliable with a latency of 200 ms, suitable for real-time processing. CONCLUSIONS: These findings represent a promising step towards the development of assistive technologies with the potential to significantly enhance surgical practice.

11.

Subjective Affective Responses to Natural Scenes Require Understanding, Not Spatial Frequency Bands.

Mastria, Serena; Codispoti, Maurizio; Tronelli, Virginia; De Cesarei, Andrea.

Vision (Basel) ; 8(2)2024 Jun 04.

Artigo em Inglês | MEDLINE | ID: mdl-38922181

RESUMO

It is debated whether emotional processing and response depend on semantic identification or are preferentially tied to specific information in natural scenes, such as global features or local details. The present study aimed to further examine the relationship between scene understanding and affective response while manipulating visual content. To this end, we presented affective and neutral natural scenes which were progressively band-filtered to contain global features (low spatial frequencies) or local details (high spatial frequencies) and assessed both affective response and scene understanding. We observed that, if scene content was correctly reported, subjective ratings of arousal and valence were modulated by the affective content of the scene, and this modulation was similar across spatial frequency bands. On the other hand, no affective modulation of subjective ratings was observed if picture content was not correctly reported. The present results indicate that subjective affective response requires content understanding, and it is not tied to a specific spatial frequency range.

12.

Improving Surgical Scene Semantic Segmentation through a Deep Learning Architecture with Attention to Class Imbalance.

Urrea, Claudio; Garcia-Garcia, Yainet; Kern, John.

Biomedicines ; 12(6)2024 Jun 13.

Artigo em Inglês | MEDLINE | ID: mdl-38927516

RESUMO

This article addresses the semantic segmentation of laparoscopic surgery images, placing special emphasis on the segmentation of structures with a smaller number of observations. As a result of this study, adjustment parameters are proposed for deep neural network architectures, enabling a robust segmentation of all structures in the surgical scene. The U-Net architecture with five encoder-decoders (U-Net5ed), SegNet-VGG19, and DeepLabv3+ employing different backbones are implemented. Three main experiments are conducted, working with Rectified Linear Unit (ReLU), Gaussian Error Linear Unit (GELU), and Swish activation functions. The applied loss functions include Cross Entropy (CE), Focal Loss (FL), Tversky Loss (TL), Dice Loss (DiL), Cross Entropy Dice Loss (CEDL), and Cross Entropy Tversky Loss (CETL). The performance of Stochastic Gradient Descent with momentum (SGDM) and Adaptive Moment Estimation (Adam) optimizers is compared. It is qualitatively and quantitatively confirmed that DeepLabv3+ and U-Net5ed architectures yield the best results. The DeepLabv3+ architecture with the ResNet-50 backbone, Swish activation function, and CETL loss function reports a Mean Accuracy (MAcc) of 0.976 and Mean Intersection over Union (MIoU) of 0.977. The semantic segmentation of structures with a smaller number of observations, such as the hepatic vein, cystic duct, Liver Ligament, and blood, verifies that the obtained results are very competitive and promising compared to the consulted literature. The proposed selected parameters were validated in the YOLOv9 architecture, which showed an improvement in semantic segmentation compared to the results obtained with the original architecture.

13.

Nighttime road scene image enhancement based on cycle-consistent generative adversarial network.

Jia, Yanfei; Yu, Wenshuo; Chen, Guangda; Zhao, Liquan.

Sci Rep ; 14(1): 14375, 2024 Jun 22.

Artigo em Inglês | MEDLINE | ID: mdl-38909068

RESUMO

During nighttime road scenes, images are often affected by contrast distortion, loss of detailed information, and a significant amount of noise. These factors can negatively impact the accuracy of segmentation and object detection in nighttime road scenes. A cycle-consistent generative adversarial network has been proposed to address this issue to improve the quality of nighttime road scene images. The network includes two generative networks with identical structures and two adversarial networks with identical structures. The generative network comprises an encoder network and a corresponding decoder network. A context feature extraction module is designed as the foundational element of the encoder-decoder network to capture more contextual semantic information with different receptive fields. A receptive field residual module is also designed to increase the receptive field in the encoder network.The illumination attention module is inserted between the encoder and decoder to transfer critical features extracted by the encoder to the decoder. The network also includes a multiscale discriminative network to discriminate better whether the image is a real high-quality or generated image. Additionally, an improved loss function is proposed to enhance the efficacy of image enhancement. Compared to state-of-the-art methods, the proposed approach achieves the highest performance in enhancing nighttime images, making them clearer and more natural.

14.

Virtual Experience Toolkit: An End-to-End Automated 3D Scene Virtualization Framework Implementing Computer Vision Techniques.

Mora, Pau; Garcia, Clara; Ivorra, Eugenio; Ortega, Mario; Alcañiz, Mariano L.

Sensors (Basel) ; 24(12)2024 Jun 13.

Artigo em Inglês | MEDLINE | ID: mdl-38931621

RESUMO

Virtualization plays a critical role in enriching the user experience in Virtual Reality (VR) by offering heightened realism, increased immersion, safer navigation, and newly achievable levels of interaction and personalization, specifically in indoor environments. Traditionally, the creation of virtual content has fallen under one of two broad categories: manual methods crafted by graphic designers, which are labor-intensive and sometimes lack precision; traditional Computer Vision (CV) and Deep Learning (DL) frameworks that frequently result in semi-automatic and complex solutions, lacking a unified framework for both 3D reconstruction and scene understanding, often missing a fully interactive representation of the objects and neglecting their appearance. To address these diverse challenges and limitations, we introduce the Virtual Experience Toolkit (VET), an automated and user-friendly framework that utilizes DL and advanced CV techniques to efficiently and accurately virtualize real-world indoor scenarios. The key features of VET are the use of ScanNotate, a retrieval and alignment tool that enhances the precision and efficiency of its precursor, supported by upgrades such as a preprocessing step to make it fully automatic and a preselection of a reduced list of CAD to speed up the process, and the implementation in a user-friendly and fully automatic Unity3D application that guides the users through the whole pipeline and concludes in a fully interactive and customizable 3D scene. The efficacy of VET is demonstrated using a diversified dataset of virtualized 3D indoor scenarios, supplementing the ScanNet dataset.

15.

Remote Sensing Image Classification Based on Canny Operator Enhanced Edge Features.

Zhou, Mo; Zhou, Yue; Yang, Dawei; Song, Kai.

Sensors (Basel) ; 24(12)2024 Jun 17.

Artigo em Inglês | MEDLINE | ID: mdl-38931695

RESUMO

Remote sensing image classification plays a crucial role in the field of remote sensing interpretation. With the exponential growth of multi-source remote sensing data, accurately extracting target features and comprehending target attributes from complex images significantly impacts classification accuracy. To address these challenges, we propose a Canny edge-enhanced multi-level attention feature fusion network (CAF) for remote sensing image classification. The original image is specifically inputted into a convolutional network for the extraction of global features, while increasing the depth of the convolutional layer facilitates feature extraction at various levels. Additionally, to emphasize detailed target features, we employ the Canny operator for edge information extraction and utilize a convolution layer to capture deep edge features. Finally, by leveraging the Attentional Feature Fusion (AFF) network, we fuse global and detailed features to obtain more discriminative representations for scene classification tasks. The performance of our proposed method (CAF) is evaluated through experiments conducted across three openly accessible datasets for classifying scenes in remote sensing images: NWPU-RESISC45, UCM, and MSTAR. The experimental findings indicate that our approach based on incorporating edge detail information outperforms methods relying solely on global feature-based classifications.

16.

Exploration of MPSO-Two-Stage Classification Optimization Model for Scene Images with Low Quality and Complex Semantics.

Liu, Kexin; Wang, Rong; Song, Xiaoou; Deng, Xiaobing; Zhu, Qingchao.

Sensors (Basel) ; 24(12)2024 Jun 19.

Artigo em Inglês | MEDLINE | ID: mdl-38931766

RESUMO

Currently, complex scene classification strategies are limited to high-definition image scene sets, and low-quality scene sets are overlooked. Although a few studies have focused on artificially noisy images or specific image sets, none have involved actual low-resolution scene images. Therefore, designing classification models around practicality is of paramount importance. To solve the above problems, this paper proposes a two-stage classification optimization algorithm model based on MPSO, thus achieving high-precision classification of low-quality scene images. Firstly, to verify the rationality of the proposed model, three groups of internationally recognized scene datasets were used to conduct comparative experiments with the proposed model and 21 existing methods. It was found that the proposed model performs better, especially in the 15-scene dataset, with 1.54% higher accuracy than the best existing method ResNet-ELM. Secondly, to prove the necessity of the pre-reconstruction stage of the proposed model, the same classification architecture was used to conduct comparative experiments between the proposed reconstruction method and six existing preprocessing methods on the seven self-built low-quality news scene frames. The results show that the proposed model has a higher improvement rate for outdoor scenes. Finally, to test the application potential of the proposed model in outdoor environments, an adaptive test experiment was conducted on the two self-built scene sets affected by lighting and weather. The results indicate that the proposed model is suitable for weather-affected scene classification, with an average accuracy improvement of 1.42%.

17.

[Review of joint attention deficit intervention based on virtual reality for children with autism].

Zhao, Xin; Wang, Heting; Guo, Xinmeng; Zhang, Ludan; Liu, Wei; Liu, Shuang; Ming, Dong.

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 41(3): 612-619, 2024 Jun 25.

Artigo em Chinês | MEDLINE | ID: mdl-38932549

RESUMO

Joint attention deficit is one of the core disorders in children with autism, which seriously affects the development of multiple basic skills such as language and communication. Virtual reality scene intervention has great potential in improving joint attention skills in children with autism due to its good interactivity and immersion. This article reviewed the application of virtual reality based social and nonsocial scenarios in training joint attention skills for children with autism in recent years, summarized the problems and challenges of this intervention method, and proposed a new joint paradigm for social scenario assessment and nonsocial scenario training. Finally, it looked forward to the future development and application prospects of virtual reality technology in joint attention skill training for children with autism.

Assuntos

Atenção , Transtorno Autístico , Realidade Virtual , Humanos , Transtorno Autístico/terapia , Criança

18.

MResTNet: A Multi-Resolution Transformer Framework with CNN Extensions for Semantic Segmentation.

Detsikas, Nikolaos; Mitianoudis, Nikolaos; Pratikakis, Ioannis.

J Imaging ; 10(6)2024 May 21.

Artigo em Inglês | MEDLINE | ID: mdl-38921602

RESUMO

A fundamental task in computer vision is the process of differentiation and identification of different objects or entities in a visual scene using semantic segmentation methods. The advancement of transformer networks has surpassed traditional convolutional neural network (CNN) architectures in terms of segmentation performance. The continuous pursuit of optimal performance, with respect to the popular evaluation metric results, has led to very large architectures that require a significant amount of computational power to operate, making them prohibitive for real-time applications, including autonomous driving. In this paper, we propose a model that leverages a visual transformer encoder with a parallel twin decoder, consisting of a visual transformer decoder and a CNN decoder with multi-resolution connections working in parallel. The two decoders are merged with the aid of two trainable CNN blocks, the fuser that combined the information from the two decoders and the scaler that scales the contribution of each decoder. The proposed model achieves state-of-the-art performance on the Cityscapes and ADE20K datasets, maintaining a low-complexity network that can be used in real-time applications.

19.

Combined representation of visual features in the scene-selective cortex.

Kang, Jisu; Park, Soojin.

Behav Brain Res ; 471: 115110, 2024 Jun 11.

Artigo em Inglês | MEDLINE | ID: mdl-38871131

RESUMO

Visual features of separable dimensions conjoin to represent an integrated entity. We investigated how visual features bind to form a complex visual scene. Specifically, we focused on features important for visually guided navigation: direction and distance. Previously, separate works have shown that directions and distances of navigable paths are coded in the occipital place area (OPA). Using functional magnetic resonance imaging (fMRI), we tested how separate features are concurrently represented in the OPA. Participants saw eight types of scenes, in which four of them had one path and the other four had two paths. In single-path scenes, path direction was either to the left or to the right. In double-path scenes, both directions were present. A glass wall was placed in some paths to restrict navigational distance. To test how the OPA represents path directions and distances, we took three approaches. First, the independent-features approach examined whether the OPA codes each direction and distance. Second, the integrated-features approach explored how directions and distances are integrated into path units, as compared to pooled features, using double-path scenes. Finally, the integrated-paths approach asked how separate paths are combined into a scene. Using multi-voxel pattern similarity analysis, we found that the OPA's representations of single-path scenes were similar to other single-path scenes of either the same direction or the same distance. Representations of double-path scenes were similar to the combination of two constituent single-paths, as a combined unit of direction and distance rather than as a pooled representation of all features. These results show that the OPA combines the two features to form path units, which are then used to build multiple-path scenes. Altogether, these results suggest that visually guided navigation may be supported by the OPA that automatically and efficiently combines multiple features relevant for navigation and represent a navigation file.

20.

Research on multi-object detection technology for road scenes based on SDG-YOLOv5.

Lv, Zhenyang; Wang, Rugang; Wang, Yuanyuan; Zhou, Feng; Guo, Naihong.

PeerJ Comput Sci ; 10: e2021, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38855227

RESUMO

To resolve the challenges of low detection accuracy and inadequate real-time performance in road scene detection, this article introduces the enhanced algorithm SDG-YOLOv5. The algorithm incorporates the SIoU Loss function to accurately predict the angle loss of bounding boxes, ensuring their directionality during regression and improving both regression accuracy and convergence speed. A novel lightweight decoupled heads (DHs) approach is employed to separate the classification and regression tasks, thereby avoiding conflicts between their focus areas. Moreover, the Global Attention Mechanism Group Convolution (GAMGC), a lightweight strategy, is utilized to enhance the network's capability to process additional contextual information, thereby improving the detection of small targets. Extensive experimental analysis on datasets from Udacity Self Driving Car, BDD100K, and KITTI demonstrates that the proposed algorithm achieves improvements in mAP@.5 of 2.2%, 3.4%, and 1.0% over the original YOLOv5, with a detection speed of 30.3 FPS. These results illustrate that the SDG-YOLOv5 algorithm effectively addresses both detection accuracy and real-time performance in road scene detection.

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA