Búsqueda | Portal Regional de la BVS

1.

Two distinct enriched housings differentially ameliorate object and place recognition deficits in a rat model of schizophrenia.

Toyoshima, Michimasa; Takahashi, Katsumasa; Sato, Eri; Shimoda, Shota; Yamada, Kazuo.

Behav Brain Res ; : 115276, 2024 Oct 02.

Artículo en Inglés | MEDLINE | ID: mdl-39366555

RESUMEN

Schizophrenia is a psychiatric disorder characterized by cognitive dysfunctions. These dysfunctions significantly impact the daily lives of schizophrenic patients, yet effective interventions remain scarce. In this study, we explored the effects of two enriched housing types-cognitive and physical-on cognitive dysfunctions in a rat model of schizophrenia. Male neonatal Wistar-Imamichi rats were administered MK-801, a noncompetitive NMDAR antagonist, twice daily from postnatal day (PND) 7 to PND 20. Physical enrichment ameliorated memory deficits in both object and place recognition tests, while cognitive enrichment primarily improved object recognition performance. Our findings suggest that exercise therapy could be a potential approach to address cognitive dysfunctions in schizophrenia patients.

2.

Visual place recognition from end-to-end semantic scene text features.

Raisi, Zobeir; Zelek, John.

Front Robot AI ; 11: 1424883, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-39350962

RESUMEN

We live in a visual world where text cues are abundant in urban environments. The premise for our work is for robots to capitalize on these text features for visual place recognition. A new technique is introduced that uses an end-to-end scene text detection and recognition technique to improve robot localization and mapping through Visual Place Recognition (VPR). This technique addresses several challenges such as arbitrary shaped text, illumination variation, and occlusion. The proposed model captures text strings and associated bounding boxes specifically designed for VPR tasks. The primary contribution of this work is the utilization of an end-to-end scene text spotting framework that can effectively capture irregular and occluded text in diverse environments. We conduct experimental evaluations on the Self-Collected TextPlace (SCTP) benchmark dataset, and our approach outperforms state-of-the-art methods in terms of precision and recall, which validates the effectiveness and potential of our proposed approach for VPR.

3.

DINO-Mix enhancing visual place recognition with foundational vision model and feature mixing.

Huang, Gaoshuang; Zhou, Yang; Hu, Xiaofei; Zhang, Chenglong; Zhao, Luying; Gan, Wenjian.

Sci Rep ; 14(1): 22100, 2024 Sep 27.

Artículo en Inglés | MEDLINE | ID: mdl-39333370

RESUMEN

Using visual place recognition (VPR) technology to ascertain the geographical location of publicly available images is a pressing issue. Although most current VPR methods achieve favorable results under ideal conditions, their performance in complex environments, characterized by lighting variations, seasonal changes, and occlusions, is generally unsatisfactory. Therefore, obtaining efficient and robust image feature descriptors in complex environments is a pressing issue. In this study, we utilized the DINOv2 model as the backbone for trimming and fine-tuning to extract robust image features and employed a feature mix module to aggregate image features, resulting in globally robust and generalizable descriptors that enable high-precision VPR. We experimentally demonstrated that the proposed DINO-Mix outperforms the current state-of-the-art (SOTA) methods. Using test sets having lighting variations, seasonal changes, and occlusions such as Tokyo24/7, Nordland, and SF-XL-Testv1, our proposed architecture achieved Top-1 accuracy rates of 91.75%, 80.18%, and 82%, respectively, and exhibited an average accuracy improvement of 5.14%. In addition, we compared it with other SOTA methods using representative image retrieval case studies, and our architecture outperformed its competitors in terms of VPR performance. Furthermore, we visualized the attention maps of DINO-Mix and other methods to provide a more intuitive understanding of their respective strengths. These visualizations serve as compelling evidence of the superiority of the DINO-Mix framework in this domain.

4.

An Object-Centric Hierarchical Pose Estimation Method Using Semantic High-Definition Maps for General Autonomous Driving.

Pyo, Jeong-Won; Choi, Jun-Hyeon; Kuc, Tae-Yong.

Sensors (Basel) ; 24(16)2024 Aug 11.

Artículo en Inglés | MEDLINE | ID: mdl-39204886

RESUMEN

To achieve Level 4 and above autonomous driving, a robust and stable autonomous driving system is essential to adapt to various environmental changes. This paper aims to perform vehicle pose estimation, a crucial element in forming autonomous driving systems, more universally and robustly. The prevalent method for vehicle pose estimation in autonomous driving systems relies on Real-Time Kinematic (RTK) sensor data, ensuring accurate location acquisition. However, due to the characteristics of RTK sensors, precise positioning is challenging or impossible in indoor spaces or areas with signal interference, leading to inaccurate pose estimation and hindering autonomous driving in such scenarios. This paper proposes a method to overcome these challenges by leveraging objects registered in a high-precision map. The proposed approach involves creating a semantic high-definition (HD) map with added objects, forming object-centric features, recognizing locations using these features, and accurately estimating the vehicle's pose from the recognized location. This proposed method enhances the precision of vehicle pose estimation in environments where acquiring RTK sensor data is challenging, enabling more robust and stable autonomous driving. The paper demonstrates the proposed method's effectiveness through simulation and real-world experiments, showcasing its capability for more precise pose estimation.

5.

2,5-Dimethoxy-4-iodoamphetamine and altanserin induce region-specific shifts in dopamine and serotonin metabolization pathways in the rat brain.

Nikolaus, Susanne; Fazari, Benedetta; Chao, Owen Y; Almeida, Filipe Rodrigues; Abdel-Hafiz, Laila; Beu, Markus; Henke, Jan; Antke, Christina; Hautzel, Hubertus; Mamlins, Eduards; Müller, Hans-Wilhelm; Huston, Joseph P; von Gall, Charlotte; Giesel, Frederik L.

Pharmacol Biochem Behav ; 242: 173823, 2024 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-39002804

RESUMEN

PURPOSE: For understanding the neurochemical mechanism of neuropsychiatric conditions associated with cognitive deficits it is of major relevance to elucidate the influence of serotonin (5-HT) agonists and antagonists on memory function as well dopamine (DA) and 5-HT release and metabolism. In the present study, we assessed the effects of the 5-HT2A receptor agonist 2,5-dimethoxy-4-iodoamphetamine (DOI) and the 5-HT2A receptor altanserin (ALT) on object and place recognition memory and cerebral neurotransmitters and metabolites in the rat. METHODS: Rats underwent a 5-min exploration trial in an open field with two identical objects. After systemic injection of a single dose of either DOI (0.1 mg/kg), ALT (1 mg/kg) or the respectice vehicle (0.9 % NaCl, 50 % DMSO), rats underwent a 5-min test trial with one of the objects replaced by a novel one and the other object transferred to a novel place. Upon the assessment of object exploration and motor/exploratory behaviors, rats were sacrificed. DA, 5-HT and metabolite levels were analyzed in cingulate (CING), caudateputamen (CP), nucleus accumbens (NAC), thalamus (THAL), dorsal (dHIPP) and ventral hippocampus (vHIPP), brainstem and cerebellum with high performance liquid chromatography. RESULTS: DOI decreased rearing but increased head-shoulder motility relative to vehicle. Memory for object and place after both DOI and ALT was not different from vehicle. Network analyses indicated that DOI inhibited DA metabolization in CING, CP, NAC, and THAL, but facilitated it in dHIPP. Likewise, DOI inhibited 5-HT metabolization in CING, NAC, and THAL. ALT facilitated DA metabolization in the CING, NAC, dHIPP, vHIPP, and CER, but inhibited it in the THAL. Additionally, ALT facilitated 5-HT metabolization in NAC and dHIPP. CONCLUSIONS: DOI and ALT differentially altered the quantitative relations between the neurotransmitter/metabolite levels in the individual brain regions, by inducing region-specific shifts in the metabolization pathways. Findings are relevant for understanding the neurochemistry underlying DAergic and/or 5-HTergic dysfunction in neurological and psychiatric conditions.

Asunto(s)

Anfetaminas , Encéfalo , Dopamina , Serotonina , Animales , Ratas , Serotonina/metabolismo , Masculino , Dopamina/metabolismo , Anfetaminas/farmacología , Encéfalo/metabolismo , Encéfalo/efectos de los fármacos , Ketanserina/farmacología , Ketanserina/análogos & derivados , Agonistas del Receptor de Serotonina 5-HT2/farmacología , Ratas Wistar

6.

BinVPR: Binary Neural Networks towards Real-Valued for Visual Place Recognition.

Wang, Junshuai; Han, Junyu; Dong, Ruifang; Kan, Jiangming.

Sensors (Basel) ; 24(13)2024 Jun 25.

Artículo en Inglés | MEDLINE | ID: mdl-39000909

RESUMEN

Visual Place Recognition (VPR) aims to determine whether a robot or visual navigation system locates in a previously visited place using visual information. It is an essential technology and challenging problem in computer vision and robotic communities. Recently, numerous works have demonstrated that the performance of Convolutional Neural Network (CNN)-based VPR is superior to that of traditional methods. However, with a huge number of parameters, large memory storage is necessary for these CNN models. It is a great challenge for mobile robot platforms equipped with limited resources. Fortunately, Binary Neural Networks (BNNs) can reduce memory consumption by converting weights and activation values from 32-bit into 1-bit. But current BNNs always suffer from gradients vanishing and a marked drop in accuracy. Therefore, this work proposed a BinVPR model to handle this issue. The solution is twofold. Firstly, a feature restoration strategy was explored to add features into the latter convolutional layers to further solve the gradient-vanishing problem during the training process. Moreover, we identified two principles to address gradient vanishing: restoring basic features and restoring basic features from higher to lower layers. Secondly, considering the marked drop in accuracy results from gradient mismatch during backpropagation, this work optimized the combination of binarized activation and binarized weight functions in the Larq framework, and the best combination was obtained. The performance of BinVPR was validated on public datasets. The experimental results show that it outperforms state-of-the-art BNN-based approaches and full-precision networks of AlexNet and ResNet in terms of both recognition accuracy and model size. It is worth mentioning that BinVPR achieves the same accuracy with only 1% and 4.6% model sizes of AlexNet and ResNet.

7.

Place Recognition through Multiple LiDAR Scans Based on the Hidden Markov Model.

Gui, Linqiu; Zeng, Chunnian; Luo, Jie; Yang, Xu.

Sensors (Basel) ; 24(11)2024 Jun 03.

Artículo en Inglés | MEDLINE | ID: mdl-38894402

RESUMEN

Autonomous driving systems for unmanned ground vehicles (UGV) operating in enclosed environments strongly rely on LiDAR localization with a prior map. Precise initial pose estimation is critical during system startup or when tracking is lost, ensuring safe UGV operation. Existing LiDAR-based place recognition methods often suffer from reduced accuracy due to only matching descriptors from individual LiDAR keyframes. This paper proposes a multi-frame descriptor-matching approach based on the hidden Markov model (HMM) to address this issue. This method enhances the place recognition accuracy and robustness by leveraging information from multiple frames. Experimental results from the KITTI dataset demonstrate that the proposed method significantly enhances the place recognition performance compared with the scan context-based single-frame descriptor-matching approach, with an average performance improvement of 5.8% and with a maximum improvement of 15.3%.

8.

Distributed training of CosPlace for large-scale visual place recognition.

Zaccone, Riccardo; Berton, Gabriele; Masone, Carlo.

Front Robot AI ; 11: 1386464, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38832343

RESUMEN

Visual place recognition (VPR) is a popular computer vision task aimed at recognizing the geographic location of a visual query, usually within a tolerance of a few meters. Modern approaches address VPR from an image retrieval standpoint using a kNN on top of embeddings extracted by a deep neural network from both the query and images in a database. Although most of these approaches rely on contrastive learning, which limits their ability to be trained on large-scale datasets (due to mining), the recently reported CosPlace proposes an alternative training paradigm using a classification task as the proxy. This has been shown to be effective in expanding the potential of VPR models to learn from large-scale and fine-grained datasets. In this work, we experimentally analyze CosPlace from a continual learning perspective and show that its sequential training procedure leads to suboptimal results. As a solution, we propose a different formulation that not only solves the pitfalls of the original training strategy effectively but also enables faster and more efficient distributed training. Finally, we discuss the open challenges in further speeding up large-scale image retrieval for VPR.

9.

5-HT_1A and 5-HT_2A receptor effects on recognition memory, motor/exploratory behaviors, emotionality and regional dopamine transporter binding in the rat.

Nikolaus, Susanne; Chao, Owen Y; Henke, Jan; Beu, Markus; Fazari, Benedetta; Almeida, Filipe Rodrigues; Abdel-Hafiz, Laila; Antke, Christina; Hautzel, Hubertus; Mamlins, Eduards; Müller, Hans-Wilhelm; Huston, Joseph P; von Gall, Charlotte; Giesel, Frederik L.

Behav Brain Res ; 469: 115051, 2024 07 09.

Artículo en Inglés | MEDLINE | ID: mdl-38777263

RESUMEN

Both dopamine (DA) and serotonin (5-HT) play key roles in numerous functions including motor control, stress response and learning. So far, there is scarce or conflicting evidence about the effects of 5-HT1A and 5-HT2A receptor (R) agonists and antagonists on recognition memory in the rat. This also holds for their effect on cerebral DA as well as 5-HT release. In the present study, we assessed the effects of the 5-HT1AR agonist 8-OH-DPAT and antagonist WAY100,635 and the 5-HT2AR agonist DOI and antagonist altanserin (ALT) on rat behaviors. Moreover, we investigated their impact on monoamine efflux by measuring monoamine transporter binding in various regions of the rat brain. After injection of either 8-OH-DPAT (3â¯mg/kg), WAY100,635 (0.4â¯mg/kg), DOI (0.1â¯mg/kg), ALT (1â¯mg/kg) or the respective vehicle (saline, DMSO), rats underwent an object and place recognition memory test in the open field. Upon the assessment of object exploration, motor/exploratory parameters and feces excretion, rats were administered the monoamine transporter radioligand N-o-fluoropropyl-2b-carbomethoxy-3b-(4-[123I]iodophenyl)-nortropane ([123I]-FP-CIT; 8.9 ± 2.6 MBq) into the tail vein. Regional radioactivity accumulations in the rat brain were determined post mortem. Compared vehicle, administration of 8-OH-DPAT impaired memory for place, decreased rearing behavior, and increased ambulation as well as head-shoulder movements. DOI administration led to a reduction in rearing behavior but an increase in head-shoulder motility relative to vehicle. Feces excretion was diminished after ALT relative to vehicle. Dopamine transporter (DAT) binding was increased in the caudateputamen (CP), but decreased in the nucleus accumbens (NAC) after 8-OH-DPAT relative to vehicle. Moreover, DAT binding was decreased in the NAC after ALT relative to vehicle. Findings indicate that 5-HT1AR inhibition and 5-HT2AR activation may impair memory for place. Furthermore, results imply associations not only between recognition memory, motor/exploratory behavior and emotionality but also between the respective parameters and the levels of available DA in CP and NAC.

Asunto(s)

Proteínas de Transporte de Dopamina a través de la Membrana Plasmática , Conducta Exploratoria , Reconocimiento en Psicología , Animales , Proteínas de Transporte de Dopamina a través de la Membrana Plasmática/metabolismo , Masculino , Reconocimiento en Psicología/efectos de los fármacos , Reconocimiento en Psicología/fisiología , Conducta Exploratoria/efectos de los fármacos , Conducta Exploratoria/fisiología , Ratas , Receptor de Serotonina 5-HT1A/metabolismo , Receptor de Serotonina 5-HT1A/efectos de los fármacos , Receptor de Serotonina 5-HT2A/metabolismo , Receptor de Serotonina 5-HT2A/efectos de los fármacos , Actividad Motora/efectos de los fármacos , Actividad Motora/fisiología , Encéfalo/metabolismo , Encéfalo/efectos de los fármacos , Emociones/efectos de los fármacos , Emociones/fisiología , Agonistas del Receptor de Serotonina 5-HT1/farmacología , Agonistas del Receptor de Serotonina 5-HT2/farmacología , Ratas Wistar

10.

Convolutional MLP orthogonal fusion of multiscale features for visual place recognition.

Gan, Wenjian; Zhou, Yang; Hu, Xiaofei; Zhao, Luying; Huang, Gaoshuang; Zhang, Chenglong.

Sci Rep ; 14(1): 11756, 2024 May 23.

Artículo en Inglés | MEDLINE | ID: mdl-38783024

RESUMEN

Visual place recognition (VPR) involves obtaining robust image descriptors to cope with differences in camera viewpoints and drastic external environment changes. Utilizing multiscale features improves the robustness of image descriptors; however, existing methods neither exploit the multiscale features generated during feature extraction nor consider the feature redundancy problem when fusing multiscale information when image descriptors are enhanced. We propose a novel encoding strategy-convolutional multilayer perceptron orthogonal fusion of multiscale features (ConvMLP-OFMS)-for VPR. A ConvMLP is used to obtain robust and generalized global image descriptors and the multiscale features generated during feature extraction are used to enhance the global descriptors to cope with changes in the environment and viewpoints. Additionally, an attention mechanism is used to eliminate noise and redundant information. Compared to traditional methods that use tensor splicing for feature fusion, we introduced matrix orthogonal decomposition to eliminate redundant information. Experiments demonstrated that the proposed architecture outperformed NetVLAD, CosPlace, ConvAP, and other methods. On the Pittsburgh and MSLS datasets, which contained significant viewpoint and illumination variations, our method achieved 92.5% and 86.5% Recall@1, respectively. We also achieved good performances-80.6% and 43.2%-on the SPED and NordLand datasets, respectively, which have more extreme illumination and appearance variations.

11.

Changes in movement patterns in relation to sun conditions and spatial scales in wild western gorillas.

Robira, B; Benhamou, S; Obeki Bayanga, E; Breuer, T; Masi, S.

Anim Cogn ; 27(1): 37, 2024 Apr 29.

Artículo en Inglés | MEDLINE | ID: mdl-38684551

RESUMEN

For most primates living in tropical forests, food resources occur in patchworks of different habitats that vary seasonally in quality and quantity. Efficient navigation (i.e., spatial memory-based orientation) towards profitable food patches should enhance their foraging success. The mechanisms underpinning primate navigating ability remain nonetheless mostly unknown. Using GPS long-term tracking (596 days) of one group of wild western lowland gorillas (Gorilla gorilla gorilla), we investigated their ability to navigate at long distances, and tested for how the sun was used to navigate at any scale by improving landmark visibility and/or by acting as a compass. Long episodic movements ending at a distant swamp, a unique place in the home range where gorillas could find mineral-rich aquatic plants, were straighter and faster than their everyday foraging movements relying on spatial memory. This suggests intentional targeting of the swamp based on long-distance navigation skills, which can thus be efficient over a couple of kilometres. Interestingly, for both long-distance movements towards the swamp and everyday foraging movements, gorillas moved straighter under sunlight conditions even under a dense vegetation cover. By contrast, movement straightness was not markedly different when the sun elevation was low (the sun azimuth then being potentially usable as a compass) or high (so providing no directional information) and the sky was clear or overcast. This suggests that gorillas navigate their home range by relying on visual place recognition but do not use the sun azimuth as a compass. Like humans, who rely heavily on vision to navigate, gorillas should benefit from better lighting to help them identify landmarks as they move through shady forests. This study uncovers a neglected aspect of primate navigation. Spatial memory and vision might have played an important role in the evolutionary success of diurnal primate lineages.

Asunto(s)

Gorilla gorilla , Animales , Gorilla gorilla/fisiología , Masculino , Femenino , Navegación Espacial , Luz Solar , Memoria Espacial , Movimiento , Fenómenos de Retorno al Lugar Habitual

12.

An Appearance-Semantic Descriptor with Coarse-to-Fine Matching for Robust VPR.

Chen, Jie; Li, Wenbo; Hou, Pengshuai; Yang, Zipeng; Zhao, Haoyu.

Sensors (Basel) ; 24(7)2024 Mar 29.

Artículo en Inglés | MEDLINE | ID: mdl-38610414

RESUMEN

In recent years, semantic segmentation has made significant progress in visual place recognition (VPR) by using semantic information that is relatively invariant to appearance and viewpoint, demonstrating great potential. However, in some extreme scenarios, there may be semantic occlusion and semantic sparsity, which can lead to confusion when relying solely on semantic information for localization. Therefore, this paper proposes a novel VPR framework that employs a coarse-to-fine image matching strategy, combining semantic and appearance information to improve algorithm performance. First, we construct SemLook global descriptors using semantic contours, which can preliminarily screen images to enhance the accuracy and real-time performance of the algorithm. Based on this, we introduce SemLook local descriptors for fine screening, combining robust appearance information extracted by deep learning with semantic information. These local descriptors can address issues such as semantic overlap and sparsity in urban environments, further improving the accuracy of the algorithm. Through this refined screening process, we can effectively handle the challenges of complex image matching in urban environments and obtain more accurate results. The performance of SemLook descriptors is evaluated on three public datasets (Extended-CMU Season, Robot-Car Seasons v2, and SYNTHIA) and compared with six state-of-the-art VPR algorithms (HOG, CoHOG, AlexNet_VPR, Region VLAD, Patch-NetVLAD, Forest). In the experimental comparison, considering both real-time performance and evaluation metrics, the SemLook descriptors are found to outperform the other six algorithms. Evaluation metrics include the area under the curve (AUC) based on the precision-recall curve, Recall@100%Precision, and Precision@100%Recall. On the Extended-CMU Season dataset, SemLook descriptors achieve a 100% AUC value, and on the SYNTHIA dataset, they achieve a 99% AUC value, demonstrating outstanding performance. The experimental results indicate that introducing global descriptors for initial screening and utilizing local descriptors combining both semantic and appearance information for precise matching can effectively address the issue of location recognition in scenarios with semantic ambiguity or sparsity. This algorithm enhances descriptor performance, making it more accurate and robust in scenes with variations in appearance and viewpoint.

13.

Contextual Patch-NetVLAD: Context-Aware Patch Feature Descriptor and Patch Matching Mechanism for Visual Place Recognition.

Sun, Wenyuan; Chen, Wentang; Huang, Runxiang; Tian, Jing.

Sensors (Basel) ; 24(3)2024 Jan 28.

Artículo en Inglés | MEDLINE | ID: mdl-38339570

RESUMEN

The goal of visual place recognition (VPR) is to determine the location of a query image by identifying its place in a collection of image databases. Visual sensor technologies are crucial for visual place recognition as they allow for precise identification and location of query images within a database. Global descriptor-based VPR methods face the challenge of accurately capturing the local specific regions within a scene; consequently, it leads to an increasing probability of confusion during localization in such scenarios. To tackle feature extraction and feature matching challenges in VPR, we propose a modified patch-NetVLAD strategy that includes two new modules: a context-aware patch descriptor and a context-aware patch matching mechanism. Firstly, we propose a context-driven patch feature descriptor to overcome the limitations of global and local descriptors in visual place recognition. This descriptor aggregates features from each patch's surrounding neighborhood. Secondly, we introduce a context-driven feature matching mechanism that utilizes cluster and saliency context-driven weighting rules to assign higher weights to patches that are less similar to densely populated or locally similar regions for improved localization performance. We further incorporate both of these modules into the patch-NetVLAD framework, resulting in a new approach called contextual patch-NetVLAD. Experimental results are provided to show that our proposed approach outperforms other state-of-the-art methods to achieve a Recall@10 score of 99.82 on Pittsburgh30k, 99.82 on FMDataset, and 97.68 on our benchmark dataset.

14.

SVS-VPR: A Semantic Visual and Spatial Information-Based Hierarchical Visual Place Recognition for Autonomous Navigation in Challenging Environmental Conditions.

Arshad, Saba; Park, Tae-Hyoung.

Sensors (Basel) ; 24(3)2024 Jan 30.

Artículo en Inglés | MEDLINE | ID: mdl-38339624

RESUMEN

Robust visual place recognition (VPR) enables mobile robots to identify previously visited locations. For this purpose, the extracted visual information and place matching method plays a significant role. In this paper, we critically review the existing VPR methods and group them into three major categories based on visual information used, i.e., handcrafted features, deep features, and semantics. Focusing the benefits of convolutional neural networks (CNNs) and semantics, and limitations of existing research, we propose a robust appearance-based place recognition method, termed SVS-VPR, which is implemented as a hierarchical model consisting of two major components: global scene-based and local feature-based matching. The global scene semantics are extracted and compared with pre-visited images to filter the match candidates while reducing the search space and computational cost. The local feature-based matching involves the extraction of robust local features from CNN possessing invariant properties against environmental conditions and a place matching method utilizing semantic, visual, and spatial information. SVS-VPR is evaluated on publicly available benchmark datasets using true positive detection rate, recall at 100% precision, and area under the curve. Experimental findings demonstrate that SVS-VPR surpasses several state-of-the-art deep learning-based methods, boosting robustness against significant changes in viewpoint and appearance while maintaining efficient matching time performance.

15.

IS-CAT: Intensity-Spatial Cross-Attention Transformer for LiDAR-Based Place Recognition.

Joo, Hyeong-Jun; Kim, Jaeho.

Sensors (Basel) ; 24(2)2024 Jan 17.

Artículo en Inglés | MEDLINE | ID: mdl-38257678

RESUMEN

LiDAR place recognition is a crucial component of autonomous navigation, essential for loop closure in simultaneous localization and mapping (SLAM) systems. Notably, while camera-based methods struggle in fluctuating environments, such as weather or light, LiDAR demonstrates robustness against such challenges. This study introduces the intensity and spatial cross-attention transformer, which is a novel approach that utilizes LiDAR to generate global descriptors by fusing spatial and intensity data for enhanced place recognition. The proposed model leveraged a cross attention to a concatenation mechanism to process and integrate multi-layered LiDAR projections. Consequently, the previously unexplored synergy between spatial and intensity data was addressed. We demonstrated the performance of IS-CAT through extensive validation on the NCLT dataset. Additionally, we performed indoor evaluations on our Sejong indoor-5F dataset and demonstrated successful application to a 3D LiDAR SLAM system. Our findings highlight descriptors that demonstrate superior performance in various environments. This performance enhancement is evident in both indoor and outdoor settings, underscoring the practical effectiveness and advancements of our approach.

16.

Coarse Alignment Methodology of Point Cloud Based on Camera Position/Orientation Estimation Model.

Yoo, Suhong; Kim, Namhoon.

J Imaging ; 9(12)2023 Dec 14.

Artículo en Inglés | MEDLINE | ID: mdl-38132697

RESUMEN

This study presents a methodology for the coarse alignment of light detection and ranging (LiDAR) point clouds, which involves estimating the position and orientation of each station using the pinhole camera model and a position/orientation estimation algorithm. Ground control points are obtained using LiDAR camera images and the point clouds are obtained from the reference station. The estimated position and orientation vectors are used for point cloud registration. To evaluate the accuracy of the results, the positions of the LiDAR and the target were measured using a total station, and a comparison was carried out with the results of semi-automatic registration. The proposed methodology yielded an estimated mean LiDAR position error of 0.072 m, which was similar to the semi-automatic registration value of 0.070 m. When the point clouds of each station were registered using the estimated values, the mean registration accuracy was 0.124 m, while the semi-automatic registration accuracy was 0.072 m. The high accuracy of semi-automatic registration is due to its capability for performing both coarse alignment and refined registration. The comparison between the point cloud with refined alignment using the proposed methodology and the point-to-point distance analysis revealed that the average distance was measured at 0.0117 m. Moreover, 99% of the points exhibited distances within the range of 0.0696 m.

17.

Landmark Topology Descriptor-Based Place Recognition and Localization under Large View-Point Changes.

Gao, Guanhong; Xiong, Zhi; Zhao, Yao; Zhang, Ling.

Sensors (Basel) ; 23(24)2023 Dec 12.

Artículo en Inglés | MEDLINE | ID: mdl-38139621

RESUMEN

Accurate localization between cameras is a prerequisite for a vision-based heterogeneous robot systems task. The core issue is how to accurately perform place recognition from different view-points. Traditional appearance-based methods have a high probability of failure in place recognition and localization under large view-point changes. In recent years, semantic graph matching-based place recognition methods have been proposed to solve the above problem. However, these methods rely on high-precision semantic segmentation results and have a high time complexity in node extraction or graph matching. In addition, methods only utilize the semantic labels of the landmarks themselves to construct graphs and descriptors, making such approaches fail in some challenging scenarios (e.g., scene repetition). In this paper, we propose a graph-matching method based on a novel landmark topology descriptor, which is robust to view-point changes. According to the experiment on real-world data, our algorithm can run in real-time and is approximately four times and three times faster than state-of-the-art algorithms in the graph extraction and matching phases, respectively. In terms of place recognition performance, our algorithm achieves the best place recognition precision at a recall of 0-70% compared with classic appearance-based algorithms and an advanced graph-based algorithm in the scene of significant view-point changes. In terms of positioning accuracy, compared to the traditional appearance-based DBoW2 and NetVLAD algorithms, our method outperforms by 95%, on average, in terms of the mean translation error and 95% in terms of the mean RMSE. Compared to the state-of-the-art SHM algorithm, our method outperforms by 30%, on average, in terms of the mean translation error and 29% in terms of the mean RMSE. In addition, our method outperforms the current state-of-the-art algorithm, even in challenging scenarios where the benchmark algorithms fail.

18.

LWR-Net: Robust and Lightweight Place Recognition Network for Noisy and Low-Density Point Clouds.

Zhang, Zhenghua; Chen, Guoliang; Shu, Mingcong; Wang, Xuan.

Sensors (Basel) ; 23(21)2023 Oct 24.

Artículo en Inglés | MEDLINE | ID: mdl-37960364

RESUMEN

Point cloud-based retrieval for place recognition is essential in robotic applications like autonomous driving or simultaneous localization and mapping. However, this remains challenging in complex real-world scenes. Existing methods are sensitive to noisy, low-density point clouds and require extensive storage and computation, posing limitations for hardware-limited scenarios. To overcome these challenges, we propose LWR-Net, a lightweight place recognition network for efficient and robust point cloud retrieval in noisy, low-density conditions. Our approach incorporates a fast dilated sampling and grouping module with a residual MLP structure to learn geometric features from local neighborhoods. We also introduce a lightweight attentional weighting module to enhance global feature representation. By utilizing the Generalized Mean pooling structure, we aggregated the global descriptor for point cloud retrieval. We validated LWR-Net's efficiency and robustness on the Oxford robotcar dataset and three in-house datasets. The results demonstrate that our method efficiently and accurately retrieves matching scenes while being more robust to variations in point density and noise intensity. LWR-Net achieves state-of-the-art accuracy and robustness with a lightweight model size of 0.4M parameters. These efficiency, robustness, and lightweight advantages make our network highly suitable for robotic applications relying on point cloud-based place recognition.

19.

Efficient Underground Tunnel Place Recognition Algorithm Based on Farthest Point Subsampling and Dual-Attention Transformer.

Chai, Xinghua; Yang, Jianyong; Yan, Xiangming; Di, Chengliang; Ye, Tao.

Sensors (Basel) ; 23(22)2023 Nov 18.

Artículo en Inglés | MEDLINE | ID: mdl-38005647

RESUMEN

An autonomous place recognition system is essential for scenarios where GPS is useless, such as underground tunnels. However, it is difficult to use existing algorithms to fully utilize the small number of effective features in underground tunnel data, and recognition accuracy is difficult to guarantee. In order to solve this challenge, an efficient point cloud position recognition algorithm, named Dual-Attention Transformer Network (DAT-Net), is proposed in this paper. The algorithm firstly adopts the farthest point downsampling module to eliminate the invalid redundant points in the point cloud data and retain the basic shape of the point cloud, which reduces the size of the point cloud and, at the same time, reduces the influence of the invalid point cloud on the data analysis. After that, this paper proposes the dual-attention Transformer module to facilitate local information exchange by utilizing the multi-head self-attention mechanism. It extracts local descriptors and integrates highly discriminative global descriptors based on global context with the help of a feature fusion layer to obtain a more accurate and robust global feature representation. Experimental results show that the method proposed in this paper achieves an average F1 score of 0.841 on the SubT-Tunnel dataset and outperforms many existing state-of-the-art algorithms in recognition accuracy and robustness tests.

20.

Cross-Domain Indoor Visual Place Recognition for Mobile Robot via Generalization Using Style Augmentation.

Wozniak, Piotr; Ozog, Dominik.

Sensors (Basel) ; 23(13)2023 Jul 04.

Artículo en Inglés | MEDLINE | ID: mdl-37447982

RESUMEN

The article presents an algorithm for the multi-domain visual recognition of an indoor place. It is based on a convolutional neural network and style randomization. The authors proposed a scene classification mechanism and improved the performance of the models based on synthetic and real data from various domains. In the proposed dataset, a domain change was defined as a camera model change. A dataset of images collected from several rooms was used to show different scenarios, human actions, equipment changes, and lighting conditions. The proposed method was tested in a scene classification problem where multi-domain data were used. The basis was a transfer learning approach with an extension style applied to various combinations of source and target data. The focus was on improving the unknown domain score and multi-domain support. The results of the experiments were analyzed in the context of data collected on a humanoid robot. The article shows that the average score was the highest for the use of multi-domain data and data style enhancement. The method of obtaining average results for the proposed method reached the level of 92.08%. The result obtained by another research team was corrected.

Asunto(s)

Robótica , Algoritmos , Aprendizaje , Iluminación , Redes Neurales de la Computación

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA