RESUMO
Point-cloud semantic segmentation is a visual task essential for agricultural robots to comprehend natural agroforestry environments. However, owing to the extremely large amount of point-cloud data in agroforestry environments, learning effective features for semantic segmentation from large-scale point clouds is challenging. Therefore, to address this issue and achieve accurate semantic segmentation of different types of road-surface point clouds in large-scale agroforestry environments, this study proposes a point-cloud semantic segmentation network framework based on double-distance self-attention. First, a point-cloud local feature enhancement module is proposed. This module primarily extends the receptive field and enhances the generalizability of multidimensional features by incorporating reflection intensity information and a spatial feature-encoding block that is enhanced with contextual semantic information. Second, we introduce a dual-distance attention pooling (DDAPS) block based on the self-attention mechanism. This block initially learns the feature representation of the local neighborhood of each point through the self-attention mechanism. Then, it uses the DDAPS block to aggregate more discriminative local neighborhood point features. Finally, extensive experimental results on large-scale point-cloud datasets, SemanticKITTI and RELLIS-3D, demonstrate that our algorithm outperforms similar algorithms in large-scale agroforestry environments.
RESUMO
Road detection is a crucial part of the autonomous driving system, and semantic segmentation is used as the default method for this kind of task. However, the descriptive categories of agroforestry are not directly definable and constrain the semantic segmentation-based method for road detection. This paper proposes a novel road detection approach to overcome the problem mentioned above. Specifically, a novel two-stage method for road detection in an agroforestry environment, namely ARDformer. First, a transformer-based hierarchical feature aggregation network is used for semantic segmentation. After the segmentation network generates the scene mask, the edge extraction algorithm extracts the trail's edge. It then calculates the periphery of the trail to surround the area where the trail and grass are located. The proposed method is tested on the public agroforestry dataset, and experimental results show that the intersection over union is approximately 0.82, which significantly outperforms the baseline. Moreover, ARDformer is also effective in a real agroforestry environment.