基于RGB-D多模态的沿不规则边界避障路径规划算法

陈怡婷; 李成龙; 范加力; 潘宇镭; 王俊

doi:10.11975/j.issn.1002-6819.202510061

基于RGB-D多模态的沿不规则边界避障路径规划算法

An effective path planning method using RGB-D multimodal data for single-edge guided obstacle avoidance in irregularly shaped fields

摘要

摘要: 南方丘陵山区多不规则田块，实现无人化田埂作业的核心在于沿不规则边界的单边引导实现农机具稳定避障与精准回归，这对环境感知、实时路径规划提出了更高要求。为此，该研究提出一种基于RGB-D（Red, Green, Blue - Depth）多模态数据的单边引导避障路径规划方法，包括场景感知地图构建与路径规划两大环节。在感知层面，利用深度相机获取颜色与深度信息，构建高效的RGB-D语义分割网络，实现田埂和障碍物的高精度识别，并将三维点云投影为二维障碍坐标图。在规划层面，设计改进的人工势场算法，结合平滑因子、田埂区域排斥模型和子目标生成机制，实现平缓入障、提前出障和整体路径的连续平滑，同时通过路径逆投影将规划轨迹精确映射回前视图像，便于统一的视觉控制。试验结果表明，所构建的RGB-D分割网络在水田和旱田场景中的最高平均交并比mIoU达95.97%，能够稳定输出精准的分割结果；改进的人工势场规划器在多种田埂形状、障碍物类型与位置下均可生成平滑且误差低于9像素、标准差小于1.2像素的避障路径，田间试验的平均横向偏差低于0.076 m。该方法具有入障平缓、出障早、路径平滑和高准确性的避障效果，可为不规则农田的视觉感知与自主避障提供一体化技术方案。

Abstract: Irregularly shaped small fields are widely distributed across hilly agricultural regions, where curved and undulating headland boundaries pose persistent challenges for autonomous navigation. Achieving unmanned edging operations in such environments required robust perception of complex field structures and the ability to generate smooth, safe obstacle avoidance paths while maintaining unilateral guidance along the inner headland boundary. Traditional machine vision techniques were often inadequate for this task due to low robustness under varying field conditions, limited spatial understanding, and difficulty distinguishing subtle texture differences between headland and farmland. Likewise, conventional local path planning methods were not well suited for unilateral guidance scenarios, as they typically failed to ensure both a smooth obstacle-bypassing process and a stable, consistent rejoining of the reference boundary. To address these challenges, this paper proposed a unilateral guidance obstacle avoidance path planning method based on RGB-D multimodal data, which consisted of two key components: high-fidelity scene perception map construction and intelligent real-time path planning. At the perception level, a depth camera simultaneously captured color and depth information. An efficient RGB-D semantic segmentation model was adopted to exploit the complementary strengths of RGB texture cues and depth geometry. The network followed a dual-stream encoder–decoder design with multilevel cross-modality fusion, attention-based feature reweighting, and multi-scale supervision, which enabled accurate pixel-level classification of headland and obstacle regions under diverse lighting, crop residue, and soil conditions. Depth information enhanced separability in cases where RGB contrast was weak, such as dry field headland edges, thereby ensuring stable segmentation performance across different terrains. To transform segmentation into spatially meaningful representations, a three-stage point cloud processing pipeline was developed. First, each RGB pixel was back-projected into 3D space using depth measurements and camera calibration parameters to form a dense point cloud. Second, semantic labels from the RGB-D segmentation were transferred to the corresponding 3D points through pixel-point indexing, resulting in a semantically annotated point cloud. Third, a ground-plane estimation and projection procedure generated a 2D obstacle-avoidance coordinate map by projecting all relevant points onto a common field plane. The resulting coordinate map provided a compact yet informative representation that is well suited for efficient, real-time path planning. On this basis, an enhanced artificial potential field algorithm was designed to meet the unique demands of single-edge-guided obstacle avoidance. Several key improvements were introduced, including geometric generalization of repulsive fields to handle arbitrarily shaped obstacles, region-aware gain modulation to regulate repulsive forces near the headland boundary, integration of a smoothing factor to eliminate curvature discontinuities, headland-specific repulsion modelling to maintain a stable lateral offset from the boundary, and sub-target generation to avoid local minima caused by force equilibrium. The resulting planner generated obstacle avoidance trajectories with gentle obstacle approach, early exit, and overall path smoothness, while maintaining the desired distance from the headland. To facilitate direct use by vision-based path tracking controllers, the planned trajectory was further re-projected onto the forward-looking RGB image through inverse coordinate transformation, so that the downstream control module received visually aligned path references. Extensive experiments were conducted to validate the effectiveness of the proposed framework. The RGB-D segmentation network achieved a highest mean Intersection over Union of 95.97% on the paddy-field and dry-field datasets, consistently producing accurate and stable segmentation results. The improved potential field planner generated obstacle avoidance trajectories with path deviation below 9 pixels and standard deviations under 1.2 pixels across various obstacle shapes and placements. Furthermore, real-world field experiments demonstrated that the system achieved an average lateral deviation below 0.076 m during autonomous operation. The overall processing efficiency reached 23.19 frames per second, indicating that the system satisfied real-time deployment on agricultural machinery. Overall, the proposed framework provided an integrated solution for autonomous obstacle avoidance under unilateral headland guidance. By combining multimodal perception, 3D spatial reasoning, and behavior-aware planning, the proposed method enabled reliable navigation in irregularly shaped small fields and contributes a practical approach toward fully unmanned agricultural operations.

HTML全文

参考文献(30)

施引文献

资源附件(0)