语义信息辅助的LiDAR-IMU融合定位方法

曹欣源; 寇志伟; 崔啸鸣; 李娜; 齐咏生

doi:10.11975/j.issn.1002-6819.202509036

语义信息辅助的LiDAR-IMU融合定位方法

Semantic information-assisted LiDAR-IMU fusion localization method

摘要

摘要: 农业机器人的精准定位是智慧农业中机器人自主作业的关键。传统 SLAM（simultaneous localization and mapping）定位方法主要依赖几何特征建立约束，当场景可用几何信息不足时，特征匹配过程易出现不稳定影响定位精度。为此，该研究提出了一种结合语义信息的多传感器融合定位方法。系统以LiDAR点云和IMU（inertial measurement unit）数据为输入，经预处理后提取几何与语义特征，结合语义类别的重要性设定差异化语义权重参与点云的匹配。后续采用历史配准残差进行统计分析实现语义权重的调整，从而优化特征匹配的可靠性，提高位姿估计精度。在后端优化中，将前端里程计输出的位姿估计结果与IMU预积分约束共同作为因子图优化的约束因子，实现联合优化，提高定位精度。在开源数据集和自采数据集中对本文方法进行了验证，在开源数据集的结果表明，本文方法在大部分指标上比其他经典的方法更优。本文方法在Sequence.05、Sequence 07、Sequence 10数据集序列的绝对轨迹误差分别为2.884、0.821、1.391 m；在自采的数据集中，较LIO-SAM和A-LOAM的均方根误差分别降低了17.65%和59.02%，该研究结果为农业机器人的精准定位提供了思路。

Abstract: Agricultural environments are generally characterized by weak geometric saliency, repetitive vegetation structures, and frequent dynamic disturbances. The conventional simultaneous localization and mapping can rely primarily on geometric feature matching. It is often required for stable data and reliable residual estimation under such conditions. The accuracy of the pose can also be improved to avoid the accumulated localization drift. In this study, a tightly coupled LiDAR–IMU SLAM framework was introduced with the semantic information, in order to improve the localization accuracy and robustness in agricultural robots. As such, the feature optimization was guided by the category-dependent reliability rather than the geometry alone. LiDAR point clouds and inertial measurement unit data were jointly utilized in the framework. Raw LiDAR scans were first processed by the motion distortion correction and outlier removal, in order to mitigate the sensor motion and noise. Subsequently, a semantic segmentation network (RangeNet++) was employed to assign the semantic labels to each point, thereby generating annotated point clouds. A semantic-aware downsampling strategy was adopted to balance the computational efficiency and semantic preservation. Among them, the different semantic categories were sampled with various densities, according to their spatial distribution and pose estimation. The curvature edge and planar features were extracted and then associated between consecutive frames during LiDAR odometry estimation. Semantic information was explicitly embedded into the optimization to assign the category-specific weights to residual terms. Moreover, a dynamic semantic weighting mechanism was introduced with the historical residual statistics. Residuals accumulation within a sliding window was statistically analyzed to adaptively update the semantic weights, so that the categories with the larger residuals were down-weighted, while the categories with stable matching behavior were reinforced after optimization. LiDAR odometry and IMU pre-integration constraints were jointly formulated on the back end within a factor-graph framework. Keyframes were then selected according to translational and rotational motion thresholds. All constraints were optimized using nonlinear least-squares optimization. Extensive experiments were carried out on both public benchmark datasets and self-collected agricultural datasets, in order to verify the effectiveness of the framework. The KITTI benchmark, referred to as LID-SLAM, was evaluated on Sequences 05, 07, and 10, and then compared with the LIS-SLAM, LIO-SAM, and LeGO-LOAM, where the loop closure was disabled for the fair comparison. On Sequence 05, the influence of the unreliable semantic categories was effectively suppressed, which contained the frequent turns, occlusions, and dynamic traffic participants. As a result, a mean absolute trajectory error of 2.484 m and a root mean square error (RMSE) of 2.884 m were achieved, outperforming LIO-SAM (mean 2.886 m, RMSE 3.410 m) and significantly surpassing LeGO-LOAM (mean 9.369 m, RMSE 10.091 m). On Sequence 07, the stable performance was maintained to represent a relatively static and open environment, with a mean error of 0.694 m and an RMSE of 0.821 m, which were comparable to those of LIO-SAM and markedly better than those of LIS-SLAM and LeGO-LOAM. On the more challenging Sequence 10, the peak and long-tail errors were substantially reduced, which involved the complex dynamics and higher motion speeds. The maximum error was lowered to 3.766 m, and the RMSE was reduced to 1.391 m, compared with the 2.224 m for LIO-SAM and 7.923 m for LeGO-LOAM. Ablation experiments were further conducted to assess the contribution of the dynamic semantic weighting mechanism. Once the dynamic weighting module was removed, the larger local deviations and increased error fluctuations were observed, although the overall trajectory shape was preserved. In contrast, the dynamic semantic weights reduced the maximum error by approximately 20% and RMSE by up to 14.9% on the representative sequences. The trajectories were consistently produced to closely align with the ground truth on the agricultural dataset, which featured dense vegetation, mixed terrain, and narrow roads. There was no obvious drift or trajectory break even after long straight motions followed by sharp turns. Quantitatively, the RMSE was reduced to 0.084 m, compared with 0.102 m for LIO-SAM and 0.205 m for A-LOAM, which was relatively reduced by 17.65% and 59.02%, respectively. In conclusion, a semantic-assisted, LiDAR–IMU tightly coupled SLAM framework was presented to combine the semantic-aware downsampling, category-weighted residual modeling, and residual-feedback-driven dynamic semantic weight adaptation within a factor-graph optimization backend. The reliability of the data association was improved to effectively suppress the error accumulation in the weak-geometry and dynamic environments. Experimental results on the benchmark and real-world agricultural datasets demonstrated that the localization accuracy and robustness were significantly improved. The finding can also provide a practical and effective solution for the high-precision autonomous navigation of agricultural robots.

HTML全文

参考文献(33)

施引文献

资源附件(0)