高级检索+

基于多尺度边缘增强与特征融合的农田道路凹坑检测

Pothole Detection in Farmland Roads with Multi-Scale Edge Enhancement and Feature Fusion

  • 摘要: 非结构化农田道路凹坑检测是实现农业机械自主导航的关键前提之一。针对农田道路环境复杂、凹坑目标边界模糊、形态不规则、尺度变化大等检测难题,该研究提出一种基于改进YOLOv11n的凹坑目标检测模型POT_YOLOv11n。设计多尺度边缘信息增强模块MSE(multi-scale edge information enhancement),结合多分支卷积与通道注意力以强化边界判别;构建金字塔池化与大核注意力融合模块SPPF_LSKA(spatial pyramid pooling fast_large separable kernel attention),融合大核可分离注意力和空间金字塔池化,增强多尺度特征提取能力;在颈部引入特征筛选与上下文锚点注意力机制,优化特征融合与上下文建模。试验结果表明,改进后的模型在自建农田道路凹坑数据集上的平均精度均值达到83.20%,较原模型提升2.47个百分点,F1分数为80.30%,参数量仅为2.22 M,检测速度达248.83 帧/s,研究结果可为复杂田间环境下的农机自主导航提供全天候的可靠支撑。

     

    Abstract: Detection of potholes on unstructured farm roads is a critical prerequisite for the autonomous navigation of agricultural machinery, yet this task remains challenging due to complex field environments, blurred target boundaries, irregular morphological characteristics, and significant scale variations. To address these issues, this study proposed a pothole detection model named POT-YOLOv11n, which was developed through systematic improvements to the baseline YOLOv11n architecture. The proposed model incorporated three key components: a Multi-scale Edge Information Enhancement module integrating multi-branch convolutional layers with channel-wise attention mechanisms to strengthen edge feature extraction and improve boundary discrimination for potholes with vague contours; a Pyramid Pooling and Large Kernel Attention Fusion module combining large kernel separable attention with spatial pyramid pooling to enhance multi-scale contextual feature capture; and a Feature Screening and Contextual Anchor Attention mechanism introduced in the neck network to optimize feature fusion and contextual modeling by focusing on salient features while suppressing irrelevant information from complex backgrounds. For static evaluation, a farm road pothole dataset was constructed containing diverse samples with varying scales, irregular shapes, and different lighting conditions, on which the improved POT-YOLOv11n model achieved a mean Average Precision of 83.20%, representing an improvement of 2.47 percentage points over the baseline YOLOv11n model, with an F1-score of 80.30% indicating balanced precision and recall performance, while maintaining a compact architecture with only 2.22 million parameters and achieving an inference speed of 248.83 frames per second that satisfied real-time requirements for autonomous navigation applications. Beyond static evaluation, dynamic field tests were conducted on a representative farm road segment containing typical potholes to assess practical applicability under realistic operating conditions, where the agricultural vehicle was operated at three conventional working speeds including low speed at 5 km/h, medium speed at 10 km/h, and high speed at 15 km/h, while performing continuous image acquisition and online detection. Results showed that as vehicle speed increased, motion blur effects intensified leading to a decline in detection accuracy: at low speed of 5 km/h, the model achieved a mean Average Precision of 80.8% slightly lower than the static test result with precision and recall of 82.1% and 79.6% respectively while the F1-score remained consistent with static testing confirming effective transferability of detection capabilities during real-world deployment; at medium speed of 10 km/h, the mean Average Precision decreased to 79.8% with precision and recall of 80.9% and 78.8% respectively; at high speed of 15 km/h, the mean Average Precision further declined to 75.5% with recall decreasing to 74.2% indicating increased missed detections under severe motion blur, yet maintaining a mean Average Precision of 75.5% under such high-speed conditions suggested that the proposed multi-scale edge enhancement module provided compensation against motion blur effects, preserving detection performance beyond what conventional architectures could achieve. These findings demonstrate that the proposed POT-YOLOv11n model achieves effective performance in detecting potholes on unstructured farm roads across different operating speeds by addressing the challenges of boundary ambiguity, scale variation, and motion blur through its integrated multi-scale edge enhancement, large kernel attention fusion, and feature screening mechanisms, while maintaining a lightweight architecture and high inference speed suitable for practical deployment, with consistent performance across static and dynamic tests particularly the robust results under low and medium speeds and the compensated performance at high speed underscoring the practical reliability of the proposed approach. Therefore, the POT-YOLOv11n model can provide reliable support for the autonomous navigation of agricultural machinery operating in complex field environments, contributing to the advancement of smart agriculture technologies and automated farming operations, with the combination of accuracy, efficiency, and robustness to motion blur positioning this model as a promising solution for real-time obstacle detection in vision-based navigation systems for agricultural applications.

     

/

返回文章
返回