基于多任务分割与自适应压缩的黄花菜采摘点三维定位

张睿; 王骞; 张延军; 白峭峰

doi:10.11975/j.issn.1002-6819.202507222

基于多任务分割与自适应压缩的黄花菜采摘点三维定位

Three-dimensional positioning of picking points for daylilies based on multi-task segmentation and adaptive compression

摘要

摘要: 针对田间黄花菜采摘存在类间差异小、类内差异大、密集重叠导致定位难的问题，该研究提出一种融合多任务分割与自适应压缩的采摘点三维定位方法。首先，构建DaylilyPick-Net多任务分割网络，集成密集信息估计、多维信息增强融合、任务协同动态分割头等模块及动态损失权重策略，提升密集场景下的分割性能；其次，提出自适应调控剪枝策略（adaptive collabora-tive self-optimizing pruning，ACSP），融合层级动态自适应剪枝与全局增强蛇鹭优化算法，压缩模型同时维持精度；最后，进一步提出黄花菜三维采摘点定位及裁剪角度确定方法，融合分割掩码与深度信息实现采摘点三维定位与裁剪角度计算。结果表明，DaylilyPick-Net在田间密集黄花菜数据集上平均精度均值（mAP@50）、参数量、浮点计算量和平均帧率分别为59.62%、13.01 M、71.7 G和57 帧/s，优于YOLO系列等主流模型；ACSP在剪枝率为40%时mAP@50达60.31%；采摘点定位误差保持在0.30 cm以内。该研究可为密集场景下的农业采摘机器人提供一定的视觉感知技术支持。

Abstract: The automated harvesting of daylilies under field conditions remains a challenging task due to several inherent complexities: subtle differences between maturity stages, significant size variations within the same class—especially among unripe buds—and severe occlusions caused by densely overlapping flowers and stems. These factors collectively hinder the accurate and robust localization of picking points in real-world applications. To address these challenges, this paper proposes an integrated framework that combines multi-task semantic segmentation with adaptive model compression, aiming to achieve high-precision and efficient 3D picking point localization in dense daylily environments. The core of our framework is a novel multi-task segmentation network named DaylilyPick-Net, which simultaneously performs the segmentation of daylilies and their corresponding picking regions. This network incorporates several specialized architectural innovations designed to enhance performance under challenging conditions. These include an OIEM (occlusion information estimation module) equipped with Multi-path Weighted Coordinate Attention to improve feature discrimination in occluded regions, a DIEM （dense information estimation module) that facilitates multi-scale feature integration across different network layers, and a TSDS-Head (task synergistic dynamic segmentation head) that employs dynamic convolution to adapt to various morphological characteristics in dense clusters. Furthermore, a DLWS (dynamic loss weighting strategy) is introduced to automatically balance the learning process between the multi-class segmentation task (classifying daylilies into different maturity stages) and the single-class segmentation task (identifying picking regions), thereby mitigating the adverse effects of task imbalance. To ensure the practicality of deployment on resource-constrained devices, an ACSP (adaptive collabora-tive self-optimizing pruning) strategy is proposed. This strategy consists of two main stages: first, a AD-Lamp (adaptive dynamic lamp) that reduces model complexity based on importance scores; and second, a GL-SBOA (global-enhanced secretary bird optimization algorithm) that fine-tunes the pruned model using an enhanced optimization algorithm. This algorithm, an improved version of the Secretary Bird Optimization Algorithm, incorporates Cubic chaotic mapping for population initialization, adaptive convergence factors, and refined search strategies across different phases to effectively navigate the high-dimensional hyperparameter space. Finally, the Daylily-3DPAC (daylily 3d picking-point positioning and angle calculation) is developed to translate the segmentation results into actionable 3D picking information. By integrating segmentation masks obtained from DaylilyPick-Net with depth information captured by an RGB-D camera, this algorithm calculates the 3D coordinates of picking points through centroid estimation of the intersection between maturity-specific masks and picking region masks, followed by depth projection. Additionally, the optimal cutting angle is determined based on the orientation of the minor axis of the minimum bounding rectangle of the target region. Comprehensive evaluations on a dedicated dataset demonstrate the effectiveness of the proposed framework. The experimental results show that the DaylilyPick-Net model proposed in this study achieves a mean Average Precision at 50% IoU (mAP@50), parameter count, floating-point operations (FLOPs), and detection frame rate of 59.62%, 13.01 M, 71.7 G, and 57 frames per second (fps), respectively. Compared to Mask R-CNN, SOLOv2, and models from the YOLO series, the DaylilyPick-Net model demonstrates improvements in mAP@50, reductions in parameter count and FLOPs, and an increase in detection frame rate by 1.98 to 16.38 percentage points, 14.57 to 75.17 M, 42.3 to 252.4 G, and 16 to 45 fps, respectively. The pruning strategy achieves an exceptional balance between compression and performance, even surpassing the accuracy of the original model at a 40% pruning ratio with a mAP@50 of 60.31%. Most importantly, the proposed 3D localization method achieves picking point positioning errors of less than 0.30 cm in simulated environments, meeting the precision requirements of practical applications. This research provides an accurate and efficient solution for 3D picking point localization in dense daylily harvesting scenarios. The proposed DaylilyPick-Net effectively handles segmentation under complex field conditions, while the model compression strategy facilitates deployment without sacrificing accuracy. The overall framework offers a robust technological foundation for the development of intelligent harvesting robots. Future work will focus on integrating 3D reconstruction technologies for enhanced spatial accuracy and implementing the system in real-world robotic platforms.

HTML全文

参考文献(35)

施引文献

资源附件(0)