高级检索+

基于多任务分割与自适应压缩的黄花菜采摘点三维定位

Three-dimensional positioning of the picking points for daylilies based on multi-task segmentation and adaptive compression

  • 摘要: 针对田间黄花菜采摘存在类间差异小、类内差异大、密集重叠导致定位难的问题,该研究提出一种融合多任务分割与自适应压缩的采摘点三维定位方法。首先,构建DaylilyPick-Net多任务分割网络,集成密集信息估计、多维信息增强融合、任务协同动态分割头等模块及动态损失权重策略,提升密集场景下的分割性能;其次,提出自适应调控剪枝策略(adaptive collabora-tive self-optimizing pruning,ACSP),融合层级动态自适应剪枝与全局增强蛇鹭优化算法,压缩模型同时维持精度;最后,进一步提出黄花菜三维采摘点定位及裁剪角度确定方法,融合分割掩码与深度信息实现采摘点三维定位与裁剪角度计算。结果表明,DaylilyPick-Net在田间密集黄花菜数据集上平均精度均值(mAP@50)、参数量、浮点计算量和平均帧率分别为59.62%、13.01 M、71.7 G和57 帧/s,优于YOLO系列等主流模型;ACSP在剪枝率为40%时mAP@50达60.31%;采摘点定位误差保持在0.30 cm以内。该研究可为密集场景下的农业采摘机器人提供一定的视觉感知技术支持。

     

    Abstract: Daylilies are one of the most favorite vegetables in Asian areas. However, the challenging task remains on mechanical harvesting of the daylilies under field conditions, due to the subtle differences among maturity stages, significant size variations within the same class—especially among unripe buds—and severe occlusions caused by densely overlapping flowers and stems. It is often required for the accurate and robust localization of the picking points in real-world applications. In this study, an integrated framework was proposed to combine the multi-task semantic segmentation with the adaptive model compression. High-precision and efficient 3D picking point localization was achieved in dense daylily environments. A multi-task segmentation network, named DaylilyPick-Net, simultaneously performed the segmentation of the daylilies and their picking regions. Several architectures of the network were specially designed to enhance the performance under challenging conditions. Among them, an OIEM (occlusion information estimation module) equipped with Multi-path Weighted Coordinate Attention improved the feature discrimination in the occluded regions. A DIEM (dense information estimation module) facilitated the multi-scale feature integration over different network layers. A dynamic convolution was employed in the TSDS-Head (task synergistic dynamic segmentation head) for the various morphological characteristics in the dense clusters. Furthermore, a DLWS (dynamic loss weighting strategy) was introduced to automatically balance the learning process between the multi-class segmentation (classifying daylilies into different maturity stages) and the single-class segmentation (identifying picking regions), thereby mitigating the adverse effects of the task imbalance. An ACSP (adaptive collaborative self-optimizing pruning) strategy was proposed to ensure the practicality of the deployment on the resource-constrained devices. Two stages consisted of: an AD-Lamp (adaptive dynamic lamp) was reduced the model complexity using importance scores; and a GL-SBOA (global-enhanced secretary bird optimization algorithm) was fine-tuned the pruned model using an enhanced optimization algorithm. An improved Secretary Bird Optimization Algorithm was incorporated with the Cubic chaotic mapping for the population initialization and adaptive convergence factors. Search strategies were then refined over different phases to effectively navigate the high-dimensional hyperparameter space. Finally, the Daylily-3DPAC (daylily 3d picking-point positioning and angle calculation) was developed to translate the segmentation into the actionable 3D picking information. Segmentation masks from the DaylilyPick-Net were integrated with the depth information captured by an RGB-D camera. The 3D coordinates of the picking points were calculated after the centroid estimation of the intersection between maturity-specific and picking region masks, followed by depth projection. Additionally, the best cutting angle was determined using the orientation of the minor axis of the smallest bounding rectangle of the target region. A systematic evaluation was performed to verify the effectiveness of the framework. The experimental results show that the DaylilyPick-Net model achieved a mean Average Precision at 50% IoU (mAP@50), parameter count, floating-point operations (FLOPs), and detection frame rate of 59.62%, 13.01 M, 71.7 G, and 57 frames per second (fps), respectively. Compared with the Mask R-CNN, SOLOv2, and the YOLO series, the DaylilyPick-Net model was improved in the mAP@50 and the detection frame rate by 1.98 to 16.38 percentage points, 16 to 45 fps, respectively, whereas the parameter count and FLOPs were reduced by 14.57 to 75.26 M, and 42.3 to 278.2 G, respectively. The pruning strategy was achieved in an exceptional balance between compression and performance, even surpassing the accuracy of the original model at a 40% pruning ratio with a mAP@50 of 60.31%. Most importantly, the 3D localization was achieved in the picking point positioning errors of less than 0.30 cm in the simulated environments, fully meeting the precision requirements of the practical applications. This finding can provide an accurate and efficient solution for the 3D picking point localization in dense daylily harvesting scenarios. The DaylilyPick-Net was effectively segmented under complex field conditions, while the compression strategy facilitated the deployment with high accuracy. The framework can offer a robust technological foundation for the intelligent harvesting robots. Future work can be expected to integrate the 3D reconstruction for high spatial accuracy in real-world robotics.

     

/

返回文章
返回