高级检索+

基于双相机协作的智能化荔枝果梗识别

Intelligent Litchi Stem Recognition via Dual-Camera Collaboration

  • 摘要: 针对自动化荔枝采摘中果梗精确识别与筛选的技术挑战,该研究设计一种基于D455与D435i双深度相机的远近景协同视觉检测方案。在远景检测阶段,移动平台搭载的D455相机配合轻量化检测模型,实现果实快速识别及其三维中心坐标的获取;进一步结合基于密度的空间聚类算法(Density-based spatial clustering of applications with noise,DBSCAN),生成机械臂的全局运动路径。在近景精细定位阶段,通过末端执行器上方的D435i相机,采用有向包围盒(Oriented bounding box,OBB)对果梗进行形态拟合,并设计一种优化匹配与选择算法(Optimal matching and selection algorithm,OMSA)。该算法通过量化果实与果梗的空间关联强度,实现复杂背景下目标果梗的精准筛选。试验结果表明:远景检测的平均精度均值达80.1%,检测速度为110帧/s;近景检测中果实与果梗的识别精确率分别为99.2%与87.7%;所提优化匹配算法对簇状与单果荔枝的果梗筛选成功率分别为88%与98%,性能优于传统最邻近匹配方法。系统整体采摘成功率达到83%,平均耗时7.04s,验证了其在自然场景中的有效性与实用性。

     

    Abstract: Litchi, renowned as the "King of Tropical Fruits", is widely cultivated with approximately 95% of its total consumption dedicated to fresh eating. However, current harvesting practices predominantly rely on manual labor, which struggles to ensure operational efficiency. To address this, the present study developed a dual-camera collaborative vision detection system based on D455 and D435i depth cameras, integrating hardware and software to achieve automated harvesting. At the software level, the system operates in two distinct stages: far-view fruit detection and near-view pedicel detection. In the far-view stage, a dedicated dataset was constructed using LabelImg, and the decoupled detection head of YOLOv8n was optimized for the single-category litchi detection scenario by removing redundant structures to achieve lightweight performance. In the near-view stage, to minimize background interference, an oriented bounding box (OBB) detection approach was adopted. A near-view dataset was built using RolabelImg, and comparative training was conducted on mainstream OBB-capable models. Ultimately, YOLOv8n-OBB was selected as the pedicel detection model due to its optimal balance between accuracy and real-time performance. At the hardware level, the system employs a dual-camera configuration with varying perceptual ranges: the D455 camera, with its larger optimal depth perception range, serves as the far-view camera mounted on a mobile platform. It works in conjunction with the lightweight detection model to enable rapid fruit recognition and 3D centroid coordinate extraction, further integrated with a density-based spatial clustering algorithm (DBSCAN) to generate the global motion path for the robotic arm. The D435i camera, featuring a smaller optimal depth perception range, is deployed as the near-view camera above the end-effector. It performs morphological fitting of pedicels via OBB, complemented by a novel optimal matching and selection algorithm (OMSA), which achieves precise target pedicel screening in complex backgrounds by quantifying the spatial association strength between fruits and pedicels. Experimental results demonstrate that the system achieves a mean average precision (mAP) of 80.1% in far-view detection, with a detection speed of 110 frames per second. Parameter count and computational load were reduced by 12.5% and 29.2%, respectively, while detection speed increased by 18.3%. In near-view detection, recognition precision for fruits and pedicels reached 99.2% and 87.7%, respectively. The proposed OMSA algorithm achieved pedicel screening success rates of 88% for clustered litchi and 98% for single fruits, outperforming the conventional nearest-neighbor matching method. The overall harvesting success rate of the system reached 83%, with an average time consumption of 7.04 seconds per cluster, confirming its effectiveness and practicality in natural environments.

     

/

返回文章
返回