Abstract:
In view of the significant challenges posed by small target scale, strong motion nonlinearity, and frequent short-term occlusions in the flight trajectory monitoring of peach borer (
Carposina sasakii) adults under the absence of visible light, this study proposes a collaboratively optimized framework that integrates target detection and multi-object tracking to achieve robust, continuous, and identity-consistent trajectory acquisition. In nocturnal or infrared imaging environments, peach borer adults typically exhibit rapid, irregular, and highly nonlinear flight behaviors, accompanied by weak visual contrast, blurred contours, and intermittent target disappearance. These characteristics substantially increase the difficulty of reliable detection and tracking, often leading to incomplete detections, frequent identity switches, and severe trajectory fragmentation, which in turn undermine the validity of long-term trajectory-based behavioral analysis.To address these challenges, this study argues that isolated optimization of either the detection or tracking module is insufficient. Instead, a joint optimization strategy across both stages is required to enhance the overall robustness of the trajectory acquisition pipeline. Accordingly, a detection–tracking collaborative framework is developed, in which the detection stage is optimized to provide more complete and stable observations, while the tracking stage is designed to better tolerate short-term detection failures and abrupt motion changes.In the target detection phase, an improved detection architecture is constructed to enhance robustness against complex backgrounds, weak infrared contrast, and dense target distributions. Specifically, a novel C3k2_DEAB module is introduced into the backbone network to strengthen feature representation for small-scale flying targets. By expanding the effective local receptive field and incorporating attention-driven feature discrimination mechanisms, the proposed module improves the detector’s sensitivity to weak, fragmented, and low-contrast target cues commonly observed in infrared imagery. In addition, a BiFPN-based multi-scale feature fusion strategy is employed to enable adaptive integration of semantic and spatial information across different feature levels. This design enhances detection completeness for targets with varying scales, motion states, and imaging conditions, thereby reducing missed detections caused by scale variation and rapid motion.To further address target overlap and dense flight scenarios, Distance-IoU-based Non-Maximum Suppression (DIoU-NMS) is incorporated into the inference stage. By jointly considering overlap and center distance between bounding boxes, DIoU-NMS effectively reduces false suppression in crowded scenes, improving both localization accuracy and detection recall for closely spaced flying insects.In the multi-object tracking phase, a Kalman filter–based prediction optimization strategy incorporating iterative motion smoothing is proposed to better model the nonlinear and abrupt motion patterns of peach borer adults. By introducing iterative smoothing into the state prediction process, the tracker becomes more resilient to sudden velocity changes and short-term observation noise, which are common in insect flight trajectories. Furthermore, a trajectory consistency constraint strategy based on a predefined target number prior is designed to mitigate the adverse effects of short-term missed detections. This constraint prevents premature trajectory termination and erroneous identity reassignment caused by transient detection failures or occlusions.To further enhance tracking robustness, a trajectory optimization and association repair mechanism for short-term trajectory fragmentation is developed. Within a limited temporal window, temporarily unmatched trajectories are maintained and continuously evaluated based on motion consistency and spatial feasibility. This mechanism enables effective identity recovery and continuity maintenance under conditions of short-term occlusion, rapid directional changes, and dense interactions, while avoiding erroneous long-term trajectory associations.Extensive experiments conducted under laboratory-controlled infrared imaging conditions demonstrate that the proposed method significantly outperforms the baseline YOLOv12 + Deep OC-Sort framework. Quantitative results show that AssA is improved by 3.42%, IDF1 is increased by 3.74%, HOTA is enhanced by 1.85%, MOTP is improved by 2.22%, and MOTA is increased by 3.85%, while the number of trajectory fragmentations (Frag) is reduced by 305 instances. Visualization results further confirm that the proposed framework effectively suppresses trajectory fragmentation and identity redistribution caused by high-speed motion and short-term occlusions, producing smoother, longer, and more coherent flight trajectories.Through systematic and collaborative optimization of both detection and tracking stages, this study enables accurate and reliable characterization of peach borer adult flight trajectories under no-visible-light conditions. The proposed framework provides a solid data foundation and methodological reference for subsequent quantitative analysis of flight behavior, phototactic response modeling, and the development of intelligent pest monitoring technologies in complex agricultural environments.