高级检索+

基于双重相关性特征融合的红鳍东方鲀异常行为识别

Takifugu rubripes abnormal behavior recognition based on dual correlation feature fusion

  • 摘要: 鱼类行为识别对于评估其健康状态具有重要意义,这有助于精准检测病鱼和预防疾病爆发。然而,红鳍东方鲀异常行为与正常行为高度相似,容易导致识别误判。因此,提出了一种基于双重相关性特征融合的红鳍东方鲀异常行为识别方法DCFF-EYSFNet(DCFF-based enhanced YOLOv10 and SlowFast network)。首先,将识别过程解耦为目标定位和行为分类两个阶段,以降低训练难度。其次,对视频数据预处理以融合鱼类边缘轮廓,并设计ECAM(efficient coordination attention module)模块聚焦个体关键特征,实现更精准的个体定位。最后,针对不同行为间差异较小的问题,提出双重相关性特征融合方法DCFF(dual correlation feature fusion),通过增强时间与通道维度间的相关性,并融合各层级间的时空特征,从而更准确地识别异常行为。在自建数据集上进行的消融和对比试验表明,与基线模型相比,DCFF-EYSFNet准确率和召回率分别提升了7.8和7.6个百分点。研究表明,DCFF-EYSFNet能够精准识别红鳍东方鲀的异常行为,为病害防控提供了技术支持。

     

    Abstract: Fish behavior recognition plays a critical role in assessing the health status of aquaculture species, as it enables early detection of diseased individuals and provides an opportunity to prevent large-scale disease outbreaks. Among various farmed fish species, Takifugu rubripes holds significant economic value. However, its abnormal behaviors, such as body tilting, rolling, or vertical floating near the water surface, are often highly similar to normal behaviors under underwater observation. This high behavioral similarity poses a substantial challenge for automated vision-based systems, frequently leading to misjudgment or false classification. To address this issue, this paper proposes a novel abnormal behavior recognition method for Takifugu rubripes based on dual correlation feature fusion, termed DCFF-EYSFNet. The proposed framework is built upon two complementary stages: target localization and behavior classification. This decoupled design reduces the overall training difficulty by allowing each sub-network to focus on a specific task, thereby avoiding the optimization conflicts commonly observed in end-to-end models that attempt to perform both detection and classification simultaneously. In the target localization stage, the raw video frames are first preprocessed to extract and fuse the edge contours of individual fish. This preprocessing step enhances the posture information of each fish, making them more distinguishable from complex underwater backgrounds. Subsequently, an efficient coordination attention module (ECAM) is designed and integrated into the detection network. ECAM consists of two complementary components: an efficient channel attention block (ECAB) and a coordinated spatial attention block (CSAB). ECAB adaptively recalibrates channel-wise feature responses, while CSAB captures long-range spatial dependencies along both horizontal and vertical directions without introducing excessive computational overhead. By focusing on the most discriminative key features of each individual, ECAM significantly improves the accuracy of fish localization, even in densely populated or partially occluded scenarios. This precise localization provides reliable spatial input for the subsequent behavior classification stage. In the behavior classification stage, the challenge lies in the subtle differences between normal and abnormal movements. To tackle this, the proposed method introduces a dual correlation feature fusion (DCFF) mechanism, which enhances the correlation between temporal and channel dimensions and effectively fuses spatiotemporal features across multiple hierarchical levels of the network. The DCFF mechanism is composed of two core modules: the Hjorth module and the path feature reconstruction (PFR) module. The Hjorth module captures high-order temporal statistics, specifically the Fact and Fmob parameters, which reflect the stability and instantaneous intensity of fish motion over time. The PFR module, on the other hand, separates strong and weak features from the slow and fast pathways of the SlowFast backbone, suppresses redundant background noise, and reconstructs a purified feature representation through adaptive weighting. By integrating these two modules, DCFF effectively enlarges the feature discrepancy between normal and abnormal behaviors, thereby enabling more accurate and reliable recognition. Extensive experiments are conducted on a self-constructed underwater dataset collected from real aquaculture environments, which includes both clear and turbid water conditions under varying illumination. Both ablation studies and comparative evaluations are performed to validate the effectiveness of each proposed component. The experimental results demonstrate that DCFF-EYSFNet achieves substantial improvements over the baseline model. Specifically, compared with the baseline SlowFast network, the proposed method increases accuracy by 7.8 percentage points and recall rate by 7.6 percentage points. Furthermore, when compared with state-of-the-art spatiotemporal action detection models such as YOWOv2 and ST-GCN, DCFF-EYSFNet achieves superior performance, with F1-score improvements of 4.7 and 5.8 percentage points, respectively. The normalized confusion matrix further reveals that the proposed model correctly identifies 97% of normal behaviors and 85% of abnormal behaviors that successfully enter the classification stage, with only 1% of abnormal samples being misclassified as normal. The remaining 14% of missed detections are mainly attributed to severe occlusion or blurred edge features during the localization phase, which is explicitly acknowledged as a current limitation. These findings confirm that DCFF-EYSFNet can accurately recognize the abnormal behaviors of Takifugu rubripes in complex underwater environments. The proposed method provides effective technical support for early disease warning, targeted intervention, and precision management in aquaculture, thereby contributing to reduced economic losses, minimized antibiotic usage, and more sustainable fish farming practices. Despite the promising results, the current model still suffers from relatively high inference latency due to its two-stage serial processing pipeline, which limits its real-time deployment on resource-constrained edge devices. Future work will focus on network lightweighting, knowledge distillation, and optimized hardware acceleration to improve inference efficiency while maintaining high detection accuracy.

     

/

返回文章
返回