Abstract:
Abstract: An anomaly detection of the data stream has been one of the most critical subjects for the monitoring of water quality in aquaculture. The data stream of water quality collected by wireless sensor network is normally difficult to be detected accurately, due to the characteristics of high complexity, instability, and nonlinearity. The traditional support vector data description (SVDD) presents a relatively low recognition on a small number of abnormal samples under the condition of data imbalance. The noise samples have also a great interference to the anomaly detection, leading to the specific features that cannot be captured completely. In this study, an improved support vector data description (improved SVDD) was proposed to strengthen the detection capability of the sensor data stream. First, a mahalanobis distance was applied to enhance the Gaussian function of Parzen-Window, thus avoiding data interference in the process of classification. Then, the improved Parzen-Window function was utilized to realize the density estimation of training data. As such, the data classification was completed to extract the distribution of training data. In this case, the new ISVDD model was constructed to combine the fuzzy membership function. Thus, the interference of the model from the noise samples was significantly reduced to improve the classification accuracy. Finally, the abnormal detection effect of SVDD different kernel functions was compared to determine the optimal kernel function, according to the performances. The density-weighted support vector data description (D-SVDD), traditional support vector data description (improved SVDD), and the FastFood were selected to verify the performance in different testing datasets of three ponds. The D-SVDD was used to verify the superiority of the fuzzy membership function during improvement operation. The traditional SVDD was used to verify the detection precision of improved SVDD. The FastFood was taken to verify the running efficiency. All detections were tested several times to choose the average values as the final. The true positive rate (TPR), false negative rate (FPR), accuracy value, and running time were used as the detection performance to evaluate all models. The experimental results showed that the improved SVDD presented a higher detection performance. Among them, the maximum TPR value of ISVDD was 99.83%, the minimum FPR value reached zero, the maximum accuracy value of anomaly detection was 99.83%, and the minimum running time was 1.34 s. It indicated that the improved SVDD presented a superior performance than the D-SVDD and traditional SVDD. The detection performance demonstrated that the different kernel functions in SVDD and different detection were identified in all testing ponds during the aquaculture period. In addition, the expanding boundary of normal and abnormal data was achieved using the density-weighted and fuzzy membership function with a greatly better performance of abnormal detection. The finding can provide a new idea to improve the accuracy of anomaly detection in the whole aquaculture cycle. Meanwhile, the experimental and improved SVDD can be expected to serve as a theoretical reference to enhance the supervised level of anomaly detection.