Abstract:
A geo-parcel is one of the most basic geographical units to perform the physical parameter inversion under space-time attributes. Automatic recognition of the geo-parcels can be a crucial step during object-based imaging and high-resolution remote sensing. The intelligent interpretation of these images can be used to form the spatial database, in order to extract the thematic information. A powerful tool has emerged for target recognition and complex geographic scenarios. The promising potential can also be found in precision agriculture, resource investigation, disaster assessment, and urban planning. Nevertheless, technical challenges still remained in the intelligent recognition of geo-parcel from the high-resolution remote sensing images, particularly on semantic segmentation. In this article, a top-down "Target-Primitive-Object" framework was presented to recognize the farmland geo-parcels in the high-resolution remote sensing imagery using a local focusing algorithm. An intelligent recognition of the high-resolution image parcel was introduced prior to segmentation, referred to as LFAS (Local Focus Aided Segmentation). A local focus algorithm was utilized to assist with the labeling. Specifically, the YOLO backbone network with the CNN architecture was employed to extract and then fuse the features from the high-resolution remote-sensing image blocks. Thereby both semantic and positional information was then acquired from the feature maps. The prediction was realized on the multiple bounding boxes containing block targets. Subsequently, the non-maximum suppression was applied to eliminate the redundant boxes of prediction. The center point coordinates of the ultimate prediction box were obtained to serve as the auxiliary markers. The SAM feature space was then used to map in the form of position encoding. Finally, the original image was integrated into the mask decoder to produce the final block segmentation. Thereby, the target block was extracted from the remote sensing images. The experimental results demonstrate that the LFAS effectively learned the global semantic features from the remote sensing images against the complex backgrounds. The long-term dependencies of global features were effectively captured to manage the fine edges and local details. The recognition accuracy of LFAS extraction reached an impressive 91.15% (PA), with completeness (Cp) at 87.42%, IoU (Intersection Over Union) at 89.62%, and the quality of boundary line extraction (Q) at 80.39%. There were distinctly clear edge features of the farmland geo-parcels after extraction. The objects in each plot were well-defined and independent. Furthermore, the boundary line after extraction was aligned closely with the actual spatial forms of the parcel. A comparative experiment was also performed on the various algorithms of semantic segmentation, including the U-Net, DeepLabV3, Swin Transformer, and SegFormer. It was found that the LFAS exhibited superior performance of land parcel recognition, compared with the rest. Specifically, the intelligent recognition accuracy (PA) of LFAS for the farmland geo-parcel was improved by 2.73 percentage points, compared with the SegFormer. The IoU increased by 3.01 percentage points, and the quality of boundary line extraction (Q) was enhanced by 3.55 percentage points. Notably, a significant improvement was observed in the recognition accuracy and edge quality of farmland plots. Therefore, the LFAS system was employed to first identify and then segment the farmland geo-parcels. Two learning architectures (CNN and Transformer) were integrated to achieve the feature-level fusion of spatial position and spectral attributes. The regional adaptability of the segmentation was improved in the quality of farmland parcel extraction using high-resolution remote sensing. In summary, the LFAS shared the strong universality for stable and reliable recognition. The minimal manual intervention was required over the entire process. A viable approach was offered for the accurate segmentation of high spatial resolution remote sensing images, indicating the broad prospects of application. The more complex agricultural scenarios and diverse plot structures can be expected to fully validate the algorithm's adaptability and robustness in the future.