Accurate identification of building footprints from high-resolution satellite imagery is crucial for urban planning and disaster response. This paper investigates building detection methodologies using the Mask R-CNN framework and its variants, aiming to address challenges such as accurate boundary pixel classification and reducing false positives. Two WorldView-3 datasets, including the SpaceNet Building Detection Dataset and a dataset on Prato, Italy, are utilized for analysis. Augmentation techniques, such as NDVI and Sobel edge detection features, and evaluation metrics such as F1-score and Average Precision are employed to assess model performance. Findings reveal the superiority of the Point Rend Mask R-CNN in detecting medium and large buildings in densely populated urban environments. Notably, Point Rend and the use of NDVI and Sobel demonstrate substantial improvements compared to other methods for building detection. This investigation provides insights into the efficacy of Mask R-CNN framework and its variants for advancing building footprint delineation across various applications.
|