Training Samples Dataset for Building
Identification in the Urban Village
Liu, Y. F.1,2 Lv, B. R.1,3 Peng, L.1 Wu, T.1,3 Liu, S.4
1 Aerospace Information Research Institute,
Chinese Academy of Sciences, Beijing 100101, China;
2 Ucastech (Beijing) Smart Co. Ltd., Beijing
100080, China;
3 University of Chinese Academy of Sciences,
Beijing 100049, China;
4 Beijing Qingruanhaixin Technology
Co. Ltd., Beijing 100085, China
Abstract: Identifying buildings from remote sensing imagery is an important basic methodology
used in urban management. The distribution pattern of the building clusters,
especially the high density of buildings and narrow streets, among other
aspects, are more critical for urban managers. Based on the remotely sensed
images obtained in Google Maps, 2328 samples of building clusters in an urban
village were drawn by using LabelMe software. The building information was
extracted by using the Mask R-CNN, which is an example of a segmentation
algorithm used in deep learning. The data set includes: (1) original sample
images (Buildingsample_pic); (2) sample segmentation results (Buildingsample_mask),
and; (3) sample segmentation annotation (Buildingsample_info). The data set consists
of 6984 data files in three data folders, having .png and .yaml data formats.
The data set’s size is 499 MB (compressed into one file: 498 MB). The research
paper related to the data set will be published in the Proceedings of the first
China Digital Earth Conference.
Keywords: urban
village; building cluster; deep learning; Mask R-CNN; Proceedings of the first
China Digital Earth Conference
1 Introduction
With the continuous
development of urban construction and urban governance, the problem of the
urban village is now of widespread concern [1,2].
The urban village is a residential area built on the original rural
collective’s land and farmers’ homestead during phases of urban expansion, in
which buildings are an important part. Urban village buildings are disordered
and heterogeneous pathological settlement patterns perhaps best described as
“city is not like city, village is not like village” [3]: because of
its high density of buildings, narrow streets and lanes, illegal building and
other characteristics, the urban villages’ shape is diverse and is structurally
complex, which has always been a contentious and difficult topic in academic
research. The urban village building community is a target subject, with
discernable structural characteristics, in the analysis of remotely sensed
images of urban areas, because of its unique distribution and patterning. In
recent years, with advances in artificial intelligence and deep learning
techniques, many scholars have begun to research how to apply deep learning to
extract buildings from such imagery. Compared with a data-driven method and
model-driven method, the building extraction process based on machine learning
requires less prior knowledge and it can achieve high extraction accuracy when
using suitable samples [4-7]. In this paper, a large and
medium-sized city in northern China was selected as the sample drawing basis.
By using Google Maps remote sensing imagery, with a spatial resolution of 0.11
m, a total of 2328 urban village building samples were drawn by LabelMe
software. This study provides basic data for remote sensing image analysis
based on deep learning, specifically the case segmentation algorithm mask R-CNN,
and includes an application case of the case segmentation sample. This work has
practical significance for applying artificial intelligence information
extraction in urban governance.
2 Metadata of the Dataset
The data set name, its short name, author information,
geographical region, data age, spatial resolution, data format, data volume,
data set composition, data computing environment, data publishing and sharing
service platform, data sharing policy and other information of the sample data
for training samples dataset of “Building Identification in the Urban Village” [8]
(Samples_BuiUrbanVill) are shown in Table 1.
Table 1 Dataset Metadata Profile of Training Samples Dataset for
Building Identification in the Urban Village
Item
|
Description
|
Dataset Name
|
Training Samples Dataset for Building Identification in the Urban Village
|
Short Name Of Dataset
|
Samples_BuiUrbanVill
|
Author Information
|
Liu Yufei, Aerospace Information Research Institute, Chinese
Academy of Sciences, Ucastech
(Beijing) Smart Co. Ltd., 18811519832@163.com
Lv Beiru, Aerospace Information Research
Institute, Chinese Academy of Sciences, University of Chinese Academy of
Sciences, 1121222861@qq.com
Peng Ling, Aerospace Information Research
Institute, Chinese Academy of Sciences, pengling@aircas.ac,cn
Wu Tong, Aerospace Information Research
Institute, Chinese Academy of Sciences, University of Chinese Academy of
Sciences, tongw_indus@126.com
Liu Sai, Beijing Qingruanhaixin Technology
Co. Ltd., liusai@hesion3d.com
|
Data Age
|
2018–2019
|
Spatial Resolution
|
0.11 m
|
Data Format
|
.png, .txt, and .yaml
|
Data Volume
|
498MB (after compression)
|
Dataset Composition
|
(1) Sample segmentation result (Buildingsam-ple_mask); (2) original
sample images (Buildingsample_pic); (3) sample segmentation annotation
(Buildingsample_info).
|
Fund Projects
|
The Beijing Municipal Science and Technology Project, No.
Z191100001419002
|
Data Computing Environment
|
GPU: NVIDIA GP102 [TITAN Xp];
Python: 3.6; TensorFlow-gpu: 1.3.0; Keras: 2.0.8
|
Publishing and Sharing Service Platform
|
Global Change Research Data Publishing & Repository
http://www.geodoi.ac.cn
|
Address
|
Institute of Geographical Sciences and resources, Chinese Academy of
Sciences, 100101, a 11 Datun Road, Chaoyang District, Beijing
|
(continued)
(continued)
Item
|
Description
|
Data Sharing Policy
|
The “data” of global change scientific research data publishing &
repository includes metadata (in Chinese and English), entity data (in
Chinese and English), and data papers published through the Journal of global
change data (Chinese and English). The sharing policies are as follows: (1)
The “data” are open to the whole society, free of charge, through the
Internet system in the most convenient way, and users can browse and download
it free of charge; (2) The end-user needs to mark the data source in the
reference or appropriate position according to the citation format; (3)
Value-added service users or distribute and disseminate in any form (including
through computer services)—the user of “data” must sign
a written agreement with the editorial department of Journal of Global Change
Data (Chinese and English) to obtain permission; (4) The author who extracts
some records from the “data” and creates new data must follow the 10%
quotation principle, that is, the data records extracted from this dataset
constitute less than 10% of the total records of the new dataset, and the
extracted data records need to be marked as “Data sources” [9].
|
Data and Paper Retrieval System
|
DOI, DCI, CSCD, WDS/ISC, GEOSS, China GEOSS
|
3 Data Development Methods
The sample of remotely
sensed images in this project are divided into target detection samples,
semantic segmentation samples, and instance segmentation samples according to
their specific uses [6]. The samples used for target detection must have the
location and type of the target feature labeled, that is, by drawing the
external rectangular box of the target feature and labeling its category; for
semantic segmentation, its sample needs to have the outline and type of the target
feature labeled, that is, by drawing the outline of the target feature and
labeling its category; for instance segmentation, its sample should have the outline
and the category of the target feature marked, that is, by drawing the outline
of a single object and labeling its category. Currently, the most commonly used
software tools for drawing on images are LabelMe, ArcGIS, and Labellmg.

Figure
1 Flow
chart of image samples’ drawing
|
According to the remote sensing imagery and from the
ground real-scene photos, this paper used LabelMe software to obtain the
building samples from an urban villages, which were then used for deep learning
by the instance segmentation algorithm. The operational flow chart for this is
shown in Figure 1.
Drawing steps:
(1) Remote sensing imagery
selection
Combined with the unique distribution pattern of the
urban village building community, high
building density, narrow streets and lanes, an image captured via Google remote
sensing with a resolution of 0.11 m is selected as the remote sensing image
data of this data set.
(2) Image segmentation
A remote sensing image is divided into the target
size, which is generally an exponential square with side length of 2. The
sample set cuts the original image data and their labels into 512 × 512 sizes
for subsequent model training. After the original image is segmented the pic
file is obtained, which is the sample’s original image set pic, as shown in
Figure 2-a.
(3) Labelme draws the
buildings in the village in the city
Draw the outline of the building in LabelMe and mark
it in the form of vbuilding *.
(4) Format conversion
According to the JSON file generated by LabelMe, the
sketch sample is converted to an executable dataset format. Next, the mask
images generated are sorted to derive the mask file of the instance
segmentation result set, as shown in Figure 2-b.

|

|
Fig 2-a.
Original image of the sample
|
|
Figure 2 Original
image and mask of the sample
(5) Data enhancement
The generated mask image (mask) and original image
(pic) are flipped horizontally then vertically, and rotated 90°, rotated 180°,
and rotated 270° to increase the number of samples, as shown in Figure 3.

|

|

|
Fig 3-a.
original image
|
Fig 3-b.
horizontal flip
|
Fig 3-c.
vertical flip
|

|

|

|
Fig 3-d. 90 °
flip
|
Fig 3-e. 180 °
flip
|
Fig 3-f. 270 °
flip
|
Figure 3 Schematic
diagram of data enhancement
4 Data Results and Validation
4.1 Dataset composition
The sample data set of urban
village buildings includes: (1) case segmentation result set mask, file format .png;
(2) sample original image set pic, file format .png; (3) instance segmentation
annotation information, info.yaml. A total of 2328 urban village building
samples were drawn. After the data set was compressed into *.rar file by
software, the data volume was 498 MB.
Table
2 Description
of data set file composition
Serial number
|
File name
|
Document description
|
Data volume (MB)
|
1
|
Buildingsample_mask
|
sample segmentation results (.png)
|
10.6
|
2
|
Buildingsample_pic
|
original sample images (.png)
|
488
|
3
|
Buildingsample_info
|
sample segmentation annotation (.yaml)
|
0.96
|
4.2 Validation
of data results
The case
segmentation algorithm, mask R-CNN, was used to extract building information [10-13], and 678 urban
village building samples were tested and verified. The algorithm of extracting
village buildings in city by mask R-CNN is shown in Figure 4.
To quantitatively
evaluate the performance of the algorithm, average precision (AP) was used as
the evaluation standard of experimental accuracy. After verification, the AP of
the model on the test set was 0.66, and the maximum detection accuracy AP of a
single urban village building sample image reached 0.995. These results fully
demonstrate that mask R-CNN can achieve robust detection performance on the
sample data set of buildings in this urban village.
AP is the area
formed by the accuracy recall curve and X and Y axes, which is calculated by
formula (1). The higher the AP, the better the performance of
the model, and vice versa. Therefore, the calculation of AP involves the
calculation of both ‘precision’ and ‘recall.’ The precision rate refers to the
ratio of TP (True Positive) to the number of all
detected targets, as shown in formula (2). Recall rate refers to the ratio of
TP (True
Positive) to all actual target numbers, as shown in
formula (3).
(1)
(2)
(3)
Table 3 Evaluation
index of target detection
Name
|
Abbreviation
|
Concept
|
True Positive
|
TP
|
Number of positive
samples detected correctly
|
True Negative
|
TN
|
Number of
negative samples correctly detected
|
Fasle Positive
|
FP
|
Number of
negative samples detected as positive samples by error
|
False Negative
|
FN
|
Number of
positive samples detected as negative by error
|

|

|
Fig 5-a. original image
|
Fig 5-b. test result map
|
|
|
|
|
Figure 5 Analysis
and comparison of test results
FN is derived from
the difference between the number of labeled buildings and TP. To calculate TP
and FP, we set the IOU (Intersection Over Union) to
judge the correctness of the test results, and set the threshold value to 0.5.
When the IOU > 0.5, the test results are considered to be reliable, that is,
the positive samples were correctly detected; otherwise, it is a false positive
in which a positive sample was detected by mistake. The specific formula for
calculating IOU is shown in formula (4).
(4)
The results show that the
average building area of the experimental area is 75.08 m2, and the average nearest-neighbor distance is 0.90 m.
According to the kernel density estimation results, the building density of the
studied area is 43.75%, and the green space rate is 5.12% [14]. According
to the regulations of the People’s Republic of China on the planning and design
standards of urban residential areas, this makes it a high-density residential area[15].
5 Conclusion
The sample set is based on 0.11-m spatial resolution of remote sensing
imagery produced by Google Maps, for which the location, outline, and type of
each single building in a city’s village was marked. According to the sample, we
provide an application case of case segmentation of single building in urban
village.Our experimental results show the following:
(1) The network structure of mask
R-CNN has advantages in building target detection. The sample set has high
practicability when using an instance segmentation algorithm mask R-CNN to
extract information by deep learning. The AP reached 0.66, and the highest
detection accuracy of a single urban village building sample image reached
0.995. When the sample quality is good and the similarity between sample set
features and verification set features is high, mask R-CNN can achieve an high
accuracy and recall rate;
(2) Spatial analysis of the information extraction
results can effectively convey the distribution characteristics of small
average building area, narrow streets, high density of buildings and complex
building types.
The sample set provides the basic data for the use of
remotely sensed images based on a deep learning algorithm to extract the
buildings in urban villages. It offers sound practical significance for
studying the spatial distribution characteristics of urban villages and the
intelligent analysis and application of urban villages’ governance.
Author contributions
Wu Tong was responsible for the technical route of the
data set development; Liu Yufei and Lu Beiru collected and processed the sample
data of urban villages; Lu Beiru was responsible for the design of models and
algorithms; Lu Beiru and Liu Sai were responsible for data validation; Liu
Yufei and Lu Beiru were responsible for writing the data paper. Peng Ling was
responsible for data organization, sample types, and production process, as
well as value judgment and evaluation.
References
[1]
Li,
Z. Y., Yang, Y. C. Research progress of urban village in China [J]. Gansu
Science and Technology, 2008 (7): 7–11.
[2]
Zhou,
X. H. Urban village problem: an economic analysis of its formation, existence
and transformation [D]. Shang: Fudan University, 2007
[3]
Deng,
C. Y., Wang, Y. R. A review of the research on urban villages in China [J]. Journal
of Guangdong University of Administration (1): 93–97.
[4]
Zhao,
Y. H., Chen, G. Q., Chen, G. L., et al.
Extraction of urban village buildings from multi-source big data: a case study
of Tianhe District, Guangzhou City [J]. Geography and Geographic Information
Science, 2018, 34 (5): 3, 13–19.
[5]
Liang
Yd. “Research on the application of UAV system in urban village
reconstruction”, Beijing Surveying and Mapping, 2018, 32 (10):
70–73.
[6]
S.
D. Mayunga and S. D. Mayunga. “Semi-automatic building extraction in informal
settlements from high-resolution satellite imagery, ”
2006.
[7]
Cheng,
T. Construction and application method of big data of remote sensing image
sample [J]. Application of Computer System, 2017, 026(005): 43-48.
[8]
LIU
Yf, LV Br, PENG L, WU T, LIU S “Training Samples Dataset of Building Identification
in Urban Village, ” Global Change Data Repository,
2020. DOI: 10.3974/geodb.2020.02.16.V1.
[9]
GCdataPR Editorial Office. GCdataPR Data Sharing Policy [OL]. DOI:
10.3974/dp.policy.2014.05 (Updated 2017)
[10] Ji Sp, Wei Sq. “Convolutional
neural network and open source dataset method for building extraction from
remote sensing images, ” Acta Sinica Sinica,
2019.48 (04): 50–61.
[11] T. Y. Lin, P. Dollár, R. Girshick,
K. He, and S. Belongie, “Feature Pyramid Networks for Object Detection, ” 2016.
[12] Hirata T, Kuremoto T, Obayashi M,
et al. “Deep Belief Network Using Reinforcement Learning and Its Applications
to Time Series Forecasting,” International Conference on Neural Information
Processing. Springer International Publishing, 2016.
[13] Fu F, Wei Jy, Zhang Ln. “Research on
building extraction from remote sensing image based on convolution network,” Software
Engineering, v21; 228 (6): 8–11.
[14] Lv Br, Peng L, Wu T, et al. “Research on urban
building extraction method based on deep learning convolutional neural
network,” IOP Conference Series Earth and Environmental Science, 2020,
502: 012022.
[15] “The regulations of the people's
People’s Republic of China on the planning and design standards of urban
residential areas,” China Architecture & Building Press, 2002.