Development of Monthly-Seasonal-Annual
Dataset of Sea Surface Chlorophyll-a Concentration for 21 Years (1998?C2018)
Li,
L. W.1 Fu,
Y. X.1,2 Xue,
C. J.2,3* Cui,
J. Y.1 Zhang,
Y. Y.1,2 Xu,
Y. F.1,2
1.
College of Oceanography and Space Informatics, China University of Petroleum,
Qingdao 266580, China;
2.
Key Laboratory of Digital Earth Science, Chinese Academy of Sciences, Beijing
100094, China;
3.
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing
100094, China
Abstract: Based on the
Chlorophyll-a concentration products retrieved from SeaWIFS, Terra-MODIS,
Aqua-MODIS, MERIS and VIIRS from January 1998 to December 2018. This paper combines
the wavelet transform and Kalman filtering to develop the multi-source remote
sensing data fusion technology, and integrates the look-up table and the
maximum value composite to design the data fusion method. The algorithm
completes the global sea surface chlorophyll-a concentration fusion dataset
(1998-2018) at three temporal scales:
monthly, seasonal and annual. The spatial resolution of the dataset is 4 km ´ 4 km, the data format is TIFF. The dataset includes 357 data files and
the data size is 50.1 GB, and the compressed dataset is 23 data files and the
compressed data size is 19.4 GB. Comparisons with the measured data and GSM
(Graver-Siegel-Maritorena) products from European Space Agency show that the
correlation between the fusion product and the measured data is 79% in 2008,
while the correlation between the GSM product and the measured data is only 35%.
The results prove the advantages of this fused products.
Keywords: sea surface
chlorophyll-a; data fusion; global dataset; 1998-2018
DOI: https://doi.org/10.3974/geodp.2021.02.15
CSTR: https://cstr.escience.org.cn/CSTR:20146.14.2021.02.15
Dataset Availability Statement:
The
dataset supporting this paper was published and is accessible through the Digital Journal of Global Change Data
Repository at: https://doi.org/10.3974/geodb.2021.05.05.V1 or https://cstr.escience.org.cn/CSTR:20146.11.2021.05.05.V1.
1 Introduction
Chlorophyll-a is a key element of water
constituents, its value can reflect the primary productivity of water body
and it is an index to evaluate the degree of eutrophication[1]. Therefore,
it is of great significance to know the status and change of chlorophyll-a for
the maintenance of ecological quality. Various countries in the world have
launched many marine satellites, which can conduct large-scale and
high-precision observation of the marine environment. It can provide massive
data for researchers in various countries to study the marine environment, and
also provide a solid data foundation for the detection of marine chlorophyll-a
concentration[2-4].
Due
to the variations in spatial and temporal resolution of different sensors,
single satellite image has various shortcomings in coverage, resolution and
utilization. In order to overcome the existing defects and to make full use of
the advantages of multiple satellite images, researchers have proposed
multi-source data fusion technology[5]. Multi-source data fusion can
increase the data quantaty, expand the spatial-temporal resolution, and improve
the spatial-temporal continuity, consistency and reliability[4].
While
the multi-source data fusion technology has been widely used[6-8], most of
them still have problems in terms of accuracy, coverage rate or time span. This
paper used the chlorophyll-a concentration products retrieved from SeaWIFS,
Terra, Aqua, MERIS and VIIRS sensors, and designed a fusion algorithm based on
wavelet transform and Kalman filtering technology. The fused data are at monthly,
seasonal and annual scales and have a span from Jan. 1998 to Dec. 2018.
2 Metadata
of the Dataset
The dataset name, authors, geographical region, years of
dataset, temporal resolution, spatial resolution, data files, data publishing
and sharing service platform, data sharing policy and other information of the Global ocean surface chlorophyll-a concentration
fusion 4-km grid dataset (1998-2018)[9] are shown in table 1.
3 Methods
3.1 Data Collection
Global oceanic chlorophyll-a concentration dataset comes
from the Ocean Color official network of NASA[11]. According to the
sensor life cycle, the global sea surface chlorophyll- a concentration data
from January 1998 to December 2018 were downloaded, a total of 2,921 scenes are
shown in table 2 and the life cycle of each sensor is shown in Figure 1.
The measured data of
chlorophyll-a concentration on the global sea surface come from SeaBass[12],
A total of 5,711 observation files were recorded in each observation file,
including site, number, date, time, latitude, longitude, water depth,
chlorophyll concentration and other information. The statistical table of
measured data is shown in Table 3.
3.2 Data Processing
Based on the adaptive weighted fusion algorithm, the fusion
dataset of global sea surface chlorophyll-a concentration is generated. The
core idea of the algorithm is: (1) Taking the regional optimum as the
evaluation criterion of weight selection, the data collected by each sensor can
adaptively find the corresponding optimal weighting factor; (2) Dynamic
weighted fusion based on wavelet transformation. The design process of adaptive
weighted fusion algorithm is shown in Figure 2.
The
specific steps of the fusion algorithm include:
(1)
firstly, the threshold iterative method is used to segment the image according
to whether there have measured values or not;
(2)
the regions with no measured values are divided into the similar (homogeneous)
regions and the variable (heterogeneous) ones according to their variances;
Table 1 Metadata summary of the Global ocean surface
chlorophyll-a concentration fusion 4-km grid dataset (1998-2018)
Items
|
Description
|
Dataset
full name
|
Global
ocean surface chlorophyll-a concentration fusion 4-km grid dataset (1998-2018)
|
Dataset short
name
|
Global_Chla_1998-2018
|
Authors
|
Li,
L. W., College of Oceanography and Space Informatics, China University of
Petroleum, lilianwei78@163.com
Fu,
Y. X., College of Oceanography and Space Informatics, China University of
Petroleum, 624002974@qq.com
Xue,
C. J. 0000-0003-3605-6578, Aerospace Information Research Institute, Chinese
Academy of Sciences, Key Laboratory of Digital Earth Science, Chinese Academy
of Sciences, xuecj@aircas.ac.cn
Cui,
J. Y., College of Oceanography and Space Informatics, China University of
Petroleum, xjuzhxcjy@163.com
Zhang,
Y. Y., College of Oceanography and Space Informatics, China University of
Petroleum, 1529142841@qq.com
Xu,
Y. F., College of Oceanography and Space Informatics, China University of
Petroleum, xuyf187627@163.com
|
Geographical region
|
Global
sea surface
|
Year
|
From
January 1998 to December 2018
|
Temporal
resolution
|
Month
/ season / year
|
Spatial
resolution
|
4 km
´ 4
km
|
Data format
|
Geo-TIFF
|
Data size
|
50.1
GB (19.4 GB after compression)
|
Data files
|
Chlorophyll-a concentration fusion monthly scale dataset
of global ocean surface
Chlorophyll-a
concentration fusion seasonal scale dataset of global ocean surface
Chlorophyll-a
concentration fusion annual scale dataset of global ocean surface
|
Foundations
|
Chinese
Academy of Sciences (XDA19060103)
|
Data publisher
|
Global
Change Research Data Publishing & Repository,
http://www.geodoi.ac.cn
|
Address
|
No.
11A, Datun Road, Chaoyang District, Beijing 100101, China
|
Data
sharing policy
|
Data from
the Global Change Research Data Publishing & Repository includes metadata, datasets
(in the Digital Journal of Global Change Data Repository), and
publications (in the Journal of Global Change Data & Discovery). Data sharing policy
includes: (1) Data are openly available and can be free downloaded via the
Internet; (2) End users are encouraged to use Data subject to
citation; (3) Users, who are by definition also value-added service
providers, are welcome to redistribute Data subject to written permission
from the GCdataPR Editorial Office and the issuance of a Data redistribution
license; and (4) If Data are used to compile new
datasets, the ??ten per cent principal?? should be followed such that Data
records utilized should not surpass 10% of the new dataset contents, while
sources should be clearly noted in suitable places in the new dataset[10]
|
Communication and searchable system
|
DOI, CSTR, Crossref, DCI, CSCD,
CNKI, SciEngine, WDS/ISC, GEOSS
|
Table 2 Information of input remote sensing
datasets of global sea surface chlorophyll-a concentration
Sensor
|
Algorithm
|
Start time
|
End time
|
Quantity
|
Temporal
resolution
|
Spatial
resolution
|
SeaWiFS
|
OCI
|
Sep. 18, 1997
|
Nov. 17, 2010
|
644
|
8 days
|
9 km
|
Terra
|
OCI
|
Jul. 4, 2002
|
Apr. 13, 2018
|
736
|
8 days
|
4 km
|
Aqua
|
OCI
|
Jul. 4, 2002
|
Apr. 13, 2018
|
736
|
8 days
|
4 km
|
MERIS
|
OCI
|
Apr. 29, 2002
|
Mar. 15, 2012
|
506
|
8 days
|
4 km
|
VIIRS
|
OCI
|
Jan. 2, 2012
|
Sep. 21, 2018
|
299
|
8 days
|
4 km
|
(3)
The data in similar region is fused by fixed weight;
(4)
In variable regions, the weighted coefficient optimization method is used to
fuse the low-frequency dada, while the edge-preserving smoothing method is used
to fuse the high-frequency data;
(5)
Finally, the global sea surface chlorophyll-a concentration is fused by the
wavelet inverse transform.
Figure 1 Time range of multi-sensor remote
sensing dataset for chlorophyll-a concentration
Table 3 Statistics
for mea-
sured
data of global sea surface chlorophyll-a concentration
|
Sensor
|
Measured points
|
TERRA
|
1,576
|
AQUA
|
925
|
SeaWiFS
|
2,280
|
MERIS
|
781
|
VIIRS
|
149
|
Total
|
5,711
|
The
maximum value composite method and look-up table method are combined to develop
the fusion dataset of global sea surface chlorophyll-a concentration. The
specific process includes:
(1)
firstly, for each sensor, the dataset of every eight days is fused, and 46
images per year are obtained, thus, five global sea surface chlorophyll-a
concentration datasets are generated, i.e., Terra, Aqua, MERIS, SeaWIFS, VIIRS;
(2)
the maximum value of each month, season and year in each pixel is selected to
replace the final fusion dataset;
(3)
for those dataset with only one data source, e.g., 1998-2002 with only SeaWiFS senor, the dataset is
calculated using the lookup table method to make the data consistent with the
fused dataset.
Figure 2 Workflow of the data fusing algorithm
4 Data Results and Validation
4.1 Data Composition
Global fusion dataset of sea surface chlorophyll-a
concentration (1998-2018) includes monthly scale dataset, seasonal
scale dataset and annual scale dataset. The number and amount of products
contained in the dataset are shown in Table 4.
4.2 Data Pre-processing
Data Pre-processing is to establish a unified
spatio-temporal dataset, which provides the basis for designing the algorithm
data fusion and the algorithm of generating products. The preprocessing
includes the data format transformation, the spatiotemporal resampling and the
coordinate system projection.
Table 4 Composition of global fusion dataset of sea surface chlorophyll-a
concentration
Classification
|
Number of Products
|
Data Size
|
Monthly Global fusion dataset
of sea surface Chlorophyll-a Concentration
|
252
|
35 GB
|
Seasonal Global fusion dataset
of sea surface Chlorophyll-a Concentration
|
84
|
11.8 GB
|
Annual Global fusion dataset of
sea surface Chlorophyll-a Concentration
|
21
|
2.9 GB
|
(1) Data format transformation
The
original dataset of Tarra-MODIS, Aqua-MODIS, SeaWiFS and VIIRS are all files in
NetCDF (suffix name. nc) format, and the dataset of MERIS are files in HDF4.
The data format conversion program is written based on ENVI-IDL language and
Python language, and the format of remote sensing data of chlorophyll-a
concentration on the global sea surface from multiple sensors is converted to
TIFF format.
(2)
Spatiotemporal resampling
In
order to ensure the accuracy and utilization of the dataset, the bilinear
interpolation method is used to interpolate the low-resolution data. After
interpolation, the spatial resolution of remote sensing data of chlorophyll-a
concentration on the global sea surface is resampled to 4 km.
(3)
Coordinate system projection
The
global sea surface chlorophyll-a concentration remote sensing dataset from
multiple sensors were projected into WGS-84 geographic coordinate system.
4.3 Data Results
The datasets includes three temporal scales, there are
monthly, seasonal and annual global datasets of sea surface chlorophyll-a
concentration. The annual dataset is shown in Figure 3.
Figure 3 Annual
fusion datasets of sea surface chlorophyll-a concentration (2008)
4.4 Data Validation
(1) Comparison between the fusion dataset and the measured
data
The
comparison between the fusion dataset and the measured data was analyzed.
Firstly, the maximum value and the corresponding coordinate value of each eight
days were selected from the measured dataset found from the fusion dataset
image. Then calculate the logarithm value with 10 as the bottom, and the
chlorophyll-a concentration less than 1 becomes negative. Finally, the
correlation between the two data points is calculated. The matching degree of
the two data points is high, and the point fitting degree reaches nearly 87%.
(2)
Comparison between the fusion dataset and the original dataset
In
order to compare and analyze the data values of fusion dataset and the data
values of original dataset, two tests are carried out. One test selected from
the fusion datasets, Aqua, Terra, MERIS and SeaWIFS from 2005 to 2010, and the
other selected from fusion datasets, Aqua, Terra and VIIRS from 2012 to 2016. Firstly,
the remote sensing values corresponding to the measured maximum values are
respectively calculated, and then the data values of the fusion dataset and
multi-source sensor data of the same day are selected, and the correlation
analysis is carried out. The results are shown in Figure 4 and Figure 5.
|
|
Figure 4 Comparison
between the fusion dataset and the real dataset
|
Figure
5 Comparison between the
fusion dataset and the original dataset
|
Figure
6 Comparisons between the fusion
datasets, GSM and the real dataset
|
The
first test shows that the MERIS data have the best correlation with the
dataset, and the fitting degree reaches nearly 87%. Terra data have the worst
fitting degree, and the fitting degree is only 66%. And the second test shows
that the fitting degree of Aqua, Terra, VIIRS and other three kinds of data
with the fusion dataset is better than that of the first group. The minimum
fitting degree is nearly 83%, and the maximum fitting degree is nearly 93%,
indicating that Aqua, Terra, VIIRS original datasets have good fitting with the
fusion dataset.
(3)
Comparison between the fusion dataset and prevailing datasets
The
fusion dataset, the GSM dataset and the real(original) dataset in 2008 are used
for comparison. There are a total of 24 matched pairs of points, and the
comparison is shown in Figure 6, which shows that the fitting degree of the
fusion dataset and the real dataset is 79%, while the fitting degree of GSM and
the real dataset is only 35%. The fitting degree of the fusion dataset and the
real dataset is much higher than that of the GSM product, indicating that the
fusion dataset has a high quality and a good matching with the real dataset.
5 Discussion
and Conclusion
Based on the chlorophyll-a data of five sensors, i.e.,
SeaWIFS, Terra, Aqua, MERIS and VIIRS, a data fusion algorithm is designed and
a fusion dataset is generated. The fusion dataset of global sea surface
chlorophyll-a concentration covers from January 1998 to December 2018, with a
spatial resolution of 4 km ´ 4 km, the temporal resolution of monthly, seasonal
and yearly. The amount of dataset size is 50.1 GB (19.5 GB after compressed).
Based on the real (original) dataset and the GSM dataset of ESA, three
validations of the fusion dataset are carried out. The results show that the
fitting degree of the fusion dataset is higher than that of GSM dataset. As a
large number of observation data is needed to optimize the fusion algorithm and
there is a lack of observation data in the opening ocean, the fusion dataset may
have much room to improve.
Author
Contributions
Li, L. W. is responsible for data validation. Xue, C. J. is
responsible for the overall planning and design of datasets. Cui, J. Y. is
responsible for dataset fusion algorithm design and algorithm implementation.
Fu, Y. X., and Zhang, Y. Y. participated in data download and preprocessing. Xu,
Y. F. is responsible for dataset fusion. Li, L. W. and Fu, Y. X. are
responsible for data paper writing.
Conflicts of Interest
The authors declare no conflicts of interest.
References
[1]
Wang, J.
L., Zhang, Y. J., Yang, F., et al.
The seasonal chlorophyll-a dataset of Poyang Lake, China (2009?C2012) [J]. Journal of Global Change Data & Discovery, 2017, 1(2): 208?C215.
[2]
Li, X. X.,
Zhang, T. L., Tian, L., et al.
Merging chlorophyll-a data from multiple ocean color sensors in South China Sea
[J]. Journal of Remote Sensing, 2015,
19(4): 680?C689.
[3]
Shi, Y. N.,
Zhang, T. L., Shi, L. J., et al.
Objective analysis for merging multisensory chlorophyll-a data [J]. Haiyang Xuebao, 2016(3): 82?C87.
[4]
Cui, J. Y.,
Liu, X. D., Yue, Z. Y., et al.
Multi-source ocean remote sensing chlorophyll data fusion [J]. Remote Sensing Information, 2020, 35(3):
31?C36.
[5]
Chen, Y. Z.,
Wang, X. Q., Wu, B., et al. Ocean
color data merging based on adaptive weighted averaging [J]. Remote Sensing Technology and Application,
2012, 27(3): 333?C338.
[6]
Gregg, W.
W., Conkright, M. E. Global seasonal climatologies of ocean chlorophyll:
blending in situ and satellite data for the coastal zone color scanner era [J].
Journal of Geophysical Research,
2001, 106(C2): 2499?C2516.
[7]
Qu, L. Q.,
Guan, L., He, M. X. The global availabilities of SeaWiFS, MODIS and merged
chlorophyll-a data [J]. Periodical of
Ocean University of China, 2006, 36(2): 321?C326.
[8]
Chen, Z.
Y., Zheng, G. Q., Wang, X. Q., et al.
Retrieval of chlorophyll a concentration with multi-sensor data by GSM01 merging
algorithm [J]. Journal of Geo-information
Science, 2013, 15(6): 911?C917.
[9]
Li, L. W.,
Fu, Y. X., Xue, C. J., et al. Global ocean surface chlorophyll-a concentration fusion
4-km grid dataset (1998-2008) [J/DB/OL]. Digital Journal of Global Change Data Repository, 2021. https://doi.org/10.3974/geodb.2021.05.05.V1. https://cstr.escience.org.cn/CSTR:20146.11.2021.05.05.V1.
[10]
GCdataPR
Editorial Office. GCdataPR data sharing policy [OL]. https://doi.org/10.3974/dp.policy.2014.05
(Updated 2017).
[11]
MERIS
Reprocessing Information. https://oceancolor.gsfc.nasa.gov/cgi/browse.pl.
[12]
SeaBASS. https://seabass.gsfc.nasa.gov/.