Journal of Global Change Data & Discovery2025.9(4):469-474

[PDF] [DATASET]

Citation:Tian, J., Ma, H. L.Dataset Development of China Root-zone Soil Moisture Based on the TCH Method (2018–2021)[J]. Journal of Global Change Data & Discovery,2025.9(4):469-474 .DOI: 10.3974/geodp.2025.04.04 .

Dataset Development of China Root-zone Soil Moisture Based on the TCH Method (2018?C2021)

Tian, J.1*  Ma, H. L.2*

1. Key Laboratory of Water Cycle and Related Land Surface Processes, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China??

2. River and Lake Protection Center, Ordos Water Conservancy Bureau, Ordos 017000, China

 

Abstract: Root zone soil moisture (RZSM) is a key variable linking surface water cycling with vegetation ecological processes, and it serves as an important indicator for medium- to long-term drought monitoring, agricultural water management, and ecohydrological assessment. However, current spatiotemporally continuous RZSM data face considerable challenges due to limitation in direct observation and model uncertainties. In this study, RZSM data from 2 land surface models and 3 reanalysis datasets were integrated using the Triangle Corned Hat (TCH) method to produce a daily, 0.25?? root zone (0?C100 cm) soil moisture dataset for China??s mainland covering 2018?C2021. The dataset is archived in .tif format. Validation using observations from 2,061 soil moisture monitoring stations across China indicates that the fused dataset achieves a median RMSE of 0.077 m3/m3, a median correlation coefficient (r) of 0.5, a bias peak close to 0, and a median unbiased RMSE (ubRMSE) of 0.04 m3/m3. These results demonstrate that the dataset is robust and reliable, providing valuable support for regional-scale drought monitoring, eco-hydrological analyses, and agricultural applications.

Keywords: root zone soil moisture; three-corned hat method; data fusion

DOI: https://doi.org/10.3974/geodp.2025.04.04

Dataset Availability Statement:

The dataset supporting this paper was published and is accessible through the Digital Journal of Global Change Data Repository at: https://doi.org/10.3974/geodb.2025.08.08.V1.

1 Introduction

Root zone soil moisture (RZSM) refers to the soil water content within the main rooting depth of vegetation and is a key variable that links surface water cycling with vegetation ecolo­gical processes. RZSM not only directly affects plant water availability, evapo­tran­spiration, photosynthetic rate, and crop yield, but also plays a crucial role in regulating land-atmosphere energy and water exchanges within the climate system. Compared with surface soil moisture, RZSM exhibits stronger buffering and memory capacities, making it a more reliable indicator for medium- to long-term drought monitoring, agricultural water management, and ecohydrological assessment[1,2]??

With the increasing occurrence and severity of droughts and intensification of extreme hydrological events under climate change, obtaining accurate RZSM information is crucial for improving drought monitoring accuracy, guiding agricultural irrigation management, and assessing ecosystem resilience. However, due to the scarcity of in situ observations, the limited penetration depth of remote sensing, and the high uncertainty associated with model simulations, current RZSM data still face significant limitations[3,4]. Therefore, developing multi-source fused RZSM datasets that integrate multi-source data (e.g., remote sensing, meteorological forcing, in situ observations, and machine learning) is a critical foundation for advancing integrated hydrological, ecological, and agricultural studies.

In this study, RZSM data from 5 land surface models and reanalysis products were fused using the Three-Cornered Hat (TCH) method to produce a root-zone (0?C100 cm) soil moisture dataset for China??s mainland. This dataset provides an important data resource for regional drought monitoring, eco-hydrological process analysis, and agricultural water management applications.

2 Metadata of the Dataset

The metadata information of the Root zone (0?C100 cm) soil moisture 0.25??/daily dataset over China (2018?C2021)[5] including the title, author, geographical region, spatial and temporal resolution, data size, data file, etc., is summarized in Table 1.

3 Methods

This study fused the root-zone soil moisture (RZSM) data from two land surface models and three reanalysis products (Table 2) using the TCH method to produce a new RZSM product (0?C100 cm) for China. The dataset was systematically evaluated against observations from more than 2,000 soil moisture monitoring stations across the country. The methodological framework consists of two main components:

(1) Computation of 0?C100 cm root-zone soil moisture: Multi-layer soil moisture data from each model and observation site were aggregated using a depth-weighted averaging method. The weighting coefficients were determined based on the relative proportion of the distance between the centers of 2 adjacent soil layers within the total 100 cm depth, representing each layer??s relative contribution to the total 100 cm depth. This weighted averaging process was applied independently to each dataset, resulting in comparable 0?C100 cm RZSM estimates across all models and observation sites.

(2) Generation of the fused product: After obtaining the 0?C100 cm RZSM estimates from the 5 data sources, the TCH method was applied to quantify error variances and derive optimal weights for each dataset, thereby producing a fused product without the need for ground truth data. This approach objectively evaluates the relative error levels of multiple data sources and adjusts the weighting scheme accordingly, thereby improving the accuracy consistency, and robustness of the fused product. The resulting dataset combines the complementary strengths of different models while reducing the uncertainties associated with any single source.

All data were averaged to daily temporal resolution using arithmetic means and resampled to a 0.25?? spatial resolution via bilinear interpolation, ensuring consistent spatiotemporal

Table 1  Metadata summary of the Root zone (0?C100 cm) soil moisture 0.25??/daily dataset over China (2018?C2021)

Item

Description

Dataset full name

Root zone (0?C100 cm) soil moisture 0.25??/daily dataset over China (2018?C2021)

Dataset short name

RZSM_China_2018-2021

Author

Tian, J., Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, tianj.04b@igsnrr.ac.cn

Geographical region

China??s mainland

Year

2018?C2021

Temporal resolution

Day

Spatial resolution

0.25??

Data format

.tif

Data size

99.4 MB (compressed)

Data file

Mean root zone (0?C100 cm) soil moisture

Foundations

Department of Science and Technology of Inner Mongolia Autonomous Region, Ordos Science and Technology Bureau (ZD20232303); National Natural Science Foundation of China (42071327)

Computing environment

Python

Data publisher

Global Change Research Data Publishing & Repository, http://www.geodoi.ac.cn

Address

No. 11A, Datun Road, Chaoyang District, Beijing 100101, China

Data sharing policy

(1) Data are openly available and can be free downloaded via the Internet; (2) End users are encouraged to use Data subject to citation; (3) Users, who are by definition also value-added service providers, are welcome to redistribute Data subject to written permission from the GCdataPR Editorial Office and the issuance of a Data redistribution license; and (4) If Data are used to compile new datasets, the ??ten percent principal?? should be followed such that Data records utilized should not surpass 10% of the new dataset contents, while sources should be clearly noted in suitable places in the new dataset[6]

Communication and searchable system

DOI, CSTR, Crossref, DCI, CSCD, CNKI, SciEngine, WDS, GEOSS, PubScholar, CKRSC

 

Table 2  Multi-layer soil moisture products from 5 land surface models

Data name

Spatial resolution

Temporal resolution (h)

Depth of soil moisture (cm)

ERA5[7]

0.1??

3

0?C7, 7?C28, 28?C100, 100?C289

MERRA-2[8]

0.5????0.625??

3

0?C5, 0?C100

CFSR[9]

0.205????0.204??

1

0?C10, 10?C40, 40?C100, 100?C200

GLDAS-NOAH2.1[10]

0.25??

3

0?C10, 10?C40, 40?C100, 100?C200

SMAP Level 4[11]

9 km

3

0?C5, 0?C100

 

representation across the entire domain.

3.1 Algorithm

(1) Computation of 0?C100 cm root-zone soil moisture

As described above, 0?C100 cm RZSM was calculated as the weighted average of multi-layer soil moisture data, where the weights corresponding to the proportional thickness of each soil layer within the 0?C100 cm depth. For example, in the GLDAS-NOAH2.1 product, 3 layers are available within 0?C100 cm: 0?C10 cm, 10?C40 cm, and 40?C100 cm, with respective weights of 0.1, 0.3, and 0.6. The RZSM for 0?C100 cm is computed as:

                                                                              (1)

whereis the averaged 0?C100 cm soil moisture.is the soil moisture at a specific layer. The same method was applied to other datasets in Table 2. 

(2) TCH data fusion method

TCH method is used to evaluate the relative errors among multiple data sources and perform weighted fusion without requiring a true reference value. It is proposed by Tavella and Premoli [12]. Xi​ (i=1,2, , N) represents the time series of the ith RZSM product, where N is the number of products (here, N=5). Each Xi consists of the true value Xt and an error term ??i​:

                                                                                                         (2)

where i = 1, 2 , N. To estimate ??i​, differences between N?C1 products and a randomly chosen reference product XR are computed as:

                                                                                        (3)

where i = 1, 2 , N?C1. The covariance between the errors ??i and ??j is:

                                                                                    (4)

where i, j = 1, 2 , N?C1. M is the number of temporal samples.andare the mean of error of the ith RZSM product and jth RZSM product, respectively. The superscript T denotes the transpose. Accordingly, the covariances between Yi​ and Yj​ can be expressed as??

                                                                                    (5)

where i, j = 1, 2 , N?C1. rij, riR, rjR, and rRR represent the covariance between ??i and ??j, the covariance between ??i and ??R, the covariance between ??j and ??R, and the covariance between ??R and ??R, respectively, and are calculated using Equation 4. However, Equation 4 cannot be solved directly because the number of unknowns exceeds the number of equations. Galindo and Palacio (1999) proposed the constrained minimization problem based on the Kuhn- Tucker theorem and solved this problem[13]. The objective function F and the constraint condition H are:

                                                                                                                                                                    (6)

                                                                                                                                                                                                                (7)

                                                                                                                                                              (8)

                                                                                                                                                                  (9)

                                                                                                                                                  (10)

where R represents the error covariance matrix; S represents the covariance matrix of the sum; K represents the identity matrix. rij ( i = 1, 2, , N) represents the error covariance between the ith and jth products. All pairwise interactions among the products yield an error covariance matrix. The weights used in fusing soil moisture products are determined by the inverse of this error covariance matrix. This matrix not only estimates the uncertainty of the TCH method but also accounts for error correlation. According to the Gauss-Markov theorem, this method yields a weighted average with the minimum variance.

                                                                                                                                                                           (11)

                                                                                                                               (12)

where C denotes the error covariance matrix obtained from calculations??W represents the weight matrix. Xweighted  is the transformed form of the weight matrix. J is the design matrix, which is a vector consisting entirely of 1??i.e. [1, ,1]T. The weight value of each product in data fusion is derived from the above parameters.

3.2 Technical Workflow

 

Figure 1  Flowchart of the dataset development

 

The development of this dataset involved 6 steps: (1) Preprocessing of soil moisture data products: For each dataset, the temporal mean values were computed, and all data were resampled to a 0.25?? spatial resolution to ensure consistency. Then, the 0?C100 cm root-zone soil moisture was derived using a depth-weighted averaging method. (2) Preprocessing of in situ soil moisture observations: Daily mean soil moisture values were calculated for each observation site, and the 0?C100 cm soil moisture was derived using the same depth-weighted approach as applied to the other soil moisture products. (3) Estimation of error variance using the TCH method: Error variances for the preprocessed soil moisture products were calculated based on the TCH method. (4) Weight determination: The relative weights of each product were calculated according to their estimated error variances, reflecting their reliability in the subsequent fusion process. (5) Data fusion: The individual soil moisture products were combined using the derived weights to produce the fused RZSM dataset. (6) Validation: The fused dataset was validated against in situ soil moisture observations from monitoring stations to evaluate its accuracy and robustness (Figure 1).

4 Data Results and Validation

4.1 Dataset Composition

The Root zone (0?C100 cm) soil moisture 0.25??/daily dataset over China (2018?C2021) provides daily root zone (0?C100 cm) soil moisture data for China??s mainland from 2018 to 2021, with a spatial resolution of 0.25?? and archived in .tif format. A total of 1,461 files is included in the dataset.

4.2 Data Results

Figure 2 illustrates the multi-year monthly mean distribution of RZSM across China??s mainland (January?CDecember). The spatial patterns exhibit distinct regional differences, with soil moisture generally decreasing from the humid southeast to the arid northwest. Southeastern China maintains higher soil moisture levels due to abundant precipitation, humid climate, and dense vegetation cover, all of which enhance soil moisture retention. In contrast, northwestern China experiences much lower soil moisture because of its arid climate, limited rainfall, and high evaporation rates, which constrain effective soil moisture recharge. In high-altitude regions such as the Qinghai-Xizang Plateau, soil moisture dynamics are further shaped by cold temperatures, permafrost conditions, and complex hydrothermal processes.

Temporally, RZSM exhibits clear seasonal variations: (1) Spring (March?CMay): As temperatures rise and precipitation increases, soil moisture gradually replenishes following the winter dry period. (2) Summer (June?CAugust): Concentrated rainfall induces the annual peak in soil moisture, representing the main recharge season. (3) Autumn (September?C November): With declining temperatures and reduced precipitation, soil moisture begins to decrease. (4) Winter (December?CFebruary): Under low temperatures and snow-dominated precipitation, combined with low evaporation, soil moisture remains relatively stable at a low level.

 

 

Figure 2   Maps of multi-year monthly average values of RZSM of China

4.3 Data Validation

Validation of the TCH-fused dataset was performed using observations from 2,061 soil moisture stations across China??s mainland (Figures 3, 4). The observation network is denser in eastern and central China, while coverage is sparser in the west. Figure 3 shows no pronounced spatial trend although higher correlation coefficients (r) are observed in northern and southern China. RMSE values peak between 0.05 and 0.10 m3/m3 with a median of 0.077 m3/m3, indicating moderate errors at most stations, though a high-value tail reflects a few larger deviations. Correlation coefficient (r) peaks between 0.5 and 0.8, with a median of 0.5, suggesting generally good linear consistency, though a small portion of low correlations persists, particularly in the northwestern region. Bias cluster around 0, mostly within ?C0.05 m3/m3 to 0.05 m3/m3, implying no significant systematic bias at the national scale, though slight underestimation is evident in parts of North China. ubRMSE values peak between 0.03 and 0.05 m3/m3, with a median of 0.04 m3/m3, indicating small random errors for most stations but a few outliers exhibit higher uncertainty. Overall, these validation results demonstrate that the TCH-based fusion method provides reliably and robust performance at the national scale. However, remaining discrepancies are mainly associated with the inherent accuracy of the input datasets. Additionally, scale mismatches may contribute to uncertainty, as station observations represent point-scale conditions, whereas the fused dataset corresponds to a 0.25?? grid, encompassing a substantially larger spatial extent.

 

 

Figure 3  Maps of site verification effect diagram of soil moisture data after TCH fusion of China

 

 

Figure 4  Histogram statistics of site verification results

5 Discussion and Conclusion

Using the TCH fusion method, multiple RZSM datasets were integrated and validated against observations from 2,061 soil moisture stations across China. The median values of the key validation metrics for the fused dataset were 0.077 m3/m3 for RMSE, 0.5 for the correlation coefficient, 0.008 m3/m3 for bias, and 0.04 m3/m3 for ubRMSE. The peak values were primarily concentrated within the ranges of 0.05?C0.10 m3/m3 for RMSE, 0.5?C0.8 for r, near 0 for bias, and 0.03?C0.05 m3/m3 for ubRMSE. These results demonstrate that the fused dataset exhibits good reliability and that the TCH method performs robustly, making it well-suited for large-scale applications across China. Considering the scarcity and observ­ational challenges of in situ RZSM measurements, the resulting dataset provides valuable support for hydrological, agricultural, and ecological research. Future improvements could further enhance fusion accuracy by incorporating additional data sources and optimizing the selection of input variables to better capture spatiotemporal variability in root zone soil moisture.

 

Author Contributions

Ma, H. L. contributed to the overall design of the dataset development; Tian, J. processed, analyzed the data and wrote the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

 

References

[1]        Tobin, K. J., Torres, R., Crow, W. T. Multi-decadal analysis of root-zone soil moisture applying the exponential filter across CONUS [J]. Hydrology and Earth System Sciences, 2017, 21 (9): 4403?C4417.

[2]        Zohaib, M., Kim, H., Choi, M. Evaluating the patterns of spatiotemporal trends of root zone soil moisture in major climate regions in East Asia [J]. Journal of Geophysical Research-Atmospheres, 2017, 122 (15): 7705?C7722.

[3]        Xu, L., Chen, N. C., Zhang, X., et al. In-situ and triple-collocation based evaluations of eight global root zone soil moisture products [J]. Remote Sensing of Environment, 2023, 254: 112248.

[4]        Tian, J., Zhang, Y. Q. Comprehensive validation of seven root zone soil moisture products at 1153 ground sites across China [J]. International Journal of Digital Earth, 2023, 16(2): 4008?C4022.

[5]        Tian, J. Root zone (0?C100 cm) soil moisture 0.25??/daily dataset over China (2018?C2021) [J/DB/OL]. Digital Journal of Global Change Data Repository, 2025. https://doi.org/10.3974/geodb.2025.08.08.V1.

[6]        GCdataPR Editorial Office. GCdataPR data sharing policy [OL]. https://doi.org/10.3974/dp.policy.2014.05 (Updated 2017).

[7]        Bell, B., Hersbach, H., Simmons, A., et al. The ERA5 global reanalysis: preliminary extension to 1950 [J]. Quarterly Journal of the Royal Meteorological Society, 2021, 147(741): 4186?C4227.

[8]        Reichle, R. H., Draper, C. S., Liu, Q., et al. Assessment of MERRA-2 land surface hydrology estimates [J]. Journal of Climate, 2017, 30(8): 2937?C2960.

[9]        Saha, S., Moorthi, S., Wu, X. R., et al. The NCEP climate forecast system version 2 [J]. Journal of Climate, 2014, 27(6): 2185?C2208.

[10]     Rodell, M., Houser, P. R., Jambor, U., et al. The global land data assimilation system [J]. Bulletin of the American Meteorological Society, 2004, 85(3): 381?C394.

[11]     Reichle, R. H., Liu, Q., Koster, R. D., et al. Version 4 of the SMAP Level-4 soil moisture algorithm and data product [J]. Journal of Advances in Modeling Earth Systems, 2019, 11(10): 3106?C3130.

[12]     Tavella, P., Premoli, A. Estimating the instabilities of N-Clocks by measuring differences of their readings [J]. Metrologia, 1994, 30(5): 479-486.

[13]     Galindo, F. J., Palacio, J. Estimating the instabilities of N correlated clocks [C]. In Proceedings of the 31th annual precise time and time interval systems and applications meeting. 1999, 285?C296.

Co-Sponsors
Superintend