Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/139464
Citations
Scopus Web of Science® Altmetric
?
?
Type: Journal article
Title: On how data are partitioned in model development and evaluation: Confronting the elephant in the room to enhance model generalization
Author: Maier, H.R.
Zheng, F.
Gupta, H.
Chen, J.
Mai, J.
Savic, D.
Loritz, R.
Wu, W.
Guo, D.
Bennett, A.
Jakeman, A.
Razavi, S.
Zhao, J.
Citation: Environmental Modelling and Software, 2023; 167:105779-1-105779-8
Publisher: Elsevier BV
Issue Date: 2023
ISSN: 1364-8152
1873-6726
Statement of
Responsibility: 
Holger R. Maier, Feifei Zheng, Hoshin Gupta, Junyi Chen, Juliane Mai, Dragan Savic, Ralf Loritz, Wenyan Wu, Danlu Guo, Andrew Bennett, Anthony Jakeman, Saman Razavi, Jianshi Zhao
Abstract: Models play a pivotal role in advancing our understanding of Earth’s physical nature and environmental systems, aiding in their efficient planning and management. The accuracy and reliability of these models heavily rely on data, which are generally partitioned into subsets for model development and evaluation. Surprisingly, how this partitioning is done is often not justified, even though it determines what model we end up with, how we assess its performance and what decisions we make based on the resulting model outputs. In this study, we shed light on the paramount importance of meticulously considering data partitioning in the model development and evaluation process, and its significant impact on model generalization. We identify flaws in existing data-splitting approaches and propose a forward-looking strategy to effectively confront the “elephant in the room”, leading to improved model generalization capabilities.
Keywords: Model development
Model evaluation
Data partitioning
Data splitting
Calibration
Validation
Uncertainty
Earth systems
Rights: © 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
DOI: 10.1016/j.envsoft.2023.105779
Grant ID: http://purl.org/au-research/grants/arc/DE210100117
Published version: http://dx.doi.org/10.1016/j.envsoft.2023.105779
Appears in Collections:Civil and Environmental Engineering publications

Files in This Item:
File Description SizeFormat 
hdl_139464.pdfPublished version2 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.