04 Jul Towards automated ensemble discharge forecasts for any river in the world on a federated compute and data infrastructure
The use-case was developed by Stichting Deltares in close collaboration with, and on the SURF infrastructure. In the near future, it will also be deployed at EODC, to test ease of redeployment, scalability and performance of the workflow. For questions, please contact: Björn.Backeberg@deltares.nl, Frederiek.email@example.com or Joost.Buitink@deltares.nl.
With increasing droughts and increased water use worldwide there is a great need to understand and accurately predict river conditions and water availability on a seasonal time scale. Within C-SCALE, we develop a workflow solution that will allow users to produce seasonal ensemble forecasts of river discharges for any river basin in the world.
With the increasing availability of global datasets at high spatial and temporal resolutions, it is now possible to provide hydrological simulations for river basins anywhere in the world, even in basins where river discharges are not measured using in situ observations.
Within this use-case, we aim to develop a workflow solution on the European Open Science Cloud compute and data infrastructure that will in the end enable seasonal river discharge forecasts for any river basin in the world. Our workflow tests the interoperability, scalability and performance of cloud, HTC and hybrid (cloud/HTC) compute and data resources.
The use case builds on locally developed tools, combining these in a workflow that is operationalised in the cloud. We use the global ERA5 re-analysis product and SEAS5 seasonal meteorological forecasts from the Copernicus Climate Data store as input for the wflow_sbm hydrological model. The workflow automatically downloads the required input data for the model domain, resamples the data to the required model grids, and runs a 50-member ensemble forecast simulation. The workflow is triggered every month when new SEAS5 forecasts become available and results in forecasts of drought conditions for several months ahead in time.
The workflow has been deployed for the Rhine river basin, and in the near future it will be generalised to allow for easy adaptation to other river basins of interest and to allow for deployment on other cloud infrastructures.
The original data and modelling chain ran on standalone windows infrastructure. A first step was the selection of the preferred cloud infrastructure, as several European Open Science Cloud providers exist that could host this use-case. The HTC compute solution offered by SURF was identified as an appropriate solution for these types of (embarrassingly parallel) ensemble forecast simulations. For the implementation on SURF’s HTC cluster, the WFLOW model was containerised using Singularity. The individual workflow components were connected in an orchestration tool and deployed and tested on the HTC infrastructure. Currently, the workflow and its individual components are becoming more generic, this is required for easy redeployment to other European cloud infrastructures to enable users to choose their preferred provider and run the ensemble forecasts for their own river basin of interest.
Support from C-SCALE
The Land Surface Drought Analysis (LSDA) use-case is part of the C-SCALE project and activities started at the beginning of it. The use case has been implemented in close collaboration between Deltares and SURF and will in the near future be deployed on EODC’s cloud infrastructure. The user and provider teams have had constructive interaction, progressively working towards improved solutions addressing the use case requirements. Whilst some of the technologies needed for the use cases were clearly new to the providers. The providers should be commended for their willingness to explore new technologies and find solutions to support the use case deployment.
C-SCALE services used
Deltares is now using the C-SCALE services provided by SURF, i.e. the HTC cluster. Access is facilitated through the SURF Research Access Manager (SRAM), and the provider ensured that crontab was available to schedule the workflow orchestration.
More information and relevant publications
Backeberg, B., Z. Šustr, E. Fernández, G. Donchyts, A. Haag, J. B. R. Oonk, G. Venekamp, B. Schumacher, S. Reimond and C. Chatzikyriakou (accepted). An open compute and data federation as an alternative to monolithic infrastructures for big Earth data analytics. Big Earth Data, https://doi.org/10.1080/20964471.2022.2094953
Buitink, J., J. Langemeijer, R. Oonk, F. Sperna Weiland and B. Backeberg (accepted). Automated monthly river forecasts: A Copernicus eoSC Analytics Engine use case. In EGI Conference 2022, September 2022, Prague, Czech Republic.
Testimony by Deltares researchers
Why did you decide to work with C-SCALE? What challenges does the collaboration with C-SCALE solve for Deltares?
- Easily deployable, interoperable, workflows to facilitate globally applicable, locally relevant ensemble forecasts of seasonal river discharges are difficult to achieve.
- Working with the IT and data experts of the EOSC providers allows us to implement state-of-the-art solutions towards achieving the above thereby enabling stakeholders to replicate developments here for any river basin of interest.
Which C-SCALE services are you using and what are you using them for?
- C-SCALE HTC compute to run the ensemble forecasts provided by SURF
- C-SCALE CephFS filesystem directly connected to the C-SCALE HTC provided by SURF
- C-SCALE AAI (SRAM) to enable easy access provided by SURF
- Support from IT and data experts from SURF