print · login

Building a Drought Impact Database for the Netherlands and Cross-Border Catchments

Overview: Help us build a structured, high-quality historical database of drought impacts in the Netherlands using NLP on newspaper archives. You'll extract detailed information from Dutch and selected Belgian/German news sources (areas influencing the Rhine, Meuse, and Vecht river systems). The goal is to support long-term drought planning and fill a critical gap in local-scale impact records beyond existing manual reporting systems.

Key challenges:

Curate and pre-process archived articles from Dutch and regional BE/DE newspapers
Use topic modelling and classification to filter drought-relevant content
Extract structured information:
- Date, location, sector, impact type, and response
- Duration, spatial extent, severity, and monetized impacts (if available)
Normalize results and build a searchable, geotagged database
Optionally, support spatial/temporal analysis of historical drought patterns

You'll learn about:

NLP pipelines for processing, filtering, and structuring large volumes of news text
Topic modelling (e.g., BERTopic) to explore evolving themes and filter relevant articles
Information extraction techniques (e.g., named entity recognition, pattern matching, date/number normalization) to identify key impact attributes
Drought impact typologies and socio-hydrological indicators
How to build a usable, extensible dataset from unstructured archives

Ideal for: A student with an interest in natural language processing, climate or environmental data, and real-world applications of AI in water management. You’re excited by the challenge of turning messy historical data sources into a structured resource that can support national and regional drought policy and planning.

Contact: Hans Korving and Tom Heskes

Department of Data Science

Building a Drought Impact Database for the Netherlands and Cross-Border Catchments

Department of
Data Science