print · login   

Building a Drought Impact Database for the Netherlands and Cross-Border Catchments


Overview: Help us build a structured, high-quality historical database of drought impacts in the Netherlands using NLP on newspaper archives. You'll extract detailed information from Dutch and selected Belgian/German news sources (areas influencing the Rhine, Meuse, and Vecht river systems). The goal is to support long-term drought planning and fill a critical gap in local-scale impact records beyond existing manual reporting systems.

Key challenges:

  • Curate and pre-process archived articles from Dutch and regional BE/DE newspapers
  • Use topic modelling and classification to filter drought-relevant content
  • Extract structured information:
    • Date, location, sector, impact type, and response
    • Duration, spatial extent, severity, and monetized impacts (if available)
  • Normalize results and build a searchable, geotagged database
  • Optionally, support spatial/temporal analysis of historical drought patterns

You'll learn about:

  • NLP pipelines for processing, filtering, and structuring large volumes of news text
  • Topic modelling (e.g., BERTopic) to explore evolving themes and filter relevant articles
  • Information extraction techniques (e.g., named entity recognition, pattern matching, date/number normalization) to identify key impact attributes
  • Drought impact typologies and socio-hydrological indicators
  • How to build a usable, extensible dataset from unstructured archives

Ideal for: A student with an interest in natural language processing, climate or environmental data, and real-world applications of AI in water management. You’re excited by the challenge of turning messy historical data sources into a structured resource that can support national and regional drought policy and planning.

Contact: Hans Korving and Tom Heskes