print · login   

A Murder Mystery: How does Adam kill neurons with his slingshots?

The Adam optimizer is a widely used optimizer in deep learning, used to train many neural networks. However, it is not flawless.

One issue is that it can cause large loss spikes late in training when loss gets (too) low, called 'slingshots'.

In recent work (currently under review) we were, to our knowledge, the first to notice that these slingshots seem to introduce dead neurons. We want to know more about this.

This project fits both a master's research internship or master thesis.

Research questions are:

  • Do Adam's loss slingshots indeed always kill neurons? By what mechanism?
  • Do specific neurons die? If so, do they have certain properties?
  • (Optional: the dead neurons and their potentially regularizing effects for tasks that exhibit grokking behavior (see literature below).)

Familiarity with Deep Learning and the Adam optimizer is required, like the course Deep Learning Part 1 (or equivalent).

Supervision: Stijn van den Beemt (daily), Twan van Laarhoven

Contact: Stijn van den Beemt.

Timeframe: dependent on thesis or internship, but generally flexible.

References:

  • Slinghots: Vimal et al. (2022): The slingshot mechanism: An empirical study of adaptive optimizers and the grokking phenomenon.
  • Dead neurons: Van den Beemt et al. (2026): Currently under review. Pre-print available on request.
  • Grokking: Power, Alethea, et al. (2022): Grokking: Generalization beyond overfitting on small algorithmic datasets.