The Role of the Learning Rate in Layered Neural Networks with ReLU Activation Function

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Using the statistical physics framework, we study the online learning dynamics in a particular case of shallow feed-forward neural networks with ReLU activation. By expanding the activation function in terms of Hermite polynomials we derive analytical results for the evolution of order parameters for any learning rate. Moreover, we compare our results with online gradient descent simulations and show how our method describes the typical learning curves. We also present results on how the learning rate affects the overall behavior of the network and its equilibria,
showing different learning regimes and critical values of the learning rate.
Original languageEnglish
Title of host publicationEuropean Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
Subtitle of host publication33rd ESANN 2025
EditorsMichel Verleysen
PublisherCiaco - i6doc.com
Pages437-442
Number of pages6
DOIs
Publication statusPublished - Apr-2025
EventEuropean Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning - Brugge, Belgium
Duration: 23-Apr-202525-Apr-2025
Conference number: 35
https://www.esann.org

Conference

ConferenceEuropean Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
Abbreviated titleESANN 2025
Country/TerritoryBelgium
CityBrugge
Period23/04/202525/04/2025
Internet address

Fingerprint

Dive into the research topics of 'The Role of the Learning Rate in Layered Neural Networks with ReLU Activation Function'. Together they form a unique fingerprint.

Cite this