An optimizer derived from Halpern’s method for enhanced neural network convergence and reduced carbon emissions

Colao, V.; Foglia, K. R.; Giordano, A.; Ritacco, E.; Spataro, W.

doi:10.1007/s10844-025-00969-x

This work examines the Halpern’s iterative method as a means to create new neural network optimizers that could surpass many current existing approaches. We introduce HalpernSGD, an innovative network optimizer that leverages Halpern’s technique to enhance the rate of convergence of the Stochastic Gradient Descent (SGD), leading to reduced carbon emissions in neural network training processes. The combination of Halpern’s iterative method and Gradient Descent (GD) has led to an algorithm with a quadratic rate of convergence compared to the simple GD. Experimental comparisons between their stochastic versions show that HalpernSGD achieves greater efficiency than SGD by requiring fewer training epochs, thus reducing energy consumption and carbon footprint while maintaining model accuracy. We also compare HalpernSGD with ADAM, identifying potential improvements to ADAM’s approach in terms of stability and convergence, and suggesting a future direction for the development of combined optimizers. The implementation for HalpernSGD can be accessed at: https://github.com/EttoreRitacco/HalpernSGD.git