Nick Sudarsky will present the MSE talk "Dual Exponential Momentum Filtering for First Order Gradient Based Optimization" on Thursday, April 24, 2025 at 3:45 pm in Friend Center 005.

Thesis adviser: Szymon Rusinkiewicz and Adam Finkelstein (Reader)

Abstract:
Recent years have seen machine learning be applied to an increasingly diverse range of optimisation problems, with a number of generally-applicable optimizer algorithms being created to more effectively and efficiently solve these problems. Among the most broadly applicable and widely used of these optimization algorithms are those combining some form of first-order gradient-based optimization with momentum, which generally involves filtering gradient estimates or a function thereof via a single exponentially weighted running mean function. This work proposes a series of novel first order gradient descent optimizers, employing momentum filtering based upon a linear combination of two exponentially-weighted running gradient means. Through a combination of theoretical analysis and practical evaluation, we find that these dual exponential optimizers have the potential to outperform their single exponential counterparts in terms of both the number of iterations taken to reduce training cost and the stability with which training cost decreases.