We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization, aiming at good generalization. We describe the structure of the gradient of a validation error w.r.t. the learning rate schedule – the hypergradient. Based on this, we introduce MARTHE, a novel online algorithm guided by cheap approximations of the hypergradient that uses past information from the optimization trajectory to simulate future behaviour. It interpolates between two recent techniques, RTHO [Franceschi et al., 2017] and HD [Baydin et al., 2018], and is able to produce learning rate schedules that are more stable leading to models that generalize better.

MARTHE: Scheduling the Learning Rate Via Online Hypergradients / Michele Donini, Luca Franceschi, Orchid Majumder, Massimiliano Pontil, Paolo Frasconi. - ELETTRONICO. - (2020), pp. 2119-2125. (Intervento presentato al convegno International Joint Conference on Artificial Intelligence).

MARTHE: Scheduling the Learning Rate Via Online Hypergradients

Paolo Frasconi
2020

Abstract

We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization, aiming at good generalization. We describe the structure of the gradient of a validation error w.r.t. the learning rate schedule – the hypergradient. Based on this, we introduce MARTHE, a novel online algorithm guided by cheap approximations of the hypergradient that uses past information from the optimization trajectory to simulate future behaviour. It interpolates between two recent techniques, RTHO [Franceschi et al., 2017] and HD [Baydin et al., 2018], and is able to produce learning rate schedules that are more stable leading to models that generalize better.
2020
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
International Joint Conference on Artificial Intelligence
Goal 17: Partnerships for the goals
Michele Donini, Luca Franceschi, Orchid Majumder, Massimiliano Pontil, Paolo Frasconi
File in questo prodotto:
File Dimensione Formato  
0293.pdf

accesso aperto

Descrizione: Articolo Principale
Tipologia: Pdf editoriale (Version of record)
Licenza: Tutti i diritti riservati
Dimensione 500.43 kB
Formato Adobe PDF
500.43 kB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1214703
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? ND
social impact