🏖️ Kaggle Solutions

Search

❯

❯

❯

Adam optimizer

Mar 01, 2025, 1 min read

optimizes the learning rate for each parameter in your model
Kaparthy likes to use Adam with a learning rate of 3e-4 (to set baselines)
- “In my experience Adam is much more forgiving to hyperparameters, including a bad learning rate.”
Q: Doesn’t Adam set the learning rate for me? Why do I have to set one?
- The value you provide is the initial ballpark that Adam uses (adam can then decide what learning rate to use later)

Backlinks

Google - Fast or Slow? Predict AI Model Runtime
MLB Player Digital Engagement Forecasting
decreasing learning rate

Created with Quartz v4.2.3 © 2025

Download these notes on Github!