Difference between revisions of "Gradient descent"
From Wiki @ Karl Jones dot com
Karl Jones (Talk | contribs) |
Karl Jones (Talk | contribs) |
||
Line 3: | Line 3: | ||
== Description == | == Description == | ||
− | To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point. If instead one takes steps proportional to the positive of the gradient, one approaches a local maximum of that function; the procedure is then known as gradient ascent. | + | To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the [[gradient]] (or of the approximate gradient) of the function at the current point. If instead one takes steps proportional to the positive of the gradient, one approaches a local maximum of that function; the procedure is then known as gradient ascent. |
− | Gradient descent is also known as steepest descent, or the method of steepest descent. Gradient descent should not be confused with the method of steepest descent for approximating integrals. | + | Gradient descent is also known as steepest descent, or the method of steepest descent. |
+ | |||
+ | Gradient descent should not be confused with the [[method of steepest descent]] for approximating integrals. | ||
== See also == | == See also == | ||
Line 13: | Line 15: | ||
* [[Delta rule]] | * [[Delta rule]] | ||
* [[First-order method]] | * [[First-order method]] | ||
+ | * [[Gradient]] | ||
* [[Iterative method]] | * [[Iterative method]] | ||
* [[Mathematical optimization]] | * [[Mathematical optimization]] |
Latest revision as of 19:38, 12 October 2016
Gradient descent is a first-order iterative optimization algorithm.
Description
To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point. If instead one takes steps proportional to the positive of the gradient, one approaches a local maximum of that function; the procedure is then known as gradient ascent.
Gradient descent is also known as steepest descent, or the method of steepest descent.
Gradient descent should not be confused with the method of steepest descent for approximating integrals.
See also
- BFGS method
- Conjugate gradient method
- Delta rule
- First-order method
- Gradient
- Iterative method
- Mathematical optimization
- Nelder–Mead method
- Preconditioning
- Rprop
- Stochastic gradient descent
- Wolfe conditions
External links
- Gradient descent @ Wikipedia