GD is a convex optimization scheme.w := w - l.r * dJ(w)/dw. This is the general form of gradient descent.It attempts to find a point of minima, because as W → minima, dJ/dw → 0, so the solution converges. In practice, by noticing the change in the gr...
gradient-descents.hashnode.dev2 min readNo responses yet.