Member-only story

Mathematics behind the parameters update rule:

8 min readJan 2, 2020

This article covers the content discussed in the Sigmoid Neuron module of the Deep Learning course and all the images are taken from the same module.

In this article, we discuss the mathematics behind the parameters update rule.

Our goal is to find an algorithm which at any timestamp, tells us how to change the value of w such that the loss that we compute at the new value is less than the loss that we have at the current value.

And if we keep doing this at every step, the loss is bound to decrease no matter where we start from and eventually reach its minimum value.

And Taylor series tells us that if we have a function and if we know its value at a certain point(x in the below case), then its value at a new point which is very close to x can be given by the below expression

And we can see that the Taylor series relates the function value at a new point (x + δx)with the function value at the current point(x)

In fact, the value at the new point is equal to the value at the current point plus some additional terms all of which depends on δx

Now if this δx is such that the quantity which is getting added to f(x)(in brackets in the below image) is actually negative, then we can sure that the function value at a new point is less than the function value at the current point.

Mathematics behind the parameters update rule:

Written by Parveen Khurana

No responses yet