Skip to main content

Partial Derivative - Chain Rule

what is the chain rule ?

consider the function y = m * x +b 

i am going to work on the function which we can use for our machine learning problems .

consider a loss function h = [y - y_hat] ^2 which is also called mean square error


y - dependent on x

y_hat - predicted value 

h = called loss 

 

so why again two functions , ok

h = [m * x +b - y_hat]^2  looks complicated right ,this is where we will use the 

chain rule ,how then ?

say partial derivative of m , b are

[1]  dy /dm  = x , dy/db = 1

do a substitution for h  = u^2  , u = [y- y_hat]

[2] 

dh/du = 2u = 2 * (y - y_hat)

du/dy = 1

so dh/dy  = dh/du * du /dy    --------[chain rule]

= 2 * (y - y_hat) *1

dh/dy = 2 * (y - y_hat)


so what is partial derivative of  dh/dm and dh/db  then ?

 chain rule again from 1 and 2 

dh/dm = dh/dy * dy/dm

[3] dh/dm 2 * (y - y_hat) *

dh/db = dh/dy * dy/db

[4] dh/db  = 2 * (y - y_hat) * 1


values 3 and 4 are used in values corecction for m and b 

new value of  m = m - dh/dm

new value of b = b - dh /db 

 

OK i think this is too much for now will do a implementation in  our machine learning sample for better understanding .


 

 

 

Comments

Popular posts from this blog

SHA-256 initial values

The simple workout to arrive at the initial values for sha-256 The first 32 bit of the fractional part of the sqroot (first 8 prime number 2-19) Alright what does it say  Sqrroot(prime)- Let’s say the first prime is 2 Sqroot(2)  = 1.414213562373095 Convert to hexadecimal- Since we are worried about the fractional part alone Converting the fractional part would be easy Fractional part- 0.414213562373095 Multiply the fractional part with 16 to arrive at hex 0.414213562373095*16= 6.62741699796952 0.62741699796952*16= 10.03867196751232 0.03867196751232*16=0.61875148019712 0.61875148019712*16=9.90002368315392 0.90002368315392*16=14.40037893046272 0.40037893046272*16=6.40606288740352 0.40606288740352*16=6.49700619845632 0.49700619845632*16=7.95209917530112 Resulting hexadecimal would be 6a09e667 which is  h0 := 0x6a09e667 Iam going to stop at the 8th iteration , why is that ? Since we are interested in 32 bit (8*4=32) Alright to make it clear  Convert hexade...

Linear Regression with one variable - Introduction

 It is not but making a some how clear relationship among variables the dependent and independent variables. talking in terms of maths the equation can be used meaningfully for something may be to determine /predict values from data. if y = m * x + b  the values for m , b can be anything but has to appropriate to predict y  so the loss which is  difference from existing to prediction is close to zero ~0 to start with we can say the one variable as -x  in some scenario m , b are called variables    the equation stated about is a line equation we have any equation  y = 2*x  y = x*x y = 2x +2x*x  so why the need of all these equations , it is all about playing data now a days in machine learning problems we create a data sets , lets consider as x  y to be a value of x the datas . y = datas  when we express the data as a function and plot in the graph we get the curves  take some random data x and plot x and y  x =1 , 2, ...