Lecture Note
University
Stanford UniversityCourse
CS229 | Machine LearningPages
1
Academic year
2023
anon
Views
43
p {margin: 0; padding: 0;} .ft00{font-size:25px;font-family:TimesNewRomanPS;color:#000000;} .ft01{font-size:18px;font-family:TimesNewRomanPSMT;color:#000000;} .ft02{font-size:22px;font-family:TimesNewRomanPSMT;color:#000000;} .ft03{font-size:18px;font-family:CambriaMath;color:#000000;} .ft04{font-size:12px;font-family:CambriaMath;color:#000000;} .ft05{font-size:9px;font-family:CambriaMath;color:#000000;} .ft06{font-size:25px;line-height:33px;font-family:TimesNewRomanPS;color:#000000;} .ft07{font-size:18px;line-height:23px;font-family:TimesNewRomanPSMT;color:#000000;} .ft08{font-size:18px;line-height:27px;font-family:TimesNewRomanPSMT;color:#000000;} Vectorization-based Gradient Descent for Multiple LinearRegression In machine learning, multiple linear regression is a potent technique that lets us simulate therelationship between a dependent variable and a number of independent factors. We willdelve deep into the application of gradient descent for multiple linear regression withvectorization in this post. We shall comprehend this potent method better by fusing the ideasof gradient descent and multiple linear regression. Multiple linear regression analysis A dependent variable and several independent variables are modeled using the statisticaltechnique known as multiple linear regression. In this model, we are attempting to predict acontinuous variable called the dependent variable using the values of the independentvariables. We will group all of the independent variables into a vector w and the dependent variable intoa number b in order to write this model more concisely using vector notation. The equationfor the model is . π π€,π (π₯) = π€ 1 π₯ 1 +... + π€ π π₯ π + π Function Cost The cost function, J(w,b), calculates the discrepancy between the dependent variable's actualvalue and its expected value. By updating the parameters w and b until the cost functionachieves a minimum, the gradient descent algorithm seeks to minimize the cost function. Descent in Gradient The cost function is minimized using the optimization process known as gradient descent.Gradient descent involves updating each w and b parameter repeatedly until the cost functionis minimized. The gradient descent update rule is given by and π€ π = π€ π β Ξ± β βπ€ π π½(π€, π) , where alpha is the learning rate, J(w,b) is the cost function's π = π β Ξ± β βπ π½(π€, π) derivative with respect to , and J(b) is the cost function's derivative with respect to b. π€ π Multiple Regression using Gradient Descent With I = 1, 2,..., n, the gradient descent updating rule for multiple linear regression is , where n is the total number of independent variables. π€ π = π€ π β Ξ± β βπ€ π π½(π€, π) , where f(w,b) is the predicted value, y is the actual value, and x i is the j-th π(π€, π) β π¦ Β· π₯ π independent variable, is the formula for the derivative of J with respect to . π€ π
Gradient Descent for Multiple Linear Regression
Please or to post comments