Lecture Note
University
Stanford UniversityCourse
CS229 | Machine LearningPages
1
Academic year
2023
anon
Views
50
p {margin: 0; padding: 0;} .ft00{font-size:22px;font-family:Arial;color:#000000;} .ft01{font-size:16px;font-family:ArialMT;color:#000000;} .ft02{font-size:19px;font-family:ArialMT;color:#000000;} .ft03{font-size:16px;font-family:CambriaMath;color:#000000;} .ft04{font-size:11px;font-family:CambriaMath;color:#000000;} .ft05{font-size:16px;line-height:21px;font-family:ArialMT;color:#000000;} .ft06{font-size:11px;line-height:15px;font-family:CambriaMath;color:#000000;} Gradient Descent in Machine Learning: An Overview A fundamental machine learning approach called gradient descent is used to reduce amodel's cost function. This approach can be used to train a broad variety of models,including deep learning and linear regression models, and it can be applied to different costfunctions. In this article, we'll go into great detail on gradient descent and explain how itworks to optimize a model's parameters. Gradient Descent: What Is It? Finding the minimum value of a cost function is done using the optimization technique ofgradient descent. The parameters are initially estimated by the algorithm, which theniteratively changes them to lower the cost until it is as low as possible. By altering theparameters in a way that lowers the cost, gradient descent seeks to minimize the costfunction. The operation of Gradient Descent Calculating the gradient of the cost function with respect to the parameters is how gradientdescent operates. The gradient shows where the cost function's greatest growth is occurring.To cut costs, the algorithm adjusts the parameters in the gradient's opposite direction. Thelearning rate, a hyperparameter that must be specified prior to training the model,determines the size of the update. The parameters are adjusted in the manner described below at each cycle of gradientdescent: π€ = π€ β ππππππππ πππ‘π Β· ππ€ππ€ π = π β ππππππππ πππ‘π Β· ππππ Where dw/dw and db/db are the gradients of the cost function with respect to w and b,respectively, and w and b are the parameters. The cost function is what? A gauge of how well the model matches the data is the cost function. The discrepancybetween expected and actual output is how it is defined. The mean squared differencebetween the output predicted and the actual output is generally used to calculate the costfunction in linear regression. Although the cost function in deep learning models may bemore sophisticated, lowering costs is still the key objective.
CS229: Gradient Descent in Machine Learning
Please or to post comments