Lecture Note
University
Stanford UniversityCourse
CS229 | Machine LearningPages
2
Academic year
2023
anon
Views
41
A Complete Guide to Understanding Logistic Regression forData Scientists The classification process known as logistic regression is quite popular among datascientists and machine learning experts. This algorithm is easy to comprehend, use, andapply, and it works well to resolve problems in the real world. We'll delve deeply into the areaof logistic regression in this post and comprehend how it may be applied to categorize data. Logistic Regression: What Is It? A dependent variable can be classified using the statistical technique of logistic regressionbased on one or more independent variables. This algorithm uses supervised learning tocategorize binary data, or data that can only have two possible outcomes. For instance,determining if a tumor is cancerous or benign, or whether an email is spam or not. Consequences of Linear Regression A common regression approach for simulating the relationship between a dependentvariable and one or more independent variables is linear regression. It is assumed that thereis a linear, or straight line, relationship between the dependent and independent variables.However, linear regression does not produce reliable findings when categorizing binary data. Because the linear regression model generates a continuous value between 0 and 1, whichmight be any decimal value, this is the case. On the other hand, for classification issues, theoutput must either be 0 or 1. Logistic regression was developed to get around this constraintof linear regression. Logistic regression is the answer. By converting the continuous output of linear regression into a binary output, logisticregression resolves the issue with linear regression. The Sigmoid function, a mathematicalfunction, is used in this process. Any real number can be converted by the sigmoid functionto a value between 0 and 1, which can be thought of as a probability. It is said that the Sigmoid function is: 𝑔(𝑧) = 1 1+𝑒 −𝑧 Where z is the weighted sum of the independent variables and e is a mathematical constantwith a value of roughly 2.7. The Sigmoid function's output, g(z), can be understood as the likelihood that the dependentvariable belongs to a particular class. For instance, if the result of the Sigmoid function is0.7, it can be understood that there is a 70% likelihood that the tumor is malignant whenclassifying it as benign or malignant. The construction of the log-regression model In two steps, the logistic regression model is constructed:
1. Calculating the weighted sum of the independent variables by linear regression is the first step. 2. The second stage involves using the sigmoid function to convert the linear regression's continuous output into a binary output. We must train the logistic regression model on a training dataset in order to create themodel. The model will discover during training the ideal independent variable weights thatreduce the error between the anticipated and actual values. We can use the model to generate predictions on new data after it has been trained. Themodel will compute the weighted total and run it through the Sigmoid function to provide aprobability value between 0 and 1 given a fresh set of independent variables. The fresh datacan then be classified using this value.
Logistic Regression in Data Science: Study Guide
Please or to post comments