Formalizing the problem. Identifying the unknowns One issue to consider is what are the unknowns in this situation? The natural answerwould be to say that the unknowns are x and y. This is not actually the case,however, as we have some values given to us and we're trying to find out the linethat best fits those points on a graph. We don't really care about any particular valueof x, what we care about is the coefficients a and b that will tell us what kind of linethis is going to be. So in fact we are trying to solve for a and b that will give us thenicest possible line for these points. So the question is how to find the best-fitting line and its parameters, a and b. And ofcourse, we have to decide what we mean by "best." We want to minimize somefunction of a and b that measures the total error we are making when we choose thisline compared to experimental data. There are various ways to do it: some validones give us different answers.You must decide what it is you prefer, such as measuring the distance to a line byprojecting perpendicularly or measuring the difference between experimental andpredicted values of y for a given value of x.
In the conventionally used way of measuring distance, you minimize the distancebetween two points. But if you have one bad data point, such as a measurement inwhich someone fell asleep while conducting the experiment, that point may dominateyour results and give you misleading information.In statistics, it is often important to deal with outliers. A point that is very distant fromothers may not be a valid data point and so it is more appropriate to measure theaverage distance of all points or give more weight to distant outliers. There areseveral possible answers, but one of them gives us a particularly nice formula for aand b, so that's why it's the universally used one. The least-squares methodmeasures the sum of the squares of the errors. Because of aesthetic considerations,we choose to plot data in a linear fashion, even when it is not linear. Linear plots arealso easier to solve than nonlinear ones and therefore more appealing in that regardas well. If you have a method that minimizes the total square deviation between observedand predicted values, then that's probably the best one.The sum of the squares of the deviations from a model's prediction to observed datais known as the sum of squared residuals. Minimizing this value can be importantwhen trying to find an accurate model to predict observable data.The sum of the squares of the deviations from a model's prediction to observed datais known as the sum of squared residuals. Minimizing this value can be importantwhen trying to find an accurate model to predict observable data.