# Introduction to residuals and least squares regression

I weigh people with heights I want to find the connection between. Here the height will be measured in cm and the weight in kg. First, random heights and weights I choose a group of people to measure. And then, the height for each person and describe a point that represents a combination of weights. For example, let's say I I measure one 60 centimeters high. Suppose it is 60 cm long and weighs 100 kg. Then I go to 60 cm, and then to 100 kg. The point here is 60 commas, It's 100. One way to think about it we can say that the height is in our x axis or along the x axis, the weight is described along the y-axis. So that's the point of that person Describes 60 cm and 100 kg. I do it one, two, three, four, five, six, seven, eight, I did it for nine people and I could continue to do so.

However, it is approximate we can say that there is a linear relationship. This seems like a positive. In general, as the height increases, so does the weight. Perhaps the one who predicted this action I could put a line. Let me do it. This is my electronic arrow tool. I can think of many lines. It seems that most of the points are below the line. This is not straightforward. I can do something like this, but that's right does not look like a match.

Most of the dots appear above the line. I look here again. In the future to find you more suitable you will learn better methods. And it's something like that, and I do it again I browse. It looks straight. You can see this line as a regression line. We can express this as y is equal to mx plus b. First the angle coefficient, and then We need to find the cutter y to understand this. We can understand this based on what I drew or we can think of it as weight. Shoot the angle of gravity and height plus y cutter. If you look at the vertical axis as a weight axis, you might think it's a weight loss. Another model I'm looking at is my regression line. This is what I want to adapt to these points. But one line is clear cannot pass through all points.

All of the points, though not all realistic and approximate for most distance between will be far away. This idea – real 4 points and estimated difference between, height will be called residual. I will write this below. Residual for all 4 points. For example, do it here q1, q1 for the remainder, this for variable, will be 60 cm for our height variable. The real one is 100 kg. From this we can deduce what is estimated. Presumably here. I can replace 60 in this equation and it will hit M plus 60 plus b. I think I can write it as M. Or let's say, 60M plus B Once again I take 60 pounds and Putting the phrase here, I can say that What is the estimated weight? Even here to find the figure take my line tool and I try to take a straight line from this point.

Let me draw a straight line from this point. While this may not seem entirely straightforward, I fixed it a bit. Received. So, it looks like about 150 kg. That is, my expression is estimated at 150 kg. So, the balance here will be minus 50. Negative balance is a real estimate occurs when located below what is done. This is right here is q1, which is a negative balance. If that's the point here If you tried to find, q2 will be a positive balance. Because of the real estimate is great. Your balance line, of your regression, to the given points of the expression indicates how appropriate it is. Maybe you have some of the remains think about combinations and you are thinking of trying to reduce it. Now you probably think that's why I do it all I try to reduce them by collecting the remains. This is a bit confusing. Because, some positive and some negative. Big negative balance and can balance a large positive balance. This will give 0. Then it will appear that there is no residue. You can simply collect the absolute prices. Of all the remains, absolute prices I'll take the sum. Then M and B reduce it to change according to my line. This is a regression may be a way to build a line. But there is another way to do it, which is often the case in statistics you will come across the sum of the squares of the remains is to find.

You are the one who is negative or positive If you raise the square, it will be positive and negative and positive solves the problem of reduction. Approximately squared large residual numbers you will get even larger numbers when you upgrade. For example, I choose fixed numbers. One, two, three, four, they are from each other they are one unit apart. But if I square them, 1, 4, 9, 16, they go further. When you square the remainder, and the great proportion of the sum of the sum of the squares becomes large when shown. We are in future videos regression of small squares we will see the method. Regression of small squares. Where you, M and B are given find and reduce the sum of the squares of the remainder.

This is important and that's it The reason for its use is that takes into account significant limits. Those located away from the model. Something like that. With regression of small squares We will try to minimize this. Or, the weight will get a little heavier. Because when you raise the square this becomes an even bigger factor. But this is just a conceptual introduction. In future videos, we will look at calculating balances. We, the balances for the line able to reduce the sum of squares To find M and B.

we will get a new formula..