🏹 Regression
Direct and Indirect Method, Distinguish with Correlation
- The term ‘regression’ literally means ‘stepping back towards the average’.
- It was first used by a British Biometrician Sir
Francis Galton
. - The relationship between the independent and dependent variables may be expressed as a function. Such functional relationship between two variables is termed as regression.
- In regression analysis independent variable is also known as regressor or predictor or explanatory variable while dependent variable is also known as regressed or explained variable.
- When only two variables are involved the functional relationship is known as simple regression.
- If the relationship between two variables is a straight line, it is known as simple linear regression, otherwise it is called as simple non-linear regression.
Direct Method
The regression equation of Y on X is given as
Y = a + bX
- Where,
- Y = dependent variable
- X = independent variable
- a = intercept
- b = the regression coefficient (or slope) of the line
- a and b are also called as Constants, the constants a and b can be estimated with by applying the “least squares method”.
- Range of regression is varying between
-∞ to +∞
. - This gives,
- And
- Where b is called the estimate of regression coefficient of Y on X and it measures the change in Y for a unit change in X.
- Similarly, the regression equation of X on Y is given as
X = a1 + b1Y
- Where X = dependent variable and Y = independent variable
- And
- Where b1 is known as the estimate of regression coefficient of X on Y and ‘a’ is intercept
Deviation Method
-
The regression equation of Y on X is
-
The regression equation of X on Y
Properties of Regression Coefficient
- Correlation coefficient is the geometric mean of the two regression coefficients i.e.
r = ± √ b.b1
- If one of the regression coefficients is greater than unity, the other must be less than unity.
- Arithmetic mean of the regression coefficients is greater than the correlation coefficient “r”.
- Regression coefficients are independent of the change of origin but not of scale.
- Units of “b” are same as that of the dependent variable.
- Regression is only a one-way relationship between Y (dependent variable) and X (independent variable).
- The range of b is from -∞ to ∞. -∞ for negative b and +∞ for positive b.
👀 Note:
- Both the lines regression pass through the point (X, Y). In other words, the mean values (X, Y) can be obtained as the point of intersection of the two regression lines.
- If
r = 0
, the two variables are uncorrelated, the lines of regression becomeperpendicular
to each other. - If
r = ±1
, in this case the two lines of regression eithercoincide
or they areparallel
to each other - If the regression coefficients are positive, r is positive and if the regression coefficients are negative, r is negative.