🍡 Correlation

Types, Test for significance

When there are two continuous variables which are concomitant their joint distribution is known as bivariate normal distribution.
If there are more than two such variables their joint distribution is known as multivariate normal distributions.
In case of bivariate or multivariate normal distributions, we may be interested in discovering and measuring the magnitude and direction of the relationship between two or more variables.
For this purpose we use the statistical tool known as correlation.
Definition:

If the change in one variable affects a change in the other variable, the two variables are said to be correlated and the degree of association ship (or extent of the relationship) is known as correlation.
It studies the relation or association between two variables.
Two independent variables are not interrelated.
The measurement of correlation is called the correlation co-efficient (r) or correlation index, which summarizes in one figure the direction & degree of correlation.
Range of correlation varies between +1 to -1 (i.e. –1 ≤ r ≤ 1). The correlation coefficient never exceed unity.
If r = +1 then we say that there is a perfect positive correlation between x and y
If r = -1 then we say that there is a perfect negative correlation between x and y
If r = 0 then the two variables x and y are called uncorrelated variables
No unit of measurement.

Types of Correlation

Positive

If the two variables deviate in the same direction, i.e., if the increase (or decrease) in one variable results in a corresponding increase (or decrease) in the other variable, correlation is said to be direct or positive.
Ex:
- Heights and weights
- Household income and expenditure
- Amount of rainfall and yield of crops
- Prices and supply of commodities
- Feed and milk yield of an animal
- Soluble nitrogen and total chlorophyll in the leaves of paddy.

Negative correlation

If the two variables constantly deviate in the opposite direction i.e., if increase (or decrease) in one variable results in corresponding decrease (or increase) in the other variable, correlation is said to be inverse or negative.
Ex:
- Price and demand of a goods
- Volume and pressure of perfect gas
- Sales of woolen garments and the day temperature
- Yield of crop and plant infestation

No or Zero Correlation

If there is no relationship between the two variables such that the value of one variable change and the other variable remain constant is called no or zero correlation.

Simple, Partial and Multiple Correlations

Simple correlation: When only two variables are studied.
Partial correlation: More than two variables are studied but consider only two to be influencing each other, the effect of other influencing variable being kept constant.
Multiple correlations: Three or more variable are studied simultaneously.

Linear and Nonlinear Correlation

If the amount of change in one variable tends to bear a constant ratio to the amount of change in the other variable is known as linear correlation.
If the amount of change in variable doesn’t bear a constant ratio to the amount of change in other variable is known as nonlinear correlation.
In the most of the practical situations we find a nonlinear relationship between variables.
Absence of any relationship between the variable the value of correlation coefficient will be zero.

Methods of studying Correlation

Scatter Diagram
Karl Pearson’s Coefficient of Correlation
Spearman’s Rank Correlation
Regression Lines

Scatter diagram

It is the simplest way of the diagrammatic representation of bivariate data. Thus for the bivariate distribution (x_i, y_i); i = j = 1,2,…n, If the values of the variables X and Y be plotted along the X-axis and Y-axis respectively in the xy-plane, the diagram of dots so obtained is known as scatter diagram.
From the scatter diagram, if the points are very close to each other, we should expect a fairly good amount of correlation between the variables and if the points are widely scattered, a poor correlation is expected. This method, however, is not suitable if the number of observations is fairly large.

Positive Correlation

If the plotted points shows an upward trend of a straight line then we say that both the variables are positively correlated.

Negative Correlation

When the plotted points shows a downward trend of a straight line then we say that both the variables are negatively correlated.

No Correlation

If the plotted points spread on whole of the graph sheet, then we say that both the variables are not correlated.

Karl Pearson’s Coefficient of Correlation

Prof. Karl Pearson, a British Biometrician suggested a measure of correlation between two variables. It is known as Karl Pearson’s coefficient of correlation. It is useful for measuring the degree of linear relationship between the two variables X and Y.
It is usually denoted by r_xy or ‘r’.

i) Direct Method:

ii) Deviation method

Where
- σ_x = S.D. of x and σ_y = S.D. of Y
- n = number of items
- d_x = x - A, d_y = y - B
- A = assumed value of and B = assumed value of y

Test for significance of correlation coefficient

If “r” is the observed correlation coefficient in a sample of “n” pairs of observations from a bivariate normal population, then Prof. Fisher proved that under the null hypothesis

H₀: ρ = 0

The variables x, y follows a bivariate normal distribution. If the population correlation coefficient of x and y is denoted by ρ, then it is often of interest to test whether ρ is zero or different from zero, on the basis of observed correlation coefficient “r”.
Thus if “r” is the sample correlation coefficient based on a sample of “n” observations, then the appropriate test statistic for testing the null hypothesis H₀: ρ = 0 against the alternative hypothesis H₁: ρ ≠ 0 is

Follows Student’s t – distribution with (n - 2) d.f.
If calculated value of t > table value of t with (n - 2) d.f. at specified level of significance, then the null hypothesis is rejected. That is, there may be significant correlation between the two variables. Otherwise, the null hypothesis is accepted.

Example

From a paddy field, 12 plants were selected at random. The length of panicles in cm (x) and the number of grains per panicle (y) of the selected plants were recorded. The results are given in the following table. Calculate correlation coefficient and its testing.

Solution:

a) Direct Method:

Where, n = number of observations
Testing the correlation coefficient:
Null hypothesis H₀: Population correlation coefficient “ρ” = 0
Under H₀, the test statistic becomes
T critical (table) value for 10 d.f. at 5% LOS is 2.23
Since calculated value i.e. 9.6 is > t table value i.e. 2.23, it can be inferred that there exists significant positive correlation between (x, y).

b) Indirect Method:

Here A = 127 and B = 24

When there are two continuous variables which are concomitant their joint distribution is known as bivariate normal distribution.
If there are more than two such variables their joint distribution is known as multivariate normal distributions.
In case of bivariate or multivariate normal distributions, we may be interested in discovering and measuring the magnitude and direction of the relationship between two or more variables.
For this purpose we use the statistical tool known as correlation.
Definition:

If the change in one variable affects a change in the other variable, the two variables are said to be correlated and the degree of association ship (or extent of the relationship) is known as correlation.
It studies the relation or association between two variables.
Two independent …

Become Successful With AgriDots

Learn the essential skills for getting a seat in the Exam with tons of fun in the process.

🦄 You are a pro member!

Only use this page if purchasing a gift or enterprise account

Plan

Unlimited access to PRO courses
Quizzes with hand-picked meme prizes
Invite to private Discord chat
Free Sticker emailed

Lifetime

Rs 1,499 once

All PRO-tier benefits
Single payment, lifetime access
4,200 bonus xp points
Next Level T-shirt shipped worldwide

Yo! You just found a 20% discount using 👉 EASTEREGG

High-quality fitted cotton shirt produced by Next Level Apparel

Questions? Let's chat

Open Discord

Basics

Sample Test

Relations

Experimental Designs