🍑 Correlation

Types, Test for significance

  • When there are two continuous variables which are concomitant their joint distribution is known as bivariate normal distribution.
  • If there are more than two such variables their joint distribution is known as multivariate normal distributions.
  • In case of bivariate or multivariate normal distributions, we may be interested in discovering and measuring the magnitude and direction of the relationship between two or more variables.
  • For this purpose we use the statistical tool known as correlation.
  • Definition:

    If the change in one variable affects a change in the other variable, the two variables are said to be correlated and the degree of association ship (or extent of the relationship) is known as correlation.

  • It studies the relation or association between two variables.
  • Two independent variables are not interrelated.
  • The measurement of correlation is called the correlation co-efficient (r) or correlation index, which summarizes in one figure the direction & degree of correlation.
  • Range of correlation varies between +1 to -1 (i.e. –1 ≀ r ≀ 1). The correlation coefficient never exceed unity.
  • If r = +1 then we say that there is a perfect positive correlation between x and y
  • If r = -1 then we say that there is a perfect negative correlation between x and y
  • If r = 0 then the two variables x and y are called uncorrelated variables
  • No unit of measurement.

Types of Correlation

Positive

  • If the two variables deviate in the same direction, i.e., if the increase (or decrease) in one variable results in a corresponding increase (or decrease) in the other variable, correlation is said to be direct or positive.
  • Ex:
    • Heights and weights
    • Household income and expenditure
    • Amount of rainfall and yield of crops
    • Prices and supply of commodities
    • Feed and milk yield of an animal
    • Soluble nitrogen and total chlorophyll in the leaves of paddy.

Negative correlation

  • If the two variables constantly deviate in the opposite direction i.e., if increase (or decrease) in one variable results in corresponding decrease (or increase) in the other variable, correlation is said to be inverse or negative.
  • Ex:
    • Price and demand of a goods
    • Volume and pressure of perfect gas
    • Sales of woolen garments and the day temperature
    • Yield of crop and plant infestation

No or Zero Correlation

  • If there is no relationship between the two variables such that the value of one variable change and the other variable remain constant is called no or zero correlation.

Simple, Partial and Multiple Correlations

  • Simple correlation: When only two variables are studied.
  • Partial correlation: More than two variables are studied but consider only two to be influencing each other, the effect of other influencing variable being kept constant.
  • Multiple correlations: Three or more variable are studied simultaneously.

Linear and Nonlinear Correlation

  • If the amount of change in one variable tends to bear a constant ratio to the amount of change in the other variable is known as linear correlation.
  • If the amount of change in variable doesn’t bear a constant ratio to the amount of change in other variable is known as nonlinear correlation.
  • In the most of the practical situations we find a nonlinear relationship between variables.
  • Absence of any relationship between the variable the value of correlation coefficient will be zero.

Methods of studying Correlation

  • Scatter Diagram
  • Karl Pearson’s Coefficient of Correlation
  • Spearman’s Rank Correlation
  • Regression Lines

Scatter diagram

  • It is the simplest way of the diagrammatic representation of bivariate data. Thus for the bivariate distribution (xi, yi); i = j = 1,2,…n, If the values of the variables X and Y be plotted along the X-axis and Y-axis respectively in the xy-plane, the diagram of dots so obtained is known as scatter diagram.
  • From the scatter diagram, if the points are very close to each other, we should expect a fairly good amount of correlation between the variables and if the points are widely scattered, a poor correlation is expected. This method, however, is not suitable if the number of observations is fairly large.

Positive Correlation

  • If the plotted points shows an upward trend of a straight line then we say that both the variables are positively correlated.

Negative Correlation

  • When the plotted points shows a downward trend of a straight line then we say that both the variables are negatively correlated.

No Correlation

  • If the plotted points spread on whole of the graph sheet, then we say that both the variables are not correlated.

Karl Pearson’s Coefficient of Correlation

  • Prof. Karl Pearson, a British Biometrician suggested a measure of correlation between two variables. It is known as Karl Pearson’s coefficient of correlation. It is useful for measuring the degree of linear relationship between the two variables X and Y.
  • It is usually denoted by rxy or β€˜r’.

i) Direct Method:

ii) Deviation method

  • Where
    • Οƒx = S.D. of x and Οƒy = S.D. of Y
    • n = number of items
    • dx = x - A, dy = y - B
    • A = assumed value of and B = assumed value of y

Test for significance of correlation coefficient

  • If β€œr” is the observed correlation coefficient in a sample of β€œn” pairs of observations from a bivariate normal population, then Prof. Fisher proved that under the null hypothesis

H0: ρ = 0

  • The variables x, y follows a bivariate normal distribution. If the population correlation coefficient of x and y is denoted by ρ, then it is often of interest to test whether ρ is zero or different from zero, on the basis of observed correlation coefficient β€œr”.
  • Thus if β€œr” is the sample correlation coefficient based on a sample of β€œn” observations, then the appropriate test statistic for testing the null hypothesis H0: ρ = 0 against the alternative hypothesis H1: ρ β‰  0 is
  • Follows Student’s t – distribution with (n - 2) d.f.
  • If calculated value of t > table value of t with (n - 2) d.f. at specified level of significance, then the null hypothesis is rejected. That is, there may be significant correlation between the two variables. Otherwise, the null hypothesis is accepted.

Example

  • From a paddy field, 12 plants were selected at random. The length of panicles in cm (x) and the number of grains per panicle (y) of the selected plants were recorded. The results are given in the following table. Calculate correlation coefficient and its testing.

Solution:

a) Direct Method:

  • Where, n = number of observations
  • Testing the correlation coefficient:
  • Null hypothesis H0: Population correlation coefficient β€œΟβ€ = 0
  • Under H0, the test statistic becomes
  • T critical (table) value for 10 d.f. at 5% LOS is 2.23
  • Since calculated value i.e. 9.6 is > t table value i.e. 2.23, it can be inferred that there exists significant positive correlation between (x, y).

b) Indirect Method:

  • Here A = 127 and B = 24

  • When there are two continuous variables which are concomitant their joint distribution is known as bivariate normal distribution.
  • If there are more than two such variables their joint distribution is known as multivariate normal distributions.
  • In case of bivariate or multivariate normal distributions, we may be interested in discovering and measuring the magnitude and direction of the relationship between two or more variables.
  • For this purpose we use the statistical tool known as correlation.
  • Definition:

    If the change in one variable affects a change in the other variable, the two variables are said to be correlated and the degree of association ship (or extent of the relationship) is known as correlation.

  • It studies the relation or association between two variables.
  • Two independent …

Become Successful With AgriDots

Learn the essential skills for getting a seat in the Exam with tons of fun in the process.

πŸ¦„ You are a pro member!

Only use this page if purchasing a gift or enterprise account

Plan
Rs
  • Unlimited access to PRO courses
  • Quizzes with hand-picked meme prizes
  • Invite to private Discord chat
Lifetime
Rs 1,499 once
  • All PRO-tier benefits
  • Single payment, lifetime access
  • 4,200 bonus xp points
programming is hard

Yo! You just found a 20% discount using πŸ‘‰ EASTEREGG

AgriDots T-shirt

High-quality fitted cotton shirt produced by Next Level Apparel

Questions? Let's chat

Open Discord