ππ» Dispersion
Absolute Measures and Relative Measures
- Dispersion means scattering of the observations among themselves or from a central value (Mean/ Median/ Mode) of data. We study the dispersion to have an idea about the variation.
- Suppose that we have the distribution of the yields (kg per plot) of two Ground nut varieties from 5 plots each. The distribution may be as follows:
- It can be seen that the mean yield for both varieties is 50 kg. But we cannot say that the performances of the two varieties are same. There is greater uniformity of yields in the first variety where as there is more variability in the yields of the second variety.
- The first variety may be preferred since it is more consistent in yield performance.
Measures of Dispersion
- These measures give us an idea about the amount of dispersion in a set of observations.
- They give the answers in the same units as the units of the original observations.
- When the observations are in kilograms, the absolute measure is also in kilograms.
- If we have two sets of observations, we cannot always use the absolute measures to compare their dispersion.
Absolute Measures
- Range
- Quartile Deviation
- Mean Deviation
- Standard Deviation and Variance
Range
- It is simplest measure of dispersion.
Range = Largest value β Smallest Value Coefficient of range = (L - S)/(L + S)
- In industries for quality control, the most important measure of dispersion is Range.
Quartile Deviation
- Quartile Deviation = (Q3 - Q1)/2
- Quartile Range = Q3 β Q1
- Coefficient of quartile range = (Q3 - Q1)/(Q3 + Q1)
Mean Deviation
- M.D. for ungrouped data = β(Xi - X)/N
- M.D. for grouped data = βf | Xi - X |/N
Standard Deviation/Root Mean Square Deviation
- It is defined as the positive square root of the arithmetic mean of the squares of the deviations of the given values from arithmetic mean.
- The square of the standard deviation is called
variance
.
S.D. = βVarience
- Given by
Karl Person
, 1823 denoted by Greek word sigma (Ο). - Measure of dispersion which is considered as
best
is standard deviation. - The value of standard deviation may vary between
0 to β
. - If all the variant values are negative the standard deviation will be
positive
because of squaring. - Such measure of dispersion which is affected least by fluctuation of sampling is Standard deviation.
Ungrouped data
- Let x1, x2, …, xn be n observations then the standard deviation is given by the formula,
-
Simplifying the above formula, we have,
-
By linear transformation method, we have,
-
Where,
- di = xi - A
- A = Assumed value
- xi = Given values
Continuous frequency distribution: (Grouped data)
-
Let f1, f2, …, fn be the βnβ frequencies corresponding to the mid values of the classes x1, x2, …, xn respectively, then the standard deviation is given by
-
Simplifying the above formula, we have,
-
By linear transformation method, we have
-
S.D. for population data is represented by the symbol βΟβ
Example
Ungrouped data:
- Calculate S.D. for the Kapas yields (in kg per plot) of a cotton variety recorded from seven plots 5, 6, 7, 7, 9, 4, 5
- Direct method:
- Deviation Method:
Grouped Data
- The following are the 381 soybean plant heights in cms collected from a particular plot. Find the Standard deviation of the plants by direct and deviation method:
Solution: i) Direct method:
ii) Deviation Method:
i) Direct method:
ii) Deviation Method:
Variance
- Term variance proposed by
R.A. Fisher
. - Square root of standard deviation known as
variance
.
Relative measure of dispersion
- These measures are calculated for the comparison of dispersion in two or more than two sets of observations.
- These measures are free of the units in which the original data is measure. If the original is in dollar or kilometers, we do not use these units with relative measure of dispersion. These are a sort of
ratio
and are called coefficients. - Suppose that the two distributions to be compared are expressed in the same units and their means are equal or nearly equal. Then their variability can be compared directly by using their standard deviations. However, if their means are widely different or if they are expressed in different units of measurement. We cannot use the standard deviations as such for comparing their variability.
- We have to use the relative measures of dispersion in such situations.
- Coefficient of Variance (CV)
- Standard Error of Mean (SEM)
Coefficient of Variance (C.V.)
- Given by
Karl Pearson
. - Most commonly used measure of relative variation.
- If standard deviation is expressed as percentage of mean is known as C.V.
- If value of C.V. is
greater
it meansmore variability
orless homogeneity
, if values of C.V. less means vice β versa. - Unit less measure of dispersion is C.V.
- In describing the amount of variation in a population & measure often used is
Coefficient of variation
- Note: Standard deviation is absolute measure of dispersion whereas Coefficient of variation is relative measure of dispersion.
Example
- Consider the distribution of the yields (per plot) of two ground nut varieties. For the first variety, the mean and standard deviation are 82 kg and 16 kg respectively. For the second variety, the mean and standard deviation are 55 kg and 8 kg respectively.
- Then we have, for the first variety
- For the second variety
- It is apparent that the variability in second variety is less as compared to that in the first variety. But in terms of standard deviation the interpretation could be reverse.
Standard Error of Mean
- SEM defined as the S.D. of the sampling distribution of means.
- It measures how far the sample mean of the data is likely to be from the true population mean.
- The SEM is always smaller than the SD.
- It gives idea about variability of given data or sample.
- To getting the higher precision is required large samples.