π¨βπ§βπ§ Central tendency
Mean, Mode, Median
- One of the most important aspects of describing a distribution is the central value around which the observations are distributed.
- Any mathematical measure which is intended to represent the center or central value of a set of observations is known as measure of central tendency (or)
- The single value, which represents the group of values, is termed as a βmeasure of central tendencyβ or a measure of location or an average.
Characteristics of a Satisfactory Average
- It should be rigidly defined.
- It should be easy to understand and easy to calculate.
- It should be based on all the observations.
- It should be least affected by fluctuations in sampling.
- It should be capable of further algebraic treatment.
- It should not be affected much by the extreme values.
- It should be located easily.
Measures of Central Tendency
- Mean
- Arithmetic Mean
- Geometric Mean
- Harmonic Mean
- Median
- Mode
Arithmetic Mean (A.M.)
- It is defined as the sum of the given observations divided by the number of observations.
- A.M. is measured with the same units as that of the observations.
ππ» Ungrouped data:
Direct Method
- Let x1, x2, β¦β¦β¦, xn be βnβ observations then the A.M. is computed from the formula:
- Linear Transformation Method or Deviation Method: When the variable constitutes large values of observations, computation of arithmetic mean involves more calculations. To overcome this difficulty, Linear Transformation Method is used. The value xi is transformed to di.
- Where,
- A = Assumed mean, which is generally taken as class mid-point of middle class or the class where frequency is large.
- di = xi - A = deviations of the ith value of the variable taken from an assumed mean and n = number of observations.
Grouped Data
- Let f1, f2, ….., fn be βnβ frequencies corresponding to the mid values of the class intervals x1, x2, β¦β¦β¦ xn then
And
- Where,
- di = Deviation = (xi - A)/C
- f = frequency
- C = class interval
- x = mid values of classes
- Arithmetic mean, when computed for the data of entire population, is represented by the symbol βΞΌβ. Whereas when it is computed on the basis of sample data, it is represented as X, which is the estimate of ΞΌ .
Properties of A.M.
- The algebraic sum of the deviations taken from arithmetic mean is zero i.e.
β(x - A.M.) = 0
- Let xΜ1 be the mean of n1 observations, xΜ2 be the mean of the n2 observations ….. xΜk be the mean of nk observations then the mean xΜ of n = (n1 + n2 + …… nk) observations is given by
β Merits
- It is rigidly defined by formula.
- Most commonly used measure of central tendency, only measure of practical importance. As it is amenable to further algebraic treatments, provided the sample is randomly obtained.
- It is regarded as the best of all the averages. Of all averages, arithmetic mean is affected least by fluctuations of sampling.
- It is rigidly defined and based on all the observations.
- To find the average height of plants we should use arithmetic mean.
Sampling Fluctuation
π It refers to the fluctuation in the value of the sample statistic from sample to sample.
π For example, consider a class of 30 students with mean age 15. You are taking samples from this population of size 5. It is natural to observe some differences among the sample means from different samples. This is sample fluctuation.
β Demerits
- Cannot be determined by inspection nor it can be located graphically.
- Arithmetic mean cannot be obtained if a single observation is missing or lost
- Arithmetic mean is affected very much by extreme values.
- Arithmetic mean may lead to wrong conclusions if the details of the data from which it is computed are not given.
- In extremely asymmetrical (skewed) distribution, usually arithmetic mean is not a suitable measure of location
Examples
i) Ungrouped data
- If the weights of 7 ear-heads of sorghum are 89, 94, 102, 107, 108, 115 and 126 g. Find arithmetic mean by direct and deviation methods.
π Solution:
- Here A = assumed value = 102
- AM = 741/7 = 105.86 g
- AM by deviation method = 102 + (27/7) = 105.86 g
ii) Grouped Data
- The following are the 405 soybean plant heights collected from a particular plot. Find the arithmetic mean of the plants by direct and indirect method:
π Solution:
a) Direct Method:
- Where
- xi = mid values of the corresponding classes
- N = Total frequency
- fi = frequency
b) Deviation Method:
- Where, di = deviation (i.e. di = (xi - A )/C
- Length of class interval (C) = 5; Assumed value (A) = 30
a) Direct Method:
A.M. = 12270/405 = 30.30 cms
b) Deviation Method:
A.M. = 30 + 24/405 x 5 = 30.30 cms
Median
- The value of the
middle most item
when items are arranged in either ascending or descending order of their magnitude. - Median is used for qualitative data such as intelligence, ability, honesty etc.
Ungrouped data
- If the number of observations is odd then median is the middle value after the values have been arranged in ascending or descending order of magnitude.
- In case of even number of observations, there are two middle terms and median is obtained by taking the arithmetic mean of the middle terms.
- In case of discrete frequency distribution median is obtained by considering the cumulative frequencies. The steps for calculating median are given below:
- Arrange the data in ascending or descending order of magnitude
- Find out cumulative frequencies
- Apply formula: Median = Size of (N+1)/2, where N= βf
- Now look at the cumulative frequency column and find, that total which is either equal to (N+1)/2 or next higher to that and determine the value of the variable corresponding to it, which gives the value of median.
Continuous frequency distribution
- If the data are given with class intervals then the following procedure is adopted for the calculation of median.
- find (N+1)/2, where N = βf
- see the (less than) cumulative frequency just greater than (N+1)/2
- the corresponding value of x is median
- In the case of continuous frequency distribution, the class corresponding to the cumulative frequency just greater than (N+1)/2 is called the median class and the value of median is obtained by the following formula:
- Where,
- l is the lower limit of median class
- f is the frequency of the median class
- m is the cumulative frequency of the class preceding the median class
- C is the class length of the median class
- N = total frequency
Examples
- Case-i) when the number of observations (n) is odd:
- The number of runs scored by 11 players of a cricket team of a school are
5, 19, 42, 11, 50, 30, 21, 0, 52, 36, 27
- To compute the median for the given data, we proceed as follows:
- In case of ungrouped data, if the number of observations is odd then median is the middle value after the values have been arranged in ascending or descending order of magnitude. Let us arrange the values in ascending order:
0, 5, 11, 19, 21, 27, 30, 36, 42, 50, 52
- Median = ((n+1)/2)th value = ((11 + 1)/2)th value = 6th value
- Now the 6th value in the data is 27.
- Median = 27 runs
Case-ii) when the number of observations (n) is even:
- Find the median of the following heights of plants in cms:
6, 10, 4, 3, 9, 11, 22, 18
- In case of even number of observations, there are two middle terms and median is obtained by taking the arithmetic mean of the middle terms.
- Let us arrange the given items in ascending order 3, 4, 6, 9, 10, 11, 18, 22.
- In this data the number of items n = 8, which is even.
- Median = γAverage of (n/2)γth and (n/2+1)th value
- Average of 9 and 10
- Median = 9.5 cms
Grouped Data
- Find out the median for the following frequency distribution of 180 sorghum ear-heads.
Solution:
- Here, (n+1)/2 = 181/2 = 90.5
- Cumulative frequency just greater than 90.5 is 114 and the corresponding class is 100-120. The median class is 100-120.
- N = 180; L = 100; f = 45; m = 69 and C = 20
- Substituting the above values in formula, we get
- Median = 100 + ((90.5-69)/45)20 = 109.56 g
β Merits
- It is rigidly defined.
- It is easily understood and is easy to calculate. In some cases it can be located merely by inspection.
- It is not at all affected by extreme values.
- It can be calculated for distributions with open-end classes (eg. Height < 50)
β Demerits
- In case of even number of observations median cannot be determined exactly. We merely estimate it by taking the mean of two middle terms.
- It is not amenable to algebraic treatment.
- As compared with mean, it is affected much by fluctuations of sampling.
Mode
- Mode is the value which occurs most frequently in a set of observations or mode is the value of the variable which is predominant in the series.
- E.g. 4, 7, 6, 5, 4, 6, 4 = Mode is 4.
- It is used for model size of shoes, size of readymade garments and in business/meteorological fore casting.
- In case of discrete frequency distribution mode is the value of x corresponding to maximum frequency.
- In case of continuous frequency distribution, mode is obtained from the formula:
- Where,
- l is the lower limit of modal class
- C is class interval of the modal class
- f the frequency of the modal class f1 and f2 are the frequencies of the classes preceding and succeeding the modal class respectively.
- Example: Find the mode value for the following data: 27, 28, 30, 33, 31, 35, 34, 33, 40, 41, 55, 46, 31, 33, 36, 33, 41, 33.
- Solution: As seen from the above data, the item 33 occurred maximum number of times i.e. 5 times. Hence 33 is considered to be the modal value of the given data.
Grouped Data:
Example: The following table gives the marks obtained by 89 students in Statistics. Find the mode.
Solution:
- From the above table it is clear that the maximum frequency is 21 and it lies in the class 30-34.
- Thus, the modal class is 29.5-34.5
- Here L = 29.5, c = 5, f = 21, f1 = 16, f2 = 18
- Mode = 30 + [(21-16)/(2*21-16-18)] Γ 5
- = 30 + 3.13 = 33.63 m
β Merits
- Mode is readily comprehensible and easy to calculate.
- Mode is not at all affected by extreme values.
- Mode can be conveniently located even if the frequency distribution has class intervals of unequal magnitude provided the modal class and the classes preceding and succeeding are of the same magnitude.
- Open-end classes also do not pose any problem in the location of mode
β Demerits
- Mode is ill defined. It is not always possible to find a clearly defined mode. In some cases, we may come across distributions with two modes. Such distributions are called bi-modal. If a distribution has more than two modes, it is said to be multimodal.
- It is not based upon all the observations.
- It is not capable of further mathematical treatment.
- As compared with mean, mode is affected to a greater extent by fluctuations of sampling.
Harmonic Mean
- Harmonic mean of a set of observations is defined as the reciprocal of the arithmetic average of the reciprocal of the given values.
- If x1, x2 …, xn are n observations:
- Used to find average speed, distance and rate.
Geometric Mean
- The geometric mean of a series containing n observations is the nth root of the product of the values.
- If x1, x2 …, xn are observations then
- If anyone observation is zero then G.M. will be zero.
- GM is used in studies like bacterial growth, cell division, etc.
Relationship between mean, mode and median
- In a symmetrical distribution:
Mean = Mode = Median
- In a skewed distribution/asymmetrical distribution:
Mode = 3 Median β 2 Mean
- Relationship between AM, GM and HM
G.M. = βA.M. Γ H.M.
Explore More π
π’ https://youtu.be/sxYrzzy3cq8