💃 Frequency Distribution

Binomial, Poisson and Normal distribution

Probability


  • The range of probability lies from 0 to 1.
  • Probability of impossible event is always zero.

Distribution

  • Theoretical distributions are two types:

a. Discrete distribution

  • X can take only integer values like 0, 1, 2, 3, —-
  • E.g. Binomial distribution, Poisson distribution

b. Continuous distribution

  • X can take all possible values in a given range.
  • E.g. Normal distribution

Binomial Distribution NABARD 2020 (Mains)

  • Given by James Bernoulli in 1700 A.D.
  • A binomial distribution can be thought of as simply the probability of a SUCCESS or FAILURE outcome in an experiment or survey that is repeated multiple times. The binomial is a type of distribution that has two possible outcomes (the prefix “bi” means two, or twice).
  • For example, a coin toss has only two possible outcomes: heads or tails and taking a test could have two possible outcomes: pass or fail.
  • Bernoulli trial: a trial is known as Bernoulli trial if it has only two out comes namely success and failure with probability p and q respectively.
  • For example, if a new pesticie is introduced to cure a disease, it either cures the disease (it’s successful) or it doesn’t cure the disease (it’s a failure).

p + q = 1

Definition

  • A random variable “x” said to have binomial distribution if it assumes only non-negative values and its probability mass function is given by:
  • X = 0, 1, 2, —
  • n = number of trials
  • x = number of success of n trial
  • p = probability of success
  • q = probability of failure or (1 - p)
  • p + q = 1 (B.D. represent by 1)
  • n & p are the parameters of B.D.

Criteria

  • Binomial distributions must also meet the following three criteria.
  • The number of observations or trials is fixed. In other words, you can only figure out the probability of something happening if you do it a certain number of times. This is common sense—if you toss a coin once, your probability of getting a tails is 50%. If you toss a coin 20 times, your probability of getting a tails is very, very close to 100%.
  • Each observation or trial is independent. In other words, none of your trials have an effect on the probability of the next trial.
  • The probability of success (tails, heads, fail or pass) is exactly the same from one trial to another.

Properties of B.D.

  • Arithmetic mean (μ1) of B.D = np
  • Variance (μ2) of B.D. = npq
  • Skewness (μ3) = npq (q-p)
  • Kurtosis (μ4) = npq (1 + 3 (n-2)pq)
  • Standard Deviation = npq
  • In case of B.D. = Mean > Variance (as p & q are less than 1)

Poisson Distribution

  • Discovered by S.D. Poisson
  • It is a limiting case of B.D. such that
    • N tends to ∞
    • p tends to zero
    • and np = m
  • In statistics, a Poisson distribution is a statistical distribution that shows how many times an event is likely to occur within a specified period of time.
  • It is used for independent events which occur at a constant rate within a given interval of time.
  • Example: The number of diners in a certain restaurant every day. If the average number of diners for seven days is 500, you can predict the probability of a certain day having more customers.
  • Therefore, a variable ‘x’ is said to follow a Poisson distribution if it assumes only negative values and its probability mass function is given as:

Examples of P.D.

  • Number of deaths from a disease
  • Number of defective materials
  • Number of wrong calls receives in a month

Properties of P.D.

  • Mean (μ1) = γ
  • Variance (μ2) = γ
  • μ3 = γ
  • μ4 = 3γ12 + γ
  • In case of P.D. Mean = Variance
  • P.D. is useful in theory of games, waiting time, problems of business.
  • P.D. tends to normal distribution when γ tends to ∞.

Normal Distribution

  • The Normal Distribution (N.D.) was first discovered by De-Moivre as the limiting form of the binomial model in 1733, later independently worked Laplace and Gauss.
  • The Normal distribution is “probably” the most important distribution in statistics. It is a probability distribution of a continuous random variable and is often used to model the distribution of discrete random variable as well as the distribution of other continuous random variables.
  • The basic from of normal distribution is that of a bell, it has single mode and is symmetric about its central values.
  • The flexibility of using normal distribution is due to the fact that the curve may be centered over any number on the real line and it may be flat or peaked to correspond to the amount of dispersion in the values of random variable.
  • Definition: A random variable x is said to follow a Normal Distribution with parameter μ and σ2 if its density function is given by the probability law.

Properties of normal distribution

  • The curve of normal distribution is bell shaped and it is symmetric about the mean (μ).
  • The height of normal curve is at its maximum at the mean. Hence the mean and mode of normal distribution coincides. Also, the number of observations below the mean in a normal distribution is equal to the number of observations about the mean. Hence mean and median of N.D. coincides.
  • Thus, N.D. has Mean = Mode = Median (are equal).
  • As “x” increases numerically, f(x) decreases rapidly, the maximum probability occurring at the point x = μ, and given by
  • Coefficient of skewness32/ μ23) for normal distribution is Zero.
  • Coefficient of kurtosis4/ μ22) for normal distribution is 3 (Platukurtosis).
  • All odd central moments are zero’s i.e. μ1 μ3 μ5 ………….. = 0
  • The first and third quartiles are equidistant from the median: Q3 - Q2 = Q2 – Q1
  • Range of normal distribution is from -∞ to +∞.
  • But practically range equal to 6 σ.
  • Mean deviation about mean: 4/5 σ.
  • Quartile deviation: 2/3 σ.
  • The area under the normal curve is distributed as follows:
    • μ - 3σ < x < μ + 3σ coves 99.73 % of area
    • μ - 2σ < x < μ + 2σ covers 95.44 % of area
    • μ - σ < x < μ + σ covers 68.26 % of area

The Normal Curve

  • The graph of the normal distribution depends on two factors - the mean and the standard deviation.
  • The mean of the distribution determines the location of the center of the graph, and the standard deviation determines the height and width of the graph.
  • When the standard deviation is large, the curve is short and wide
  • When the standard deviation is small, the curve is tall and narrow
  • All normal distributions look like a symmetric, bell-shaped curve, as shown below.
  • The curve on the left is shorter and wider than the curve on the right, because the curve on the left has a bigger standard deviation.

Standard Normal Distribution

  • If “x” is a normal random variable with Mean μ and standard deviation σ , then Z = (x- μ )/σ is a standard normal variate with zero mean and standard deviation = 1.
  • The probability density function of standard normal variate “z” is
  • A graph representing the density function of the Normal probability distribution is also known as a Normal Curve or a Bell Curve (see Figure below). To draw such a curve, one needs to specify two parameters, the mean and the standard deviation. T
  • he graph below has a mean of zero and a standard deviation of 1, i.e., (m = 0, s = 1).
  • A Normal distribution with a mean of zero and a standard deviation of 1 is also known as the Standard Normal Distribution (SND).

Importance of Normal Distribution

  • It has got a large application in statistics quality control and important area of statistics used in industries for setting control limit.

Transformation of Data

  • When the data do not follow the normal distribution the variate x1 be transformed to some new variate say f (xi) in such a way that f(x) is normally and independently distributed with mean (N) & Variance (σ2).

Types of Transformation

Square root transformation

  • Used when countable data have small values between 0 - 10 & when data follow Poisson distribution.
  • It is also used for percentage data when all values are > 80 %.
  • E.g.

Transformation = √x

Logarithmic Transformation

  • Used when variation in the countable data is large and values are also large.
  • E.g.

Transformation = log x or log x + 1

Angular or Arsine Transformation

  • Used for data following Binomial Distribution and when data are in percentage based on count values.
  • The percentage are transformed the degree.
  • E.g.

Sin = p/100

Moment

Moment = Force x Distance

  • The moment in statistics are used to describe the various characteristics of frequency distribution like central tendency (μ1), variance (μ2), skewness (μ3) and kurtosis (μ4).
  • In a symmetrical distribution all odd moments i.e. μ1, μ3 and μ5 would always be zero because positive deviation & negative deviation of symmetry are exactly balance or equal.

Difference between raw moment and central moment

  • Usually calculation of central moment takes the more time due to reason that mean may not be in whole number therefore calculation of the central moment first we calculate raw moment and with the help of the formulae we calculate central moment.

Raw Moment

  • The rth moment about arbitrary mean “A” is defined as:

Central Moment

  • The rth moment about mean (x) is defined as:
  • Value of zero:
    • The 0th central moment is always one.
  • Value of 1th central moment:
    • The first central moment is always zero.
    • Whereas raw moment is not zero. It is equal to mean.
  • Value of 2th central moment:
    • 2nd central moment gives an idea about measures of dispersion.
  • Value of 3th central moment:
    • μ3 gives an idea either curve is symmetrical or not.
  • Value of 4th central moment:

Skewness

👉🏻 There are two type of frequency distribution:

  • Symmetric frequency distribution:

Mean = Median = Mode

  • Asymmetrical frequency distribution:

Mean ≠ Median ≠ Mode

  • Quartiles are not equi-distant from median.
  • Q3 – Md ≠ Md – Q1
  • Curve is more inclined in one side than other.
  • The sum of positive deviation from median is not equal to the sum of negative deviation from the median.

Definition

  • Skewness may be defined as departure from symmetry or lack of symmetry of frequency distribution is known as skewness.
  • It is denoted as β1.

Types

  • Positive skewed frequency distribution:
    • A distribution or skewness is said to be positive if the frequency curve has a longer tail on the right-hand side as compare to left side.
    • Positive skewness = Mean > Median > Mode
  • Negative skewness:
    • The frequency curve is more inclined to the left-hand side as compared to right hand side.
    • Negative skewness = Mode > Median > Mean

Measures of Skewness

  • Prof. Karl Pearson coefficient of skewness.
  • Based on moments:
    • Skewness (β1) = μ32/ μ23
    • Kurtosis (β2) = μ4/ μ22
  • For symmetric distribution:
    • Skewness or β1 = 0
    • β2 = 3

Kurtosis

  • Refers to degree of flatness or peakness of the frequency cure.
  • It is denoted as β2.

Types

  • Leptokurtic (Narrow peak/base): β2 > 3
  • Mesokurtic (Normal curve): β2 = 3
  • Platykurtic (Flat peak/broad base): β2 < 3

Questions? Let's chat

Open Discord