🥸 Testing Hypothesis

Hypothesis, Types, Important Terms

The estimate based on sample values do not equal to the true value in the population due to inherent variation in the population.
The samples drawn will have different estimates compared to the true value. It has to be verified that whether the difference between the sample estimate and the population value is due to sampling fluctuation or real difference.
If the difference is due to sampling fluctuation only it can be safely said that the sample belongs to the population under question and if the difference is real, we have every reason to believe that sample may not belong to the population under question.
The following are a few technical terms in this context.

Hypothesis

The assumption made about any unknown characteristics is called hypothesis.
It may or may not be true.
Ex:
- μ = 2.3; μ be the population mean
- σ = 2.1; σ be the population standard deviation
- Population follows Normal Distribution.
- There are two types of hypothesis, namely null hypothesis and alternative hypothesis.

Null Hypothesis

Null hypothesis is the statement about the parameters. Such a hypothesis, which is usually a hypothesis of no difference is called null hypothesis and is usually denoted by H₀. or
Any statistical hypothesis under test is called null hypothesis. It is denoted by H₀.

Ex.

H₀: μ = μ₀
H₀: μ₁ = μ₂

Alternative Hypothesis

Any hypothesis, which is complementary to the null hypothesis, is called an alternative hypothesis, usually denoted by H₁.

Ex:

H₁: μ # μ₀
H₁: μ₁ # μ₁

Parameter

A characteristics of population values is known as parameter. For example, population mean (μ) and population variance (σ²).
In practice, if parameter values are not known and the estimates based on the sample values are generally used.

Statistic

A characteristics of sample values is called a statistic. For example, sample mean (x̄), sample variance (s²) where,

and s²

Sampling Distributions

The distribution of a statistic computed from all possible samples is known as sampling distribution of that statistic.

Standard Error

The standard deviation of the sampling distribution of a statistic is known as its standard error, abbreviated as S.E.

S.E. (x̄) = σ/√n

Where, σ = population standard deviation and n = sample size

Sample

A finite subset of statistical objects in a population is called a sample and the number of objects in a sample is called the sample size.

Population

In a statistical investigation the interest usually lies in the assessment of the general magnitude and the study of variation with respect to one or more characteristics relating to objects belonging to a group. This group of objects under study is called population or universe.

Random sampling

If the sampling units in a population are drawn independently with equal chance, to be included in the sample then the sampling will be called random sampling. It is also referred as simple random sampling and denoted as SRS.
Thus, if the population consists of “N” units the chance of selecting any unit is 1/N.
A theoretical definition of SRS is as follows:

Suppose we draw a sample of size “n” from a population size N; then there are (N_n) possible samples of size “n”. If all possible samples have an equal chance, 1/(N_n) of being drawn, then the sampling is said to be simple random sampling.

Simple Hypothesis

A hypothesis is said to be simple if it completely specifies the distribution of the population.
For instance, in case of normal population with mean μ and standard deviation σ, a simple null hypothesis is of the form H₀: μ = μ₀, σ is known, knowledge about μ would be enough to understand the entire distribution.
For such a test, the probability of committing the type-1 error is expressed as exactly α.

Composite Hypothesis

If the hypothesis does not specify the distribution of the population completely, it is said to be a composite hypothesis.
Following are some examples:
- H₀ : μ ≤ μ₀ and σ is known
- H₀ : μ ≥ μ₀ and σ is known
All these are composite because none of them specifies the distribution completely.
Hence, for such a test the LOS is specified not as α but as ‘at most α’.

Types of Errors

In testing of statistical hypothesis there are four possible types of decisions
- Rejecting H₀ when H₀ is true
- Rejecting H₀ when H₀ is false
- Accepting H₀ when H₀ is true
- Accepting H₀ when H₀ is false
1^st and 4^th possibilities leads to error decisions.
Statistician gives specific names to these concepts namely Type-I error and Type-II error respectively.
The above decisions can be arranged in the following table:

Type 1^st Error

Probabilities of type-I denoted by Alfa (α).
Rejecting H₀ when it is true or accepting H₁ when it is false.

Type 2^nd Error

Probabilities of type-II denoted by Beta (β).
Accepting H₀ to when it is false, rejecting H₁ when it is true.
It is more severe than type I error.

Degrees of Freedom

In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.
The number of independent ways by which a dynamic system can move, without violating any constraint imposed on it, is called number of degrees of freedom.
It is defined as the difference between the total number of items and the total number of constraints.
If ‘n’ is the total number of items and ‘k’ the total number of constraints then the degrees of freedom (d.f.) is given by

d.f. = n - k

Level of Significance (LOS)

The maximum probability of committing Type I Error is known as level of significance denoted by Alfa.
Generally, we take 5% (field ex.) or 1 % level of significance.
The Level of significance is always fixed in advance before collecting the sample information. LOS 5% means the results obtained will be true is 95% out of 100 cases and the results may be wrong is 5 out of 100 cases.

Critical Value

While testing for the difference between the means of two populations, our concern is whether the observed difference is too large to believe that it has occurred just by chance.
But then the question is how much difference should be treated as too large? Based on sampling distribution of the means, it is possible to define a cut-off or threshold value such that if the difference exceeds this value, we say that it is not an occurrence by chance and hence there is sufficient evidence to claim that the means are different. Such a value is called the critical value and it is based on the level of significance.

Steps involved in test of hypothesis

The null and alternative hypothesis will be formulated
Test statistic will be constructed
Level of Significance will be fixed
The table (critical) values will be found out from the tables for a given level of significance. The null hypothesis will be rejected at the given level of significance if the value of test statistic is greater than or equal to the critical value.
Otherwise null hypothesis will be accepted.
In the case of rejection the variation in the estimates will be called “significant” variation. In the case of acceptance the variation in the estimates will be called “not-significant”.

Confidence limit

Tiny range within which the true populations mean lies is called confidence limit or fiduciary limit.

Questions? Let's chat

Open Discord

Basics

Sample Test

Relations

Experimental Designs

🥸 Testing Hypothesis

Hypothesis

Null Hypothesis

Alternative Hypothesis

Parameter

Statistic

Sampling Distributions

Standard Error

Sample

Population

Random sampling

Simple Hypothesis

Composite Hypothesis

Types of Errors

Type 1^st Error

Type 2^nd Error

Degrees of Freedom

Level of Significance (LOS)

Critical Value

Steps involved in test of hypothesis

Confidence limit

Questions? Let's chat

Basics

Sample Test

Relations

Experimental Designs

Hypothesis

Null Hypothesis

Alternative Hypothesis

Parameter

Statistic

Sampling Distributions

Standard Error

Sample

Population

Random sampling

Simple Hypothesis

Composite Hypothesis

Types of Errors

Type 1st Error

Type 2nd Error

Degrees of Freedom

Level of Significance (LOS)

Critical Value

Steps involved in test of hypothesis

Confidence limit

Questions? Let's chat

Type 1^st Error

Type 2^nd Error