🥷 Chi Test

Non-parameteric Test

The various tests of significance studied earlier such that as Z-test, t-test, F-test were based on the assumption that the samples were drawn from normal population. Under this assumption the various statistics were normally distributed.
Since the procedure of testing the significance requires the knowledge about the type of population or parameters of population from which random samples have been drawn, these tests are known as parametric tests.
But there are many practical situations in which the assumption of any kind about the distribution of population or its parameter is not possible to make. The alternative technique where no assumption about the distribution or about parameters of population is made are known as non-parametric tests.
Chi-square test is an example of the non-parametric test.
Chi-square distribution is a distribution free test.
Chi-square distribution was first discovered by Helmert in 1876 and later independently by Karl Pearson in 1900.
The range of chi-square distribution is 0 to ∞.
If observed frequency is equal to expected one than the value of χ² static is zero.
Measurement data: the data obtained by actual measurement is called measurement data. For example, height, weight, age, income, area etc.,
Enumeration data: the data obtained by enumeration or counting is called enumeration data. For example, number of blue flowers, number of intelligent boys, number of curled leaves, etc.,
χ² – test is used for enumeration data which generally relate to discrete variable whereas t-test and standard normal deviate tests are used for measuremental data which generally relate to continuous variable.
χ² – test can be used to know whether the given objects are segregating in a theoretical ratio or whether the two attributes are independent in a contingency table.
The expression for χ²–test for goodness of fit:
Where
- O_i = observed frequencies
- E_i = expected frequencies
- n = number of cells (or classes)
- Which follows a chi-square distribution with (n-1) degrees of freedom.
The null hypothesis H₀ = the observed frequencies are in agreement with the expected frequencies.
If the calculated value of χ₂ < Table value of χ₂ with (n-1) d.f. at specified level of significance (α), we accept H₀ otherwise we do not accept H₀.

Conditions for the validity of χ² – test

The validity of χ₂-test of goodness of fit between theoretical and observed, the following conditions must be satisfied.
- The sample observations should be independent
- Constraints on the cell frequencies, if any, should be linear ∑O_i = ∑E_i
- N, the total frequency should be reasonably large, say greater than 50
- If any theoretical (expected) cell frequency is < 5, then for the application of chi-square test it is pooled with the preceding or succeeding frequency so that the pooled frequency is more than 5 and finally adjust for the d.f. lost in pooling.

Applications of Chi-square Test

Testing the independence of attributes
To test the goodness of fit (it tells you if your sample data represents the data you would expect to find in the actual population)
Testing of linkage in genetic problems
Comparison of sample variance with population variance
Testing the homogeneity of variances ^{UPPSC 2021}
Testing the homogeneity of correlation coefficient
The test whether theory fits well in practical can be judged by Chi square test

Test for independence of two Attributes of (2x2) Contingency Table

A characteristic which cannot be measured but can only be classified to one of the different levels of the character under consideration is called an attribute.
2x2 contingency table: When the individuals (objects) are classified into two categories with respect to each of the two attributes then the table showing frequencies distributed over 2x2 classes is called 2x2 contingency table.
Suppose the individuals are classified according to two attributes say intelligence (A) and colour (B). The distribution of frequencies over cells is shown in the following table.

Where
- R₁ and R₂ are the marginal totals of 1st row and 2nd row
- C₁ and C₂ are the marginal totals of 1st column and 2nd column
- N = grand total
The null hypothesis H₀: the two attributes are independent (if the colour is not dependent on intelligent)
Based on above H₀, the expected frequencies are calculated as follows.

The degrees of freedom for m x n contingency table is (m - 1) x (n - 1)
The degrees of freedom for 2 x 2 contingency table is (2 - 1)(2 - 1) = 1
This method is applied for all r x c contingency tables to get the expected frequencies.
The degrees of freedom for r x c contingency table is (r - 1) x (c - 1)
If the calculated value of χ² < table value of χ² at certain level of significance, then H₀ is accepted otherwise we do not accept H₀.
The alternative formula for calculating χ² in 2 x 2 contingency table is:

Example

Examine the following table showing the number of plants having certain characters, test the hypothesis that the flower colour is independent of the shape of leaf.

Solution:

Null hypothesis H₀: attributes “flower colour” and “shape of leaf” are independent of each other.
Under H₀ the statistic is

Expected frequencies are calculated as follows.

Direct Method:

Calculated value of χ² < Table value of χ² at 5% LOS for 1 d.f., Null hypothesis is accepted and hence we conclude that two characters, flower colour and shape of leaf are independent of each other.
Yates correction for continuity in a 2 x 2 contingency table
In a 2 x 2 contingency table, the number of d.f. is (2 - 1) x (2 - 1) = 1. If any one of Expected cell frequency is less than 5, then we use of pooling method for χ²–test results with ‘0’ d.f. (since 1 d.f. is lost in pooling) which is meaningless. In this case we apply a correction due to Yates, which is usually known a Yates Correction for Continuity.
Yates correction consists of the following steps:
- Add 0.5 to the cell frequency which is the least.
- Adjust the remaining cell frequencies in such a way that the row and column totals are not changed. It can be shown that this correction will result in the formula.

Example

The following data are observed for hybrids of Datura.
- Flowers violet, fruits prickly = 47
- Flowers violet, fruits smooth = 12
- Flowers white, fruits prickly = 21
- Flowers white, fruits smooth = 3
Using chi-square test, find the association between colour of flowers and character of fruits.

Solution:

H₀: The two attributes colour of flowers and fruits are independent.
We cannot use Yate’s correction for continuity based on observed values.
If only expected frequency less than 5, we use Yates’s correction for continuity.
The test statistic is

The figures in the brackets are the expected frequencies

Calculated value of χ² = 0.28
Table value of χ² for (2-1) (2-1) = 1 d.f. is 3.84
Calculated value of χ² < table value of χ², H₀ is accepted and hence we conclude that colour of flowers and character of fruits are not associated.

The various tests of significance studied earlier such that as Z-test, t-test, F-test were based on the assumption that the samples were drawn from normal population. Under this assumption the various statistics were normally distributed.
Since the procedure of testing the significance requires the knowledge about the type of population or parameters of population from which random samples have been drawn, these tests are known as parametric tests.
But there are many practical situations in which the assumption of any kind about the distribution of population or its parameter is not possible to make. The alternative technique where no assumption about the distribution or about parameters of population is made are known as non-parametric tests.
Chi-square test is an example of the …

Become Successful With AgriDots

Learn the essential skills for getting a seat in the Exam with tons of fun in the process.

🦄 You are a pro member!

Only use this page if purchasing a gift or enterprise account

Plan

Unlimited access to PRO courses
Quizzes with hand-picked meme prizes
Invite to private Discord chat
Free Sticker emailed

Lifetime

Rs 1,499 once

All PRO-tier benefits
Single payment, lifetime access
4,200 bonus xp points
Next Level T-shirt shipped worldwide

Yo! You just found a 20% discount using 👉 EASTEREGG

High-quality fitted cotton shirt produced by Next Level Apparel

Questions? Let's chat

Open Discord

Basics

Sample Test

Relations

Experimental Designs

🥷 Chi Test

Conditions for the validity of χ² – test

Applications of Chi-square Test

Test for independence of two Attributes of (2x2) Contingency Table

Example

Example

Become Successful With AgriDots

🦄 You are a pro member!

Plan

Lifetime

Questions? Let's chat

Basics

Sample Test

Relations

Experimental Designs

Conditions for the validity of χ2 – test

Applications of Chi-square Test

Test for independence of two Attributes of (2x2) Contingency Table

Example

Example

Plan

Lifetime

Questions? Let's chat

Conditions for the validity of χ² – test