The Kolmogorov – Smirnov Test (K – S Test)
Lecture 57
The Kolmogorov-Smirnov test is a non-parametric version of the chi-squared goodness of fit test. The Kolmogorov-Smirnov test is used to test whether the underlying probability distribution differs from a hypothesised distribution or whether the two distributions are significantly different. The Kolmogorov-Smirnov one-sample test was developed by a Russian mathematician, Andrey Nikolaevich Kolmogorov, and the two-sample test was developed by Vladimir T.N. Smirnov. Later on, both tests are combined due to similarities.
The Kolmogorov–Smirnov one-sample test
It is a non-parametric test alternative to the chi-square goodness-of-fit test. This test compares a cumulative distribution function based on a sample with some specified theoretical distribution from which the random
sample has been selected.
Let
OR
Another convenient test statistic is given below:
Reject the null hypothesis if Dn
The advantage of this test over the chi-square test is that it
is applicable for small samples.
The Kolmogorov–Smirnov two-sample test
To test the hypothesis that the two samples came from a specified theoretical distribution. Let Sn1(X) and Sn2(X) denote the cumulative relative frequency distributions of two independent samples of size n1 and n2. Then the Kolmogorov – Smirnov two-sample test is based on maximum difference D, defined by
If n1 and n2 are large (more than 40) for one-tailed test, the test statistic to use is|
Stream |
B.sc |
B.A |
B.com |
M.A |
M.com |
|
No.
of students |
5 |
9 |
11 |
16 |
19 |
Twelve pupils from each class were anticipated to join
the drama club. To determine whether there are any differences between student
classes about their intention to join a theatrical club, use the K-S test.
Solution: An equal number of selections from each stream means it follows a uniform distribution.
i. State the null and alternative hypotheses:
H0: The population distribution is uniform. (i.e., Fn(X) ≠ F0(X))
vs.
H1: The population distribution is not uniform (i.e., Fn(X) ≠ F0(X)).
ii. The significance level; α = 0.04
iii. The Test statistic:
vi. Reject H0 whenv. Computation:
|
Class |
Observed frequency |
Cmf |
Theoretical Frequency |
Cmf |
|
B.sc |
5 |
5 |
12 |
12 |
|
BA |
9 |
14 |
12 |
24 |
|
B. Com |
11 |
25 |
12 |
36 |
|
MA |
16 |
41 |
12 |
48 |
|
M. Com |
19 |
60 |
12 |
60 |
|
Total |
60 |
|
60 |
|
Use the K-S test to test at the 5% level of significance that the sample is drawn from a uniform distribution of integer values from 1 to 6.
Solution:
i. State the null and alternative hypotheses:
H0: The population distribution is uniform. (i.e., Fn(X) = F0(X))
vs.
H1: The population distribution is not uniform (i.e., Fn(X) ≠ F0(X)).
ii. The significance level; α = 0.05
iii. The Test statistic:
vi. Reject H0 when D > 4.10|
X |
Observed Frequency |
Cmf |
Theoretical Frequency |
Cmf |
|
2 |
2 |
2 |
2 |
2 |
|
3 |
2 |
4 |
2 |
4 |
|
4 |
3 |
7 |
2 |
6 |
|
5 |
1 |
8 |
2 |
8 |
|
6 |
2 |
10 |
2 |
10 |
|
Total |
10 |
|
10 |
|
|
Measurement |
Frequency
1 |
Frequency
2 |
|
A |
4 |
5 |
|
B |
11 |
3 |
|
C |
5 |
9 |
|
D |
7 |
6 |
|
E |
2 |
2 |
i. State the null and alternative hypotheses:
H0: The two samples selected from identical distribution (i.e., F1(X) = F2(X))
vs.
H1: The two samples selected from not identical distribution (i.e., F1(X) ≠ F2(X)).
ii. The significance level; α = 0.05
iii. The Test statistic:
vi. Reject H0 when D > 0.183|
Measurement |
Frequency
1 |
Cmf |
Frequency
2 |
cmf |
|
A |
4 |
4 |
5 |
5 |
|
B |
11 |
15 |
3 |
8 |
|
C |
5 |
20 |
9 |
17 |
|
D |
7 |
27 |
6 |
23 |
|
E |
2 |
29 |
2 |
25 |
|
|
29 |
|
26 |
|
|
X |
1.2 |
1.4 |
1.9 |
3.7 |
4.4 |
4.8 |
9.7 |
17.3 |
21.2 |
28.4 |
|
Y |
5.6 |
6.5 |
6.6 |
6.9 |
9.2 |
10.4 |
10.6 |
19.3 |
|
|
Use the K-S test to test the hypothesis that the two sampled populations have identical
distribution at a 5% significance level.
i. State the null and alternative hypotheses:
H0: The two samples selected from identical distribution (i.e., F1(X) = F2(X))
vs.
H1: The two samples selected from not identical distribution (i.e., F1(X) ≠ F2(X)).
ii. The significance level; α = 0.05
iii. The test statistic: K-N test
vi. Remarks: The K-N calculated value falls in the acceptance region; the sample data does not provide sufficient evidence to accept the null hypothesis. Thus, it is concluded that both samples are not selected from identically distributed populations.
- Read More: Biserial Correlation
%20-%20Copy.png)
%20-%20Copy.png)
%20-%20Copy.png)
%20-%20Copy.png)

%20-%20Copy.png)
%20-%20Copy.png)
%20-%20Copy.png)
%20-%20Copy.png)
%20-%20Copy.png)
%20-%20Copy.png)
%20-%20Copy.png)
%20-%20Copy.png)
%20-%20Copy.png)
%20(1)%20(1).png)
%20-%20Copy.png)
%20-%20Copy.png)
No comments:
Post a Comment