Introduction to Non-Parametric Tests Lecture 49

Introduction

to

Non-Parametric Tests

Lecture 49

Introduction

The parametric statistical tests (Z, t, F, etc.) are a fundamental approach for analysing data. The parametric tests are based on certain restrictive assumptions. The most common assumptions are:

i. The sample observations should be independent.

ii. The sample should be selected from a normal distribution.

iii. The variables involved must have been measured on an interval scale.

The most common assumptions are that the sample observations are measured by an interval scale and should be selected from a normally distributed population. However, there are many situations, particularly in social sciences, where the assumptions of interval scale measurement and normality are not met and the parametric tests cannot be carried out. In such situations, the approach for analysing the data that can be used is called nonparametric tests.

Non-parametric tests are those that are executed without making any restrictive assumptions about the form of the underlying distribution. As a result, the non-parametric tests are called distribution-free tests. The non-parametric tests assess median rather than mean. The parametric tests focus on analysing the mean and variance, but the non-parametric tests focus on the analysis of the median.

Following are the pros and cons of non-parametric tests.

 Pros of Non-Parametric Tests

i. The non-parametric tests are valid when the shape of the sampled population is not known or non-normal.

ii. The non-parametric tests analyse qualitative data.

iii. The non-parametric tests can be used to test the hypotheses that do not involve parameters.

iv. The application of non-parametric tests is easier in some cases.

 v. The non-parametric tests are easy to understand and interpret.

Cons of Non-Parametric Tests

i. The non-parametric tests are less efficient than parametric tests.

ii. The non-parametric tests are geared towards hypothesis testing rather than estimation of effects.

iii. The non-parametric tests are tedious and time-consuming when the sample is large.

THE SIGN TEST

The oldest non-parametric test is used as an alternative non-parametric test to the one-sample t-test in the case of a single sample, and in the case of two samples, it is an alternative to the matched-pair two-sample t-test.

In the case of one sample:

In the case of one sample, we test the hypothesis that P (+ sign) = P (- sign), which is equivalent to testing the hypothesis that the population median assumes a specified value. Subtract the hypothesised value "m0" from each observation. Replace with “+” if X > mo and “-” if X < mo and ignore if X = mo.

In the case of two samples:

In the case of two samples, we test the hypothesis that the medians of two populations are identical. That’s

H0: median 1 = median 2

The observation is replaced by a “+” sign if X (sample 1) > Y (sample 2) and by “- “if X < Y, and ignored if X = Y and dropped from the analysis.

The statistic denoted by x is the number of less frequent signs, which follow a binomial distribution with (n, ½) and calculate.

Reject H₀ if P(X ≤ x) < α (in the case of one sample).
Reject H0 if P(X ≤ x) < α/2 (in the case of two Samples).
When to use normal approximation
If the sample size "n" is sufficiently large, then X (the number of less frequent signs) follows a normal distribution with mean n/2 and variance n/4.

The best way to use continuity correction is:

Example 13.1: Ten pupils received the following scores on a quiz: 6, 10, 8, 9, 10, 7, 8, 8, 7, 6. In last year's comparable test, the average score was 7. The shape of the distribution is unknown but non-normal. Test the hypothesis that the population median is 7 against it not being so; use the sign test at a 5% significance level.
Solution:
i. State the null and alternative hypotheses:
H0: median = 7 vs. H1: median ≠ 7
ii. The significance level; α = 0.05
iii. The test statistic:
vi. Critical Region:
Reject H₀, if P(X ≤ x) < 0.025.
v. Computation:

X

6

10

8

9

10

7

8

8

7

6

X - median

-

+

+

+

+

0

+

+

0

-

The number of less frequent signs; x = 2.


vi. Remarks: The P(X ≤ 2) = 0.1445 is more than 0.025.    We do not have sufficient evidence to reject H0. Thus, conclude that this year students are similar in performance to previous students.

Example 13.2According to a college lecturer, the median score on his student’s most recent exam was 58. Can you deny the lecturer’s assertion based on the scores for 16 randomly selected tests, which are listed below at the 1% significance level? Apply the sign test.
58, 62, 55, 53, 52, 59, 55, 55, 60, 56, 57, 61, 58, 63, 63, 55
Solution: 
i. State the null and alternative hypotheses:
H0: median = 58 vs. H1: median ≠ 58
ii. The significance level; α = 0.01
iii. The test statistic: The sample size is large, so we can use normal approximation.
vi. Critical Region:
Reject H0, when |z| ≥ 2.1628
v. Computation:
X = 58, 62, 55, 53, 52, 59, 55, 55, 60, 56, 57, 61, 58, 63, 63, 55
X - median = 0, +, -, -, +, -, -, +, -, -, +, 0, +, +, -
n = 16 and x = 6
vi. Remarks: The calculated z value falls in the acceptance region; the sample data does not provide sufficient evidence to reject the null hypothesis. Thus, it is concluded that the lecturer’s assertion can not be denied.

Example 13.3: A manufacturer claims that the routine maintenance increases the production of good products and decreases the number of defective products. Eight similar kinds of machines were selected, and the number of defective products produced in the shift was counted before maintenance and after maintenance are given below:

Machine

1

2

3

4

5

6

7

8

Before

15

3

10

4

2

5

6

2

After

7

2

6

5

0

4

6

4

Test the hypothesis that maintenance improves the machines. Apply the sign test at the 0.05 level of significance.

Solution: 

i. State the null and alternative hypotheses:

H0: Median (before) = Median (after) vs. H1: Median (before)  Median (after)

ii. The significance level: α = 0.05

iii. The test statistic:

vi. Critical Region:
Reject H₀ if P(X ≤ x) < 0.025
v. Computation:

Before

15

3

10

4

2

5

6

2

After

7

2

6

5

0

4

6

4

Sign

+

+

+

-

+

+

0

_


n = 7, x = 2

vi. Remarks: The computed probability is 0.2265, which is more than 0.025. The sample data do not provide sufficient evidence to reject the null hypothesis. Thus, it is concluded that the maintenance does not improve.

No comments:

Post a Comment

Moving Average Models (MA Models) Lecture 17

  Moving Average Models  (MA Models)  Lecture 17 The autoregressive model in which the current value 'yt' of the dependent variable ...