Provide Information Regarding Statistics & Econometrics : Point Biserial Correlation Lecture 59

Point Biserial Correlation

Lecture 59

The point biserial correlation is a statistical measure that assesses the association between a natural dichotomous variable and a continuous variable. The natural dichotomous variable has two natural categories, like 'male / female', 'yes / no', etc. The point biserial correlation is a special case of correlation and is based on the following assumptions.

i. There will be no outliers in the continuous variable.

ii. The continuous variable follows normal distribution or approximately follows normal distribution.

iii. The variance of the continuous variable is homogeneous for both categories of the natural dichotomous variable.

e.g., suppose it is desired to study the association between study hours (continuous variable) and gender (natural dichotomous variable); then such a kind of association can be measured by point biserial correlation.

Coefficient Point Biserial Correlation

A numerical quantity that measures the strength of linear association between a natural dichotomous variable and a continuous variable. The point biserial correlation coefficient is denoted by ρb (population) and by rb (sample).

The point biserial correlation between dichotomous variables, categorised into natural categories “p” and “q”, and a continuous variable is denoted by “rb” and given by:

Where:

X¯p is the mean of the interval variable’s values associated with the dichotomous variable’s first category.

X¯q q is the mean of the interval variable’s values associated with the dichotomous variable’s second category.

s is the standard deviation of the variable on the interval scale.

Pp is the proportion of the interval variable values associated with the dichotomous variable’s first category.

Pq is the proportion of the interval variable values associated with the dichotomous variable’s second category.

The mean and proportion of the dichotomous variable's first “p” category:

The mean and proportion of the dichotomous variable second “q” category:

The standard deviation “s” can be obtained as:

If it is desired to test H0: ρpb = 0

The following test statistic will be used:

if the sample size is small.

if the sample size is large.

Example 13.22: A researcher was examining the gender disparity and wanted to evaluate how men and women could identify and remember visual features. The researcher used 17 individuals, 9 of whom were women and 8 of whom were men, who were initially not aware of the experiment. The researcher instructed them to wait and put them all in a room filled with different items. The researcher invited each participant to finish a 30-question post-test about various features in the room. The post-test results and participants genders are displayed in the following table:

Participants	Gender	Score
1	M	7
2	M	19
3	M	8
4	M	10
5	M	7
6	M	15
7	M	6
8	M	13
9	F	14
10	F	11
11	F	18
12	F	23
13	F	17
14	F	20
15	F	14
16	F	24
17	F	22

The researcher wants to know the association between gender and score. Test the hypothesis that there is no association between gender and score is null.

Solution: First calculate the point biserial correlation and then test the hypothesis.

Participants	Gender	X	X^2
1	M	7	49
2	M	19	361
3	M	8	64
4	M	10	100
5	M	7	49
6	M	15	225
7	M	6	36
8	M	13	169
9	F	14	196
10	F	11	121
11	F	18	324
12	F	23	529
13	F	17	289
14	F	20	400
15	F	14	196
16	F	24	576
17	F	22	484
		248	4168

Let p represent the male category and q the female category.

np = 8, nq = 9, n = 17

The mean and proportion of category "p" that is male.

The mean and proportion of category "q" that is female.

The standard deviation of the score:

The coefficient of the point biserial correlation is given by;

Now test the hypothesis:

i. State the null and alternative hypotheses:

H0: ρpb = 0 vs. H1: ρpb ≠ 0

ii. The significance level: α = 0.05

iii. The test statistic: The sample size is small; the following test statistic will be used.

iv. Reject H0 when |t| > 2.131

v. Computation:

vi. Remarks: The calculated t value falls in the rejection area; the sample data does not provide sufficient evidence to accept the null hypothesis. Thus, it is concluded that there exist a relationship between gender and score.

Read More: Introduction to A/B Test

Provide Information Regarding Statistics & Econometrics

Point Biserial Correlation Lecture 59

No comments:

Post a Comment

Moving Average Models (MA Models) Lecture 17

Report Abuse

Labels