Probability & Non Probability Sampling Methods Lecture 24

 Probability & Non-Probability Sampling

Lecture 24

Probability Sampling

In probability sampling, every unit of the population has a known, non-zero chance of being selected.

Following are methods of probability sampling.

Simple Random Sampling

Simple random sampling is the most basic technique, where each unit of the population has an equal probability of being selected and each selection is independent.

Let a population consist of “N” units, and a simple random sample of size “n” is selected with or without replacement. Then total possible samples will be:


Example 7.1: How many possible samples of size 2 can be selected from population size 5 by i) with replacement and ii) without replacement?

i. Total Possible Samples by With Replacement:

ii. Total Possible Samples without Replacement:

Advantages

i. It is free of errors.

ii. It is more representative of the population.

iii. It is simple to use.

iv. It is free from personal bias

v. It is simple to use

Disadvantages


i. Random selection is challenging.

ii. Heterogeneous populations fail this method.

iii. Lack of knowledge about population

iv. Applicable on small level

Stratified Random Sampling 

The stratified random sampling is a probability method of sampling in which the heterogeneous population of size N is distributed into homogeneous groups, i.e., N1, N2,..., Nk, known as strata, and each group is called a stratum. For every stratum, use the simple random sampling procedure to choose a sample of size n1 from N1, n2 from N2, and so forth. 

Advantages
i. Stratified sampling is an effective method of random sampling to collect information from a large population.
ii. It allows the investigator to manage a large population.
iii. It allows to collect the most accurate data.
iv. It allows to change sample size from stratum to stratum.
v. It helps to collect information from a more diverse range of data.

Disadvantages
i. It is difficult to use in non-statistical studies.
ii. All information about the structure of the population is required. It is extremely difficult in real-life studies.

Example 7.2: There are 2000 students studying in college. The students are split into four groups: male intermediate students make up 34%, female intermediate students make up 21%, male graduate students make up 28%, and female graduate students make up 17%.
Calculate the sizes of the strata.

Solution:

N = 2000
N1: Male intermediate students = 34 %
N2: Female intermediate students = 21%
N3: Malegraduate students = 28 %
N4: Female graduate students = 17%

N= N X 0.34 = 2000 X 0.34 = 680
N= N X 0.21 = 2000 X 0.21 = 420
N= N X 0.28 = 2000 X 0.28 = 560
N= N X 0.17 = 2000 X 0.17 = 340

Systematic Random Sampling

When the population is homogenous and a sampling frame is available, then we use simple random sampling. Now if the population is homogenous and a sampling frame is not available, another method of probability sampling known as systematic random sampling is used.

In the systematic random sampling method, a fixed interval k = N / n is first calculated, known as the sampling interval. The remaining units are chosen based on a pre-design pattern, while the first unit is chosen at random from this interval.

Assume that k = N / n is an integer and that the N population units are serially numbered from 1 to N. Let the ith unit be selected from the first k units. The following number of units will be included in the sample:

Example 7.3: A population consists of 100 units. How a sample size 4 is selected by systematic random sampling. demonstrated it with diagram.

Advantages

i. The sample can be selected quickly and easily.
ii. There is a low risk of the manupulation of data.
iii. The sample is evenly distributed and has the ability to cover a large population.
iv. A specialized sampling technique used in marketing assessments is the systematic random sample.
v. If propely is well-organized and controlled, the chance of sampling error is minimal.

Disadvantages

i. The assigning number to the observation / objects is a difficult task.
ii. The fixed interval can introduce bias.

Cluster Random Sampling
The population is some naturally separated into diverse groups, which are referred to as clusters. These situations include identifying the groups (cluster) and randomly selecting a cluster or a few clusters either by simple random sampling or systematic random sampling, in which a selected cluster is either subsampled or all of its units are included in the sample.

The cluster should be internally dissimilar, and different clusters should be very similar is the basic requirement of the cluster random sampling.

The Step-by-Step Guide to Cluster Sampling This is a condensed manual for performing cluster sampling: 

Step 1: Define the Population and Clusters:

Define the target population precisely first. Identify the natural clustering of the population.

Step 2: Choose Clusters at Random:

From the specified population, choose clusters using a random sampling technique or any other sampling technique. 

Step 3: Determine Cluster Size:

Choose how many elements (households, persons, etc.) will be included in the study for each chosen cluster.

Step 4: Sample Size:

Once clusters have been chosen, sample the components within each cluster based on the cluster size that has already been specified.

Step 5: Gather Information:

Gather information from each chosen cluster's sampled elements.

Advantages

i. Cluster random sampling saves money and time.
ii. It is a convenient method of sampling for geographically spread populations.
iii. The cluster sampling is more reliable if the population is properly clustered.
vi. The cluster random sampling allows for more manageable and focused studies.

Disadvantages

i. Compared to other probability sampling techniques, cluster sampling carries a larger risk of bias.
ii. The cluster sampling is considered to be more difficult and time-consuming than other sampling techniques.

Multistage Sampling
Multistage sampling is a sophisticated and flexible probability sampling technique that includes multistage sample selection. The sample progressively shrinks from a general population to more specified, smaller units at each stage. Simple random sampling is expensive and impractical in scenarios involving large samples and dispersed populations; in these cases, this sampling technique is helpful.

In multistage sampling, the population is divided into a number of units, called first stage units, which are subsampled. Each of the sleeted second stage units is further divided into third stage units, from which a subsampled is again selected, and so on. In a multistage sample, the sample size is the number of units included in the sample at the final stage in the sampling. 

The multisatage sampling is different from cluster sampling in that the cluster uses all the observations within a cluster, whereas multistage sampling selects samples within the clusters.

How Can Multistage Sampling Be Put Into Practice?

1. Define Population

2. Divided into Cluster

3. Randomly Select Clusters

4. Choose a Sampling Unit from Every Selected Cluster.

Advantages 

i. It is less costly.

ii. It requires less effort.

iii. It helps to analyze large populations.

iv. Deep intitution is developed about population.

Disadvantages

i. There is a risk of major bias.

ii. There is a risk of sampling error.

Non-Probability Sampling

Non-probability sampling is a sampling technique in which not all members of the population have a chance to be included in a sample. The selection of sampling units is based on investigator judgment or expertise. The non-probability sampling technique is most useful for exploratory studies like pilot survey, etc.

 Types of Non-Probability Sampling

1. Purposive Sampling

2. Quota Sampling

3. Snowball Sampling

Purposive Sampling

Purposive sampling is a non-probability sampling method in which the selection of sampling units is based on a researcher’s expertise about the population.

A purposive sample is liable to bias introduced by the deliberate subjective choice of the researcher who selects the sample.

Advantages

i. It is the most straightforward sampling technique.

ii. Less time-consuming and inexpensive.

iii. It is effectively used to conduct subjective studies.

iv. It contains a few small non-response units.

Disadvantages

i. A purpose sample is not used when there is a multipurpose objective.

ii. There is a risk of bias.

iii. Applicable on a small level.

Quota Sample

Quota sampling is a non-probability sampling technique in which the population is divided in groups on the basis of defined characteristics called quota, and select from sample from each group. e.g. quota of men and women, urban and rural etc. these factors are termed quota control.


Advantages

i. A quota sample is easy to administer.

ii. Less time-consuming and less expensive.

iii. Quota samples are extensively used in government organizations.

iv. It does need sampling frame.

Disadvantages

i. Selection is non-random, so there is a risk of bias.

ii. It only reflects in quota and has a chance to ignore some segments of the population.

Snowball Sampling

Snowball sampling is a type of non-probability sampling technique and use where the units of interest (participants) are difficult to locate in the target population. In the snowball method, the researcher locates a unit of interest in the target population and then collects information about the other units whom they know directly or indirectly.


The researcher recruits or use the reference of the previous selected units and this referral technique goes on and on, increasing the size of the respondent population like a snowball rolling down a hill until the researcher has sufficient data to analyze. Snowball sampling is also called chain referral sampling.

Snowball sampling consists of two steps:

1. Initially identify one or two units in the population.

2. Use chain referral technique and increase the sample size.

Advantages

i. It is very helpful in secret surveys.

ii. It is helpful to conduct studies which is not conducted due to lack of participants.

iii. It is helpful to conduct studies about medical diseases like HIV, etc. or social events like divorces, etc.

iv. Many hidden problems come to surface.

Disadvantages

i. Time consuming and costly

ii. Selection of initial units is hammering ice berg.

Classical Definition of Probability Lecture 18

Classical Definition of Probability 

Lecture 18 

If a random experiment produces n equally likely and mutually exclusive outcomes (referred to as total possible outcomes) and if an event is observed to occur m times (referred to as favorable), then the probability of the event is equal to the ratio of favorable to the total outcomes.

That’s S = {1, 2, 3,..., n} and A = {1, 2,..., m}

Where m < n

The number of outcomes in sample space is denoted by n(S) = n and the number of outcomes in event A is denoted by n(A) = m

The probability of event A can be obtained as:

Where P(A) lies between 0 and 1.

Examples based Coin

Example 5.3: Two coins are tossed once (OR a coin is tossed two times). Find the probability of

i. At least one head.

ii. Exactly one head.

iii.  No head.

Solution: The sample space of a coin tossed twice or two coins are tossed once is given by;

S = {HH, HT, TH, TT}

n(S) = 4

i. Let A represent at least one head.

A = {HH, HT, TH}

n(A) = 3

Where P(A) lies between 0 and 1.

The probability of at least one head is given by:

ii. Let B represent at exactly one head.

B = {HT, TH}

n(B) = 2

The probability of exactly one head is given by:

iii. Let C represent no head.

C = {TT}

n (C) = 1

The probability of no head is given by:

Example 5.4: Two coins are tossed once (OR a coin is tossed two times). Find the probability of

i. At least one head.

ii. Exactly one head.

iii. Exactly two heads

iv. No head.

Solution: The sample space consists of 8 sample points listed below:

S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}

n(S) = 8

i. Let A represent at least one head.

A= {HHH, HHT, HTH, THH, HTT, THT, TTH}

n(A) = 7


ii. Let B represent exactly one head.

B = {HTT, THT, TTH}

n(B) = 3


iii. Let C represent exactly two heads.

C = {HHT, HTH, THH}

n(C) = 3


iv. Let D represent no head.

D = {TTT}

n(D) = 1

Examples-based Dice Experiments

Example 5.5: Two dice are rolled once. Find the probability

i. Same outcomes on both dies.

ii. The sum of dots is 6.

iii. At least one 5 on either die.

iv. The sum is less than 4.

Solution: The sample space consists of 36 outcomes, tabulated below:

 i. Let A represent the same outcomes

A = {(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)}

n (A) = 6

ii. Let B represent the sum of 6.

B = {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)}

n (B) = 5

iii. Let C represent at least one 5 on either die.

C = {(1, 5), (2, 5), (3, 5), (4, 5), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (6, 5)}

n (C) = 11

iv. Let D represent the sum is less than 4.

D = {(1, 1), (1, 2), (2, 1)}

n (D) = 3

Example based on Cards Experiments

Illustration of Standard Playing Cards Suit

A standard playing card deck consists of 52 playing cards; these 52 cards are divided into 4 suites of diamonds, hearts, spades, and clubs. Each suit contains 13 cards, i.e., Ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, King.

Example 5.6: A card is selected from a standard deck of playing cards. Find the probability that the selected card is

i.                    Red

ii.                  Diamond

iii.                Ace

iv.                Face

v.                  King of diamond

Solution: A standard deck consists of 52 playing cards.

i. Let A represent the selected card is red

ii. Let B represent the selected card as a diamond

iii. Let C represent the selected card, which is Ace

iv. Let D represent the selected card's face

v. Let E represent king of diamond


Example 5.7: A bag contains 4 red and 3 white balls. Two balls are selected at random. Find the probability that

i. Both balls are red

ii. Alternate of colors

iii. At least one is red

Both balls are white

Solution: Summary of information:

Red balls = 4, White balls = 3, Total balls = 7, Selected balls = 2

The sample in this kind of experiment is obtained by combination technique, given below:

i. Let A represent both balls are red.
ii. Let B represent alternate colors.

iii. Let C represent that at least one is red (one or more than one is red)


Example 5.8: An urn contains 4 red, 6 black, and 3 white balls. Three balls are selected at random, find the probability that the selected balls are;

i. 1 red ball

ii. 2 black balls

iii. At least one white ball

Solution: Summary of information:

Red balls = 4, White balls = 6, White balls = 3, Total balls = 13, Selected balls = 3

The sample in this kind of experiment is obtained by combination technique, given below:


i. Let A represent that one selected ball is red

ii. Let B represent that 2 balls are black

iii. Let C represent at least one white ball selected.





Introduction to Sampling Lecture 23

 Introduction to Sampling

Lecture 23

Population

The aggregate of all individuals or objects having some characteristics of interest is called the population. The units or members of a population are represented by X1, X2,..., XN. The numerical value assigned to the units of interest is treated as a value of a random variable X, and the distribution of X is called population distribution.

A population can be classified into two:

1. Sampled Population

2. Target Population

Sampled population & Target population

A sampled population is that population from which a sample is selected. Whereas a population about which we wish to draw inferences is called the target population. The following example illustrates the difference between a sampled & a target population.

Suppose we desire to know the opinion of college students in the province of the KP with regard to the present examination system. Then our population will consist of the total number of students in all the colleges in the province. Suppose, on account of a shortage of resources & time, we are able to conduct such a survey only on six colleges scattered throughout the province. In such a case, the target population consists of the students of all the colleges in the province, while on the other hand, the sampled population consists of the students of six colleges, from which the sample of students will be selected. 


Sample

A small representative part is selected from the population for analysis.

Sampling Frame

A sampling frame is a complete list or a map that contains all the N sampling units in a population from which a sample is drawn.

e.g. A complete list of the name of all students in the college at particular point of time, a list of households in a city, a map of a village showing all fields, etc.

Sampling Plan

The sets of steps in selecting the sample from the population.

The following steps are involved in

developing a sampling plan:

        i.            Define the target population.

      ii.            Identify the sample population

    iii.            Develop a sampling frame.

    iv.            Define the sampling method.

      v.            Selection of sample size.

 

ERRORS IN SAMPLING

The following two main types of errors

involved in sampling.

1. Sampling Error

2: Non-Sampling Error.

Sampling Error

Sampling error is associated with sample selection from the population. The difference between an estimate and their corresponding parameter is called sampling error.

Let θ ^ be the estimate of  θ

The sampling error can be reduced by increasing the sample size.

Non-Sampling Errors

The errors that occur at the stage of gathering or processing the data are called non-sampling errors. All kinds of human errors, faulty sampling frames, etc. are included in non-sampling errors.

There are two main types of non-sampling errors.

i. Error in Response

ii. Non-response error

Bias

The difference between the expected value of the estimator and the true value of the parameter.

Let θ ^ be the estimate of  θ, then

If  θ^ is an unbiased estimator of θ , then E(θ ^) = θ

Bias(θ^) = 0


Sampling

Sampling is a statistical technique that is used in order to select a small part of a population.

There are two basic purposes of sampling:

i. It provides information about the population without examining all units of the population.

ii. The reliability of the estimates derived from the sample.

 

Types of Sampling

Sampling may be divided into two main branches:

1.      Probability Sampling

2.      Non - Probability Sampling

Probability Sampling

When each unit in a population has a known non-zero probability of being included in the sample. A probability sampling is also called random sampling.

The major types of probability sampling are:

        i.            Simple random sampling

      ii.            Stratified random sampling

    iii.            Systematic random sampling

    iv.            Cluster random sampling

Non-probability sampling

A process in which the personal judgment determines which units of the population are selected for a sample. A non-probability sampling is also called non-random sampling.

The common methods of noon probability sampling techniques are:

i. Purposive sampling.

ii. quota sampling.

iii.Snowball sampling


Moving Average Models (MA Models) Lecture 17

  Moving Average Models  (MA Models)  Lecture 17 The autoregressive model in which the current value 'yt' of the dependent variable ...