In this class of statistics we learned about
T test and sampling .
One-Sample T-Test
It is perhaps easiest to demonstrate the ideas and methods of the
one-sample t-test by working through an example. To reiterate, the one-sample
t-test compares the mean score of a sample to a known value, usually the
population mean (the average for the outcome of some population of interest).
The basic idea of the test is a comparison of the average of the sample
(observed average) and the population (expected average), with an adjustment
for the number of cases in the sample and the standard deviation of the
average. Working through an example can help to highlight the issues involved
and demonstrate how to conduct a t-test using actual data.
Example
- Prenatal Care and Birthweight
1) Establish Hypotheses
The first step to examining this question
is to establish the specific hypotheses we wish to examine.
In this case:
· Null
hypothesis is that the difference between the birthweights of babies born to
mothers who participated in the program and those born to other poor mothers is
0. Another way of stating the null hypothesis is that the difference between
the observed mean of birthweight for program babies and the expected mean of
birthweight for poor women is zero.
· Alternative
hypothesis - the difference between the observed mean of birthweight for
program babies and the expected mean of birthweight for poor women is not zero
2) Calculate
Test Statistic
Calculation of the test statistic requires
four components:
1.
The average of the sample (observed average)
2.
The population average or other known value (expected average)
3.
The standard deviation (SD) of the sample average
4. The
number of observations.
With these four pieces of information, we calculate the following
statistic,
3) Use
This Value To Determine P-Value
Having calculated the t-statistic, compare the t-value with
a standard table of t-values to determine whether the t-statistic reaches the
threshold of statistical significance.
Sampling Frame:
In statistics,
a sampling frame is the source material or device from which a sample is drawn. It is a list of all those within a population who can be sampled, and may include
individuals, households or institutions
Sampling :It refers to the selection of a subset
of individuals from a population to form the sample for your survey. There are
two types of sampling methods: Probability Sampling and Non-Probability
Sampling.
Probability methods require a sample frame (a comprehensive list of
the population of interest). Probability methods rely on random selection in a
variety of ways from the sample frame of the population. They permit the use of
higher level statistical techniques which require random selection, and allow
you to calculate the difference between your sample results and the population
equivalent values so that you can confidently state that you know the
population values. Non-probability methods do not.
However non-probability samples cannot be dismissed by this
apparent lack of rigour. They are available even when you have no sample frame.
They are generally less complicated to undertake. They may minimise the
preparation costs of a survey, and be employed when you are actually unsure of
the population of interest
Benford’s
Law:
A phenomenological law also called the first digit law,
first digit phenomenon, or leading digit phenomenon. Benford's law states that
in listings, tables of statistics, etc., the digit 1 tends to occur with
probability ∼30%, much greater than the expected 11.1% (i.e., one digit
out of 9).
Benford's law applies to data that are not dimensionless,
so the numerical values of the data depend on the units. If there exists a
universal probability distribution P(x)
over such numbers, then it must be invariant under a change of scale, so
P(kx)=f(k)P(x). (1)
If intP(x)dx=1, then intP(kx)dx=1/k, and normalization
implies f(k)=1/k. Differentiating with respect to k and setting k=1 gives
xP^'(x)=-P(x), (2)
having solution P(x)=1/x. Although this is not a proper
probability distribution (since it diverges), both the laws of physics and
human convention impose cutoffs. For example, randomly selected street
addresses obey something close to Benford's law.
The distribution of
first digits, according to Benford's law. Each bar represents a digit, and the
height of the bar is the percentage of numbers that start with that digit.
Cluster
Vs Stratified Sampling
Cluster Sampling
Cluster Sampling
- When
natural groupings are evident in a statistical population, this technique
is used.
- It
can be opted if the group consists of homogeneous members.
- Its
advantages are that it is cheaper as compared to the other methods.
- The main disadvantage is that it introduces higher errors.
Stratified
Sampling
- In
this method, the members are grouped into relatively homogeneous groups.
- It
is a good option for heterogeneous members.
- The
advantages are that this method ignores the irrelevant ones and focuses on
the crucial sub populations. Another advantage is that for different sub
populations, you can opt for different techniques. This also helps in
improving the efficiency and accuracy of the estimation. This allows
greater balancing of statistical power of tests.
- The
disadvantages are that it requires choice of relevant stratification
variables which can be tough at times. When there are homogeneous
subgroups, it is not much useful. Its implementation is expensive. If not
provided with accurate information about the population, then an error may
be introduced.
Prepared By:
Nidhi
Group Members:
Nitin Boratwar
Palak Jain
Pallavi bizoara
Nitesh Singh Patel
Nitesh Singh Patel
No comments:
Post a Comment