Monday, 26 August 2013

In this class of statistics we learned about T test and sampling .
One-Sample T-Test
It is perhaps easiest to demonstrate the ideas and methods of the one-sample t-test by working through an example. To reiterate, the one-sample t-test compares the mean score of a sample to a known value, usually the population mean (the average for the outcome of some population of interest). The basic idea of the test is a comparison of the average of the sample (observed average) and the population (expected average), with an adjustment for the number of cases in the sample and the standard deviation of the average. Working through an example can help to highlight the issues involved and demonstrate how to conduct a t-test using actual data.

Example - Prenatal Care and Birthweight

1) Establish Hypotheses

The first step to examining this question is to establish the specific hypotheses we wish to examine.
In this case:
· Null hypothesis is that the difference between the birthweights of babies born to mothers who participated in the program and those born to other poor mothers is 0. Another way of stating the null hypothesis is that the difference between the observed mean of birthweight for program babies and the expected mean of birthweight for poor women is zero.
· Alternative hypothesis - the difference between the observed mean of birthweight for program babies and the expected mean of birthweight for poor women is not zero

2) Calculate Test Statistic

Calculation of the test statistic requires four components:
1.      The average of the sample (observed average)
2.      The population average or other known value (expected average)
3.      The standard deviation (SD) of the sample average
4.      The number of observations.
With these four pieces of information, we calculate the following statistic,

 

3) Use This Value To Determine P-Value

Having calculated the t-statistic, compare the t-value with a standard table of t-values to determine whether the t-statistic reaches the threshold of statistical significance.
Sampling Frame:
In statistics, a sampling frame is the source material or device from which a sample is drawn. It is a list of all those within a population who can be sampled, and may include individuals, households or institutions
Sampling :It refers to the selection of a subset of individuals from a population to form the sample for your survey. There are two types of sampling methods: Probability Sampling and Non-Probability Sampling.
Probability methods require a sample frame (a comprehensive list of the population of interest). Probability methods rely on random selection in a variety of ways from the sample frame of the population. They permit the use of higher level statistical techniques which require random selection, and allow you to calculate the difference between your sample results and the population equivalent values so that you can confidently state that you know the population values. Non-probability methods do not.

However non-probability samples cannot be dismissed by this apparent lack of rigour. They are available even when you have no sample frame. They are generally less complicated to undertake. They may minimise the preparation costs of a survey, and be employed when you are actually unsure of the population of interest

Benford’s Law:
A phenomenological law also called the first digit law, first digit phenomenon, or leading digit phenomenon. Benford's law states that in listings, tables of statistics, etc., the digit 1 tends to occur with probability 30%, much greater than the expected 11.1% (i.e., one digit out of 9). 
        Benford's law applies to data that are not dimensionless, so the numerical values of the data depend on the units. If there exists a universal probability distribution  P(x) over such numbers, then it must be invariant under a change of scale, so
 P(kx)=f(k)P(x).          (1)

If intP(x)dx=1, then intP(kx)dx=1/k, and normalization implies f(k)=1/k. Differentiating with respect to k and setting k=1 gives
 xP^'(x)=-P(x),            (2)


having solution P(x)=1/x. Although this is not a proper probability distribution (since it diverges), both the laws of physics and human convention impose cutoffs. For example, randomly selected street addresses obey something close to Benford's law.
The distribution of first digits, according to Benford's law. Each bar represents a digit, and the height of the bar is the percentage of numbers that start with that digit.

Cluster Vs Stratified Sampling

Cluster Sampling
  • When natural groupings are evident in a statistical population, this technique is used.
  • It can be opted if the group consists of homogeneous members.
  • Its advantages are that it is cheaper as compared to the other methods.
  • The main disadvantage is that it introduces higher errors.
Stratified Sampling
  • In this method, the members are grouped into relatively homogeneous groups.
  • It is a good option for heterogeneous members.
  • The advantages are that this method ignores the irrelevant ones and focuses on the crucial sub populations. Another advantage is that for different sub populations, you can opt for different techniques. This also helps in improving the efficiency and accuracy of the estimation. This allows greater balancing of statistical power of tests.
  • The disadvantages are that it requires choice of relevant stratification variables which can be tough at times. When there are homogeneous subgroups, it is not much useful. Its implementation is expensive. If not provided with accurate information about the population, then an error may be introduced.
Prepared By:
Nidhi
 Group Members:
Nitin Boratwar
Palak Jain
Pallavi bizoara
Nitesh Singh Patel

No comments:

Post a Comment