Wednesday 14 August 2013

13th-14th Session

Session began by Chi-square and T-Test :

Chi square :

Chi-square is a statistical test commonly used to compare observed data with data we would expect to obtain according to a specific hypothesis. For example, if, according to Mendel's laws, you expected 10 of 20 offspring from a cross to be male and the actual observed number was 8 males, then you might want to know about the "goodness to fit" between the observed and expected. Were the deviations (differences between observed and expected) the result of chance, or were they due to other factors. How much deviation can occur before you, the investigator, must conclude that something other than chance is at work, causing the observed to differ from the expected. 

The chi-square test is always testing  the null hypothesis, which states that there is no significant difference between the expected and observed result.

Calculating Chi-Square
GreenYellow
Observed (o)639241
Expected (e)660220
Deviation (o - e)-2121
Deviation2 (d2)441441
d2/e0.6682
2 = d2/e = 2.668..


T-Test :

t-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported. It can be used to determine if two sets of data are significantly different from each other, and is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known. When the scaling term is unknown and is replaced by an estimate based on the data, the test statistic (under certain conditions) follows a Student's tdistribution.

t-test is performed on :
                                    -Single Sample T-Test
                                   -Independent Sample T-Test
                                   -Paired Sample T-Test

Assumptions
  • each group is considered to be a sample from a distinct population
  • the responses in each group are independent of those in the other group
  • the distributions of the variable of interest are normal
 formula for the t-test and how the numerator and denominator are related to the distributions.

Figure 3. Formula for the t-test.

  1. The null hypothesis is that the two population means are equal to each other. To test the null hypothesis, you need to calculate the following values: xs.gif (974 bytes)(the means of the two samples),s12s22 (the variances of the two samples), n1n2 (the sample sizes of the two samples), and k (the degrees of freedom).
T-test formula
  1. Compute the t-statistic.
T-test statistic
  1. Compare the calculated t-value, with k degrees of freedom, to the critical t value from the tdistribution table at the chosen confidence level and decide whether to accept or reject the null hypothesis.

The probability sampling :
The probability sampling techniques are briefly illustrated as:
  • Simple random sampling
  • Systematic random sampling
  • Stratified sampling
  • Cluster sampling
  • Multi-stage sampling: a mixture of stratified and cluster sampling.

Probability sampling methods require a sample frame. Each member of the population has a known probability of selection. i.e. from a list of 100 people on a sampling frame we know that each person has a 1 in 100 chance of selection. This allows us to calculate a quantity known as the sampling errorwhich is the expected difference between the sample values for your questions and the population values. Hence we can use this knowledge to make accurate predictions about what the population values for your results will be. This is based on what you obtained and the calculated sampling error.

simple random sampling : Probability samples also rely on random selection to select individuals from the sample frame. The simplest form of this would be to put all the names in a hat and draw them ,especially where samples are large. Each individual on the list is given a number and random numbers corresponding to these numbers are generated until the required sample size is attained. This is the simplest form of sampling. 

Stratified sampling :

Simple random sampling can lead to samples which are not representative of the groups in the population. It is entirely random and theoretically could choose a sample of all females. In other situations small groups in the population may be missed. Stratified sampling is a technique which allows you to ensure that you control the groups that are of interest to you. Simply define the groups e.g. males, females and randomly selected a proportion of your sample from each using Simple Random Sampling.

Cluster sampling:

Sometimes you may wish to undertake a survey of a population spread over a large area, and find that the resources required to travel to these areas would be too great. An example might be where you wish to undertake a survey of schools in your region and your survey requires you to use interviewers to explain complex issues. You can’t afford to resource travel across the entire region, but know that groups (clusters) of schools have the same or similar characteristics e.g. urban, rural, size etc. You would simply select at random from these clusters and undertake a census from those selected.

Systematic sampling:

Here the idea is to sample every 1 in n from the sample frame. You simple decide what n should be, based on sample size requirements, for example 1 in 10. You randomly select a starting point on your sample frame and then select every 10th person from the list.


Non probability Sampling :

Types of non-probability sampling :
1> Quota Sampling
2>Snowball Sampling
3> Judgement Sampling
4> Convenience Sampling

Written by: Neha Gupta

Group members:

Prachi kasera
Raghav Bhattar
Nihal
Nitesh Beriwal
Parthajit Sa
r

No comments:

Post a Comment