Wednesday 14 August 2013

Day well-spent with Chi-Square and t-Test (13th August)

In today’s class we learnt to calculate chi-square value manually and discussions were made.

Cross Tab or chi-square:

The chi-square is considered as one of the most important parameters in statistics. One of the most useful of the non-parametric statistics is chi square. We use it when our data consist of people distributed across categories, and we want to know whether that distribution is different from what we would expect by chance (or another set of expectations).
Every time we calculate chi-square we assume a hypothesis. It is also called as the null hypothesis. If the chi-square value is high in accordance with the standard value given in the table we reject the null hypothesis else we accept it.

Let us understand this concept better by using an example.

Example:


Find the relation between a person’s background and the marks scored.
Marks Scored out of 20
Commerce
Engineering
12.5
0
1
14
0
1
17
0
1
19.5
0
1
20
6
2
Total
6
6

Answer:
Step 1: Decide the null hypothesis
There is no relation between marks scored and the background of the students.

Step 2: Calculate the expected frequency

Redraw the table to find the expected frequency
Marks Scored out of 20
Commerce
Engineering
12.5
1/12 x 6/12 x 12 = 0.5
1/12 x 6/12 x 12 = 0.5
14
1/12 x 6/12 x 12 = 0.5
1/12 x 6/12 x 12 = 0.5
17
1/12 x 6/12 x 12 = 0.5
1/12 x 6/12 x 12 = 0.5
19.5
1/12 x 6/12 x 12 = 0.5
1/12 x 6/12 x 12 = 0.5
20
8/12 x 6/12 x 12 = 4
8/12 x 6/12 x 12 = 4

Step 3: Calculate the chi-square values

The formula for chi-square is as follows:

Chi-square= ∑(observed-expected)2/expected

Marks Scored out of 20
Chi-Square Value Commerce
Chi-Square Value Engineering
12.5
0.5
0.5
14
0.5
0.5
17
0.5
0.5
19.5
0.5
0.5
20
1
1

Total of all the chi-square values = 6

Step 4: Find the degree of freedom

Degree of freedom is extremely important when rejecting or accepting the null hypothesis.
For example, if you have to take ten different courses to graduate, and only ten different courses are offered, then you have nine degrees of freedom. Nine semesters you will be able to choose which class to take; the tenth semester, there will only be one class left to take - there is no choice, if you want to graduate. 

Degrees of freedom are commonly discussed in relation to chi-square and other forms of hypothesis testing statistics. It is important to calculate the degree of freedom when determining the significance of a chi square statistic and the validity of the null hypothesis. 

Degree of freedom = (rows-1)(column-1)

Rows= 5
Column= 2
DOF = 4


Step 5: Compare the chi-square value with the standard value


For 4 degree of freedom look for 0.95 probability
The value according to the table is 0.711
The original value calculated is 6, which is very high. So, we reject the null hypothesis.
This implies that there is a relationship between person and background and marks.

                     T - test :

If you have to take ten different courses to graduate, and only ten different courses are offered, then you have nine degrees of freedom. Nine semesters you will be able to choose which class to take; the tenth semester, there will only be one class left to take - there is no choice, if you want to graduate. 

Degrees of freedom are commonly discussed in relation to chi-square and other forms of hypothesis testing statistics. It is important to calculate the degree(s) of freedom when determining the significance of a chi square statistic and the validity of the null hypothesis. 

Types of t-Test:

The t-test is used to determine whether two groups are significantly different in their means. There are 3 types of t-tests:
1) One sample t-test                                                                                                   
2) Independent sample t-test                                                                                      
3) Paired samples t-test


One Sample t-test:

A one sample t-test means that you have ONE GROUP (e.g., your class of 8th grade students) who you are comparing to A KNOWN MEAN SCORE (say the national mean on a normed test).


Independent Sample t-test:

A two sample t-test means that you have TWO GROUPS (e.g., your class of 8th grade students compared to your LAST YEAR'S group of students).


Paired Sample t-test:

A two sample t-test means that you have TWO GROUPS that you are comparing against one another, but the members of each group are related in some way to a specific member of the other group (e.g., study partners, siblings, married couples, etc.).


             The following example was taken in the class:
MCDonalds

Quality index was created based on various parameters. Each Parameter has got points like dress, mosquitoes.
Report

Quality Index

Location of Franchise
Mean
N
Std. Deviation
Delhi
321.998514
16
.0111568
Mumbai
322.014263
16
.0106913
Pune
321.998283
16
.0104812
Bangalore
321.995435
16
.0069883
Jaipur
322.004249
16
.0092022
NOIDA
322.002452
16
.0086440
Calcutta
322.006181
16
.0093303
Chandigarh
321.996699
16
.0077085
Total
322.002009
128
.0108224


Benchmark Quality index – 322

                                                                      One-Sample Statistics

Location of Franchise

N
Mean
Std. Deviation
Std. Error Mean
Delhi
Quality Index
16
321.998514
.0111568
.0027892
Mumbai
Quality Index
16
322.014263
.0106913
.0026728
Pune
Quality Index
16
321.998283
.0104812
.0026203
Bangalore
Quality Index
16
321.995435
.0069883
.0017471
Jaipur
Quality Index
16
322.004249
.0092022
.0023005
NOIDA
Quality Index
16
322.002452
.0086440
.0021610
Calcutta
Quality Index
16
322.006181
.0093303
.0023326
Chandigarh
Quality Index
16
321.996699
.0077085
.0019271





                                                                                                   One-Sample Test

Location of Franchise

Test Value = 322


t
Df
Sig. (2-tailed)
Mean Difference
95% Confidence Interval of the Difference






Lower
Upper

Delhi
Quality Index
-.533
15
.602
-.0014858
-.007431
.004459
Mumbai
Quality Index
5.336
15
.000
.0142629
.008566
.019960
Pune
Quality Index
-.655
15
.522
-.0017174
-.007302
.003868
Bangalore
Quality Index
-2.613
15
.020
-.0045649
-.008289
-.000841
Jaipur
Quality Index
1.847
15
.085
.0042486
-.000655
.009152
NOIDA
Quality Index
1.134
15
.274
.0024516
-.002154
.007058
Calcutta
Quality Index
2.650
15
.018
.0061813
.001210
.011153
Chandigarh
Quality Index
-1.713
15
.107
-.0033014
-.007409
.000806



Min. values we need in order to predict the remaining values.
<.05 then reject.

Hence, we reject Mumbai, Bangalore and Calcutta.

Blog written by: Priya Jain (2013210)

Group Members

Neeraj Garg
Prerna Bansal
Pallavi Gupta 
Piyush Mittal

No comments:

Post a Comment