Tuesday, 13 August 2013

                        SESSION 11 AND SESSION 12
Types of Variables
Variables may have two types, continuous and categorical:
Two category variabes
Continuous variables -- A continuous variable has numeric values such as 1, 2, 3.14, -5, etc. The relative magnitude of the values is significant
(e.g., a value of 2 indicates twice the magnitude of 1). Examples of continuous variables are blood pressure, height, weight, income, age, and
probability of illness. Some programs call continuous variables “ordered” or “monotonic” variables.

Categorical variables -- A categorical variable has values that function as labels rather than as numbers. Some programs call categorical variables
 “nominal” variables. For example, a categorical variable for gender might use the value 1 for male and 2 for female. The actual magnitude of the
value is not significant; coding male as 7 and female as 3 would work just as well. As another example, marital status might be coded as 1 for single,
 2 for married, 3 for divorced and 4 for widowed. DTREG allows you to use non-numeric (character string) values for categorical variables. So your
 dataset could have the strings “Male” and “Female” or “M” and “F” for a categorical gender variable. Because categorical values are stored and
compared as string values, a categorical value of 001 is different than a value of 1. In contrast, values of 001 and 1 would be equal for continuous variables.

NULL HYPOTHESIS :There is no relation between two variables.

A chi-squared test, also referred to as chi-square test or χw² test, is any statistical hypothesis test in which the sampling distribution of the test statistic
 is a chi-squared distribution when the null hypothesis is true. Also considered a chi-squared test is a test in which this is asymptotically true, meaning that
 the sampling distribution (if the null hypothesis is true) can be made to approximate a chi-squared distribution as closely as desired by making the sample size
 large enough.

Chi Square :((O-E)^2)/E

So you accept or reject the hypothesis.

Degreee of freedom: How many number is needed to predict other varaible.
Formula for degree of freedom =(row-1)(column-1)

Confidence is 95% for normal return business
Confidence is 100% for critical decision.

We can see this in chi square table.
Chi square table:
With the help of chi square table and degree of freedom we can find the value of χw² with 95 % confidence.
If the value we calculate is greater than value of chi square table we will reject that value otherwise we will accept it.

Gender Females Males Total
3 2 3 5
5 2 1 3
6 1 1 2
Total 5 5 10


Gender Females Males Total
3 2.5 2.5 5
5 1.5 1.5 3
6 1 1 2
Total 10


Gender Females Males Total
3 .1 .1 .2
5 .16 .16 .32
6 0 0
Total .526


Degree of freedom=2

So we can check it through the Chi square table with the help of degree of freedom and confidence required is 95%.The calculated
value we get is greater than the Chi square table therefore we reject that hypothesis.








T table is used for continuous variable

1 Single sample T-test:
  Example :To measure the tolerance of bolt,first we set a benchmark and and compare the significant value with it.Accordingly we will
  select whether to select the lot or not
2 Independent sample T-test:
  When data are taken from the same population or when means of variable are same but data are not same.
  Example :If we take the different cities and calculate their respective means and the do Independent T test to find out which city is performing well.


  CITIES MEAN DEGREE OF FREEDOM SIFNIFICANT  VALUE
  DELHI        321.998        15        .602
  MUMBAI 322.0145 15        .080
  PUNE        321.9983 15        .522
  BANGLORE 321.9954 15        .020
  JAIPUR 322.0042 15        .085
  NOIDA        322.0025 15        .274
  CALCUTTA 322.0062 15        .018
  CHANDIGARH 321.9967 15        .101

  If we set the benchmark as 322 then the 4 cities are performing well and 4 cities are not performing well.
  With the help of significant value we find that means of Calcutta,Mumbai and Banglore are not same.
  So we reach to the conclusion that Mumbai and Calcutta are performing well as there value is greater
  than benchmark set ie. 322 but Banglore is not performing well as its value is less than 322.


 
3 Paired sample T-test:
  Example :In medicine some parameters are measured before supply and some measures are measured after supply.


BY :  RAGHAV KABRA(2013217)
GROUP MEMBERS : RAGHAV KABRA(2013217)
                                    ABHISHEK PANWALA(2013190)    
                                      PARITA MANDHANA(2013192)
                                      POORVA SABOO(2013200)
                                      PAREENA NEEMA(2013191)






No comments:

Post a Comment