Wednesday, 14 August 2013

SESSION 11 AND SESSION 12

What is a variable?

variable is any characteristics, number, or quantity that can be measured or counted. A variable may also be called a data item. Age, sex, business income and expenses, country of birth, capital expenditure, class grades, eye colour and vehicle type are examples of variables. It is called a variable because the value may vary between data units in a population, and may change in value over time. 

For example; 'income' is a variable that can vary between data units in a population (i.e. the people or businesses being studied may not have the same incomes) and can also vary over time for each data unit (i.e. income can go up or down). 


What are the types of variables?


There are different ways variables can be described according to the ways they can be studied, measured, and presented.

Numeric variables have values that describe a measurable quantity as a number, like 'how many' or 'how much'. Therefore numeric variables are quantitative variables.

Numeric variables may be further described as either continuous or discrete:
  • continuous variable is a numeric variable. Observations can take any value between a certain set of real numbers. The value given to an observation for a continuous variable can include values as small as the instrument of measurement allows. Examples of continuous variables include height, time, age, and temperature.
  • discrete variable is a numeric variable. Observations can take a value based on a count from a set of distinct whole values. A discrete variable cannot take the value of a fraction between one value and the next closest value. Examples of discrete variables include the number of registered cars, number of business locations, and number of children in a family, all of of which measured as whole units (i.e. 1, 2, 3 cars).

The data collected for a numeric variable are quantitative data.



NULL HYPOTHESIS :

When there is no Relation between Two Variables

Chi-Square

chi-squared test, also referred to as chi-square test or χw² test, is any statistical hypothesis test in which the sampling distribution of the test statistic is a chi-squared distribution when the null hypothesis is true. Also considered a chi-squared test is a test in which this is asymptotically true, meaning that the sampling distribution (if the null hypothesis is true) can be made to approximate a chi-squared distribution as closely as desired by making the sample size large enough.

Chi Square :((O-E)^2)/E

O = Observed frequency
E = Expected frequency
http://psychology.ucdavis.edu/sommerb/sommerdemo/stat_inf/gifs/sigma.png = Sum of above across all cells
6. Find the probability value (p) associated with the obtained Chi-square statistic
a.         Calculate degrees of freedom (df)
df = (# rows - 1)(# columns - 1)
b.         Use the abbreviated table of Critical Values for Chi-square test to find the p value.

T- test

t-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported. It can be used to determine if two sets of data are significantly different from each other, and is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known. When the scaling term is unknown and is replaced by an estimate based on the data, the test statistic (under certain conditions) follows a Student's t distribution.

Independent samples

The independent samples t-test is used when two separate sets of independent and identically distributed samples are obtained, one from each of the two populations being compared. For example, suppose we are evaluating the effect of a medical treatment, and we enroll 100 subjects into our study, then randomize 50 subjects to the treatment group and 50 subjects to the control group. In this case, we have two independent samples and would use the unpaired form of the t-test. The randomization is not essential here—if we contacted 100 people by phone and obtained each person's age and gender, and then used a two-sample t-test to see whether the mean ages differ by gender, this would also be an independent samples t-test, even though the data are observational.

Paired samples

Paired samples t-tests typically consist of a sample of matched pairs of similar units, or one group of units that has been tested twice (a "repeated measures" t-test).
A typical example of the repeated measures t-test would be where subjects are tested prior to a treatment, say for high blood pressure, and the same subjects are tested again after treatment with a blood-pressure lowering medication. By comparing the same patient's numbers before and after treatment, we are effectively using each patient as their own control. That way the correct rejection of the null hypothesis (here: of no difference made by the treatment) can become much more likely, with statistical power increasing simply because the random between-patient variation has now been eliminated. Note however that an increase of statistical power comes at a price: more tests are required, each subject having to be tested twice. Because half of the sample now depends on the other half, the paired version of Student's t-test has only 'n/2 - 1' degrees of freedom (with 'n' being the total number of observations). Pairs become individual test units, and the sample has to be doubled to achieve the same number of degrees of freedom
How It Works
  1. The null hypothesis is that the two population means are equal to each other. To test the null hypothesis, you need to calculate the following values: xs.gif (974 bytes)(the means of the two samples),s12s22 (the variances of the two samples), n1n2 (the sample sizes of the two samples), and k (the degrees of freedom).
T-test formula
  1. Compute the t-statistic.
T-test statistic
  1. Compare the calculated t-value, with k degrees of freedom, to the critical t value from the tdistribution table at the chosen confidence level and decide whether to accept or reject the null hypothesis.
*Reject the null hypothesis when: calculated t-value > critical t-value
  1. Note: This procedure can be used when the distribution variances from the two populations are not equal, and the sample sizes are not equal.



BY :  Priyadarshi Tandon(2013211)

GROUP MEMBERS : 

1) PRIYADARSHI TANDON
2) P.PRIYATHAM KIREETI 
3) NISHIDH LAD
4) P.S.V.P.S.G. KARTHEEKI
5) P. KALYANI

No comments:

Post a Comment