Applied Business Statistics: T-Test and Z-Test

INTRODUCTION :

T-TEST

A statistical examination of two population means. A two-sample t-test examines whether two samples are different and is commonly used when the variances of two normal distributions are unknown and when an experiment uses a small sample size. For example, a t-test could be used to compare the average score obtained by Class A in Maths team to the average score obtained by Class B in the same.

Z-TEST

A statistical test used to determine whether two population means are different when the variances are known and the sample size is large. The test statistic is assumed to have a normal distribution and nuisance parameters such as standard deviation should be known in order for an accurate z-test to be performed.

A singe sample experiment compares one sample to a population. There are two types of statistics you can use to compare a single sample to a population:

A. Single Sample z-test

If your sample size is above 1000 then this is the appropriate statistic to use.

The single sample z-test formula is shown below:

The formula reads: the z-test equals the sum of the sample mean minus the population mean and then divided by the standard error.

As mentioned in an earlier lesson, if we do not know the population standard deviation we can use the sample standard deviation to estimate the standard error.

The formula reads: the standard error equals the sample standard deviation divided by the square root of the sample size.

When we use an alpha level of 0.05, any z score that results in a probability of less than 0.05 allows us to reject the null hypothesis and accept the research hypothesis. All you need to know is the minimal z score necessary for significance. Rather than constantly going to the Z table you can just memorize the one-tailed and two-tailed z scores that equate to a 0.05 level of significance. If you go back to the end of Lesson 11 you will see that a two-tailed hypothesis needs a z score of 1.96 to be significant while a one-tailed test needs a z score of 1.64 to be significant.

B. Single Sample t-test

If your sample size is below 1000 then this is the appropriate statistic to use.

The single sample t-test formula is shown below:

The formula reads: the t test equals the sum of the sample mean minus the population mean and then divided by the standard error.

As with the z-test above, if we do not know the population standard deviation we can use the sample standard deviation to estimate the standard error.

The formula reads: the standard error equals the sample standard deviation divided by the square root of the sample size.

The t distribution is similar to the z distribution in that both are symmetrical, bell-shaped sampling distributions. The overall shape of the t distribution is influenced by the sample size used to generate it. Therefore, when the sample is large (n >1000) you should use the z-test and when the sample is small you should use the t-test. Because of this fact we need to use degrees of freedom to determine our significance threshold. For a single sample t-test the degrees of freedom calculation is as follows:

df = n - 1

Now we can go to the T Table to see if our statistic is significant.

	One-Tail = .4	.25	.1	.05	.025	.01	.005	.0025	.001	.0005
df	Two-Tail = .8	.5	.2	.1	.05	.02	.01	.005	.002	.001
1	0.325	1.000	3.078	6.314	12.706	31.821	63.657	127.32	318.31	636.62
2	0.289	0.816	1.886	2.920	4.303	6.965	9.925	14.089	22.327	31.598
3	0.277	0.765	1.638	2.353	3.182	4.541	5.841	7.453	10.214	12.924
4	0.271	0.741	1.533	2.132	2.776	3.747	4.604	5.598	7.173	8.610

5	0.267	0.727	1.476	2.015	2.571	3.365	4.032	4.773	5.893	6.869
6	0.265	0.718	1.440	1.943	2.447	3.143	3.707	4.317	5.208	5.959
7	0.263	0.711	1.415	1.895	2.365	2.998	3.499	4.029	4.785	5.408
8	0.262	0.706	1.397	1.860	2.306	2.896	3.355	3.833	4.501	5.041
9	0.261	0.703	1.383	1.833	2.262	2.821	3.250	3.690	4.297	4.781

The T Table continues

The T Table is similar to the R Table we used in lesson 7. The degrees of freedom are in the far left column and the levels of significance for each type of tailed test are in the above column headings. As with the R Table critical R values, the T Table gives you the critical T values. Your calculated T value must surpass the critical T value for your statistic to be considered significant.

III. Two Sample Experimental Statistics

For these experiments we are comparing two samples. This is the very common control group vs. experimental group research design. There are two ways to conduct the analysis base on your sample groups.

A. t-test for Independent Groups

If your two sample groups are independent of each other then you can conduct a t-test for independent groups. The formula for this specific type of t-test is as follows:

The formula reads: t (for independent groups) equals the sum of sample mean number 1 minus sample mean number 2 and then divided by the standard error of the difference.

The standard error of the difference is similar to the standard error calculated earlier. It simply is a better estimate for two independent samples. The standard error of the difference between independent sample means can be calculated with the formula below:

The formula reads: the standard error of the difference equals the square root of the standard error of sample one squared plus the standard error of sample 2 squared.

The calculation for the degrees of freedom is as follows:

df_{independent groups} = (n₁ - 1) + (n₂ - 1)

Once your calculations are complete you go to the T Table to see if your statistic is significant as above.

B. t-test for Correlated Groups

If the two samples are not independent of each other but instead are positively correlated to each other, we conduct a t-test for correlated groups. There are two ways of calculating this statistic. One uses the correlational coefficient (r) of the two samples and one does not.

1. t-test for Correlated Groups: using the r value

The t-test formula is the same as was used for independent groups:

The formula reads: t (for correlated groups) equals the sum of sample mean number 1 minus sample mean number 2 and then divided by the standard error of the difference.

The new standard error formula is as follows:

The formula reads: the standard error of the difference equals the square root of the following: the sum of the squared standard error of the first sample mean and the squared standard error of the second sample mean. Then subtract the product of 2 times the r value times the standard error from the first sample times the standard error from the second sample.

The calculation for the degrees of freedom is as follows:

df_{correlated groups} = number of pairs - 1

Once your calculations are complete you go to the T Table to see if your statistic is significant as above.

2: t-test for correlated samples: using raw data

The t-test for correlated groups using the raw data is as follows:

The formula reads: t (for correlated groups) equals D bar divided by the standard error of the difference.

D bar is the mean of all the difference scores. Difference scores are calculated by subtracting each Y value from its X pair value. You then sum these difference scores and divide by the number of pairs to get D bar. An example is shown in the table below:

X	Y	D
15	5	10
7	1	6
12	8	4
18	12	6
8	9	-1
		sum = 25
n = 5	D bar = 25/5	D bar = 5

The new standard error formula is as follows:

The formula reads: the standard error of the difference equals the square root of the following: D bar squared subtracted from the sum of D squared over n and then this entire sum divided by the number of pairs minus one.

In order to get the sum of D squared you need to generate a new column of data as is shown below:

X	Y	D	D²
15	5	10	100
7	1	6	36
12	8	4	16
18	12	6	36
8	9	-1	1
		sum = 25	sum = 189
n = 5	D bar = 25/5	D bar = 5

The calculation for the degrees of freedom is the same:

df_{correlated groups} = number of pairs - 1

Once your calculations are complete you go to the T Table to see if your statistic is significant as above.

Name : Nilay Kohaley (2013172)

Members : Pawan Agarwal

Priyanka Doshi

Pragya Singh

Poulami Sarkar

Refrences : Investopedia, Wikipedia

Applied Business Statistics

Monday, 2 September 2013

T-Test and Z-Test

No comments:

Post a Comment