We were required to carry out
a small project based on statistical analysis of data. Our topic was ‘Month
wise production of cement in India’. Data was collected from www.indiastat.com. Following is a snapshot of the data.
On the required data, we were
asked to perform the following functions.
1.
Mean
Mean is what
most people commonly refer to as an average. The mean refers to the number you
obtain when you sum up a given set of numbers and then divide this sum by the
total number in the set. Mean is also referred to more correctly as arithmetic
mean.
mean= sum of elements in set/ number of elements in set
Example.
To find the mean of the set of
numbers below
3, 4, -1, 22, 14, 0, 9, 18, 7, 0, 1
The first step is to count how many numbers there are in the set, which we shall call n,
n=10
The next step is to add up all the numbers in the set
sum= 77
The last step is to find the
actual mean by dividing the sum by n,
mean=7.7
2.
Median
The median is defined as the number in the
middle of a given set of numbers arranged in order of increasing magnitude.
When given a set of numbers, the median is the number positioned in the exact
middle of the list when you arrange the numbers from the lowest to the highest.
The median is also a measure of average. In higher level statistics, median is
used as a measure of dispersion. The median is important because it describes
the behavior of the entire set of numbers.
To find the median in the set of numbers given
below
15, 16, 15, 7, 21, 18, 19, 20, 21
From the definition of median, we should be
able to tell that the first step is to rearrange the given set of numbers in
order of increasing magnitude, i.e. from the lowest to the highest
7, 11, 15, 15, 16, 18, 19, 20, 21
Then we inspect the set to find that number
which lies in the exact middle.
median=16
3.
Mode
The mode is defined as the element that
appears most frequently in a given set of elements. Using the definition of
frequency given above, mode can also be defined as the element with the largest
frequency in a given data set. For a given data set, there can be more than one
mode. As long as those elements all have the same frequency and that frequency
is the highest, they are all the modal elements of the data set.
Example.
To find the Mode of the following data set.
3, 12, 15, 3, 15, 8, 20, 19, 3, 15, 12, 19, 9
Mode = 3 and 15
4.
T-test
We use this test
for comparing the means of two samples (or treatments), even if they have
different numbers of replicates. In simple terms, the t-test compares the
actual difference between two means in relation to the variation in the data
(expressed as the standard deviation of the difference between the means).
The formula given below is used to compute the T Test
Where,
x1 is the mean of first data set, x2 is the mean of first data set
S12 is the standard deviation of first data set, S22 is the standard deviation of first data set
N1 is the number of elements in the first data set, N2 is the number of elements in the first data set
Example.
Calculate the T test value whose inputs are 10, 20, 30, 40, 50 and 1, 29, 46, 78, 99.
First Calculate Standard Deviation & mean of the given data set,
x1 is the mean of first data set, x2 is the mean of first data set
S12 is the standard deviation of first data set, S22 is the standard deviation of first data set
N1 is the number of elements in the first data set, N2 is the number of elements in the first data set
Example.
Calculate the T test value whose inputs are 10, 20, 30, 40, 50 and 1, 29, 46, 78, 99.
First Calculate Standard Deviation & mean of the given data set,
For 10, 20, 30, 40, 50
Total Inputs(N)=5
Means(xm)= 30
SD =15.8114
For 1, 29,46,78,99
Total Inputs(N) = 5
Means(xm) = 50.6
SD=38.8626
To Perform T Test
From above we know that,
x1 = 30, x2 = 50.6, S12 = 250, S22 = 1510.3, N1 = 5, N2 = 5
Substitute these values in the above formula,
= -1.0979
5.
Cross Tabulation
It is a statistical process that summarises categorical data to create a contingency table. They are heavily used in survey
research, business intelligence, engineering and scientific research. They
provide a basic picture of the interrelation between two variables and can help
find interactions between them.
Example.
We examine the 3 rows for the unit J1. This
unit needs both Adult size t-shirts and Child sizes.
A total of 8 adult (A) shirts (Total Of
ID):
2
medium (M), 3 small (S), 3 extra large (X)
A total of 8 child (C) t-shirts (Total Of
ID)::
5
large (L), 2 medium (M), 1 extra large (X)
By Pallavi Gupta (2013187)
Group Members:
Piyush (2013197)
Prerna Bansal (2013209)
Priya Jain (2013210)
Neeraj Garg (2013318)
No comments:
Post a Comment