The 17TH
& 18TH session began with us incorporating some of the
very important tools of statistics in our projects such as Measures of Central
Tendency – Mean, Median & Mode, T-Test, Regression, Co-relation, Xtab, etc.
Measures of Central Tendency
A measure of
central tendency is a single value that attempts to describe a set of data by
identifying the central position within that set of data. As such, measures of
central tendency are sometimes called measures of central location. They are
also classed as summary statistics. The mean, median and mode are all valid
measures of central tendency, but under different conditions, some measures of
central tendency become more appropriate to use than others.
·
Mean - The mean
(or average) is the most popular and well known measure of central tendency. It
can be used with both discrete and continuous data, although its use is most
often with continuous data. The mean is equal to the sum of all the values in
the data set divided by the number of values in the data set.
·
Median - The median is the middle score for
a set of data that has been arranged in order of magnitude. The median is less
affected by outliers and skewed data. In order to calculate the median, we
first need to rearrange that data into order of magnitude (smallest first) and
then our median mark is the middle mark.
·
Mode - The mode is the most frequent
score in our data set. On a histogram it represents the highest bar in a bar
chart or histogram. Therefore, sometimes mode is considered to be the most
popular option.
Summary of when to use the mean, median and
mode
Type
of Variable
|
Best
measure of central tendency
|
Nominal
|
Mode
|
Ordinal
|
Median
|
Interval/Ratio
(not skewed)
|
Mean
|
Interval/Ratio
(skewed)
|
Median
|
T-Test
A t-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported. It can be used to determine if two sets of data are significantly different from each other, and is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known. When the scaling term is unknown and is replaced by an estimate based on the data, the test statistic (under certain conditions) follows a Student's t distribution.
Among the most
frequently used t-tests are:
- A one-sample
location test of whether the mean of a population has a value specified in
a null hypothesis.
- A two-sample
location test of the null hypothesis that the means of two populations are
equal.
- A test of
the null hypothesis that the difference between two responses measured on
the same statistical unit has a mean value of zero.
- A test of
whether the slope of a regression line differs significantly from 0.
Submitted
By:- Priyanka Doshi
- 2013212
Group
members:-
Nilay Kohaley – 2013172
Pawan
Agarwal – 2013195
Poulami
Sarkar – 2013201
Pragya Singh
– 2013203
No comments:
Post a Comment