Statistics
can be boring as it requires us to work plainly with numbers but as was pointed
out by our professor even statistics can be interesting if we build up a story
around it. This seemed quite an absorbing thought and on that note the class
began with the introduction of a new concept Cross Tabulation.
CROSS TABULATION
Cross-tabulation is one of the most
useful analytical tools and is a main-stay of the market research industry. Cross-tabulation
analysis, also known as contingency table analysis, is most often used to
analyze categorical (nominal measurement scale) data. A cross-tabulation is a
two (or more) dimensional table that records the number (frequency) of respondents
that have the specific characteristics described in the cells of the table.
Cross-tabulation tables provide a wealth of information about the relationship
between the variables. In simple terms cross tabulation is a
presentation of data about categorical variable in a tabular form to aid in
identifying a relationship between the variables.
Cross-tabulation analysis has its
own unique language, using terms such as “banners”, “stubs”, “Chi-Square
Statistic” and “Expected Values.”
In order to gain a better
understanding about the use of Cross Tabulation we were provided with a SPSS data
sheet containing details regarding a setup having 4 stores along with details
about other parameters like gender, age-category of shoppers, shopping
frequency, service satisfaction etc. We were provided with a task to evaluate
the levels of service satisfaction across the stores.
The
first and foremost thing we had to deal with was to determine the direction of
the relationship. In order to do this we were told to determine which variable
is the "dependent" variable and which variable is the
"independent" variable (i.e. in other words what influences what) In the
instant case the stores was the independent variable and the service
satisfaction was considered as the dependent variable.
Another
point which was observed was that the percentages for the cross tabulation held
more relevance across the row as it helped us establish the particular
department which had a problem with the levels of service satisfaction. The
results created showed that Store 2 had the greatest percentage of unsatisfied
people while the Store 3 had the highest levels of service satisfaction.
This
left us with a conclusion that Store 2 was the problem area. However, it was
still left to be established if we could trust these results. Could the results
have happened by chance? This left scope for a further analysis. It was then
that we were introduced to the concept of Pearson Chi-Square results.
CROSS-TABULATION WITH
CHI-SQUARE ANALYSIS
The
Chi-square statistic is the primary statistic used for testing the statistical
significance of the cross-tabulation table. Chi-square tests whether or not the
two variables are independent. If the variables are independent (have no
relationship), then the results of the statistical test will be
“non-significant” and we “are not able to reject the null hypothesis”, meaning
that we believe there is no relationship between the variables.
Null
Hypothesis
In
statistical inference, Null Hypothesis refers to a general or default position:
that there is no relationship between 2 variables tested. Rejecting or disproving
null hypothesis and concluding that there are grounds for believing that there
is a relationship between 2 variables gives a precise claim in which a claim is
capable of being proven false.
If
the variables are related, then the results of the statistical test will be
“statistically significant” and we “are able to reject the null hypothesis”,
meaning that we can state that there is some relationship between the
variables. The chi-square statistic, along with the associated probability of
chance observation, may be computed for any table. If the variables are related
(i.e. the observed table relationships would occur with very low probability,
say only 5%) then we say that the results are “statistically significant” at
the “.05 or 5% level”. This means that the variables have a low chance of being
independent.
The
probability values (.05 or .01) reflect the researcher’s willingness to accept
a type I error, or the probability of rejecting a true null hypothesis (meaning
that we thought there was a relationship between the variables when there
really wasn’t). Furthermore these probabilities are cumulative, meaning that if
20 tables are tested, the researcher can be almost assured that one of the
tables is incorrectly found to have a relationship (20 x .05 = 100% chance).
Depending on the cost of making mistakes, the researcher may apply more
stringent criteria for declaring “significance” such as .01 or .005.
Upon
applying the chi square analysis in the instant case it was observed that the
null hypothesis was wrong and hence there appeared to be no relationship
between the stores and the level of service satisfaction. However, this was
still inconclusive therefore we resorted to the use of another variable i.e.
contact with the employee to see if the contact with the employee had a bearing
on the outcome. It was then that we realized that the null hypothesis held
true. The outcome is interpreted below.
Store * Service satisfaction
* Contact with employee Crosstabulation
Contact with employee
|
|
|
Service satisfaction
|
Total
|
|||||
|
|
|
Strongly Negative
|
Somewhat Negative
|
Neutral
|
Somewhat Positive
|
Strongly Positive
|
Strongly Negative
|
|
No
|
Store
|
Store 1
|
Count
|
16
|
9
|
18
|
17
|
19
|
79
|
|
|
|
% within Store
|
20.3%
|
11.4%
|
22.8%
|
21.5%
|
24.1%
|
100.0%
|
|
|
Store 2
|
Count
|
2
|
15
|
16
|
13
|
12
|
58
|
|
|
|
% within Store
|
3.4%
|
25.9%
|
27.6%
|
22.4%
|
20.7%
|
100.0%
|
|
|
Store 3
|
Count
|
9
|
14
|
23
|
22
|
14
|
82
|
|
|
|
% within Store
|
11.0%
|
17.1%
|
28.0%
|
26.8%
|
17.1%
|
100.0%
|
|
|
Store 4
|
Count
|
17
|
14
|
19
|
10
|
10
|
70
|
|
|
|
% within Store
|
24.3%
|
20.0%
|
27.1%
|
14.3%
|
14.3%
|
100.0%
|
|
Total
|
Count
|
44
|
52
|
76
|
62
|
55
|
289
|
|
|
|
% within Store
|
15.2%
|
18.0%
|
26.3%
|
21.5%
|
19.0%
|
100.0%
|
|
Yes
|
Store
|
Store 1
|
Count
|
9
|
11
|
20
|
13
|
14
|
67
|
|
|
|
% within Store
|
13.4%
|
16.4%
|
29.9%
|
19.4%
|
20.9%
|
100.0%
|
|
|
Store 2
|
Count
|
24
|
15
|
18
|
14
|
7
|
78
|
|
|
|
% within Store
|
30.8%
|
19.2%
|
23.1%
|
17.9%
|
9.0%
|
100.0%
|
|
|
Store 3
|
Count
|
6
|
6
|
18
|
11
|
15
|
56
|
|
|
|
% within Store
|
10.7%
|
10.7%
|
32.1%
|
19.6%
|
26.8%
|
100.0%
|
|
|
Store 4
|
Count
|
10
|
21
|
25
|
12
|
24
|
92
|
|
|
|
% within Store
|
10.9%
|
22.8%
|
27.2%
|
13.0%
|
26.1%
|
100.0%
|
|
Total
|
Count
|
49
|
53
|
81
|
50
|
60
|
293
|
|
|
|
% within Store
|
16.7%
|
18.1%
|
27.6%
|
17.1%
|
20.5%
|
100.0%
|
Chi-Square Tests
Contact with employee
|
|
Value
|
df
|
Asymp. Sig. (2-sided)
|
No
|
Pearson Chi-Square
|
20.898(a)
|
12
|
.052
|
Likelihood Ratio |
22.937
|
12
|
.028
|
|
Linear-by-Linear Association |
3.514
|
1
|
.061
|
|
N of Valid Cases |
289
|
|
|
|
Yes
|
Pearson Chi-Square
|
25.726(b)
|
12
|
.012
|
Likelihood Ratio |
25.777
|
12
|
.012
|
|
Linear-by-Linear Association |
1.993
|
1
|
.158
|
|
N of Valid Cases |
293
|
|
|
a 0 cells (.0%) have
expected count less than 5. The minimum expected count is 8.83.
b 0 cells (.0%) have
expected count less than 5. The minimum expected count is 9.37.
The
second part of the day began with a study of correlation across different
aspects of satisfaction ( as illustrated below). However, before we observe the
table it would be appropriate to have an insight about Correlation.
CORRELATION
Correlation
refers to any of a broad class of statistical relationships involving
dependence. Formally, dependence refers to any situation in which a random
variable does not satisfy a mathematical condition of probabilistic
independence. Informally, correlation can refer to any departure of two or more
random variables from independence, but technically it refers to any of several
more specialized types of relationships between mean values. There are several correlation coefficients, often
denoted ρ or r, measuring the degree of correlation. The
commonest of these is the Pearson correlation coefficient, which is sensitive
only to a linear relationship between two variables (which may exist even if
one is a nonlinear function of the other).
Correlations
Chi-Square Tests
|
|
Price satisfaction
|
Variety satisfaction
|
Organization satisfaction
|
Service satisfaction
|
Item quality satisfaction
|
Overall satisfaction
|
Price satisfaction
|
Pearson Correlation
|
1
|
.694(**)
|
.306(**)
|
.585(**)
|
.505(**)
|
.585(**)
|
Sig. (2-tailed) |
|
.000
|
.000
|
.000
|
.000
|
.000
|
|
N |
582
|
582
|
582
|
582
|
582
|
582
|
|
Variety satisfaction
|
Pearson Correlation
|
.694(**)
|
1
|
.182(**)
|
.604(**)
|
.529(**)
|
.572(**)
|
Sig. (2-tailed) |
.000
|
|
.000
|
.000
|
.000
|
.000
|
|
N |
582
|
582
|
582
|
582
|
582
|
582
|
|
Organization satisfaction
|
Pearson Correlation
|
.306(**)
|
.182(**)
|
1
|
.279(**)
|
.210(**)
|
.233(**)
|
Sig. (2-tailed) |
.000
|
.000
|
|
.000
|
.000
|
.000
|
|
N |
582
|
582
|
582
|
582
|
582
|
582
|
|
Service satisfaction
|
Pearson Correlation
|
.585(**)
|
.604(**)
|
.279(**)
|
1
|
.424(**)
|
.602(**)
|
Sig. (2-tailed) |
.000
|
.000
|
.000
|
|
.000
|
.000
|
|
N |
582
|
582
|
582
|
582
|
582
|
582
|
|
Item quality satisfaction
|
Pearson Correlation
|
.505(**)
|
.529(**)
|
.210(**)
|
.424(**)
|
1
|
.457(**)
|
Sig. (2-tailed) |
.000
|
.000
|
.000
|
.000
|
|
.000
|
|
N |
582
|
582
|
582
|
582
|
582
|
582
|
|
Overall satisfaction
|
Pearson Correlation
|
.585(**)
|
.572(**)
|
.233(**)
|
.602(**)
|
.457(**)
|
1
|
Sig. (2-tailed) |
.000
|
.000
|
.000
|
.000
|
.000
|
|
|
N |
582
|
582
|
582
|
582
|
582
|
582
|
Subsequently,
we were detailed regarding the actual process through which the Chi-square
statistic is derived at in the SPSS .
COMPUTATION OF THE
CHI-SQUARE STATISTIC FOR CROSS-TABULATION TABLES
The
chi-square statistic is computed by first computing a chi-square value for each
individual cell of the table and then summing them up to form a total
Chi-square value for the table. The chi-square value for the cell is computed
as:
(Observed
Value – Expected Value)2 / (Expected Value)
This
concluded the day with something more to learn, reflect and work upon in the
future as a budding student manager.
Written by : Priyanka Doshi
Other members : Pragya Singh
Nilay Kohaley
Pawan Agarwal
Poulami Sarkar
No comments:
Post a Comment