Sunday, 1 September 2013

Correlation - Degree of Relatedness


CORRELATION

                Correlation is a measure of the degree of relatedness of variables. It can help a business researcher determine, for example, whether the stocks of two airlines rise and fall in any related manner. For a sample of pairs of data, correlation analysis can yield a numerical value that represents the degree of relatedness of the two stock prices over time.
                Correlation is determined using sample coefficient of correlation, r, where r is a measure of the linear correlation of two variables.
                The Correlation between 2 variables can be computed using the Product Pearson – Moment Correlation Coefficient which can be given by 

Inferences of the value ‘r’ – The ‘r’ value ranges between -1 to 0 to +1. An r value of +1 denotes a Positive relationship between the two set of variables. An r value of -1 denotes a Negative relationship between the two set of variables and if 0, it means that there is no relationship between the two set of variables.
                Let us consider an example of Cement Production in India whose production is expected to vary with the change in the Construction GDP of the country over a period of years.

YEAR
CONST (x)
CEMENT (y)
2012-13
4302.76
251947
2011-12
4124.12
230490
2010-11
3906.93
196960
2009-10
3557.18
206630
2008-09
2425.75
186940
2007-08
2263.25
174310
2006-07
2055.44
161310
2005-06
1838.68
147808
2004-05
1582.12
131559
2003-04
1362.24
123440
2002-03
1216.5
116348
2001-02
1126.92
106900
2000-01
1083.62
99520
1999-00
1020.07
100450

            The Correlation of the above Problem could be determined by either Manual or by using a Tool/Software.

MANUAL METHOD:
            Determine the values that are needed to determine/compute the value of ‘r’. This would lead to the computation of ∑xy, ∑x2 ,∑y2 as follows.
           
year
CONST (x)
CEMENT (y)
xy
x^2
y^2
2012-13
4302.76
251947
1084067474
18513744
63477290809
2011-12
4124.12
230490
950568419
17008366
53125640100
2010-11
3906.93
196960
769508933
15264102
38793241600
2009-10
3557.18
206630
735020103
12653530
42695956900
2008-09
2425.75
186940
453469705
5884263
34946563600
2007-08
2263.25
174310
394507108
5122301
30383976100
2006-07
2055.44
161310
331563026
4224834
26020916100
2005-06
1838.68
147808
271771613
3380744
21847204864
2004-05
1582.12
131559
208142125
2503104
17307770481
2003-04
1362.24
123440
168154906
1855698
15237433600
2002-03
1216.5
116348
141537342
1479872
13536857104
2001-02
1126.92
106900
120467748
1269949
11427610000
2000-01
1083.62
99520
107841862
1174232
9904230400
1999-00
1020.07
100450
102466032
1040543
10090202500
SUM
31865.58
2234612
5839086396
91375280
388794894158.00

            Now using the Correlation formula to determine the coefficient by substituting the above values.

r = 14(5839086396) – (31865.58) (2234612)
√ [(14) (91375280)-(31865.58) ^2] [(14) (388794894158)-(2234612) ^2]

R = 0.967698462



CORRELATION USING EXCEL
1.      Enter the Data Set containing two or more set of variables in the Worksheet.
2.      Click on Data Tab and select the Data Analysis option.
3.      Select Correlation as a choice and enter the Input Range of values in the Data Analysis dialog box.



4.      The Correlation between the two variables is computed and the ‘r’ values are presented in the table as follows.





Inferences from the Example:
·        Correlation between the Production of the cement and the GDP of the Construction sector is positive and the coefficient is determined to be 0.967698462.

·         This signifies that if there is a change in the GDP of the Construction by 100%, it would affect the production of the Cement by approximately 96.7%

  Blogged By : Piyush (2013197)
Group No. 1 Members:
Neeraj Garg (2013166)
Pallavi Gupta (2013187)
Prerna Bansal (2013209)
Priya Jain (2013210)

No comments:

Post a Comment