Saturday 20 July 2013

Session 7 and 8 - An introduction to Permap, Z-score and Bubble Graph



With the interest building up on the use of Statistics as a management tool, the 7th and 8th session began with a brief introduction to a new application program named Permap.  

Permap is a program that uses multi dimensional scaling (MDS) to reduce multiple pairwise relationships to 2-D pictures, commonly called perceptual maps. The fundamental purpose of Permap is to uncover hidden structure that might be residing in a complex data set. A unique feature of PERMAP is that it embeds the mapping techniques in an interactive, graphical system that minimizes several difficulties associated with multidimensional scaling practices. It is particularly effective at exposing artefacts due to local minima, incomplete convergence, and the effects of outliers.
                                                                         




Thereafter, we moved on to an excel sheet containing data about the engine capacity ,horse power and mileage of two wheelers. The case required us to conduct a comparison between the set of observations to select the most suitable option. However, comparing the observations on the basis of three different parameters was difficult because the range of all these three parameters was distinctly varied. Therefore there was a need to normalize these three fields to enable comparison. It was then that we were introduced to a new concept of Z-score.

Z-SCORE  is  a test of statistical significance that helps you decide whether or not to select a sample collection. A Z-score (also known as Z-value, standard score, or normal score) is a measure of the divergence of an individual experimental result from the most probable result, the mean. Z is expressed in terms of the number of standard deviations from the mean value.
                                                                
                                                            z = X- μ
                                                                     σ
                        
                                                           X = ExperimentalValue
                                                           μ = Mean
                                                           σ = StandardDeviation

Z-scores assuming the sampling distribution of the test statistic (mean in most cases) is normal transforms the sampling distribution into a standard normal distribution. However, when using z-scores it is important to remember a few things:
§  Z-scores normalize the sampling distribution for meaningful comparison.
§  Z-scores require a large amount of data.
§  Z-scores require independent, random data.

Keeping these points in mind, we calculated the mean , variance, standard deviation for the given set of data to ultimately derive the z-scores.  Thereafter, we proceeded towards plotting of the original figures and the z-scores by means of a graph. However, one question which remains to be answered here is how to interpret the z-scores. The z-score is used to interpret the data in the following manner.  
                                           Original and Z-score plotting for engine size 



                                          Original and Z-score plotting for horse power
 

                                          Original and Z-score plotting for mileage



  • A z-score less than 0 represents an element less than the mean.
  • A z-score greater than 0 represents an element greater than the mean.
  • A z-score equal to 0 represents an element equal to the mean.
  • A z-score equal to 1 represents an element that is 1 standard deviation greater than the mean; a z-score equal to 2, 2 standard deviations greater than the mean; etc.
  • A z-score equal to -1 represents an element that is 1 standard deviation less than the mean; a z-score equal to -2, 2 standard deviations less than the mean; etc.
  • If the number of elements in the set is large, about 68% of the elements have a z-score between -1 and 1; about 95% have a z-score between -2 and 2; and about 99% have a z-score between -3 and 3.

The second part of the session introduced us to a new concept of ‘Bubble Chart’ in plotting data where more than two variables are required to be plotted and compared

BUBBLE CHART

A  bubble chart is a chart that displays three dimension of data. It can facilitate the understanding of social , economical, medical, and other  scientific relations. Bubble Chart can refer to a  data flow,  a data structure or other diagram in which entities are depicted with circles or bubbles and relationships are represented by links drawn between the circles. Bubble charts are often used to present financial data. Use a Bubble chart when you want specific values to be more visually represented in the chart by different bubble sizes. Bubble charts are useful when your worksheet has any of the following types of data:

  • Three values per data point     Three values are required for each bubble. These values can be in rows or columns on the worksheet, but they must be in the following order: x value, y value, and then size value.
  • Negative values     Bubble sizes can represent negative values, although negative bubbles do not display in the chart by default. You can choose to display them by formatting that data series. When they are displayed, bubbles with negative values are colored white (which cannot be modified) and the size is based on their absolute value. Even though the size of negative bubbles is based on a positive value, their data labels will show the true negative value.
  • Multiple data series     Plotting multiple data series in a Bubble chart (multiple bubble series) is similar to plotting multiple data series in a Scatter chart (multiple scatter series). While Scatter charts use a single set of x values and multiple sets of y values, Bubble charts use a single set of x values and multiple sets of both y values and size values.

Create a Bubble chart

  1. Select the data you want to display in the Bubble chart.
 Note   It's best not to include row or column headings in the selection. If you select the headings with your data, the chart may produce incorrect results.
  1. On the Insert menu, click Chart.
  2. In the Chart type box, click Bubble.
  3. Under Chart sub-type, click the chart sub-type you want to use.
For a quick preview of the chart you are creating, click Press and Hold to View Sample.
  1. Click Next, and continue with steps 2 through 4 of the Chart Wizard.
  • Smaller bubbles may be hidden by larger bubbles, making it seem that Excel has not drawn all of the data markers.
  • When an entire data series contains negative bubble sizes, the series is not displayed by default. If you want to see the negative bubbles, select the series you want in the Chart Objects list on the Chart toolbar, and then click Format Data Series on the same toolbar. On the Options tab, select the Show Negative Bubbles check box.
Formatting Bubble charts

There are several ways to change the format of a Bubble chart:

  • Display bubbles with a 3-D visual effect     By selecting the 3-D Bubble chart sub-type, bubbles are formatted with a 3-D visual appearance.
 Note   A 3-D Bubble chart is 3-D in appearance only — it actually remains a 2-D chart type. Unlike other 2-D chart types, however, this chart type cannot be used in a combination chart.

  • Adjust the size of bubbles     The size of the bubbles can represent the area of the bubbles or the width of the bubble, which affects the relative size of one bubble to another. For example, you can use the Area of bubbles option if you are charting rental costs of apartments for which you know the specific square footage (such as 1,700 square feet). The Width of bubbles option can be used for representations such as market share between products.

We can also scale the bubble size for a data series by specifying a percentage between 0 and 300 — the larger the percentage, the larger the bubbles.The advantage of this chart type is that it lets us compare three variables at once. One is on the x-axis, one is on the y-axis, and the third is represented by area size of bubbles. 

In order to gain a better understanding of Bubble graph, we were provided with a data set of values to represent the birth rate, death rate and the number of people per square kilometre in different parts of the world and the same was plotted as follows

 





This concluded the day and left us with some more meaningful ways of representing data to facilitate decision making.


Written by : Pragya Singh

Other Members: Priyanka Doshi
                           Nilay Kohale
                           Pawan Agarwal
                           Poulami Sarkar



No comments:

Post a Comment