Using Transparency on Scatterplots to Display Point Density
Many times when using a scatterplot that contains a high density of points, it is difficult to fully understand the data since some points are obscured by other points. Furthermore, there are many cases where the density of points needs to be understood, but this type of analysis cannot always be accomplished with normal scatterplot techniques. To facilitate solutions to both of these problems, it is possible in STATISTICA 10 to control point transparency in a scatterplot. This example illustrates creating a scatterplot with transparent points in STATISTICA.
The data that I am using is one day’s worth of recorded observations from a power plant (more than 25,000 pairs of data). The variables measured are recorded so often that a scatterplot of the data is usually not useful since all patterns in the data are lost due to a high density of plot points. If we produce the default scatterplot for this data we get this:
Here we see that there seems to be no obvious relationship between our X and Y variables. The problem is that we can’t see the density of the points at each spot, so visually a single outlier has as much impact as a point that represents 100 observations. To try to more accurately display our data graphically, let’s use the new transparency feature.
The transparency slider that controls the plot point transparency is located in the lower-right corner of the scatterplot. I changed the transparency of the points to about 80%.
Now our plot is displaying data in a way that more information can be gained from it. We see that the majority of the points (where the plot is the darkest) falls inside a “bowtie” shape bounded by +/- 250 on the Y axis. This pattern enables us to have a much deeper understanding about the relationship between X and Y. We could not see this very distinctive pattern until we applied point transparency in our scatterplot.
Point transparency is just one of many improvements in graphical displays in STATSISTICA 10. We can see from this example that these improvements don’t just produce better looking graphics, they also present new ways to use graphics to glean information from complicated data sets.
-Shannon L. Dick