How to Show Grouping in Scatterplots
A scatterplot shows the relationship between continuous variables. Applying a grouping factor adds yet another dimension that can greatly enhance a plot’s usefulness.
This article explores two ways of showing a grouping variable in a scatterplot. The difference between the two methods is the fit line. One method uses one fit for all levels of a grouping factor, but shows the levels with point marker colors and patterns. The other method fits separate lines for each group.
The data set used in this example, Irisdat.sta, contains measurements for various parts of the flower for three different varieties of iris. To open the data set, select the Home tab and in the File group, click the Open arrow. From the menu, select Open Examples to display the Open a STATISTICA Data File dialog box. Double-click the Datasets folder, and then open Irisdat.sta.
One Fit Line for All Groups
Select the Graphs tab. In the Common group, click Scatterplot to display the 2D Scatterplots Startup Panel. Click the Variables button to display the Select Variables for Scatterplots dialog box, and select SEPALLEN as X and SEPALWID as Y.
Click the OK button.
On the Advanced tab of the 2D Scatterplots Startup Panel,
click the Mark Selected Subsets button. The Specify Multiple Subsets dialog box will be displayed. Create the three subsets for the grouping factor, IRISTYPE, as shown in the next image.
Click OK in the Specify Multiple Subsets dialog box, and click OK in the 2D Scatterplots Startup Panel.
The resulting graph is a scatterplot that contains one fit line for all points, but distinguishes points by the grouping variable IRISTYPE with colors and point markers.
Separate Fit Lines for Groups
Alternatively, it may be appropriate to use separate fit lines for the three groups. To do this, create a categorized graph.
Start a new 2D Scatterplots analysis, and select variables as before.
Now, in the 2D Scatterplots Startup Panel, select the Categorized tab. In the X-Categories group box, select the On check box. The options will become active. Click the Change Variable button to display the Select Categorization Variable dialog box. Select IRISTYPE.
Click OK to close this dialog box and return to the 2D Scatterplots Startup Panel.
The options Integer mode, Unique values, Categories, etc., give you flexibility with the grouping variable. A categorization variable does not have to be categorical in nature.
In the Layout group box, select the Overlaid option button.
Click OK to create the graph. Three separate fit lines are shown for the three categories in addition to the groups being designated by colors and point markers.
STATISTICA graphs offer extensive flexibility, which enables you to create the representation of the data that you need.