How to Show Grouping in Scatterplots
A scatterplot shows the relationship between continuous variables. Showing a grouping factor in this plot adds another dimension and can greatly enhance a plot’s usefulness. This article will explore two ways of showing a grouping variable in a scatterplot. The difference between the two methods is the fit line. One method uses one fit for all levels of a grouping factor, but shows the levels with point marker colors and patterns. The other method will fit separate lines for each group.
The data set used in this example, Irisdat.sta, contains measurements for various parts of the flower for three different varieties of iris. To open the data set, select the Home tab and in the File group, click the Open arrow. From the menu, select Open Examples to display the Open a STATISTICA Data File dialog. Double-click the Datasets folder, and then open Irisdat.sta.
One Fit Line for All Groups
Select the Graphs tab. In the Common group, click Scatterplot to display the 2D Scatterplots Startup Panel. Click the Variables button to display the Select Variables for Scatterplots dialog, and select SEPALLEN as X and SEPALWID as Y.
Click the OK button.
On the Advanced tab of the 2D Scatterplots Startup Panel,
click the Mark Selected Subsets button. The Specify Multiple Subsets dialog will be displayed. Create the three subsets for the grouping factor, IRISTYPE as shown in the next image.
Click OK in the Specify Multiple Subsets dialog, and click OK in the 2D Scatterplots Startup Panel. The resulting graph will be a scatterplot that contains one fit line for all points but distinguishes points by the grouping variable, IRISTYPE, with colors and point markers.
Separate Fit Lines for Groups
Alternatively, it may be appropriate to use separate fit lines for the three groups. To do this, create a categorized graph. Start a new 2D Scatterplots analysis, and select variables as before.
Now, in the 2D Scatterplots Startup Panel, select the Categorized tab. Select the On check box in the X-Categories group box. The options will become active. Click the Change Variable button to display the Select Categorization Variable dialog. Select IRISTYPE.
Click OK to close this dialog and return to the 2D Scatterplots Startup Panel.
The options such as Integer mode, Unique values, Categories, etc., give you flexibility with the grouping variable. A categorization variable does not have to be categorical in nature.
In the Layout group box, select the Overlaid option button.
Click OK to create the graph. Three separate fit lines are shown for the three categories in addition to the groups being designated by colors and point markers.
STATISTICA graphs offer extensive flexibility, which enables you to create the representation of the data that you need.