How to Summarize Data in STATISTICA Similar to Pivot Tables
Click here to upload the data file so you’ll be able to work through the example.
To gain understanding of our data, it is helpful to summarize it. Pivot tables, as found in Microsoft Excel and other programs, are used to summarize data and highlight important information. These tables can help us to extract meaning from data. Common tasks for pivot tables are to count, sum, or average. This is typically performed for classes of a grouping factor. For example, we could find the total sales in dollars and average sales in dollars grouped by region. These sales figures could further be grouped by fiscal quarter. We can produce at-a-glance information from a large database with these summary tables.
In this example, we are interested in exploring a database of daily rain totals. The data come from the Australian Bureau of Meteorology. http://www.bom.gov.au/climate/data/.
To start out, we want to summarize the data with yearly rain totals. To do this, select the Statistics tab. In the Base group, click Basic Statistics to display the Basic Statistics and Tables Startup Panel. Select Breakdown; non-factorial tables.
Click OK to display the Statistics BreakDown (non-factorial) dialog box.
Click the Variables button. In the Select the dependent variables and grouping variables dialog box, select the continuous variable Rainfall amount (millimeters) in the Dependent variables list and Year in the Grouping variables list.
Click the OK button.
In the Statistics BreakDown (non-factorial) dialog box, click the Summary button to create the output. The result is a table with the yearly average rainfall, count per year and standard deviation.
Return to the Statistics BreakDown (non-factorial) dialog box, and select the Descriptives tab to view statistics that can be computed. The mean is computed by default, and other statistics can be added or removed.
Next, we want to find the average rainfall broken down by year and month. Clear the Standard Deviation and Valid N check boxes.
Now, select the Quick tab. Click the Variables button, and add Month to the Grouping Variables. Create the Summary output.
This output lists Year and Month in columns. Most Pivot tables would arrange the output such that one variable was listed across and one was listed down. With a simple data management step, this can be achieved with this output.
Notice at the bottom of the output is an entry for All groups. This row should be removed. In STATISTICA, select the Data tab. In the Cases group, click the Cases arrow, and select Delete to display the Delete Cases dialog box. Select the last case, case number 302.
Click OK to delete.
Now, on the Data tab, in the Transformations group, click Stack to display the Unstacking/Stacking dialog box. Click Variables to display the Select Unstacking Variables dialog box. Select Month in the Code (column) variables list, Rainfall amount (millimeter) in the Unstack (value) variables list, and Year in the Case ID (row) variables list.
Click the OK button.
Accept the default settings in the Unstacking/Stacking dialog box, and click OK to create the new table of output.
This output shows average rainfall amounts by year and month in a compact, easy to read table.