Monthly Archives: November 2012

Predicting Sales Using STATISTICA Uplift Modeling: Webinar/Meeting

Uplift modeling comes into play when predictive analytics is implemented within your organization’s marketing/sales strategy. So, how do you measure effectiveness? Do you really derive any value from your models? How do the models perform over time?

With appropriate organization of the deployment flow, and with STATISTICA Enterprise platform, the answers to these questions can be delivered automatically and in a timely manner.

Join us as StatSoft Senior Data Mining Consultant, Dr. Vladimir Rastunkov, showcases STATISTICA‘s very useful and flexible uplift modeling module, covering topics such as:

  • The value of predictive analytics in targeting the right customers
  • Building multiple analytic models in a single flow
  • Comparing different models; the concept of lift; profit charts
  • Maximizing ROI: from predictive analytics to uplift modeling
  • Reporting the uplift

To register for the meeting, Click Here.

Advertisements

Knowledge Base – Statistica Graph Customization

Customization Overview

 

What types of graph customizations are available in STATISTICA?

Various graph customization facilities are available when a graph window is active (i.e., after a graph has been created). There are two major types of graph customizations:

  • Interactive graph customization. The customization options in STATISTICA graphics include hundreds of features and tools that can be used to adjust every detail of the display and associated data processing. Options are arranged in a hierarchical manner. Those used most often are accessible directly via shortcut menus by right-clicking on the respective element of the graph. A complete set of available features for any given graph are available in the Graph Options dialog, accessible by double-clicking in the graph.
  • Permanent settings and automation options. The initial settings of graph features can be easily adjusted so that the default appearance and behavior of STATISTICA graphs will match your specific needs and/or will require very little intervention on your part. Four different ways to make these adjustments are via (1) the Options dialog, (2) Graph Styles, (3) User-defined graphs, and (4) STATISTICA Visual Basic.

What is the Options dialog?

Perhaps the most straightforward way to adjust the default appearance of graphs is by accessing the Graphs Display and Graphs Settings options panes of the Options dialog (accessible by selecting Options from the Tools menu).

  • The Graphs Display options pane contains options to change the default color and size of various graph objects (e.g., markers, lines, areas, surface colors):

graph 
customizatoin display

  • The Graphs Settings options pane contains options to set defaults for the general appearance of the graph (e.g., fitted functions, axis proportions, document styles):

graph 
customizatoin settings options

Most commonly used settings can be easily adjusted here, and the results will be reflected in the default styles that will be used by the system, and as such, they will be automatically saved in the STATISTICA configuration file (e.g., different settings can be used for different projects).

 

What are User-Defined Graphs?

New types of graphs can be defined in a variety of ways and can be added to the menus, dialogs, or toolbars. If a custom graph that you intend to use repeatedly is not built from the starting point, but is based on one of the Graphs menu graphs and is produced by some combination of the existing graph customization options, you can create a customized graph. It can be added to the Graphs menu as a new type of graph by clicking the Add As User-defined Graph to Menu button on the Options 2 tab of the graph specification dialog. All user-defined graph specifications will be saved automatically in the STATISTICA configuration file (e.g., different sets of custom graphs can be used for different projects). For further details, see the documentation for the Configurations options pane of the Options dialog.

 

How can I use STATISTICA Visual Basic to customize graphs?

There are no limits to how customized your STATISTICA custom graphs can be, because STATISTICA Visual Basic (with all its custom drawing tools and library of graphics procedures) can be used to produce virtually any type of graphics. The custom developed displays or multimedia output can be assigned to STATISTICA toolbars, menus, or dialogs and become a permanent part of “your” STATISTICA application.

 

Can I record STATISTICA Visual Basic macros for graphs?

Yes, macro recording enables you to access programmatically almost every aspect and virtually every detail of the functionality of the STATISTICA. Even the most complex graphs can be recorded into STATISTICA Visual Basic (SVB) macro programs.

For example, suppose you create a line plot and want to change the line color and thickness. Open the Graph Options dialog, and then modify the line options in the Plot General options pane. Select the Record Macro check box at the bottom of the Graph Options dialog, and click the OK button to generate a macro containing the specific changes.

 

What is the Graph Options dialog?

The Graph Options dialog contains nodes that address all of the relevant customizable features for a particular graph. The nodes are grouped in clusters containing logically related items. The options in the Graph Options dialog are an all-inclusive “superset” of options accessed by double-clicking specific graph features.

Access the Graph Options dialog directly by selecting Graph Options from the graph Format menu, or by double-clicking the background (e.g., the area outside the axes) of a graph, or by selecting Graph Properties (Graph Options) from any graph shortcut menu.

graph customization graph options window

 

How do I add a new plot to an existing graph?

Click the Add new plot button in the Plot: General options pane of the Graph Options dialog (accessible from the Format menu). The New plot(s) dialog is displayed in which you can specify the plot to be added. You can also add a new plot directly to the Graph Data Editor.

 

How can I adjust the margins of a graph?

STATISTICA has two types of graph margins:

  • Margins within the graph area. Click the Set graph area button on the Graph Tools toolbar to adjust the space between the edge of the plotting area (i.e., the borders of the graph window) and any graph components or custom graphic objects. This can be accomplished by clicking the Set Graph Area button and either 1) dragging the resizing squares that appear around the edges of the graph window or 2) drawing a rectangle in the place where you want the graph to be.
  • Printout margins. The printout margins (the width of the distance between the edge of the paper and the beginning of the graph area) can be adjusted in the Print Preview dialog (accessed from the File menu).

How can I change the proportions of the Graph window?

Via the Document Size and Scaling dialog (accessible by clicking the Graph Actual Size/Scaling button on the toolbar) to change the current graph’s aspect ratio (i.e., the ratio of its vertical to horizontal dimensions). Note that the global default graph aspect ratio can be modified in the Graphs: Settings options pane of the Options dialog (accessible via the Tools menu).

 

How do I produce sequences of graphs from lists of variables?

Most of the graph specification dialogs accessible from the Graphs menu allow you to select lists of variables in instances where a single variable is sufficient to define a graph. When such a list of variables is specified, STATISTICA cycles through the list and produces one graph for each variable (e.g., a histogram or a line plot).

 

“Cascades” of graphs requested from output dialogs.

Most of the output (results) dialogs in those statistical procedures that process lists of variables allow you to generate “cascades” of graphs for each (or each combination) of the variables in the current list. For example, such graphs can be produced from descriptive statistics, correlations, frequencies, cross-tabulations, breakdowns, and other procedures.

 

How do I specify properties for point markers?

Controls for modifying point markers for the various plots are located on the General dialog accessible by right-clicking on a point marker and selecting General Plot Options from the shortcut menu. You can also access these controls in the Plot: General options pane of the Graph Options dialog (accessible via the Format menu). Note that point markers (and fonts) can be increased or decreased using the Increase Font or Decrease Font toolbar buttons, respectively.

 

How do I specify area properties?

The quickest way to modify area properties (e.g., patterns, colors, etc.) is to right-click on the area you want to modify, and select Pattern from the shortcut menu to display the Area Properties dialog. Use the options in this dialog to change the area color, pattern, and style in the graph. The default patterns, colors, and modes of display of consecutive plots and other components of the graphs are determined by the current selections in the Graphs: Display options pane of the Options dialog (accessible from the Tools menu). Note that the Area Properties dialog can also be accessed from the Plot: General and Plot: Bars options panes of the Graph Options dialog.

graph customization area properties

After specifying a pattern in the Area pattern options, you can select a different color for the area Foreground and Background with these options.

Instead of specifying area patterns and color using the options described above, you can select the style you want to use for the area from the Area style box on the toolbar.

 

How do I specify line properties?

The quickest way to modify line properties (e.g., size, colors, etc.) is to right-click on the line you want to modify, and select Pattern from the shortcut menu to display the Line Properties dialog. Use the options in this dialog to change the line width, pattern, and color in the graph. The default patterns, colors, and modes of display of consecutive plots and other components of the graphs are determined by the current selections in the Graphs: Display options pane of the Options dialog (accessible from the Tools menu). Note that the Line Properties dialog can also be accessed by clicking the Line button in the Plot: General options pane of the Graph Options dialog.

graph customization line properties

Instead of specifying line patterns and color using the options described above, you can select the style you want to use for the line from the Line style box on the toolbar.

 

Can I control the resolution of fit lines?

Yes. Use the Resolution option in the Fitting dialog (accessible by right-clicking on the fitted line and selecting Fitting from the shortcut menu). Once a fitted function has been determined, the fit is approximated with segments on the x-axis. The Normal fit line is composed of 200 segments, and the number of segments increases in exponential fashion as you select Medium, High, Very High, or Perfect. Note that this option is only beneficial for fits with high curvature (i.e., a straight line fit will not be improved by this option, but a high-level polynomial fit will be). Selecting a higher number of points will result in a smoother appearance of the fitted function in the graph; however, selecting a higher number of points will also slightly slow down the graphing procedure.

 

Styles

 

What are Graph Styles?

All of the numerous features that affect the appearance of the graph (from the color of the font in the footnote to the global features of the graph document) can be saved as individual “styles.” These styles can be given custom names and later be reapplied using simple shortcuts (such as pressing a specific key combination or clicking a button on a custom toolbar). An intelligent system internally manages these thousands of styles and their combinations in STATISTICA and helps you achieve your customization objectives with a minimum amount of effort. All user-defined or modified styles will be saved automatically in the STATISTICA configuration file (e.g., different sets or systems of styles can be used for different projects).

 

How can I create a Style from a custom graph title format?

First, create a graph with custom title text. For example, open the Adstudy.sta data file, and select Scatterplots from the Graphs menu to display the 2D Scatterplots creation dialog. Select variables, and type a title (e.g., Hotel Guest Survey) in the Custom title box on the Options 1 tab of the 2D Scatterplots dialog. Click the OK button to produce the graph.

Next, double-click the custom title at the top of the graph to display the Titles/Text dialog. Change the font type, size, color, etc. as desired. Note that the specified changes are applied immediately to the title displayed in the text box in the dialog.

Finally, click the Styles button in the Titles/Text dialog to display the Graphics Styles dialog, and either right-click in the Styles for Title box or click the ellipsis button. From the resulting menu, select Save As to display the Save As dialog, and enter a name for the style you have just created (e.g., surveytitle). Then click the Save button to store your custom title style for later use. Close the Graphics Styles dialog and click the OK button in the Titles/Text dialog to apply the formatting features to the custom title on the current graph.

 

How can I apply a saved Style to a graph title in a new graph?

To apply the formatting captured in a saved style, click the title to which you want to apply the style. The current style name (most likely Default Top Title) will appear in the Graphics Styles box on the left side of the Graph Tools toolbar. Click in the Graphics Styles box and select the desired style (e.g., surveytitle created in the example in the previous section). The highlighted title will instantly switch to the selected style.

 

Does altering the graph defaults on the Options dialog affect Graph Styles?

Only the default graph styles will be affected by changes made on the Graphs: Display and Graphs: Settings options panes of the Options dialog (accessible via the Tools menu). All other user-defined styles will not be affected these changes. Default system graph styles change to mirror the current system graph settings (as specified in the Graphs: Display and Graphs: Settings options panes) without any need from you to manually update them. Conversely, user-defined graph styles will retain all of their internal settings, despite any changes made to the Graphs: Display and Graphs: Settings options panes.

 

If I save a graph with customized Styles, how will this graph appear on a colleague’s computer?

The graph will appear exactly as it did on your computer. Although every aspect of your graph’s customization is encapsulated within graph styles, all customization applied to your graphs are fully portable. In fact, your colleague can then save the styles within your graph into his/her system by doing the following: click on the background of the graph, and then right-click on the Graphic Styles box on the toolbar. In the shortcut menu that is displayed, select the Save As command.

The Save As dialog is then displayed. Use this dialog to specify the name for the new graph style and click the OK button. In this manner, the graph style that was originally installed on your system has now been ported to your colleague’s system in a few simple steps, using a graph as a “carrier” (see the next topic).

 

Can I transfer a Graph Style from one system to another?

Yes, you can use the graph itself as a “carrier” of the graph style from one system to another.

Apply the desired graph style to an appropriate graph. Next, save the graph and then open it on the other system. Click the background of the graph, and then right-click on the Graphic Styles box on the toolbar. In the shortcut menu that is displayed, select the Save As command to display the Save As dialog. Specify the name for the new graph style and click the OK button. Now the graph style is part of the other system’s graphics library. This is a quick and recommended means of porting a graph style.

 

What do the letter icons represent in the Graph Styles Manager?

Graph styles address the properties of graph objects (e.g., size, color, thickness, and pattern of lines; size, shape, and color of point markers; colors and patterns used for definition of areas; size, color, and fonts for labels, titles, and scales), which are applicable to graphs.

Letter icons represent these properties of objects specified for the graph.

  • Attributes and properties. There are two general classes of items for which a style can be created: attributes and properties. An attribute (designated with an A) is an object that affects the simple appearance of the graph, such as colors, line patterns, font sizes, font names, etc. A property (designated with a P) of a graph is an aspect of the graph that is not directly visible, such as what kind of plot to make, or what scale type to use.
  • Style collections. There are two additional classes of styles, collections of attributes (AA) and collections of properties and/or attributes (S). All the elements of a collection of attributes are simple attributes, A. All the elements of a collection of properties and/or attributes are either properties (P) or collections of attributes (AA).
  • User-defined styles. Any type of style that you create and save into your system will be denoted with a pencil on the icon, whether it is an attribute or property.

Scales

 

How do I customize the layout and format of an axis?

Right-click on the respective axis and select Scaling from the shortcut menu to display the Scaling dialog, which contains customization facilities for all features of the current axis.

graph customization scaling

Note that the applicable features of the axis can be copied to other axes by clicking the Copy axis specs to button in the Scaling dialog. The Copy axis specs to dialog will be displayed, where you can copy the features to either the corresponding (i.e., the opposite) axis or all other axes. The main scaling features of the axes can also be adjusted here, as well as in the Graph Options dialog.

 

How do I replace numeric scale values with text labels?

In the Axis: Scale Values options pane of the Graph Options dialog, select the Use text labels from data set check box. Note that if the variable plotted on this particular axis does not have text labels, you can create custom labels instead in the Axis: Custom Units options pane of this dialog. Here you can create custom labels using an editable custom labels spreadsheet in which you can enter the appropriate numeric values (determining where the text labels are to be placed on the axis) and the corresponding text value labels.

graph customization graph options custom units

For example, if the values were entered as in the dialog shown above, then the label Low would be placed in the location of 1 on the axis, label Medium in the location of 2, etc.

 

Can I insert a scale break?

Yes. You can place one or more true scale “breaks” in a graph axis in order to “cut out” (i.e., “compress”) certain areas of the graph space:

graph customization scale break

To do this, right-click on the axis in which you want the break to appear and select Scaling from the shortcut menu. In the resulting Scaling dialog, click the More button and add a new break by clicking the Add new scale break button.

graph customization scaling scale break

Use the From and To boxes to adjust the break location. STATISTICA will place the break in the specified location of the scale after you click the OK button in the Scaling dialog. These options are also available in the Axis: Scaling options pane of the Graph Options dialog. Note that you can add more than one scale break to an axis.

 

Can I adjust the number of minor units?

STATISTICA will adjust the number of minor units to the current step size value. However, you can adjust the number of minor units (as well as the default style and size of minor tickmarks) for each of the axes in the Axis: Minor Units options pane of the Graph Options dialog (accessible from the Format menu). You can specify the number of tickmarks to use or have STATISTICA select the optimum number of minor tickmarks. The length and orientation of the tickmarks are also specified. Thickness and color of the tickmarks is addressed in the Axis: General options pane.

 

Can I specify custom locations for tickmarks?

You can specify custom units (including their text labels, display format, and size of tickmarks) in the Axis: Custom Units options pane of the Graph Options dialog.

 

What is the difference between manual and auto scaling?

When the axis scaling is set to Manual in the Mode option, then the minimum, maximum, and step size for the axis are determined by the current values of the Minimum, Maximum, and Step size (as specified in the respective options in the Range group in the Axis: Scaling options pane of the Graph Options dialog). If the Mode is set to Auto (i.e., automatic), STATISTICA will automatically determine the scaling based on the range of values to be plotted.

 

Titles, Legends, and Custom Text

Is all graph text editable?

Yes. There are two different types of text in graphs. The first is normal editable text that you can change in the Titles/Text dialog (by double-clicking on a title). The second is text that is automatically created and updated by STATISTICA, (e.g., graph legends, functions, statistics). This second type of text (and/or symbols) consists of separate “active objects” (e.g., the point marker symbol in a legend) that are automatically updated by STATISTICA. You can always insert new text in between active objects. Note that you can also selectively disconnect any active object from auto updating and therefore be able to edit it (but, of course, lose the auto-updating feature) by right-clicking on it and selecting Disconnect Object(s) from Graph from the resulting shortcut menu.

 

Can I customize the location and format of the legend?

By default, when a graph is created, the legend is fixed (unmovable), which means that its position is automatically determined and the graph is moved to the left in the window to leave space for the legend. If you want to edit the text or reposition the movable legend in the graph, right-click on the legend and select Title Properties from the shortcut menu.

 

What other types of legends are automatically created in graphs?

In addition to the standard legend (which identifies patterns and colors used to mark individual plots in the graph), there are also other more specialized types of fixed legends.

  • Contour legends identify the levels in surface or contour plots.
  • Icon legends identify the assignment of icon features to specific variables.
  • Selection legends identify the case selection conditions used to classify cases into multiple subsets shown on the graph.

How can I add a title to a graph?

In every graph, there are five standard graph title positions: Title and Subtitle (both at the top of graph), Left, Right, and Footnote.

graph customization adding titles histogram

They can be edited in the Titles/Text dialog, accessible by double-clicking on a specific title. For example, the following dialog [accessed by double-clicking on the line TITLE: Histogram (Hurrdata.sta 7v*209c)] shows the top title from the graph displayed above.

graph customization graph titles histogram

The titles can also be edited in the Graph: Titles/Text options pane of the Graph Options dialog.

 

Can I enter a symbol into a graph title?

Yes, you can easily enter symbols and special characters into a graph title. First, double-click on the title of the graph to display the Titles/Text dialog. Change the font to Symbol, and position the cursor at the point where you want the special character(s) to be inserted. You can then enter the symbol(s) into the title.

Another way to retrieve these characters is to use the Character Map program that comes installed with Microsoft Windows. This application enables you to copy these characters to the Clipboard and then paste them into your title.

 

Can I convert the standard titles into movable text?

You can convert a standard title into a movable (floating) title via the Graph: Titles/Text options pane of the Graph Options dialog. In the Advanced options group, change the Status to Floating.

 

How do I place a graph title or a footnote in a fixed position?

You can convert the standard title or any custom text into one of the standard graph text positions (e.g., a Footnote). You can also convert it into movable text (see above) and then fix it in the desired location. After the floating title is created, click the Text object properties button in the Graph: Titles/Text options pane to display the Text Object Properties dialog. Here, you can position the title or footnote in the desired location.

graph customization text object properties

If you intend for the text to stay in a particular place in the graph area regardless of future changes to the graph scales or graph location (within the graph area), clear the Dynamic check box in the Coordinates (left-upper) group box. This will keep the text in the absolute window coordinates regardless of the changes to the graph (e.g., in 5% of the window width and length from the upper-left corner). See also the previous topic.

 

How do I rotate text?

You can select the orientation (Horizontal, Vertical, Reversed horizontal, or Reversed vertical) of floating text objects (custom text and moveable legends) in the graph in the Orientation group box of the Text Object Properties dialog (accessible from the Graph: Titles/Text options pane of the Graph Options dialog). You can also rotate the text by specifying the rotation angle (from 0 to 359° or 0 to -359°) in the Angle box.

graph customization text object properties angle

Alternatively, you can interactively rotate the text by selecting it in the graph and then dragging one of the handles (small black squares) in the desired direction,

graph customization rotated text

or by pressing the PAGE DOWN and PAGE UP keys to rotate text objects selected in the graph clockwise or counterclockwise, respectively, in 5° increments. To rotate in 1° increments, hold down the CTRL key while pressing PAGE DOWN or PAGE UP. The rotation of text objects takes place around the object’s anchor point.

graph customization rotated text anchor

The position of the anchor point can be adjusted in the Text Object Properties dialog.

STATISTICA Enterprise in the Mining Industry

•Executive Summary
•Leveraging STATISTICA Advanced Data Analysis and Data Mining Technology for Advanced Process Monitoring
•Some Technical and Application Details
•Typical Use Cases
•Multivariate Process Monitoring
•Predictive Data Mining and Process Optimization
•Application/Deployment of Data Mining Solutions
and Brief Demonstration
1.With commodity prices skyrocketing, precious metal mining companies are financially very successful
2.The process of converting ore (e.g., 1 metric ton) into precious metal (e.g., 6 grams of rare/precious minerals or metal) is a very complex process
3.Standard control charting methods is only a small part of the specific analyses and charts that go into successful monitoring of the entire process
4.When yield deteriorates (less precious metal is extracted from the ore), then the cost is enormous
5.There are usually only very few resources (Process Control Engineers) who have the skills necessary to effectively trouble-shoot faulty processes
6.By creating a system for advanced multivariate process monitoring fewer engineers can monitor more processes more effectively
Types of Analyses
•Commonality Analysis: Find common patterns of parameters that are associated with important process outcomes (e.g., find common combination of parameter settings that minimize emissions)
•Multivariate Process Monitoring: Use a data mining model to build multivariate process monitoring applications, to simultaneously track and monitor hundreds or thousands of parameters; detect process shifts and drifts early
•Predictive Data Mining and Process Optimization: Build predictive models of important process outcomes
Key Steps
•First understand domain and processes as best as you can
•Next understand the data, what is and is not “actionable”
•Identify key performance indicators (KPI’s), and identify the important predictors of KPI’s through root-cause analysis
•Track those predictors using multivariate control charting methods

•Main value proposition is: StatSoft has unique expertise, capabilities, and know-how for creating advanced process monitoring solutions for industries where there are no standard solutions, workflows, etc., and where simple univariate analyses will not solve problems

 

Summary:
STATISTICA Focus on: Refining of Precious Metals

1.StatSoft’s implementation experts are able to design and created an efficient dedicated data warehouse, with automated roll-ups and alignment of data from different sources (e.g., Assay, Milling, etc.).
2.Using STATISTICA algorithms for automated root cause analysis, StatSoft’s applied statisticians and consultants can identify effective analytic work flows that will locate problems in a fraction of the time, as compared to traditional trouble-shooting techniques.
3.The enterprise-wide deployment of automated customized analyses, summary reports (on process KPI’s), and effective ad-hoc analytic tools allows process control engineers to monitor more processes (sites) more effectively, and to address problems before they affect the bottom line.

STATISTICA Focus on: Refining of Precious Metals

•Fast root cause analysis
•Biplots, bag plots, for root cause analysis
•Soft-sensing (model-based SPC)

Applying advanced predictive analytics and data mining methods to multivariate process monitoring.

 

STATISTICA Web Analytics – Advantages

Advantages

Functionality and Applications

A powerful enterprise-wide collaborative intelligence system. STATISTICA Enterprise Server can act as the core of an enterprise-wide network system that enables the participants to work collaboratively and quickly share results (reports), as well as scripts of analyses or queries. User or group permissions can be used by the administrators to manage access of specific groups of users to specific data or reports. The accessibility of its tools makes STATISTICA Enterprise Server a perfect system to facilitate collaborative projects of employees who are telecommuting or traveling.

Advantages of distributed processing and multi-tier Client-Server architecture. Users will benefit not only from the collaborative work tools, but also the options to offload the computationally intensive or time-consuming tasks to the server computers. Specifically, because the most powerful multiprocessor CPUs (and/or clusters of computers) are usually used as servers, users can offload computationally intensive tasks, and, for example, run “in the background” queries that will scan terabytes of data on remote servers and perform time-consuming, long sequences of analyses or reports, while keeping the end users’ computers completely free to do other tasks.

Because of its distributed processing architecture, STATISTICA Enterprise Server scales in a highly efficient manner to take advantage of multi-processor CPUs and/or multiple computers and, therefore, users can take full advantage of multi-tier Client-Server architecture, where:

  • Tier 1 is the user interface on the client computer (a plain browser or STATISTICA thick client, see STATISTICA Client, below),
  • Tier 2 is the STATISTICA Enterprise Server software and the implementation of the “business intelligence” that it may contain (specific queries, scripts of custom/proprietary analyses, etc.), and
  • Tier 3 is STATISTICA databases (e.g., STATISTICA Data Warehouse) or other corporate repositories of data.

In the desktop version of STATISTICA, all computations are performed on the local computer, and resources of other computers are used only in the case when the In-Place Database Processing (IDP) interface to external databases is established. IDP is a technology that reads data asynchronously directly from remote database servers (using distributed processing if supported by the server), and bypasses the need to “import” data and create a local copy of the data set. Records of data are retrieved and sent to the STATISTICA computer asynchronously by the CPU of the database server, while STATISTICA simultaneously processes them using the CPU of the local computer.

When a Client-Server version of STATISTICA is used, the local computer drives only the user interface of STATISTICA, and all calculations are performed on the server. The Client-Server architecture offers obvious advantages when your projects are large (e.g., computationally intensive or involving processing of extremely large data sets) and, thus, when they can be offloaded to the servers, freeing your local computer to perform other jobs.

STATISTICA Client. While no components of the STATISTICA system are necessary on the client computer (only a browser), having a copy of STATISTICA installed on the client side adds new possibilities. One could ask, Why would I want to use STATISTICA Enterprise Server if I have a copy of STATISTICA installed on my laptop? The answer is that having STATISTICA installed on the client computer enables you to take additional advantage of the multi-tier Client-Server architecture (see above) and work interactively with STATISTICA installed locally while offloading certain time-consuming tasks to the server machine(s) and/or exchange data and output between all the three tiers. You can run STATISTICA Enterprise Server from within desktop STATISTICA and flexibly control the interaction between the two. A variety of options are available to share tasks between the desktop and server computer.

Also, when you review your STATISTICA Enterprise Server output in the browser, you have options to bring any or all output objects to your desktop computer for further processing. For example, a click on a small button placed optionally (depending on the user configuration) next to every output object (table or graph) sent to your browser by the STATISTICA Enterprise Server system will offer you the option to download that object (a STATISTICA table or a graph) to the client computer in its native STATISTICA format (in .sta or .stg file format) so you can work with it offline using the locally installed STATISTICA tools.

Advantages of Multithreading Technology

The STATISTICA Enterprise Server platform is built on advanced distributed processing and multithreading technology to support optimal management of large computational loads. This technology enables rapid processing of even very large and computationally intensive projects, taking full advantage of the multiple CPUs on the server, or even multiple servers working in parallel.

In addition, the STATISTICA Enterprise Server architecture delivers a platform-independent, Web browser-based user interface, and provides an ultimate, large enterprise-level ability to manage projects or groups of users.

Ultimate scalability (parallel processing technology). One of the unique features of the STATISTICA distributed processing technology is that it flexibly scales not only to take advantage of all CPUs on the current server computer (to support both multiple jobs/users and also individual, computationally intensive projects), but it also scales to multiple server computers (clusters). This unique feature is important, since it delivers significant performance gains. STATISTICA uses the parallel processing technology across separate hardware units (as some super-computers do) and, therefore, if you have, for example, three servers with four processors each, STATISTICA can run an individual project on all 12 processors (if the scale of that project warrants that mode of processing).

STATISTICA Enterprise Server supports multiprocessor environments and works with load balanced environments, making STATISTICA Enterprise Server suitable for internal cloud computing environments.

Statistica Knowledge Base – All about Graphics (Graphs)

Graphics Introduction

What are the different ways in which I can create graphs in STATISTICA?

Graphs from the Graphs menu contain the most flexible graphing capabilities available in STATISTICA, offering literally thousands of different combinations of options to create the precise graphics that lead to accurate interpretation of data. These commands are also available from the STATISTICA Start button menu (the button in the lower-left corner of the STATISTICA window).

categories of STATISTICA graphs

Following are the general categories of STATISTICA graphs available from the Graphs menu:

  • Menu graphs use data from the current input spreadsheet, taking into account the current case selection and weighting conditions.

menu graph

  • User-defined graphs are templates of previously saved menu graphs. To create these, click the Add As User-defined Graph to Menu button on the Options 2 tab of any graph creation dialog.
  • Block data graphs use the currently selected (continuous) block of data in the active spreadsheet to specify input data for the graph.

block data graph

  • Input data graphs process data directly from the current input spreadsheet and take their cues as to which variables to use from the current cursor position.

input data graph

Other specialized graphs related to specific analyses (e.g., ANOVA plots of means, Nonlinear Estimation plots of fitted functions, Cluster Analysis tree diagrams) are accessible directly from analysis results dialogs.

specialized graph

Note that all STATISTICA Graph types offer the same customization options. Also, any type of graph can be created with STATISTICA Visual Basic.

Are there different customization options for each type of graph?

No. Once a graph is displayed on the screen, regardless of how it was requested or defined, all graph customization options available in STATISTICA can be used to customize it. The customization options available for all graphs include appending new plots to existing graphs and linking and embedding graphs, as well as drawing, fitting, and graph restructuring options. Also, all these options can be used to customize graphs that were saved and later opened for additional editing.

What happens to graphs when the data file changes?

All Graphs menu graphs can maintain automatic links to the data from which they were created as long as the graph specification dialog is active. Options for auto-updating graphs are available on the Options 1 tab of all graph specification dialogs. Note that if you want a graph to be dynamically updated when the data file changes, it must be placed in a stand-alone window (instead of in a workbook or report).

In what formats can I save STATISTICA Graphs?

STATISTICA Graphs can be saved in the following formats:

  • STATISTICA Graph files (*.stg)
  • Bitmap files (*.bmp)
  • JPEG files (*.jpeg, *.jpg)
  • Portable Network Graphics files (*.png)
  • Windows Metafiles (*.wmf)
  • Enhanced Metafiles (*.emf)
  • PDF files (*.pdf)
  • GIF files (*.gif)
  • TIFF files (*.tif)

How do I export a STATISTICA Graph to another application?

Export via Copy and Paste operations (e.g., the Clipboard). The quickest way to export a graph is to copy it to the Clipboard and then paste it into another application. STATISTICA native, Windows metafile, and bitmap formats are created in the Clipboard and can be used in other applications.

STATISTICA Graphs can be pasted into other application documents (e.g., word processor documents or spreadsheets) as embedded objects or objects linked to graph files. If STATISTICA Graphs are pasted to other applications via Windows OLE, the graphs are tied to STATISTICA and can be interactively edited from within the other application.

  • Linking STATISTICA Graph files via OLE. STATISTICA Graph files can also be inserted and linked via OLE to other applications.
  • Export to another file format. If the graph to be saved is to be used by an application that does not support OLE or ActiveX, you can choose to save the file as a different file type by selecting the appropriate option from the Save as type option in the Save As dialog.
  • Limitations of Windows Metafile format. Very large (in terms of the number of data points represented) or very complex graphs that can be produced by STATISTICA can exceed the capacity of the Windows metafile graphics format used in the Windows 95 and 98 systems. In those circumstances, use the JPG, PNG, or bitmap representation instead.

How is the mouse used in graph applications?

In addition to the standard Windows mouse conventions for selecting objects, the mouse can be used in many other specialized applications in the graphics window in STATISTICA. The following is a list of representative examples:

  • OLE. Links or embeds foreign document files to STATISTICA documents by dragging them directly from the desktop or Windows Explorer (across application windows) and dropping them onto STATISTICA Graphs.
  • Brushing. Highlights data points in the graph by clicking on them with the brushing tool or selecting them with a Box, Lasso, Cube, or a 2D or 3D Slice.
  • Zoom in and zoom out tools. Zooms in (“magnifies”) or zooms out (“shrinks”), respectively, the selected area of the graph.
  • Drawing tools. Adds rectangles, ovals (or circles), polylines, and freehand drawings, arrows, etc. to a graph.
  • Resizing and moving. Resizes (drag on a “black selection square”) or moves (drag the entire object) selected graph objects.
  • Editing polyline objects. Reshapes individual segments of the polyline drawing by dragging on either the object area black selection squares or any of the black selection squares that mark the line segments.
  • Rotating text. You can interactively rotate custom text by selecting it in the graph and then dragging one of the object handles (small black squares) in the desired direction.
  • Controlling the mouse with the keyboard in graphs. You can also emulate the mouse with the keyboard in order to move or resize an object by selecting the object, placing the mouse pointer over the object, and then using the keyboard cursor keys to move or resize the object.

Note that the mouse pointer will change to match the application in use. Press the ESC key to return the mouse pointer to the default mode. You can also use the mouse pointer to customize the graph.

How do I select an object in a graph?

To select an object in a graph, click on the object. Once an object has been selected, press the TAB key to navigate from object to object within your graph.

Copy and Paste Operations (Clipboard)

How can I copy an entire STATISTICA Graph?

Ensure that the window containing the graph to be copied is active, and then press CTRL+C or click the Copy toolbar button.

STATISTICA Graphs can be pasted and linked or embedded in other application documents (e.g., word processor documents or spreadsheets) following standard OLE conventions. If STATISTICA Graphs are pasted to OLE-compatible applications, the graphs maintain their relation to STATISTICA and thus can be interactively edited from within the other application or updated when the STATISTICA Graphs change.

If the STATISTICA Graph copied to the Clipboard has been saved as an *.stg file, you can link it in other application documents (or STATISTICA‘s own) by selecting Paste Special from the Edit menu.

How can I copy a selected part of a STATISTICA Graph?

There are several copy options:

  • Copying an object. Select a graphic object to be copied by clicking on it (ensure that you are in default pointing mode, i.e., the Selection Tool button on the toolbar has been clicked. Graphic objects are all objects you have created on the screen such as a custom text, a segment of a drawing, or an embedded graph or artwork. When the object is selected, press CTRL+C. Alternatively, you can click the Copy toolbar button.
  • Copying a rectangular section of the graph. Enable the Screen Catcher by pressing ALT+F3, or select Capture Rectangle from the Edit – Screen Catcher submenu. Hold down the left mouse button, and use the mouse pointer to select the area of the graph you want to copy. When you release the mouse button, the selected area will be automatically copied to the Clipboard in the bitmap format (there is no need to click the Copy button). Note that the Screen Catcher can be used to copy any rectangular part of the screen, not only in the graph window from which it was called but any part of the screen (even including parts that belong to other applications).
  • Copying a specific window. The Screen Catcher can also capture a specific window from the screen. To copy a specific window, select Capture Window from the Edit – Screen Catcher submenu, and use the mouse pointer to select the desired window.

How do I place text in a STATISTICA Graph?

Even large portions of text (e.g., a report several pages long) can be pasted into STATISTICA Graphs using the Clipboard operations mentioned in the previous two topics. Additionally, you can paste a portion of a document into the graph window using the Paste Special command. To edit and customize the text within STATISTICA Graphs, double-click the text to display the Titles/Text dialog (for custom text) or the respective OLE server application (for pasting in text via the Paste Special command).

Both the Clipboard-based as well as inserting operations listed in the previous topic apply to all Windows compatible graphs and artwork (linking and embedding operations support any OLE-compliant objects).

How do I place artwork or other graphs into a STATISTICA Graph?

The Clipboard-based operations (cut, copy, paste, link, embed) apply to all Windows-compatible artwork and graphs. Linking and embedding operations save graphs and artwork into bitmaps, Windows graphics metafiles, STATISTICA format graph files, and any OLE-compatible objects.

How can I undo operations on graph objects?

A multi-level undo option (available from the Edit menu, or by clicking the Undo toolbar button, or by pressing CTRL+Z) maintains up to 32 buffers (steps), which also include operations on objects.

Multiple and Compound Graphs

How can I place one STATISTICA Graph into another?

The easiest way to place one graph into another is to copy a graph displayed in one window (press CTRL+C or click the Copy toolbar button), and then move to the target graph window and paste it there (press CTRL+V or click the Paste toolbar button). The pasted graph will be displayed on the target graph. Now you can move or resize it like any other custom graphic object.

You can also change the properties of the pasted object by selecting Object Properties from its respective shortcut menu (right-click on an object). You can also edit the embedded object by double-clicking on it (following the standard OLE conventions).

Graphs and artwork saved as files can also be dynamically linked or statically embedded in the current graph by using the standard OLE facility, accessible by clicking the Graph Tools toolbar Insert Object button or selecting OLE Object from the Insert menu.

What are compound graphs?

Compound graphs are those that contain other graphs. STATISTICA can automatically create compound graphs (e.g., in the Quality Control module where one display contains four different types of graphs, or when you use the Multiple Graph AutoLayout Wizard.

Can I represent objects in graphs as expandable icons?

Icons representing documents in Windows Explorer can be dragged across applications and dropped into STATISTICA Graphs. If the source application is OLE-compliant, the document will be displayed in the STATISTICA Graph.

If the source application is not OLE-compliant, the document will be represented as an icon, either of the source application (if an association exists in Windows for the document’s file extension), or of the Windows Object Packager (if no association exists). These icons function as buttons; double-clicking on an icon will launch the application with which it is associated and open the file represented by the icon.

How do I create a blank graph for a compound graph?

The quickest way to create a blank graph is to select Blank Graph from the Graphs – Multiple Graph Layouts submenu. You can also select Wizard from the Graphs – Multiple Graph Layouts submenu to display the AutoLayout Wizard – Step 1 dialog. In this dialog, click the Blank button in the Add Graphs group box, and then click the OK button to produce a “compound” graph containing one blank graph. You can then add new or existing graph objects (e.g., added text, embedded or linked objects, arrows, freehand drawings, previously saved graphs, etc.) to that blank graph.

The Wizard (see below) and the Templates commands (on the Graphs – Multiple Graph Layouts submenu) can also be used to design and produce a custom layout. Alternatively, the Snap to Grid facility can be used. The Alignment Grid (accessible from the View menu) and/or the dynamically updated cursor coordinates can be used to aid in the visual placement and alignment of the graph objects in the blank graph.

Can I place multiple graphs on one page?

Several graphs can be printed on one page by linking or embedding them within a blank graph (see above). Although this can be done manually using cut-and-paste (and Snap to Grid), the easiest method is to use either the Multiple Graph Layouts – Wizard or the Multiple Graph Layouts – Templates, which automates placement of multiple graphs on one page.

What is the Multiple Graph AutoLayout Wizard?

The Multiple Graph AutoLayout Wizard can be accessed from the Graphs – Multiple Graph Layouts submenu. The Multiple Graph AutoLayout Wizard assists you in selecting and arranging graphs to be placed on the same page.

Graphs can be selected from all currently open STATISTICA Graph windows (in all currently open STATISTICA modules) or from graph files previously saved to disk; blank graphs (to be filled or replaced later) can also be used.

Categorized Graphs

What are categorized graphs?

Categorized graphs are created by categorizing data into subsets and then displaying each of these subsets in a separate small component graph arranged in one display. For example, one graph can represent male subjects and another one female subjects, or high blood pressure females, low blood pressure females, high blood pressure males, etc.

graph categorized graphs

In STATISTICA, categorized graphs are:

  • Available in many output dialogs (they are automatically generated as part of output from all procedures that analyze groups or subsets of data, e.g., breakdowns, t-tests, ANOVA, discriminant function analysis, nonparametrics, and many others)
  • Accessible as part of the Graphs of Input Data options in the shortcut menus in all spreadsheets
  • Accessible from the Graphs menu where a wide variety of user-defined methods to categorize data are available

How do I define categories for categorized graphs?

When categorized graphs are requested from output dialogs of specific analytic procedures that involving subsets of data, the graphs will automatically display the subsets that are already defined as part of the current analysis).

Alternatively, the categorized graphs requested from the Graphs menu offer a variety of methods to specify subsets using one or two grouping variables.

Specifically, categories can be defined by:

  • Integer Mode: Integer values of grouping variables
  • Categories: Dividing grouping variables into a requested number of equal-length intervals
  • Boundaries: Custom intervals (ranges) of grouping variables, defined by specific interval boundaries
  • Codes: Specific values (i.e., codes) of grouping variables
  • Multiple Subsets: User-defined “multiple-subset” definitions that can be entered as logical case selection conditions of values of all variables in the current data file

The following graph is a relatively complex example of a two-way categorized graph based on a mixed method of defining the subset graphs. The two-way categorization arranges small graphs like a two-way table (crosstabulation) based on two different criteria of categorization.

For example, the two rows of graphs represent categories defined based on values of variable Home_2 (cases where Home_2 is less than or equal to 104.624 and cases where it is greater than 104.624). The three columns of graphs represent subsets of cases defined using specific “multiple subset” definitions based on values of variable number 0 (i.e., case numbers) and variable Home_2.

Following is the 2D Categorized Scatterplots dialog from which the above graph was defined (select Scatterplots from the Graphs – Categorized Graphs menu).

graph categorized scatterplot

Specifically, variable Work_1 and Work_2 are plotted in each small graph (as variables X and Y, respectively). The first of the two categorizations (X Categories, or “columns” of graphs) was defined as Multiple Subsets in the Specify Multiple Subsets dialog that is displayed after the Specify Subsets button is clicked.

How do I create a graph that plots my data according to all distinct values of a categorical variable. What is the limit on the number of categories?

You can make such a graph with a categorical variable having up to 1,000 categories. On the graph specification dialog, select Unique Values as the type of interval. Note that you can also choose to sort these categories in ascending or descending order on the graph.

Fitting, Plotting Functions

How do I fit a function to data?

Access the Plot: Fitting options pane of the Graph Options dialog, select the appropriate plot, and click the Add new fit button; then, select the desired type of function or smoothing procedure in the Fit type box. You can adjust the fitting options (e.g., stiffness or optimization settings) and the pattern for the graphical representation of the fit here as well. The pattern can also be adjusted by double-clicking on the fit line or surface in the graph.

How do I display a specific equation for the fitted function?

In Graphs menu graphs, the display of the text of the fitted function equations can be requested by selecting either In title or As custom text in the Display fit expression box on the Options 2 tab of the graph specification dialog. Select Off in the Display fit expression box to suppress the display of fit equations.

Note that these options can be controlled globally (i.e., for all graphs) in the Analyses/Graphs: Display options pane of the Options dialog accessible from the Tools menu.

In all single plot and non-categorized graphs where only one function is fitted, the text of the equation is displayed in the first available line of the fixed title. Depending on the number of equations to be displayed, also in categorized graphs, the equations can be displayed in the fixed titles of the graph.

However, if more equations need to be displayed than the number of lines available in the fixed title, STATISTICA will create a custom text object on the graph and place the equations there. Potentially, such lists of equations can be very long (e.g., include 256 equations), and thus the custom text object can be large and partially cover the graph. However, the location of the listing of functions can be adjusted (the list can be moved around and edited like any other custom text object, the font size reduced, etc.).

graph fit equations

When the listing of functions is very long, it is recommended to add some space around the graph and place the text object there. You can add space around the graph using either the Graph actual size/scaling toolbar button or the Set graph area toolbar button.

How do I plot a custom-defined function?

Select Custom Function Plots from either the Graphs – 2D Graphs or the Graphs – 3D XYZ Graphs submenu and specify the function in the respective dialog. Also, you can add a custom function plot to any existing graph: Access the Custom Function options pane of the Graph Options dialog and click the Add new function button. Then, use the options in the Custom Function options pane to specify the equation to be plotted in the 2D or 3D graph.

In addition to the standard math functions, a variety of functions representing distributions as well as their integrals and inverses are supported and can be plotted (including Beta, binomial, Cauchy, Chi-square, exponential, F, Gamma, geometric, Laplace, logistic, normal, log-normal, Pareto, Poisson, Student’s t, and Weibull distributions).

How do I fit a custom-defined function to data?

The custom-function plotting facility (see the previous topic) accessible in the Custom Function options pane of the Graph Options dialog plots the requested (custom-defined) functions and overlays them on the existing graph. It does not fit these functions to the data. The most commonly used, predefined functions that can be fitted to the data and smoothing procedures is available in the Plot: Fitting options pane of the Graph Options dialog (e.g., Linear, Logarithmic, Exponential, Polynomial, Distance-Weighted least squares, Spline, and others).

Comprehensive facilities to fit to data (and interactively plot in two or three dimensions) user-defined functions of practically unlimited complexity are provided in the Nonlinear Estimation module.

Printing Graphs

Do all printer drivers support rotated fonts?

Most properly configured printers supported by Windows can handle rotated fonts; however, some printer drivers support some of the advanced printer control features used by STATISTICA only when they are set to a higher resolution (e.g., higher than 300 DPI) and/or when they are set to print fonts as graphics. If you encounter problems (e.g., rotated text is printed as unrotated or “uncovered” text is revealed that should be covered), consult the documentation included with your printer for direction on printing TrueType fonts as graphics or setting your printer to a higher resolution.

Do all printers support the non-transparent overlaying of graphic objects?

Most properly configured printers supported by Windows can properly handle printing of non-transparent overlays used in STATISTICA Graphs; see the previous topic for advice on how to configure the printer driver.

Can I quickly adjust sizes of all fonts in a graph?

In STATISTICA, all graph displays and printouts can be continuously scaled. STATISTICA will also automatically adjust the sizes of all fonts, markers, spacing, etc., such that manual adjustments of individual font sizes are rarely necessary.

You can interactively decrease or increase the size of the selected text or point marker by clicking the Decrease Font or Increase Font buttons (respectively) on the Graph Tools toolbar. Each click of the toolbar button changes the font size (or point marker) by one point (i.e., one click of the Increase Font button will increase the font size or point marker by one point). Note that if you have not selected any text or point markers, clicking these buttons will increase or decrease all text and point markers by one point.

Knowledge Base – Analyses

In Basic Statistics, what is the difference between the Breakdown; non-factorial tables analysis and the Breakdown & one-way ANOVA analysis?

These two analyses both produce Breakdown tables. If the data is well balanced, there is no difference between the two. However, when the combination of your categories is not well balanced, the Breakdown; non-factorial table analysis will produce a much denser table by eliminating the empty cells.

 

Can I perform analyses for all combinations of a set of categorical or grouping variables?

Yes. Select Batch (ByGroup) Analysis from the Statistics menu or from the Graphs menu to display the ByGroup Statistics Browser or ByGroup Graph Browser, respectively, which contains all of the available analyses and graphs.

 

What is the purpose of a Gage Linearity and Bias Study?

A Gage Linearity and Bias Study answers the questions “How biased is my gage when compared to a master value?” and “Does the accuracy of my gage change when the size of the parts being measured changes?” These are called Bias and Linearity, respectively.

 

I’m new to data mining. Is there a “wizard-like” feature in STATISTICA Data Miner?

Yes. STATISTICA Data Miner Recipes (DMR) is an easy step-by-step data mining guide with a wizard-like user interface. Novice data miners can quickly clean and analyze data, while advanced users can work more efficiently and have one more option to automate routine tasks. DMR explores the data and makes default decisions for you. You can easily modify these defaults as needed and save them for repeated use.

analysis data miner recipes

Can I save my Data Mining models for later deployment?

Yes. STATISTICA Data Miner provides convenient and effective ways of saving your existing models for later deployment and use. Predictive Model Markup Language (PMML) is particularly fast to execute.

 

What are the applications of STATISTICA Sequence, Association, and Link Analysis?

STATISTICA Sequence, Association, and Link Analysis (SAL) can be applied to any data set that contains market-basket type data. The market-basket problem assumes there are many products that can be purchased by the customer. Such products can be, for example, supermarket items, different insurance plans, etc. Customers fill their basket with only a fraction of the available items. STATISTICA SAL can use this information to predict what customers will purchase and, hence, help you to boost your sales and meet the supply and demand in your business.

 

I want to fit a large number of variable distributions to lists of variables. How do I do this?

Try the STATISTICA Distributions & Simulation module. It has standard (normal, half-normal, log-normal, Weibull) and specialized (Johnson, Gaussian Mixture, Generalized Pareto, Generalized Extreme Value) distributions. STATISTICA automatically ranks the quality of the fit for each selected distribution and variable.

 

analysis distributions and 
simulation

In addition, the distributions fit to the list of selected variables and the covariance between the selected variables can be saved for deployment. The Distributions & Simulation module uses this deployment information to generate simulated data sets that not only faithfully reproduce the respective distributions, but also the covariances between variables. In short, in addition to facilitating efficient distribution fitting to large numbers of variables, this module enables users to fit general multivariate distributions and simulate from those distributions using simulation techniques such as Latin-Hypercube.

 

Is there a way to optimize functions in STATISTICA?

Yes. The STATISTICA General Optimization module enables you to optimize arbitrary functions of virtually any complexity, using Simplex, Genetic Algorithm, or Grid-Search methods. This module finds the best parameters that control specific processes to achieve optimal results according to user-specified criteria. The function to be optimized can be specified in a simple STATISTICA Visual Basic (SVB) function or a set of formulas. This module can repeatedly invoke other STATISTICA (or R-language) functions in an efficient manner.

The Big Data Revolution And How to Extract Value from Big Data

“Big data” is the buzzword that is currently dominating professional conferences around data
science, predictive modeling, data mining, and CRM, to name only a few of the domains that
have become electrified by the prospect of incorporating qualitatively larger data sizes and
more voluminous high velocity data streams into business or other organizational processes.

As is usually the case when new technologies begin to transform industries, the technologies also introduce new terminology and, indeed, new ways of “thinking about” or conceptualizing reality and approaches to solve problems or improve processes.

For example, while only a few years ago it was only possible and conceivable to “segment” customers into groups most likely to purchase specific items or services, it is now possible and common to build models for each customer in real time as s/he peruses the internet searching for a specific household item or electronic gadget: instantly, that interest can be analyzed and translated into relevant display advertisements and offers to specific prospects providing a degree of customization that was inconceivable only a few years ago. As technologies to record the physical location of cell phones and their owners has matured, it seems that it won’t be long now until the vision depicted in the 2002 sci-fi thriller Minority Report, where display advertisements in malls are tailored to the specific individuals passing by, will become reality. At the same time, there are domains and situations where, inevitably, the excitement about new technologies around big data will give way to great disappointment. Sometimes sparse data describing precisely a critical piece of reality (critical for a business’ success) is much more valuable than big data describing non-critical pieces of that reality. The purpose of this paper is to clarify and reflect on some of the exciting new opportunities  around big data, and illustrate how StatSoft’s STATISTICA analytic platform(s) can help leverage big data to optimize a process, solve problems, or gain “big insights.”

Contents:
The Big Data Revolution
And How to Extract Value from Big Data
Overview
How Big is Big Data
Large Data, Huge Data
From Huge Data to Big Data
Technical Challenges with Big Data
Storage of Big Data
Unstructured Information
Analyzing Big Data
Map-Reduce
Simple Statistics, Business Intelligence (BI)
Predictive Modeling, Advanced Statistics
Building Models
Other Issues and Considerations for Implementation
Taking Advantage of Big Data by Building Large Numbers of Models
Deployment of Models for Real Time Scoring
Criticism of Big Data Strategies, Implementation Strategies
Big Data Does Not Necessarily Equate to Big Insights, Improvements
Velocity of Data and Actionable Time Intervals
Summary
References
Glossary
Big Data
Distributed File System
Exabyte
Hadoop
Map-Reduce
Petabyte
Terabyte

To Read the full business white paper. Click Here

Vote for Professor Adrian Saville!

Vote for Professor Adrian Saville! Adrian Saville has been nominated for the Economist Intelligence Unit’s Business Professor Award as Professor in Economics and Finance at GIBS. There are 206 nominees from some of the world’s largest and best-known business schools. He is the only nominee from Africa. Getting him onto the shortlist would send an incredible message to the world about our continent, South Africa, our universities and our people.

Adrian Saville is currently ranked fifth of 206. Let’s get him up to first! It takes 30 seconds to vote.

  • Copy this into your browser: businessprofessoraward.com
  • Find his name under “S”
  • Click on “Vote”
  • Fill in your name and email address in the pop up box
  • You will receive an email (check junk mail) to confirm your vote – this is important as if you don’t confirm the vote it isn’t registered

Voting closes in six days on 23 November at 5.00pm – so please vote as soon as you can.

 

Danske Bank Implemented STATISTICA to Maximize Efficiency and Minimize Operational Risk

Danske Bank Implemented STATISTICA to Maximize Efficiency and Minimize Operational Risk

Danske Bank, the largest bank in Denmark and one of the largest financial institutions in northern Europe, has implemented STATISTICA data analysis and predictive modeling software for risk management and scoring…

Read More >> as seen on Wall Street Journal’s Marketwatch.com. (PDF, 1.2 MB)

STATISTICA Connectivity and Data Integration Solutions

StatSoft’s customers depend on STATISTICA Enterprise for business intelligence and data mining applications that use a wide variety of data from many different sources:

STATISTICA Enterprise allows you maintain the connections to this data, and the analyses of the data, all in one centralized application.

The data sources supported by STATISTICA Enterprise include the following:

  • Any OLE DB or ODBC relational database, such as Oracle, SQL Server, or Access
  • Flat files like Excel spreadsheets and STATISTICA spreadsheets
  • Data historian repositories such as the PI Data Historian from OSI Soft, Inc
  • In-Place Database Processing (IDP), query multidimensional databases containing terabytes of data and process data without importing to local storage

STATISTICA Enterprise also provides convenient tools for filtering your data, and for viewing the metadata associated with those data sources.

 

The STATISTICA Enterprise administration application (called the Enterprise Manager) allows you to specify multiple users, with passwords, and assign different roles to those different users. For example, your database administrators can be given permission to create and modify database connections and queries, and your engineers can be given permission to run those queries and analyze the resulting data. Furthermore, each engineer can be assigned permissions for only the analyses that pertain to his or her work.