Monthly Archives: February 2012

Visualizing Medical Data

I love my friends. When they see a graph or a fun data set, sometimes they’ll share it with me as blog-spiration.

This is how I became the proud owner of some real-life blood pressure data, as my friend has been requested by their doctor to record this information several times a day. As with any new dataset, I tried out several types of graphs to see which type provided the most useful information. I looked for outliers in the data so that we could investigate and explain them. I also looked for ways to categorize the information in the data set that might help me tease out patterns that are hidden within overall trends.

In this multiple line plot, I have both Diastolic and Systolic blood pressure readings going back as far as October 2011. There was a point that spikes above 160 Systolic.  To investigate, I right clicked on the graph and opened up the Brushing tool, which allowed me to click on any point in the graph and label it. Brushing the point allowed me to quickly find out that the high reading was on February 11. If I wanted to be nosy, I could ask my friend if there was something stressful going on that day or if the reading was due to error or equipment malfunction.


We could also see if time of day had an impact on blood pressure readings. I took the raw data and broke it down by hour of the day using STATISTICA Extract, Transform, and & Load, which is typically used to aggregate and combine much larger and complicated data sources, but well-suited for handling time-stamped data of any size.  This time, I added heart rate data on the right Y axis of the multiple line plot.

Each point represents an average of all readings from each one-hour time block in the day. For example, the first point is an average of all readings ever taken between midnight and 1am since October. We can see the slight upward trend through the morning hours, the respite around noon – perhaps during or in anticipation of a lunch break – and the continued upward trend through the evening hours until around 8pm.

Finally, my friend was asked to start tracking the blood pressure data taken from each arm. We can look at the distribution of data from each arm using a Categorized Histogram.



The additional statistics added to the top of each graph allow us to confirm what our eyes gather from the histograms – that the means are higher from the right side readings and the variability is higher on the left side readings. Once again, I used the Brushing tool on the graphs to turn “off” the one outlier point from the left side. The graph and corresponding statistics updated immediately to remove the effect of the outlier point.

The ability to easily track such important health information coupled with the growing ability to analyze it, monitor it, and suggest interventions when necessary even from afar….Living in the future is pretty exciting, isn’t it?


Photo credit:

STATISTICA Decisioning Platform(TM) Integrates Predictive Analytics into Organizations’ Business Processes and Decision Making

StatSoft announces the STATISTICA Decisioning Platform™ – a new decision support technology that provides a new, enhanced platform for StatSoft decision support solutions for Banking, Insurance, and Manufacturing. It is an extension of the STATISTICA Enterprise analytics software platform, which integrates predictive analytics, text mining, and a rules engine to deliver recommendations, results, and guidance to front-line workers and decision makers within the respective departments and business units of an organization.

The proactive and on-demand recommendations achieve improved performance, product quality, customer satisfaction, market share, and other organizational objectives. The STATISTICA Decisioning Platform™ includes user personalization, a “workbench” for predictive analytics and text mining for quantitative analysts, management and versioning of predictive models, a Rules Builder™ for composing, testing, and deploying decision rules, and high performance STATISTICA Enterprise Server software for off-loading predictive model-building and scoring.

Predictive modeling is omnipresent across all aspects of commercial and government organizations, including the marketing, sales, manufacturing, and service functions. The rapid growth of data from various sources describing in detail critical aspects of the environment in which an organization operates, and processes that make an organization successful, have elevated predictive modeling to become a critical business activity. The STATISTICA Decisioning Platform™ guides decision making throughout an organization by “pushing out” recommendations and integrating them directly in core line-of-business systems – relevant insights delivered in a timely fashion to the people who need them in a format that empowers them to act upon those insights.

The STATISTICA Decisioning Platform™ provides business analysts with a flexible Rules Builder™ for defining and reviewing business rules. The Rules Builder packages a powerful set of logical primitives for defining the business rules. These Rules become reusable templates in a workflow workspace that are stored and managed under access control permissions and versioning/history in the STATISTICA system so that questions such as “Why was this rule changed?,” “Who changed it?,” and “What were the applicable rules in place on January 1, 2011?” are all possible to answer (and also to meet regulatory requirements). These Rules nodes can be re-used and applied across different processes, product lines, etc., when that is relevant. STATISTICA Rules Builder also contains powerful tools to accelerate the sometimes challenging work of debugging and validating rules by enabling business analysts to simulate scenarios to check and confirm, for example, that their rules definitions match their organization’s business requirements.

The leading-edge capabilities of the STATISTICA Decisioning Platform™ provide organizations with a wide variety of business benefits.  For example, the ability to:

Financial Services:

  • Reject fewer credit applications and reduce default rates
  • Increase profit with high-risk products such as consumer sales financing on a mobile platform at the point of sale


  • Manage both subrogation and fraud propensity within a single workflow
  • Estimate claim complexity and reserve requirements based on capturing a few key variables

StatSoft, Inc., founded in 1984, is one of the largest global providers of analytic solutions with offices in 30 countries worldwide. StatSoft’s flagship product, STATISTICA, is the enterprise predictive analytics platform for deploying analytic applications across departments, sites, and organizations.

Google Analytics API via STATISTICA Visual Basic

Google Analytics API via STATISTICA Visual Basic

This example macro will query the Google Analytics database and return various metrics/dimensions associated with external user visits to the user’s organizational website. To use this macro, the user’s organization must have an established Google Analytics account with a valid e-mail address and password available to the user. This example macro consists of one main procedure and seven (7) public and private sub-procedural functions.

This macro uses Public Function GetGAProfiles which was written by Mikael Thuneberg. Additional code was provided by Iliyan Gochev, StatSoft Bulgaria and RCoon, StatSoft.

Correct Spelling and Translate – Visual Basic

STATISTICA Developer Network

Correct Spelling and Translate

Download and open the STATISTICA workbook. It contains an example dataset and STATISTICA Visual Basic macro.

The macro will modify the dataset. It will:

  1. Use Microsoft Word spell checker to correct spelling, and write the results into Column 2
  2. Use the corrected text in Column 2 to translate the text into Spanish, using Google Translate