Monthly Archives: March 2012

Text Mining Insurance Losses – Video

To Watch the video, please click here


StatSoft’s STATISTICA Achieves SAP-Certified Integration With SAP NetWeaver

SAP Certified Logo

StatSoft announced that its STATISTICA release has achieved certified integration with the SAP NetWeaver® technology platform. STATISTICA, StatSoft’s flagship analytics software platform, works with SAP NetWeaver for data exchange. Organizations around the world aggregate their data into an analysis-ready format in the SAP NetWeaver Business Warehouse (SAP NetWeaver BW) component. These organizations can now apply an extensive set of proven high-performance data mining and predictive modeling techniques to gain greater insights from their data made available through SAP NetWeaver BW.

The SAP Integration and Certification Center (SAP ICC) has certified that STATISTICA  integrates with SAP NetWeaver Business Intelligence 7.0 via the OLAP BAPI® programming interface for SAP NetWeaver Business Warehouse (BW-OBI) 2.0 to exchange critical data with instances of SAP Business Suite software.

STATISTICA’s SAP-certified interface further unlocks the investment in aggregated, cleaned data in SAP NetWeaver BW. STATISTICA provides the integrated analytics platform for the exploration, clustering, forecasting, visualization, and predictive modeling using these data. Insights derived from these data differ by industry.

STATISTICA software is a configurable enterprise analytics, predictive modeling, and text mining software platform. STATISTICA administrators configure standard queries, analysis templates, report templates, and dashboards and publish them to the groups of users who rely on the insights and summaries for improved decision making.

StatSoft initiated development and certification of the interface to SAP NetWeaver BW in response to the needs of its customers with large investments in the SAP solution. For example, a medical equipment and supplies product manufacturer was in need of assistance in monitoring its product performance, specifically the defects, customer issues, and how their products are working in the field. Before STATISTICA was integrated with SAP NetWeaver BW, analysts at this medical device company were accessing the data in a manual process and getting the information about once a month from their SAP system and then manually analyzing the data. StatSoft’s customer has improved its regulatory compliance and field surveillance initiatives through the use of the STATISTICA analytics platform now integrated with SAP NetWeaver BW.

StatSoft, Inc., founded in 1984, is one of the largest global providers of analytic solutions with offices in 30 countries worldwide. StatSoft’s flagship product, STATISTICA, is the enterprise predictive analytics platform for deploying analytic applications across departments, sites, and organizations.

SAP, SAP NetWeaver, BAPI and all SAP logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries.
All other product and service names mentioned are the trademarks of their respective companies.

STATISTICA Knowledge Base – FAQ

 STATISTICA Knowledge Base / FAQ

statistica knowledge base blue man find questions answers faq The STATISTICA Knowledge Base provides summary information on the most commonly used conventions, features, and facilities in STATISTICA.

Table of Contents

Additional information


Additional information and step-by-step examples are available in our extensive Help files, the STATISTICA Electronic Manual. Remember, when using STATISTICA, you can press the F1 key at any time to access context-sensitive help.


If you’re looking for tutorials, you can visit the StatSoft Channel on YouTube.


If you need to speak with us directly, call Technical Support at 0112346148. You may also e-mail us at: or


Need more individualized help? Check out StatSoft’s training programs.

STATISTICA…..Adding Reference Lines to a Plot

Save The Rhino – South Africa – Petition

Dear friends,

The rhino is being hunted to the brink of extinction, driven by growing horn demand in Asia. But EU pressure on China and Vietnam can force international action to save the rhino — sign our petition today to ensure the EU acts!

The rhino is being hunted into extinction and could disappear forever unless we act now. Shocking new statistics show 440 rhinos were brutally killed last year in South Africa alone — a massive increase on five years ago when just 13 had their horns hacked off. European nations could lead the world to a new plan to save these amazing creatures but they need to hear from us first!

To sign the petition click here

Fueling this devastation is a huge spike in demand for rhino horns, used for bogus cancer cures, hangover remedies and good luck charms in China and Vietnam. Protests from South Africa have so far been ignored by the authorities, but Europe has the power to change this by calling for a ban on all rhino trade — from anywhere, to anywhere — when countries meet at the next crucial international wildlife trade summit in July.

The situation is so dire that the threat has even spread into British zoos who are on red-alert for rhino killing gangs! Let’s raise a giant outcry and urge Europe to push for new protections to save rhinos from extinction. When we reach 100,000 signers, our call will be delivered in Brussels, the decision-making heart of Europe, with a crash of cardboard rhinos. Every 50,000 signatures will add a rhino to the crash — bringing the size of our movement right to the door of EU delegates as they decide their position. Sign the petition below then forward this email widely:

So far this year one rhino has been killed every day in South Africa, home to at least 80% of the world’s remaining wild rhinos. Horns now have a street value of over $65,000 a kilo — more expensive than gold or platinum. The South African Environment Minister has pledged to take action by putting 150 extra wardens and even an electric fence along the Mozambique border to try and stem the attacks — but the scale of the threat is so severe that global action is required.

Unless we act today we may lose this magnificent and ancient animal species permanently. Some Chinese are loudly lobbying for the trade in horn to be relaxed, but banning the trade in all rhinos will silence them. With the EU’s leadership, we can bring these international gangsters to justice, put the poachers in prison, and push for public awareness programmes in key Asian countries — and end this horn horror show for good.

In the next few weeks, the EU will be setting its agenda for the next big global meeting in just a few months — our best chance of turning the tide against the slaughter. We know that rhinos will be on their agenda, but only our pressure can ensure they challenge the problem at its source. Let’s build a giant outcry and deliver it in a spectacular fashion — sign now and together we can stop the slaughter across Africa:

In 2010, Avaaz’s actions helped to stop the elephant ivory trade from exploding. In 2012, we can do the same for the rhino. When we speak out together, we can change the world — last year was the worst year ever for the rhino, but this can be the year when we win.

With hope,

Iain, Sam, Maria Paz, Emma, Ricken and the whole Avaaz team

More Information:

Few Rhinos Survive Outside Protected Areas (WWF)

South Africa record for rhino poaching deaths (BBC)

‘Cure for cancer’ rumour killed off Vietnam’s rhinos (The Guardian)

British Zoos on Alert as Rhino Poaching Hits the UK (International Business Times)

STATISTICA SAS – Considering Alternatives to SAS?

Considering Alternatives to SAS?
Do you use SAS for predictive modeling, advanced analytics, business intelligence, insurance or financial applications, or data visualization?

* Quotes from SAS Customers
* How to Proceed?


SAS software is expensive and carries high, unpredictable annual licensing costs. SAS software is difficult to use, requiring specific SAS programming expertise, and it drives users toward dependency on only SAS-specific solutions (e.g., their proprietary data warehouses). Data visualization is integral for analytics, but SAS’s graphics have major shortcomings.

STATISTICA has consistently been ranked the highest in ease of use and customer satisfaction in independent surveys of analytics professionals. Click here to see the results of the most recent Rexer survey (2010), the largest survey of data mining professionals in the industry.

SAS Alternative, Rexer Survey

SAS Alternative, Rexer Survey

We offer the breadth of analytics capabilities and performance, including the most comprehensive data mining solution on the market, using more open, modern technologies. StatSoft software is designed to facilitate interfacing with all industry standard components of your computer infrastructure (e.g., ultra-fast integration with Oracle, MS SQL Server, and other databases) instead of locking you into proprietary standards and total dependence on one vendor.

STATISTICA is significantly faster than SAS. StatSoft is a Software Partner of Intel and has developed technologies that leverage Intel CPU architecture to deliver unmatched parallel processing performance (press release with Intel) and rapidly process terabytes of data. StatSoft’s robust, cutting-edge enterprise system technology drives the analytics and analytic data management at some of the largest computer infrastructures in the world at Fortune 100 and Fortune 500 companies.

Quotes from SAS Customers

“We acquired our SAS license seven years ago and quickly learned that with SAS, you do not pay just an annual renewal and support fee – you practically have to “buy” the software again every year. Our first year renewal fee was already 60% of the initial purchase price, and it increased steadily and every year. Two years ago, our annual fee exceeded the initial purchase price we paid, and it keeps going up much faster than the inflation. This is not sustainable.” – CEO, Technology Company

“It took 8 weeks to install SAS Enterprise Miner. The installer just didn’t work. And we’re a midsize company, so we were a low priority for SAS’s technical support.” – Engineer, Chemical Company

“Early in our evaluation, we eliminated SAS from our consideration of fraud detection solutions primarily due to the exorbitant cost.” – Chief Actuary, Insurance Company

“We had used SAS on-demand for my data mining class. A few days before finals, all of our students’ project files were corrupted. Our SAS technical support representative confirmed there was nothing that could be done to restore the files. We’re switching to STATISTICA.” – University Professor

“Now, all graduate students use R. It is getting more difficult to find SAS programmers.” – Head of Statistics, Pharmaceutical Company

“We used SAS until May 2009 when we converted to WPS. The conversion went remarkably smoothly and was completed on time. Not only did we save a substantial amount in licensing fees, we also regained functionality such as Graphs that we had previously removed because of the cost.” – Survey respondent on
How to Proceed?

StatSoft makes it easy to transition your current SAS environment to STATISTICA, either gradually or all at once. STATISTICA offers:

* Direct import/export to SAS files
* Deployment of predictive models to SAS code to score against SAS data sets
* Native integration to run R programs

For a limited time, we offer to qualifying customers a special upgrade program – MSP (Migration from SAS Program) – where the initial software acquisition cost is guaranteed to be below your current SAS annual renewal cost (and StatSoft annual fees are guaranteed to remain always at only 20% of the initial cost, adjusted for CPI). As part of MSP, we also offer discounted migration service fees if our consultants are engaged to facilitate the migration process.
For more information and for specific recommendations to suit your needs, please contact one of our representatives using the form below:

How To Find Critical Values for Statistical Tests

any analysis tools that are used for hypothesis testing in STATISTICA give a calculated test statistic and a p-value. (For more information about hypothesis testing, see the article, How to Interpret Statistical Analysis Results.) At times, it may be necessary in your hypothesis test to report the test’s critical value. This article describes how to find the critical value for a statistical test.

What Is a Critical Value?

In statistical hypothesis testing, a critical value is the cut-off value for the computed test statistics that defines statistical significance of the test. This statistical test follows a known distribution, and the analyst will have selected alpha, the type I error rate. The critical value is the value from the distribution of the test for which P(X>X critical value) = alpha, where X is the observed test statistic and X critical value is the critical value for the test.

Elements Needed to Find the Critical Value

Statistical tests can be either one or two-sided, and this is important for finding the critical value of a test. Below are three possible null and alternative hypotheses for a single sample mean test. Given the same alpha, each of the three tests would have a different critical value (or values).

critical value for the test

The distribution a test follows is an important piece of finding the critical value. Tests comparing population means may follow a standard Normal or Studentized T distribution. ANOVA significance tests follow the F distribution. Chi Square is another common distribution of test statistics. T, F, and Chi Square all have one or more degree of freedom parameters that are needed to find the critical value.

Using STATISTICA to Find a Test’s Critical Value

You can use STATISTICA‘s Probability Distribution Calculator to find values or areas from various distributions. This tool can be used to find the critical value for a test. Once the distribution is selected, alpha is entered as well as any other required parameters such as degrees of freedom. Then the tool computes the critical value of the test.

For these examples, reading comprehension was measured for students in three grade levels who were administered one of two teaching methods.

Simple t-Test Example

In this one-sample t-test, researchers hypothesized that the average reading comprehension score is significantly different from 55. This is a two-sided test. Data was collected and the one-sample t-test was calculated with the results shown below.

simple t test example

The calculated test statistic is -1.96166 and p=0.054526. At alpha = 0.05, this two-sided test is not statistically significant, p-value = 0.054526 > 0.05 = alpha. Let’s find the critical value of the test.

On the Statistics tab in the Base group, click Basic Statistics to display the Basic Statistics and Tables dialog box. Select Probability calculator.

basic statistics

Click the OK button to display the Probability Distribution Calculator.

Since the test follows the t (Student) distribution, select this in the Distribution list. Select the Inverse, Two-tailed and (1-Cumulative p) check boxes. Enter 59 as df, which comes from the t-test output above. We are using alpha = 0.05, so enter this value for p. Click the Compute button to calculate the t critical value. It is found to be 2.000995.

probability distribution calculator 01

Since the test has a two-sided alternative, the critical region is -2.000995 < t calc < 2.000995. The computed test statistic is -1.96166, which is between -2.000995 and 2.000995, so the conclusion is to fail to reject the null hypothesis. There is insufficient evidence to conclude that the average reading comprehension score is significantly different from 55.

Now let’s assume that the test had a one-sided alternative, i.e. we hypothesize that average reading comprehension scores are significantly less than 55. (Note that this example is for illustrating the calculation of critical values only. It is not an acceptable statistical practice to change your hypothesis to suit the data. The one-sided alternative critical values are calculated slightly different and this example aims to highlight this difference.)

The critical value for this test is very similar, but the Two-tailed check box is cleared. Additionally, depending on the direction of the hypothesis, the (1-Cumulative p) check box may also be cleared. The alternative hypothesis is that µ<55. The direction of the inequality also tells the direction of the test. For this test, the (1-Cumulative p) check box should be cleared.

probability distribution calculator 02

For the one-sided test, the critical value is -1.671093. If t calc < -1.671093, reject H0. Thus, -1.96166<-1.671093, so reject H0. The conclusion for this test is different from the two-sided test, which failed to reject the null. Here, the conclusion would be that the average reading comprehension scores are significantly less than 55. (Note that the t-test output from STATISTICA gives p-values based on the two-sided alternative. For a one-sided alternative test, the p-value should be divided by 2.)

ANOVA F-Test Example

Using an example from an ANOVA analysis, let’s compute the critical F value for the significance test for an effect. The ANOVA table below tests for a significant effect of Method, Grade, and the interaction between the two on a student’s reading comprehension. This ANOVA output table gives the calculated F statistics, the associated p-values, and degrees of freedom for each test. (The F distribution requires 2 degrees of freedom parameters, nominator and denominator. The F statistics are calculated as MEeffect/MSerror. The degrees of freedom for the numerator and denominator come from the effect and error respectively. Additionally, F tests are always one-sided tests.) Using this information and the Probability Distribution Calculator, we can find the critical values of each test for a given alpha.

univariate tests of significance

In the Probability Distribution Calculator, select F (Fisher) in the Distribution list. Select the (1-Cumulative p) check box. Enter numerator and denominator degrees of freedom, df1 =1 and df2 =54. We are using alpha = 0.05, so enter this value for p.

distribution f(fisher

Click Compute to compute the F statistic: 4.019541. This is the critical value for the significance test for the METHOD effect.

probability distribution calculator 03

Given that the null hypothesis is true, P(F calc >4.019541)=0.05, so if F calc > F critical = 4.019541, reject H0. The computed test statistic is 1.884, which is less than 4.019541, so the conclusion is to fail to reject the null hypothesis. There is insufficient evidence to conclude that there is a significant difference in the teaching methods.


Critical values can be calculated for any statistical test that follows a known distribution. The Probability Distribution Calculator makes it easy to find these test critical values.

Text Mining Series – Enhancing Your Analysis

Text Mining Series – Sources of Text for Mining

Text Mining Series – Automatically Classify Text Documents