Monthly Archives: March 2014
If the sheer volume of this week’s Twitter buzz is any indication, it is clear that Dell Software’s acquisition of Tulsa-based StatSoft (announced three days ago) has surprised, impressed, and befuddled many an observer of the advanced analytics space.
Within hours of Dell’s press release this past Monday morning, plenty of forward-thinking statements and opinions were already being expressed as bloggers and journalists trumpeted the information across social media channels.
For your reading pleasure, here is a short list of just some of the feedback I’ve been able to keep up with. To help you decide what to read, I have taken the liberty of noting what I found to be quick takeaways.
- Dell Boosts Big Data Holdings with StatSoft Purchase
by Todd Nevins
QUICK TAKEAWAY: Dell’s “biggest move yet”
- Dell Acquires StatSoft, Continues Push Toward Cloud and Software Services
by Shawn Hessinger
QUICK TAKEAWAY: Author wonders about impact to “small businesses.”
- Dell Acquires StatSoft
by Martin Butler
QUICK TAKEAWAY: “The StatSoft acquisition is a good move.”
- Dell Expands to Predictive Analytics with StatSoft Buy
by Jennifer LeClaire
QUICK TAKEAWAY: Includes Pund-IT Principal Analyst Charles King’s thoughts about the acquisition.
- Dell makes an uninspired acquisition of data-mining specialist StatSoft
by Derrick Harris
QUICK TAKEAWAY: Author thinks (wrongly) that corporate age and technical relevancy have an inverse relationship.
- Dell Made a Smart Move Buying StatSoft
by Virginia Backaitis
QUICK TAKEAWAY: “StatSoft holds the key that opens the door to computing’s third wave.”
- Dell jumpstarts advanced analytics strategy with StatSoft buy
by Maria Deutscher
QUICK TAKEAWAY: “Today’s acquisition of StatSoft signals [Dell’s] intent to leave the sidelines and establish a solid foothold for growth…on the mid-market.”
- Dell acquires StatSoft and the list of predictive platform vendors gets even shorter…
by Simon Arkell
QUICK TAKEAWAY: Author addresses the value and shortage of platform-based vendors (like StatSoft) as industry consolidates.
StatSoft is proud to announce today that we have joined forces with Dell and Dell’s Information Management Group, one of the largest providers of end-to end BI and analytic solutions in the market. As of today, StatSoft is part of the Dell organization.
End-to-end advanced analytics solutions. For StatSoft and Dell customers, this means new opportunities and capabilities to enable leading edge analytics technologies to leverage the accelerating growth of data occurring in every industry, to achieve and retain industry leadership. Turning the torrents of data into actionable information is the fundamental mission of StatSoft as well as Dell’s Information Management Group. StatSoft’s big data predictive modeling and data mining solutions for various industries, combined with Dell’s wide range of data management and software capabilities and affordable, leading-edge, and comprehensively supported x86 server platforms can deliver big data analytics at a Dell price-point for unbeatable ROI.
Dell Software already offers a host of tools to manage data and databases across structured and unstructured data sources, including products such as Toad for Oracle, Toad for SQL Server, and Spotlight on SQL Server Enterprise, as well as tools to integrate data and applications distributed across the organizations, including products such as SharePlex and Dell Boomi, the latter of which was recently positioned by Gartner, Inc. in the “Leaders” Quadrant of the Magic Quadrant for Enterprise Integration Platform as a Service.
Making the World More Productive
We are excited to combine with Dell’s shared resources providing myriads of opportunities to leverage StatSoft’s analytic solutions in concert with Dell’s hardware solutions, and by way of its numerous industry relationships, including those with SAP Hana, Oracle, Microsoft SQL and PDW, and Cloudera. We are looking forward to continued growth together with our distinguished list of successful customers in practically every industry, and thank you for your support.
Information technology research and advisory firm Gartner has unveiled its Magic Quadrant for Advanced Analytics Platforms, publicly highlighting StatSoft’s position with top-tier “ability to execute” advanced solutions.
This particular quadrant report, new to Gartner’s offerings, was released February 24 in a brief session at the Gartner Business Intelligence and Information Management Summit in Sydney, Australia.
Darrel Amarasekera, Managing Director of StatSoft Pacific, was among the audience of vendors and executives, whom he described as “enthusiastic [and] very, very attentive” while Gartner Research Director Lisa Kart skimmed through the report’s contents.
Kart specifically drew the audience’s attention to StatSoft’s status as a new entrant among the top three vendors capable of executing advanced solutions. She shared with the audience some of the strengths of the STATISTICA platform. In the downloadable report (sign-in required), these strengths address STATISTICA’s wide range of functionality with a broad variety of data types; high customer satisfaction with advanced descriptive and predictive analytics; and scalability. In addition, StatSoft was reported with some of the highest evaluations for product reliability and upgrade experience, and STATISTICA was most frequently selected for license cost and speed of model deployment.
Previously, Gartner analysts had combined business intelligence with analytics in their annual Magic Quadrant research reports. However, recent industry changes with big data and predictive analytics have prompted them to develop a standalone “Advanced Analytics” category this year.
Gartner clients can access the complete Advanced Analytics Magic Quadrant report online.
STATISTICA reduces emissions spikes & associated costs with #PredictiveAnalytics at coal coking plant, pays for itself in 6 months
This past spring, Mayato, a data mining and business analytics consulting company based in Germany, conducted its annual study of data mining tools.
The 2013 study focused on multi-media analytics solutions and pitted several major software vendors against one another. Once again, STATISTICA scored very highly and earned top ranking for user friendliness.
Of over 150 analytics tools on the market, Mayato included STATISTICA among its selection of four data mining suites whose functionality they consider to be comprehensive:
- StatSoft: STATISTICA Professional 12
- IBM SPSS Statistics Professional 21
- SAS Enterprise Guide 5.1
- Rapid-I: RapidMiner 5.3 / R (open-source)
Each tool had to prove itself in a test scenario covering all phases of a typical analysis project: from data import through the creation of forecasting models (linear regression) to the interpretation of results. Factors affecting the user experience—stability, speed, documentation, and operation—were also evaluated.
Analyst Peter Neckel at ComputerWoche magazine reviewed the study and its competitors in a German-language article published April 25, 2013.
Neckel noted that STATISTICA outstripped the competitive field in the area of user friendliness, thanks to its modern and consistent user interface for all tasks and products. He also expressed appreciation for STATISTICA’s abundant variety of functions, especially regarding the number of available regression, data preparation, and parameterization methods.
Mayato conducted its field test on a sample of real data sets from JustBook, a hotel booking apps provider seeking to distribute its marketing budget efficiently across online and offline channels.
Complete study results are available at http://www.mayato.com.
Our previous How-To article, How to Deploy Models Using SVB Nodes, covered a topic that is becoming increasingly important, especially in data mining applications with a graphical user interface working with nodes that represent data mining algorithms. Rajiv Bhattarai covered the primary topic of deployment using the original STATISTICA Visual Basic (SVB) nodes. As STATISTICA reflects the rapid advances in technology and makes significant investments to remain a leader in predictive analytics, new nodes have been developed. This is a source of many questions, and this article will help to describe the differences between the scripted SVB nodes and the new STATISTICA Workspace nodes. Further, it will be shown how using the new nodes makes model deployment easier than ever.
- Before the node is run, it will appear with a yellow background. When the node is run, the background will turn from yellow to clear, an indication that you have completed the analysis.
- Additional functionality is represented by icons on the node:
- Nodes are run by clicking the green arrow icon located at the lower-left corner of the analysis node.
- Parameters can be edited by clicking the grey gear icon at the upper-left corner of the node.
- Node results can be viewed by clicking the report icon at the upper-right corner of the node.
- Downstream results are indicated by a document icon at the lower-right corner of the node.
- Nodes can be connected by clicking the gold diamond icon at the center-right side of the node, holding down, and drawing an arrow to another node where you can release the click, thereby attaching two nodes together.
- Variable selection can be performed on the analysis node.
- The functionality of the node closely resembles the functionality of the respective interactive analysis. As you can see with the results options for the Boosted Classification Trees above, in the results alone, you have much more control over what output is provided upon completion of the analysis.
- Deployment functionality is built into the node.
The primary goal of churn analysis is to identify those customers that are most likely to discontinue using your service or product. In this dynamic financial industry, companies are progressively providing products and services with similar features. Amidst this ever growing competition, the cost of acquiring a new customer typically exceeds the cost of retaining a current customer. Existing customers are a valuable asset. Furthermore, given the nature of the financial services industry, where customers generally tend to stay with a company for a longer term, churning could lead to substantial revenue loss.
With StatSoft’s Churn Analysis Solution, you can identify customers who are likely to churn by making precise predictions, reveal customer segments and reasons for leaving, engage with customers to improve communication and loyalty, calculate attrition rates, develop effective marketing campaigns to target customers and increase profitability. With STATISTICA’s advanced modeling algorithms and wide array of state-of-the-art tools, you can develop powerful models that can aid in accurate prediction of customer behavior and trends and avoid losing customers.
- Batch or Real-Time Processing: Use the models you have built to determine churn and indicate, either by batch or in real-time, the customers who are likely to transfer their business to another company.
- Cutting-edge Predictive Analytics: STATISTICA provides a wide variety of basic to sophisticated algorithms to build models which provide the most lift and highest accuracy for improved churn analysis.
- Innovative Data Pre-processing Tools: STATISTICA provides a very comprehensive list of data management and data visualization tools.
- Integrated Workflow: STATISTICA Decisioning Platform provides a streamlined workflow for powerful, rules-based, predictive analytics where business rules and industry regulations are used in conjunction with advanced analytics to build the best models.
- Optimized Results: Compare the latest data mining algorithms side-by-side to determine which models provide the most gain. Produce profit charts with ease.
- Role-Based, Enterprise-Wide Scope: If yours is a multi-user collaborative environment, you can use STATISTICA Enterprise to share data, improve churn models, and benefit from collaborative work with small or large groups.
- Text Mining Unstructured Data: Improve churn models by using powerful text mining algorithms to incorporate unstructured data currently sitting unused in storage.
Perhaps some readers are aware of Sheena Iyengar’s (classic) jam choice study from 1995, in which a grocery market try-before-you-buy display was set up with 24 sample jars of jam, alternated every few hours with a much smaller display of 6 jars. As described in the NY Times, considerably more customers were drawn to the larger display; however, the ratio of buyers was only 1/10 the size of the ratio who bought from the limited 6-jar display. Professor Iyengar hypothesized that “the presence of choice might be appealing as a theory, but in reality, people might find more and more choice to actually be debilitating.”
Certainly, given that the availability of choices does have some value, data categorization is important. But when I ran across Seth Redmore’s recent post about his musical background and the size and scope of musical genres on the market today, I could not believe what he had discovered: a laughably over-zealous list of electronic music categories. Thousands of them.
I am by no means a music industry expert, but it seems clear that when a musician/composer arbitrarily invents a unique name for his personal “brand” of music, such action does not mean a new genre has officially come into being. After all, we are talking about classification of “unstructured” content here (i.e., music), not a scientific taxonomy. As a practical matter in the real world where decisions are made, the differentiation of these so-called genres and sub-genres exists only in the minds of the (likely self-absorbed) composers who coined their names.
From a data collection standpoint, the more categories assigned, the greater the chance of miscategorization, misinterpretation, and confusion. This would only hinder the “shared understanding” Mr. Redmore says can be achieved with data categorization, even if music providers claim such categorization is intended to help consumers find exactly what they want.
My counter-intuitive point here (and maybe Redmore’s, too) is that the consumer cannot possibly know what he wants when faced with so many non-standardized music choices with ridiculously similar genre names like ritual ambient v. black ambient v. doom ambient v. drone ambient v. deep ambient v. death ambient. Mr. Redmore even mentions Netflix with its nearly 77K movie categories! From a marketing standpoint, that is crazy–There is simply no practical reason to attempt the creation of big data where such breadth is detrimental to decision-making. And this would be true whether in the online music room or in the executive board room.