Poll Results: Statsoft Statistica becoming the most popular commercial tool!

For the first time, the number of users of free/open source software exceeded the number of users of commercial software. The usage of Big data software grew five-fold. R, Excel, and RapidMiner were the most popular tools, with Statsoft Statistica getting the top commercial tool spot.

The 13th annual KDnuggets Software Poll asked:

What analytics/data mining software you used in the past 12 months for a real project (not just evaluation)

About 28% used commercial software but not free software, 30% used free software but not commercial, and 41% used both.

The usage of big data tools grew five-fold: 15% used them in 2012, vs about 3% in 2011.

R, Excel, and RapidMiner are the most popular tools, with Statsoft Statistica becoming the most popular commercial tool, getting more votes from SAS (in part due to more active campaign from Statsoft users, and lack of such campaign from SAS).

Among those who wrote analytics code in lower-level languages, R, SQL, Java, and Python were most popular.

This poll also had a very large number of participants and used email verification and other measures to remove unnatural votes (*see note below).

 

What Analytics, Data mining, Big Data software you used in the past 12 months for a real project (not just evaluation) [798 voters]
Legend: Free/Open Source tools
Commercial tools
% users in 2012
% users in 2011
R (245) 30.7%
23.3%
Excel (238) 29.8%
21.8%
Rapid-I RapidMiner (213) 26.7%
27.7%
KNIME (174) 21.8%
12.1%
Weka / Pentaho (118) 14.8%
11.8%
StatSoft Statistica (112) 14.0%
8.5%
SAS (101) 12.7%
13.6%
Rapid-I RapidAnalytics (83) 10.4%
not asked in 2011
MATLAB (80) 10.0%
7.2%
IBM SPSS Statistics (62) 7.8%
7.2%
IBM SPSS Modeler (54) 6.8%
8.3%
SAS Enterprise Miner (46) 5.8%
7.1%
Orange (42) 5.3%
1.3%
Microsoft SQL Server (40) 5.0%
4.9%
Other free analytics/data mining software (39) 4.9%
4.1%
TIBCO Spotfire / S+ / Miner (37) 4.6%
1.7%
Oracle Data Miner (35) 4.4%
0.7%
Tableau (35) 4.4%
2.6%
JMP (32) 4.0%
5.7%
Other commercial analytics/data mining software (32) 4.0%
3.2%
Mathematica (23) 2.9%
1.6%
Miner3D (19) 2.4%
1.3%
IBM Cognos (16) 2.0%
not asked in 2011
Stata (15) 1.9%
0.8%
Bayesia (14) 1.8%
0.8%
KXEN (14) 1.8%
1.4%
Zementis (14) 1.8%
3.7%
C4.5/C5.0/See5 (13) 1.6%
1.9%
Revolution Computing (11) 1.4%
1.4%
Salford SPM/CART/MARS/TreeNet/RF (9) 1.1%
10.6%
Angoss (7) 0.9%
0.8%
SAP (including BusinessObjects/Sybase/Hana) (7) 0.9%
not asked in 2011
XLSTAT (7) 0.9%
0.9%
RapidInsight/Veera (5) 0.6%
not asked in 2011
11 Ants Analytics (4) 0.5%
5.6%
Teradata Miner (4) 0.5%
not asked in 2011
Predixion Software (3) 0.4%
0.5%
WordStat (3) 0.4%
0.5%

Among tools with at least 10 users, the tools with the highest increase in “usage percent” were

  • Oracle Data Miner, 4.4% in from 2012, up from 0.7% in 2011, 505% increase
  • Orange, 5.3% from 1.3%, 315% increase
  • TIBCO Spotfire / S+ / Miner, 4.6% from 1.7%, 169% increase
  • Stata, 1.9% from 0.8%, 130% increase
  • Bayesia, 1.8% from 0.8%, 115% increase

The three tools with highest decrease in usage percent were 11 Ants Analytics, Salford SPM/CART/MARS/TreeNet/RF, and Zementis. Their dramatic decrease is probably due to vendors doing much less (or nothing) to encourage their users to vote in 2012 as compared to 2011.

Note: 3 tools received less than 3 votes and were not included in the above table: Clarabridge, Megaputer Polyanalyst/TextAnalyst, Grapheur/LIONsolver.

Big Data

Big data tools use grew 5-fold, from about 3% to about 15% of respondents.

Big Data software you used in the past 12 months
Apache Hadoop/Hbase/Pig/Hive (67) 8.4%
Amazon Web Services (AWS) (36) 4.5%
NoSQL databases (33) 4.1%
Other Big Data Data/Cloud analytics software (21) 2.6%
Other Hadoop-based tools (10) 1.3%

We also asked about the popularity of the individual languages for data mining. Note that we also included R in this table, as well as among higher-level tools

Your own code you used for analytics/data mining in the past 12 months in:
R (245) 30.7%
SQL (185) 23.2%
Java (138) 17.3%
Python (119) 14.9%
C/C++ (66) 8.3%
Other languages (57) 7.1%
Perl (37) 4.6%
Awk/Gawk/Shell (31) 3.9%
F# (5) 0.6%

For comparison here are the recent software polls:

Vote: cleaning: To reduce multiple voting this poll used email verification, which reduced the total number of votes compared to 2011, but made results more representative.
Furthermore, some vendors were much more active than others in recruiting their users, and to give a more objective picture of the tool popularity, a large number (over 100) of the “unnatural” votes were removed, leaving 798 votes.

Advertisements

About statsoftsa

StatSoft, Inc. was founded in 1984 and is now one of the largest global providers of analytic software worldwide. StatSoft is also the largest manufacturer of enterprise-wide quality control and improvement software systems in the world, and the only company capable of supporting its QC products worldwide, with wholly owned subsidiaries in all major markets (StatSoft has 23 full-service offices, on all continents), and its software is available in more than 10 languages.

Posted on June 1, 2012, in Uncategorized and tagged , , , , , , , . Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: