STATISTICA Text Miner
STATISTICA Text Miner is an optional extension of STATISTICA Data Miner, ideal for translating unstructured text data into meaningful, valuable clusters of decision-making “gold.”
As most users familiar with text mining already know, real-world data comes in a variety of forms, not always organized or easily ready to analyze. Text mining digs for the underlying information not readily apparent in traditional structured data. These data sources can be extremely large as well. STATISTICA Text Miner is optimized and has recently been further enhanced for working with such data.
How can you Use STATISTICA Text Miner ?
- Analyze the contents of Web pages. For example, users can automatically process and summarize all Web pages of particular companies, message boards, etc.
- Include unstructured notes in predictive data mining projects. For example, users may include responses to open-ended interview questions, patients’ own descriptions of medical symptoms, etc. in data mining projects involving the clustering of patients and symptoms.
- Analyze large document repositories. For example, users may analyze repositories of documents such as narratives of insurance claims, etc., to include such information in fraud detection projects.
STATISTICA Text Miner was specifically designed as a general and open-architecture tool for mining unstructured information. The feature extraction/selection and other analytic tools available in STATISTICA Text Miner are not only applicable to text documents or Web pages, but can also be used to index, classify, cluster, or otherwise include in your analyses unstructured information such as (pre-processed) bitmaps imported as data matrices, etc..
- Accessing Documents
- Processing Documents
- Analyzing Documents
Integration with STATISTICA, STATISTICA Data Miner, and STATISTICA Enterprise
The text miner software is fully integrated into the STATISTICA line of software. It is not a stand-alone product manufactured by another vendor and “connected” to STATISTICA. Text mining functionality can be integrated into the STATISTICA Data Miner workspace environment, STATISTICA Enterprise, or custom STATISTICA applications.
For example a customer may:
- automatically access data stored in a data warehouse
- update certain analyses and numeric summaries of the textual information
- publish results to authorized users via the Internet
It is scalable and uses multi-threaded computing technology to extract optimum performance from advanced multiple-processor server hardware.