ebook img

Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS PDF

460 Pages·2014·17.492 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS

Text Mining and Analysis Practical Methods, Examples, and Case Studies Using SAS® Goutam Chakraborty, Murali Pagolu, Satish Garla support.sas.com/bookstore 2 The correct bibliographic citation for this manual is as follows: Chakraborty, Goutam, Murali Pagolu, and Satish Garla. 2013. Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS®. Cary, NC: SAS Institute Inc. Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS® Copyright © 2013, SAS Institute Inc., Cary, NC, USA ISBN 978-1-61290-787-1 All rights reserved. Produced in the United States of America. For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. For a web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication. The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic piracy of copyrighted materials. Your support of others’ rights is appreciated. U.S. Government Restricted Rights: Use, duplication, or disclosure of this software and related documentation by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19, Commercial Computer Software-Restricted Rights (June 1987). SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513-2414. November 2013 SAS provides a complete selection of books and electronic products to help customers use SAS® software to its fullest potential. For more information about our offerings, visit support.sas.com/bookstore or call 1-800-727- 3228. SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. 3 4 Contents About This Book About The Authors Acknowledgments Chapter 1 Introduction to Text Analytics Overview of Text Analytics Text Mining Using SAS Text Miner Information Retrieval Document Classification Ontology Management Information Extraction Clustering Trend Analysis Enhancing Predictive Models Using Exploratory Text Mining Sentiment Analysis Emerging Directions Handling Big (Text) Data Voice Mining Real-Time Text Analytics Summary References Chapter 2 Information Extraction Using SAS Crawler Introduction to Information Extraction and Organization SAS Crawler SAS Search and Indexing SAS Information Retrieval Studio Interface Web Crawler Breadth First Depth First Web Crawling: Real-World Applications and Examples Understanding Core Component Servers 5 Proxy Server Pipeline Server Component Servers of SAS Search and Indexing Indexing Server Query Server Query Web Server Query Statistics Server SAS Markup Matcher Server Summary References Chapter 3 Importing Textual Data into SAS Text Miner An Introduction to SAS Enterprise Miner and SAS Text Miner Data Types, Roles, and Levels in SAS Text Miner Creating a Data Source in SAS Enterprise Miner Importing Textual Data into SAS Importing Data into SAS Text Miner Using the Text Import Node %TMFILTER Macro Importing XLS and XML Files into SAS Text Miner Managing Text Using SAS Character Functions Summary References Chapter 4 Parsing and Extracting Features Introduction Tokens and Words Lemmatization POS Tags Parsing Tree Text Parsing Node in SAS Text Miner Stemming and Synonyms Identifying Parts of Speech Using Start and Stop Lists Spell Checking Entities Building Custom Entities Using SAS Contextual Extraction Studio Summary References Chapter 5 Data Transformation 6 Introduction Zipf’s Law Term-By-Document Matrix Text Filter Node Frequency Weightings Term Weightings Filtering Documents Concept Links Summary References Chapter 6 Clustering and Topic Extraction Introduction What Is Clustering? Singular Value Decomposition and Latent Semantic Indexing Topic Extraction Scoring Summary References Chapter 7 Content Management Introduction Content Categorization Types of Taxonomy Statistical Categorizer Rule-Based Categorizer Comparison of Statistical versus Rule-Based Categorizers Determining Category Membership Concept Extraction Contextual Extraction CLASSIFIER Definition SEQUENCE and PREDICATE_RULE Definitions Automatic Generation of Categorization Rules Using SAS Text Miner Differences between Text Clustering and Content Categorization Summary Appendix References Chapter 8 Sentiment Analysis Introduction 7 Basics of Sentiment Analysis Challenges in Conducting Sentiment Analysis Unsupervised versus Supervised Sentiment Classification SAS Sentiment Analysis Studio Overview Statistical Models in SAS Sentiment Analysis Studio Rule-Based Models in SAS Sentiment Analysis Studio SAS Text Miner and SAS Sentiment Analysis Studio Summary References Case Studies Case Study 1 Text Mining SUGI/SAS Global Forum Paper Abstracts to Reveal Trends Introduction Data Results Trends Summary Instructions for Accessing the Case Study Project Case Study 2 Automatic Detection of Section Membership for SAS Conference Paper Abstract Submissions Introduction Objective Step-by-Step Instructions Summary Case Study 3 Features-based Sentiment Analysis of Customer Reviews Introduction Data Text Mining for Negative App Reviews Text Mining for Positive App Reviews NLP Based Sentiment Analysis Summary Case Study 4 Exploring Injury Data for Root Causal and Association Analysis Introduction Objective Data Description Step-by-Step Instructions Part 1: SAS Text Miner Part 2: SAS Enterprise Content Categorization 8 Summary Case Study 5 Enhancing Predictive Models Using Textual Data Data Description Step-by-Step Instructions Summary Case Study 6 Opinion Mining of Professional Drivers’ Feedback Introduction Data Analysis Using SAS® Text Miner Analysis Using the Text Rule-builder Node Summary Case Study 7 Information Organization and Access of Enron Emails to Help Investigation Introduction Objective Step-by-Step Software Instruction with Settings/Properties Summary Case Study 8 Unleashing the Power of Unified Text Analytics to Categorize Call Center Data Introduction Data Description Examining Topics Merging or Splitting Topics Categorizing Content Concept Map Visualization Using PROC DS2 for Deployment DEPLOYMENT Integrating with SAS® Visual Analytics Summary Case Study 9 Evaluating Health Provider Service Performance Using Textual Responses Introduction Summary Index 9 About This Book Purpose Analytics is the key driver of how organizations make business decisions to gain competitive advantage. While the popular press has been abuzz with Big Data, we believe in “it is the analysis, stupid.” Having Big Data means little if that data is not leveraged via analytics to create better value for all stakeholders. One of the primary drivers of Big Data is the advent of social media that has exponentially increased the rate at which textual data is generated on the Internet and the World Wide Web. In addition to data generated via the Internet and the web, organizations have large repositories of textual data collected via forms, reports, customer surveys, voice-of-customers, call-center records and so on. There are numerous organizations that simply collect and store large volumes of unstructured text data, which are yet to be explored to uncover hidden nuggets of useful information that can benefit their business. However, there are not a lot of resources available that can efficiently handle text data for the business analyst community. This book is designed to help industries leverage their textual data and SAS tools to perform comprehensive text analytics. 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.