ebook img

Big Data PDF

252 Pages·2017·13.118 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Big Data

B D IG ATA About the Author Dr. Anil Maheshwari is a Professor of Management Informa- tion Systems and Director of Center for Data Analytics at Maharishi University of Management, Fairfield, Iowa, USA. He received a bachelor’s degree in Electrical Engineering from Indian Institute of Technology (IIT) Delhi, and further received his MBA degree from Indian Institute of Management (IIM) Ahmedabad. He earned his PhD from Case Western Reserve University, Cleveland, Ohio. He has been a Professor at the University of Cincinnati, City University of New York, among others. His research has been published in prestigious journals and conferences. He teaches data analytics, big data, leadership, and marketing. He has authored many books in data science and leadership. He has also worked in the global IT industry for over 20 years, including leadership roles at IBM in Austin, Texas. He has completed various leadership and marketing training programs at IBM and also won several awards. He is a practitioner of Transcendental Meditation (TM) and TM-Sidhi techniques. He blogs on IT and Enlightenment at anilmah.com, and is a popular speaker on those topics. He is also a marathoner. He can be reached at [email protected]. B D IG ATA Dr. Anil Maheshwari Professor of Management Information Systems and Director of Center for Data Analytics Maharishi University of Management Fairfield, Iowa, USA. McGraw Hill Education (India) Private Limited CHENNAI McGraw Hill Education Offices Chennai NewYork St Louis SanFrancisco Auckland Bogotá Caracas Kuala Lumpur Lisbon London Madrid Mexico City Milan Montreal San Juan Santiago Singapore Sydney Tokyo Toronto McGraw Hill Education (India) Private Limited Published by McGraw Hill Education (India) Private Limited 444/1, Sri Ekambara Naicker Industrial Estate, Alapakkam, Porur, Chennai - 600 116 Big Data Copyright © 2017, by Dr. Anil Maheshwari No part of this publication may be reproduced or distributed in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise or stored in a database or retrieval system without the prior written permission of the publishers. The program listings (if any) may be entered, stored and executed in a computer system, but they may not be reproduced for publication. This edition can be exported from India only by the publishers, McGraw Hill Education (India) Private Limited Print Edition: ISBN-13: 978-93-5260-502-6 ISBN-10: 93-5260-502-0 E-Book Edition: ISBN-13: 978-93-5260-454-8 ISBN-10: 93-5260-454-7 Managing Director: Kaushik Bellani Director—Science & Engineering Portfolio: Vibha Mahajan Lead—Science & Engineering Portfolio: Hemant Jha Content Development Lead: Shalini Jha Content Developer: Shehla Mirza Production Head: Satinder S Baveja Sr. Manager—Production: Piyaray Pandita General Manager—Production: Rajender P Ghansela Manager—Production:Reji Kumar Information contained in this work has been obtained by McGraw Hill Education (India), from sources believed to be reliable. However, neither McGraw Hill Education (India) nor its authors guarantee the accuracy or completeness of any information published herein, and neither McGraw Hill Education (India) nor its authors shall be responsible for any errors, omissions, or damages arising out of use of this information. This work is published with the understanding that McGraw Hill Education (India) and its authors are supplying information but are not attempting to render engineering or other professional services. If such services are required, the assistance of an appropriate professional should be sought. Typeset at The Composers, 260, C.A. Apt., Paschim Vihar, New Delhi 110 063 and printed at Cover Printer: Visit us at: www.mheducation.co.in Dedicated to My family of three lovely ladies Wife Neerja, and daughters Ankita and Nupur who taught me love, care and gentleness Preface Big Data is a new, and inclusive, natural phenomenon, as big and messy as nature itself. It requires a new kind of consciousness to fathom its scale and scope, and its many opportunities and challenges. Understanding the essentials of Big Data requires suspending many conventional expectations and assumptions about data ... such as completeness, clarity, consistency, and conciseness. Fathoming and taming the multi- layered Big Data is a dream that is slowly becoming a reality. It is a rapidly evolving field that is growing exponentially in value and capabilities. There are a growing number of books being written on Big Data. They fall mostly in two categories. There are those that focus on business aspects, and discuss the strategic internal shifts required for reaping the business benefits from the many opportunities offered by Big Data. Then there are those that focus on particular technology platforms, such as Hadoop or Spark. This book aims to bring together the business context and the technologies in a seamless way. Thanks to Maharishi Mahesh Yogi for creating a wonderful university whose consciousness-based environment made writing this evolutionary book possible. Thanks to many current and former students for contributing to this book. Dheer- aj Pandey assisted with the Weblog analyzer application and its details. Suraj Thapalia assisted with the Hadoop installation guide. Enkh Tseeleesuren helped write the Spark tutorial. Thanks to my family for supporting me in this process. My daughters Ankita and Nupur reviewed the book and made helpful comments. My father Mr. R L Maheshwari and brother Dr. Sunil Maheshwari also read the book and enthusiastically approved it. My colleague Dr. Edi Shivaji too reviewed the book. May the Big Data Force be with you! Dr. Anil Maheshwari May 2017, Fairfield, IA Contents Preface vii 1. Wholeness of Big Data 1 Introduction 1 1.1 Understanding Big Data 2 1.2 Capturing Big Data 4 1.2.1 Volume of Data 4 1.2.2 Velocity of Data 4 1.2.3 Variety of Data 5 1.2.4 Veracity of Data 5 1.3 Benefitting from Big Data 7 1.4 Management of Big Data 8 1.5 Organizing Big Data 9 1.6 Analyzing Big Data 10 1.7 Technology Challenges for Big Data 12 1.7.1 Storing Huge Volumes 12 1.7.2 Ingesting Streams at an Extremely Fast Pace 12 1.7.3 Handling a Variety of Forms and Functions of Data 13 1.7.4 Processing Data at Huge Speeds 13 1.8 Conclusion 14 1.9 Organization of the Rest of the Book 15 Review Questions 16 True/False Questions 16 S 1 ECTION 2. Big Data Sources and Applications 21 Introduction 21 2.1 Big Data Sources 23 x Contents 2.1.1 People-to-People Communications 23 2.1.2 People-to-Machine Communications 24 2.2 Machine-to-Machine (M2M) Communications 25 2.2.1 RFID Tags 26 2.2.2 Sensors 26 2.3 Big Data Applications 27 2.3.1 Monitoring and Tracking Applications 27 2.3.2 Analysis and Insight Applications 29 2.3.3 New Product Development 31 2.4 Conclusion 32 Review Questions 32 True/False Questions 32 3. Big Data Architecture 34 Introduction 34 3.1 Standard Big Data Architecture 35 3.2 Big Data Architecture Examples 37 3.2.1 IBM Watson 37 3.2.2 Netflix 38 3.2.3 eBay 38 3.2.4 VMWare 39 3.2.5 The Weather Company 40 3.2.6 TicketMaster 40 3.2.7 LinkedIn 42 3.2.8 Paypal 42 3.2.9 CERN 43 3.3 Conclusion 44 Review Questions 44 True/False Questions 44 S 2 ECTION 4. Distributed Computing Using Hadoop 47 Introduction 47 4.1 Hadoop Framework 48 4.2 HDFS Design Goals 48 4.3 Master-Slave Architecture 49 4.4 Block System 51 4.5 Ensuring Data Integrity 52

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.