Table Of ContentChapman & Hall/CRC
Computer Science & Engineering / Data Mining and Knowledge Discovery
Big Data Series
The collection presented in the book covers fundamental and realistic issues B
about Big Data, including efficient algorithmic methods to process data, bet-
ter analytical strategies to digest data, and representative applications in diverse
BIG DATA
fields. ... This book is required understanding for anyone working in a major field I
of science, engineering, business, and financing.
G
—Jack Dongarra, University of Tennessee
The editors have assembled an impressive book consisting of 22 chapters writ- Algorithms, Analytics,
ten by 57 authors from 12 countries across America, Europe, and Asia. ... This
and Applications
book has great potential to provide fundamental insight and privacy to individu-
D
als, long-lasting value to organizations, and security and sustainability to the cy-
ber–physical–social ecosystem ....
—D. Frank Hsu, Fordham University
A Edited by
These editors are active researchers and have done a lot of work in the area of Kuan-Ching Li
Big Data. They assembled a group of outstanding chapter authors. ... Each sec-
T Hai JianG
tion contains several case studies to demonstrate how the related issues are
Laurence T. Yang
addressed. ... I highly recommend this timely and valuable book. I believe that it
will benefit many readers and contribute to the further development of Big Data A Alfredo Cuzzocrea
research.
—Dr. Yi Pan, Georgia State University
Presenting the contributions of leading experts in their respective fields, Big
Data: Algorithms, Analytics, and Applications bridges the gap between the
vastness of big data and the appropriate computational methods for scientific
and social discovery. It covers fundamental issues about Big Data, including ef-
aL
ficient algorithmic methods to process data, better analytical strategies to digest
ni
,
data, and representative applications in diverse fields such as medicine, science, d
J
and engineering. i
Ca
un
Overall, the book reports on state-of-the-art studies and achievements in algo-
zG
rithms, analytics, and applications of Big Data. It provides readers with the basis
z
,
for further efforts in this challenging scientific field that will play a leading role in o
Y
c
next-generation database, data warehousing, data mining, and cloud computing a
r
n
research.
e
g
a
,
K23331
www.crcpress.com
K23331_cover.indd 1 1/6/15 10:49 AM
BIG DATA
Algorithms, Analytics,
and Applications
Chapman & Hall/CRC
Big Data Series
SERIES EDITOR
Sanjay Ranka
AIMS AND SCOPE
This series aims to present new research and applications in Big Data, along with the computa-
tional tools and techniques currently in development. The inclusion of concrete examples and
applications is highly encouraged. The scope of the series includes, but is not limited to, titles in the
areas of social networks, sensor networks, data-centric computing, astronomy, genomics, medical
data analytics, large-scale e-commerce, and other relevant topics that may be proposed by poten-
tial contributors.
PUBLISHED TITLES
BIG DATA : ALGORITHMS, ANALYTICS, AND APPLICATIONS
Kuan-Ching Li, Hai Jiang, Laurence T. Yang, and Alfredo Cuzzocrea
Chapman & Hall/CRC
Big Data Series
BIG DATA
Algorithms, Analytics,
and Applications
Edited by
Kuan-Ching Li
Providence University
Taiwan
Hai Jiang
Arkansas State University
USA
Laurence T. Yang
St. Francis Xavier University
Canada
Alfredo Cuzzocrea
ICAR -CNR & University of Calabria
Italy
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2015 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Version Date: 20141210
International Standard Book Number-13: 978-1-4822-4056-6 (eBook - PDF)
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been
made to publish reliable data and information, but the author and publisher cannot assume responsibility for the valid-
ity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or uti-
lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy-
ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the
publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://
www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Contents
Foreword by Jack Dongarra, ix
Foreword by Dr. Yi Pan, xi
Foreword by D. Frank Hsu, xiii
Preface, xv
Editors, xxix
Contributors, xxxiii
Section i Big Data Management
HiSHam moHamed and StépHane marcHand-maillet
cHapter 2 ◾ S calability and Cost Evaluation of Incremental Data
Processing Using Amazon’s Hadoop Service 21
Xing Wu, Yan liu, and ian gorton
aleXander tHomaSian
cHapter 4 ◾ M ultiple Sequence Alignment and Clustering with Dot
Matrices, Entropy, and Genetic Algorithms 71
JoHn tSiligaridiS
Section ii Big Data Processing
cHapter 5 ◾ A pproaches for High-Performance Big Data Processing:
Applications and Challenges 91
ouidad acHaHbar, moHamed riduan abid, moHamed bakHouYa,
cHaker el amrani, Jaafar gaber, moHammed eSSaaidi, and tarek a. el gHazaWi
cHapter 6 ◾ T he Art of Scheduling for Big Data Science 105
florin pop and Valentin criStea
v
vi ◾ Contents
cHapter 7 ◾ T ime–Space Scheduling in the MapReduce Framework 121
zHuo tang, ling Qi, lingang Jiang, kenli li, and keQin li
cHapter 8 ◾ G EMS: Graph Database Engine for Multithreaded Systems 139
aleSSandro morari, Vito gioVanni caStellana, oreSte Villa, JeSSe WeaVer,
greg WilliamS, daVid Haglin, antonino tumeo, and JoHn feo
cHapter 9 ◾ K SC-net: Community Detection for Big Data Networks 157
ragHVendra mall and JoHan a.k. SuYkenS
cHapter 10 ◾ M aking Big Data Transparent to the Software Developers’
Community 175
Yu Wu, JeSSica kropczYnSki, and JoHn m. carroll
Section iii Big Data Stream Techniques and Algorithms
cHapter 11 ◾ K ey Technologies for Big Data Stream Computing 193
daWei Sun, guangYan zHang, Weimin zHeng, and keQin li
cHapter 12 ◾ S treaming Algorithms for Big Data Processing on Multicore
Architecture 215
marat zHanikeeV
cHapter 13 ◾ O rganic Streams: A Unified Framework for Personal Big
Data Integration and Organization Towards Social Sharing
and Individualized Sustainable Use 241
Xiaokang zHou and Qun Jin
cHapter 14 ◾ M anaging Big Trajectory Data: Online Processing of
Positional Streams 257
koStaS patroumpaS and timoS SelliS
Section iV Big Data Privacy
cHapter 15 ◾ P ersonal Data Protection Aspects of Big Data 283
paolo balboni
cHapter 16 ◾ P rivacy-Preserving Big Data Management: The Case of
OLAP 301
alfredo cuzzocrea
Contents ◾ vii
Section V Big Data Applications
cHapter 17 ◾ B ig Data in Finance 329
taruna SetH and Vipin cHaudHarY
cHapter 18 ◾ S emantic-Based Heterogeneous Multimedia Big Data
Retrieval 357
keHua guo and JianHua ma
cHapter 19 ◾ T opic Modeling for Large-Scale Multimedia Analysis and
Retrieval 375
Juan Hu, Yi fang, nam ling, and li Song
cHapter 20 ◾ B ig Data Biometrics Processing: A Case Study of an Iris
Matching Algorithm on Intel Xeon Phi 393
XueYan li and cHen liu
cHapter 21 ◾ S toring, Managing, and Analyzing Big Satellite Data:
Experiences and Lessons Learned from a Real-World
Application 405
ziliang zong
cHapter 22 ◾ B arriers to the Adoption of Big Data Applications in the
Social Sector 425
elena Strange
Description:As today’s organizations are capturing exponentially larger amounts of data than ever, now is the time for organizations to rethink how they digest that data. Through advanced algorithms and analytics techniques, organizations can harness this data, discover hidden patterns, and use the newly acqu