ebook img

Machine Learning. The new AI PDF

211 Pages·2016·0.9 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Machine Learning. The new AI

MACHINE LEARNING THE NEW AI ETHEM ALPAYDIN The MIT Press Cambridge, Massachusetts London, England © 2016 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. This book was set in Chaparral and DIN by Toppan Best-set Premedia Limited. Printed and bound in the United States of America. Library of Congress Cataloging-in-Publication Data Names: Alpaydın, Ethem, author. Title: Machine learning : the new AI / Ethem Alpaydın. Description: Cambridge, MA : MIT Press, [2016] | Series: MIT Press essential knowledge series | Includes bibliographical references and index. Identifiers: LCCN 2016012342 | ISBN 9780262529518 (pbk. : alk. paper) Subjects: LCSH: Machine learning. | Artificial intelligence. Classification: LCC Q325.5 .A47 2016 | DDC 006.3/1—dc23 LC record available at https://lccn.loc.gov/2016012342 10 9 8 7 6 5 4 3 2 1 CONTENTS Series Foreword vii Preface ix 1 Why We Are Interested in Machine Learning 1 2 Machine Learning, Statistics, and Data Analytics 29 3 Pattern Recognition 55 4 Neural Networks and Deep Learning 85 5 Learning Clusters and Recommendations 111 6 Learning to Take Actions 125 7 Where Do We Go from Here? 141 Notes 169 Glossary 171 References 183 Further Readings 187 Index 189 SERIES FOREWORD The MIT Press Essential Knowledge series offers acces- sible, concise, beautifully produced pocket-size books on topics of current interest. Written by leading thinkers, the books in this series deliver expert overviews of subjects that range from the cultural and the historical to the sci- entific and the technical. In today’s era of instant information gratification, we have ready access to opinions, rationalizations, and super- ficial descriptions. Much harder to come by is the founda- tional knowledge that informs a principled understanding of the world. Essential Knowledge books fill that need. Synthesizing specialized subject matter for nonspecialists and engaging critical topics through fundamentals, each of these compact volumes offers readers a point of access to complex ideas. Bruce Tidor Professor of Biological Engineering and Computer Science Massachusetts Institute of Technology PREFACE A quiet revolution has been taking place in computer sci- ence for the last two decades. Nowadays, more and more, we see computer programs that learn—that is, software that can adapt their behavior automatically to better match the requirements of their task. We now have pro- grams that learn to recognize people from their faces, un- derstand speech, drive a car, or recommend which movie to watch—with promises to do more in the future. Once, it used to be the programmer who defined what the computer had to do, by coding an algorithm in a pro- gramming language. Now for some tasks, we do not write programs but collect data. The data contains instances of what is to be done, and the learning algorithm modifies a learner program automatically in such a way so as to match the requirements specified in the data. Since the advent of computers in the middle of the last century, our lives have become increasingly computerized and digital. Computers are no longer just the numeric cal- culators they once were. Databases and digital media have taken the place of printing on paper as the main medium of information storage, and digital communication over computer networks supplanted the post as the main mode of information transfer. First with the personal computer with its easy-to-use graphical interface, and then with the phone and other smart devices, the computer has become a ubiquitous device, a household appliance just like the TV or the microwave. Nowadays, all sorts of information, not only numbers and text but also image, video, audio, and so on, are stored, processed, and—thanks to online con- nectivity—transferred digitally. All this digital processing results in a lot of data, and it is this surge of data—what we can call a “dataquake”—that is mainly responsible for triggering the widespread interest in data analysis and ma- chine learning. For many applications—from vision to speech, from translation to robotics—we were not able to devise very good algorithms despite decades of research beginning in the 1950s. But for all these tasks it is easy to collect data, and now the idea is to learn the algorithms for these au- tomatically from data, replacing programmers with learn- ing programs. This is the niche of machine learning, and it is not only that the data continuously has got bigger in these last two decades, but also that the theory of machine learning to process that data to turn it into knowledge has advanced significantly. Today, in different types of business, from retail and finance to manufacturing, as our systems are computer- ized, more data is continuously generated and collected. This is also true in various fields of science, from astron- omy to biology. In our everyday lives too, as digital tech- nology increasingly infiltrates our daily existence, as our x    Preface digital footprint deepens, not only as consumers and users but also through social media, an increasingly larger part of our lives is recorded and becomes data. Whatever its source—business, scientific, or personal—data that just lies dormant passively is not of any use, and smart people have been finding new ways to make use of that data and turn it into a useful product or service. In this transforma- tion, machine learning is playing a more significant role. Our belief is that behind all this seemingly complex and voluminous data, there lies a simple explanation. That although the data is big, it can be explained in terms of a relatively simple model with a small number of hidden factors and their interaction. Think about millions of cus- tomers who buy thousands of products online or from their local supermarket every day. This implies a very large database of transactions; but what saves us and works to our advantage is that there is a pattern to this data. People do not shop at random. A person throwing a party buys a certain subset of products, and a person who has a baby at home buys a different subset—there are hidden factors that explain customer behavior. It is this inference of a hidden model—namely, the underlying factors and their interaction—from the observed data that is at the core of machine learning. Machine learning is not just the commercial applica- tion of methods to extract information from data; learn- ing is also a requisite of intelligence. An intelligent system   Preface    xi should be able to adapt to its environment; it should learn not to repeat its mistakes but to repeat its successes. Previ- ously, researchers used to believe that for artificial intelli- gence to become reality, we needed a new paradigm, a new type of thinking, a new model of computation, or a whole new set of algorithms. Taking into account the recent suc- cesses in machine learning in various domains, it can now be claimed that what we need is not a set of new specific algorithms but a lot of example data and sufficient com- puting power to run the learning methods on that much data, bootstrapping the necessary algorithms from data. It may be conjectured that tasks such as machine translation and planning can be solved with such learning algorithms that are relatively simple but trained on large amounts of example data—recent successes with “deep learning” support this claim. Intelligence seems not to originate from some outlandish formula, but rather from the patient, almost brute force use of simple, straightfor- ward algorithms. It seems that as technology develops and we get faster computers and more data, learning algorithms will gener- ate a slightly higher level of intelligence, which will find use in a new set of slightly smarter devices and software. It will not be surprising if this type of learned intelligence reaches the level of human intelligence some time before this century is over. xii    Preface While I was working on this book, one of the most prestigious scientific journals, Science, in its July 15, 2015, issue (vol. 349, no. 6245), published a special section on Artificial Intelligence. Though the title announces a focus on artificial intelligence, the dominant theme is machine learning. This is just another indicator that machine learn- ing is now the driving force in artificial intelligence; after the disappointment of logic-based, programmed expert systems in 1980s, it has revived the field, delivering sig- nificant results. The aim of this book is to give the reader an overall idea about what machine learning is, the basics of some important learning algorithms, and a set of example ap- plications. The book is intended for a general readership, and only the essentials of the learning methods are dis- cussed without any mathematical or programming details. The book does not cover any of the machine-learning ap- plications in much detail either; a number of examples are discussed just enough to give the fundamentals without going into the particulars. For more information on the machine learning algo- rithms, the reader can refer to my textbook on the topic, on which this book is heavily based: Ethem Alpaydın, Intro- duction to Machine Learning, 3rd ed. (Cambridge, MA: MIT Press, 2014).   Preface    xiii The content is organized as follows: In chapter 1, we discuss briefly the evolution of com- puter science and its applications, to place in context the current state of affairs that created the interest in machine learning—namely, how the digital technology advanced from number-crunching mainframes to desktop personal computers and later on to smart devices that are online and mobile. Chapter 2 introduces the basics of machine learning and discusses how it relates to model fitting and statistics on some simple applications. Most machine learning algorithms are supervised, and in chapter 3, we discuss how such algorithms are used for pattern recognition, such as faces and speech. Chapter 4 discusses artificial neural networks inspired from the human brain, how they can learn, and how “deep,” multilayered networks can learn hierarchies at different levels of abstractions. Another type of machine learning is unsupervised, where the aim is to learn associations between instances. In chapter 5 we talk about customer segmentation and learning recommendations, as popular applications. Chapter 6 is on reinforcement learning where an au- tonomous agent—for example, a self-driving car—learns to take actions in an environment to maximize reward and minimize penalty. xiv    Preface

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.