ebook img

Machine Learning Proceedings 1993. Proceedings of the Tenth International Conference, University of Massachusetts, Amherst, June 27–29, 1993 PDF

356 Pages·1993·46.2 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Machine Learning Proceedings 1993. Proceedings of the Tenth International Conference, University of Massachusetts, Amherst, June 27–29, 1993

MACHINE LEARNING Proceedings of the Tenth International Conference University of Massachusetts, Amherst June 27-29, 1993 Morgan Kaumf ann Publishers, Inc. San Mateo, California Production, design, type, and manufacturing management provided by Professional Book Center, Denver, Colorado Morgan Kaufmann Publishers, Inc. Editorial Office: 2929 Campus Drive, Suite 260, San Mateo, CA 94403 ISBN 1-55860-307-7 © 1993 by Morgan Kaufmann Publishers, Inc. All rights reserved Printed in the United States of America No part of this publication may be reproduced, stored in a retrieval system, or trans­ mitted in any form or by any means—electronic, mechanical, photocopying, recording, or otherwise without the prior written permission of the publisher. 96 95 94 93 4 3 2 1 PREFACE This volume contains the papers presented at the Tenth Intemational Conference on Machine Learning, held at Amherst, Massachusetts, during June 27-29, 1993. Papers for the three informal workshops following the conference can be obtained from the workshop coordinators, whose names and affiliations are given on the following page. The conference attracted 164 paper submissions, from which the 44 papers that appear here were selected by the program committee. A great many people have contributed to the conference. I thank: • The authors, w^ho have taken time to produce and present their papers; • The invited speakers, for sharing their research, insights, and perspective; • The program committee members, who provided careful and constmctive reviews; • Those who served as adjunct members of the program committee by providing one or more essential reviews, namely Michael Cameron-Jones, Vijay GuUapalli, and Michael Jordan; • The members of the program committee who met at Amherst in Febmary to make the decisions regarding selection of papers, namely Andrew Barto, Haym Hirsh, Dennis Kibler, Pat Langley, Sridhar Mahadevan, Christopher Owens, Lorenza Saitta, and Steven Whitehead; • The organizing committee members, who provided useful advice, namely Andrew Barto, Larry Bimbaum, Tom Dietterich, Edwina Rissland, and Derek Sleeman; • The workshop organizers; • Siemens Corporate Research, for providing financial support; • Derek Sleeman, for the financial carry-forward from ML92; • University Conference Services, particularly Todd Bryda and Mary McCuUoch, for handling most of the local arrangements; • Gw^^n Mitchell for secretarial assistance; • Professional Book Center, particularly Jennifer Ballentine, for producing this volume; • Morgan Kaufmann Publishers, for serving as distributor of this volume. Paul Utgoff, ML93 chair Amherst, Massachusetts ORGANIZING COMMITTEE Andrew Barto, University of Massachusetts at Amherst Larry Birnbaum, Northwestern University Tom Dietterich, Oregon State University Edwina Rissland, University of Massachusetts at Amherst Derek Sleeman, University of Aberdeen PROGRAM COMMITTEE Andrew Barto, University of Massachusetts at Christopher Owens, University of Chicago Amherst Michael Pazzani, University of California at Larry Birnbaum, Northwestern University Irvine Peter Cheeseman, NASA Ames Research Center Leonard Pitt, University of Illinois at Champagne-Urbana William Cohen, AT&T Bell Laboratories Ross Quinlan, University of Sydney Oren Etzioni, University of Washington Larry Rendell, University of Illinois at Usama Fayyad, Jet Propulsion Laboratory Champagne-Urbana Douglas Fisher, Vanderbilt University Paul Rosenbloom, Information Sciences Institute, John Grefenstette, Navy Research Laboratory University of Southern California Haym Hirsh, Rutgers University Stuart Russell, University of California at Berkeley Robert Holte, University of Ottawa Lorenza Saitta, University of Torino Dennis Kibler, University of California at Irvine Jude Shavlik, University of Wisconsin Pat Langley, Siemens Corporate Research Derek Sleeman, University of Aberdeen Sridhar Mahadevan, IBM Watson Devika Subramanian, Cornell University Ryszard Michalski, George Mason University Richard Sutton, GTE Laboratories Tom Mitchell, Carnegie-Mellon University Kurt VanLehn, University of Pittsburgh Ray Mooney, University of Texas at Austin Steven Whitehead, GTE Laboratories Katharina Morik, University of Dortmund Jan Zytkow, Wichita State University Stephen Muggleton, The Turing Institute WORKSHOPS Reinforcement Leaming: What We Know, What We Need Coordinator: Richard Sutton, GTE Labs MS-44, 40 Sylvan Road, Waltham, MA 02254 Fielded Applications of Machine Learning Coordinator: Pat Langley, Siemens Corporate Research, 755 College Road East, Princeton, NJ 08540 Knowledge Compilation and Speedup Learning Coordinator: Devika Subramanian, Department of Computer Science, Comell University, Ithaca, NY 14853 SCHEDULE-SUNDAY,JUNE 27 ------ ---_ ..~ 8:45 Welcome I 9:00 Combining Instance-Based and Model-Based Learning-J. R. Quinlan 9:30 AddressingtheSelectiveSuperiority Problem: AutomaticAlgorithmIModel ClassSelection-Carla E. Brodley 0 10:00 SKlCAT:AMachine LearningSystem forAutomated CatalogingofLargeScale Sky Surveys-Usama Fayyad, Nicholas Weir, S. Djorgovski --. ----- _._-_._-------- 10:30 Break ~ ._-. -----_. .... --..._---_.--------_._----_._-------_. 11:00 Invited Talk: Leo Breiman, UniversityofCalifornia at Berkeley 12:00 Lunch ~E';;:~ntly .: 1:30 Database MiningofSubjectiveAgricultural Scaling Up Reinforcement Learningfor . Completeand SystematicSearch Data-R. BharatRao, Thomas B. Voigt, Robot Control-Long-Ji Lin Algorithm that Uses Optimal Pruning Thomas W Fermanian -JeffreyC.Schlimmer 2:00 Discovering Dynamics-Saso Dzeroshi, An Efficient Method forConstructing Adaptive Neurocontrol: How Black-Box and Ljupco Todorovski Approximate DecisionTrees for Large SimpleCan it be?-Jean·MichelRenders, Databases-Ron Musick, Jason Catlett, Hugues Bersini, MarcoSaerens Stuart Russell 2:30 :Efficient Domain-Independent AnSE-Tree Based Characterizationofthe Hierarchical Learning: Preliminary Results Experimentation-YolandaGil Induction Problem -Ron Rymon -Leslie PackKaelbling 3:00 Break 3:30 LearningSearchControl Knowledge forthe Lookahead FeatureConstruction for Supervised Learningand Divide-and Deep Space NetworkScheduling Problem LearningHard Concepts Conquer:AStatisticalApproach -Jonathan Gratch,SteveA. Chien, Gerald -Harish Ragavan, Larry Rendell -Michael I.Jordan, RobertA. Jacobs F. DeJong 4:00 ConstrainingLearning with Search Control UsingDecision Trees to Improve Case-Based Density-Adaptive Learningand Forgetting -JihieKim, Paul S. Rosenbloom Reasoning-ClaireCardie -MarcosSalganicoff 4:30 LearningProcedures from Interactive Generalization under Implication by OvercomingIncomplete Perception with Natural Language Instructions RecursiveAnti-Unification Utile Distinction Memory -Scott B. Huffman, John E. Laird -PeterIdestam-Almquist -R. AndrewMcCallum 500 1 ._. 6:00 Reception --- ._---_. -----_. .,-_.-_._--- SCHEDULE-MONDAY,JUNE 28 _._---, -----r------··-- .-.-----. ._-------_._--_._--- ._--------- 9:00 IIUsingQualitative Models toGuide Inductive Learning-PeterClark. Stan Matwin 9:30 LearningFrom Entailment: An Application to Propositional Horn Sentences-Michael Frazier, Leonard Pitt 10:00 Learningfrom Queries and l'~xamples with Tree-Structured Bias-Prasad Tadepalli ------- -.-. - ---------. -----_._--_._._--_. 10:30 I Break I - - •__• - o•• _. _•••• 11:00 InvitedTalk: MickiChi, UniversityofPittsburgh 12:00 1:30 2:00 2:30 ATMSchedulingwith QueuingDelay Predictions-DanielB. Schwartz I I -------1 3:00 I - - - - 1__. • " -.oJ - -. -. - -_ ..... - --- -...._. _. - _._-- SCHEDUU::-TUESDAY, JUNE 29 .._---- .....- --_..- ._--- - -.- ---- --. -.._-._---- .._. -.- ---- I 9:00 Constructing Hidden Variables in Bayesian Networks via Conceptual Clustering-DennisConnolly o 9:30 .LearningSymbolicRules Using Artificial Neural Networks-Mark ~ Craven,Jude Shavlik 10:00 Explanation Based Learning:AComparison ofSymbolicand Neural Network Approaches-Tom M.Mitchell,Sebastian B. Thrun ---------------.------- .-. ---------.-- .-------..- ..--. 1 10:30 Break I 1- -- ______. .-- -. . -. _._ - ----.- _ -_ -- -_ -. - - - -_ ----. -----I 11:00 I InvitedTalk: David Sandler, Thermo'Irex I l _____. .__. .__. -1 I 12:00 Lunch ___0_. _ I 1:30 Business Meeting __ L_ ------- ._---- -- . .-.------ - ------. I 2:00 IConceptSharing: AMeans to Improve Multi Combinatorial Optimization in Inductive Compiling Bayesian Networks into Neural Concept Leaming-Pieui Datta, Dennis Concept Leaming-Dunja Mladenic Networks-Eddie Schwalb I Kibler 2:30 IMultitask Learning: AKnowledge-Based The EvolutionofGeneticAlgorithms SynthesisofAbstraction Hierarchiesfor Source ofInductive Bias-RichardA. -ShumeetBaluja ConstraintSatisfaction byClustering I Caruana Approximately EquivalentObjects -ThomasEllman 3:00 IOnline Learning with Random Genetic Programmingfor LearningofTarget GALOIS:An Order-TheoreticApproach to I Representations-RichardS. Sutton, Discrimination Functions-WalterAlden ConceptualClustering-Claudio Ir------------- ----...----- Steve Whitehead Tackett Carpineto,Giovanni Romano .-----.-. __.0 ~ 3:30 I Break --_._._-.,I -- .- ._._-------_. ---.-.-------_.- t - ---------- .-- 4:00 InvitedTalk: Pat Langley,Siemens i -------_. 5:00 r - .- ----- ----.----. .--------- -.-- . ----- --_._--- --4o I. Dinner i 6:00 . _i Evolution of Genetic Algorithms The Evolution of Genetic Algorithms: Towards Massive Parallelism Shumeet Baluja Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 [email protected] Abstract Genetic algorithms have evolved through three phases, their development motivated by the goal of balancing One of the issues in creating any search tech­ exploration with focusing. Each of these phases will be nique is balancing the need for diverse individually discussed in the next three sections. The first exploration with the desire for efficient phase presented the traditional genetic algorithm. A single focusing. This paper explores a genetic algo­ population of potential solutions is evolved through a rithm (GA) architecture which is more resil­ series of generations. The second phase introduced the ient to local optima than other recently concept of parallel subpopulations. Only a limited amount introduced GA models, and which provides of swapping of potential solutions is allowed between sub- the ability to focus search quickly. The GA populations. Each subpopulation evolves independently, uses a fine-grain parallel architecture to sim­ with swapping at infrequent intervals. This has been ulate evolution more closely than previous termed coarse-grain parallelism, or the "island" model. models. In order to motivate the need for The third phase, an extension of the second, introduced fine-grain parallelism, this paper will provide fine-grain parallelism. The GAs in this category differ an overview of the two preceding phases of from the previous by relaxing the boundaries between sub- development: the traditional genetic algo­ populations. Swapping between subpopulations occurs rithm, and the coarse-grain parallel GA. A frequently, and significandy contributes to the effective­ test set of 15 problems is used to compare ness of this form of genetic algorithm. This phase also dif­ the effectiveness of a fine-grain parallel GA fers from the previous two by evolving numerous small with that of a coarse-grain parallel GA. subpopulations; in the previous two phases, only a few large subpopulations are evolved. 2 TRADITIONAL GAs 1 INTRODUCTION Traditional genetic algorithms maintain a single popula­ The effectiveness of heuristic techniques used in machine tion of potential solutions for the objective function being learning, search, and function optimization, resides in the optimized. Although a large portion of GA research has heuristic's ability to balance the need for a diverse set of been conducted with potential solutions encoded in binary sampling points with the ability to focus search quickly notation, any encoding scheme can be used. The initial upon potential solutions. Genetic algorithms (GAs) are group of potential solutions is randomly selected. These general puφose optimization tools designed to search potential solutions, termed "chromosomes", are allowed to irregular, poorly characterized search spaces. GAs are evolve over a number of generations. At every generation, based upon the ideas of natural selection and genetic the fitness of each potential solution in terms of the objec­ recombination. GAs combine the principles of survival of tive function is calculated, and pairs of solutions are the fittest with a randomized information exchange. recombined to create the subsequent generation. Recombi­ Although the information exchange is randomized, GAs nation is the method by which the parent chromosomes of are far different than simple random walks, having the the current generation donate parts of their potential solu­ ability to recognize trends toward optimal solutions, and to tions to the "children" chromosomes which appear in the exploit such information by guiding the search toward subsequent generation. The probability that a solution will them. participate in this recombination increases with its fitness. Thus, although "good" chromosomes are more likely to be 2 Baluja chosen for recombination, they are not guaranteed to be generating new species is to thrust an old species chosen. Further, the "children" chromosomes produced into a new environment, where change is benefi­ are not necessarily better than their parents. Nevertheless, cial and rewarded. For this reason we should because of the selective pressure applied through a number expect a genetic algorithm approach based upon of generations, the overall trend is toward better chromo­ punctuated equilibria to perform better than the somes. typical single environment scheme. In order to perform expansive search, genetic diversity By implication, after some period of evolution, a large must be maintained. When diversity is lost, it is possible portion of the chromosomes in a single population will for the GA to settle into a local optimum. There are two represent very similar schema. The children chromosomes fundamental mechanisms which the traditional GA uses to produced thereafter will be similar to each other and to maintain diversity. The first, mentioned above, is a proba­ their parents, thereby rendering recombination operators bilistic scheme of selecting chromosomes for recombina­ largely ineffective for further search space exploration. tion. This insures that schemata, or common recurring One method of resolving this problem is to partition a sin­ patterns, other than those represented in the best chromo­ gle large population into separate subpopulations, each somes, appear in subsequent generations. Exclusively evolving its chromosomes independentiy from others. The recombining good chromosomes will quickly converge the fitness used to determine the probability of selection for population without extensive exploration, thereby increas­ recombination is measured relative only to the other mem­ ing the possibility of settling into a local optimum. The bers within the subpopulation. Independent evolutions, in second mechanism, mutation, is a random change. For separate subpopulations, should yield closely competitive, example, in binary encoded chromosomes, it is usually a yet possibly unique results. In the context of a single sub- random bit flip. In the traditional GA, the mutation rate is population, in order to continue evolution after conver­ kept at a very small constant. gence has started, members of species from outside subpopulations can periodically be introduced. To ensure This algorithm is typically allowed to continue for an arbi­ thorough mixing of chromosomes throughout the popula­ trary number of generations. Upon completion, the best tion, the swapping does not always occur between the chromosome in the final population, or the best chromo­ same subpopulations. some ever found, is returned. Although the sudden injection of new material is an Unlike the majority of other optimization heuristics, important aspect of these simulations, it is not always genetic algorithms do not work from a single point in the effective. It is possible that the subpopulation into which search space. Methods which only use a single point are new material is introduced is entirely settied into an equi­ susceptible to local optima. GAs continually maintain a librium state. If this is the case, new information may not population of points from which the search space is be incoφorated because of its incompatibility with the explored. This aids in searching multidimensional spaces, existing information. in which many variables must be optimized, and in locat­ ing global optima. Despite the problems associated with the sudden introduc­ tion of new material, parallel subpopulations have proven 3 COARSE-GRAIN PARALLEL GAs their effectiveness in two areas. The first is, as mentioned above, to preserve diversity and to ensure peφetual nov­ elty in the population's "gene pool". Through the use of A coarse-grain parallel genetic algorithm (cgpGA) is parallel subpopulations, GAs have been able to solve based upon the theory of punctuated equilibria. In the problems which could not be solved in a reasonable paper Distributed Genetic Algorithms for the Floor Plan amount of time by single population GAs [Whitley & Design Problem, Cohoon et. al. describe the theory Starkweather, 1990] [Liepins & Baluja, 1991] [Muhlen- [Cohoon, 1988]: bien, 1989]. The second use of the subpopulation structure Punctuated Equilibria is based upon two princi­ is to emphasize various characteristics in the chromosome. ples: allopatric speciation and stasis. Allopatric For example, in multi-objective functions, the evaluations speciation involves the rapid evolution of new in each subpopulation can be used to emphasize different species after a small set of members of species, objectives. When members of separate subpopulations are peripheral isolates, becomes segregated into a mixed, the genetic information may be combined to reveal new environment. Stasis, or stability, of a spe­ chromosomes which are strong with respect to more than a cies, is simply the notion of lack of change. It single objective. An exploration of multi-objective func­ implies that after equilibria is reached in an envi­ tion optimization with parallel subpopulations can be ronment, there is very little drift away from the found in [Husbands, 1991]. An interesting method of genetic composition of species. Ideally, a species multi-objective optimization using only a single popula­ would persist until its environment changes (or tion with multiple fitness measures can be found in [Schaf- the species would drift very little). Punctuated fer & Grefenstette, 1985]. Equilibria stresses that a powerful method for

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.