Table Of ContentStudies in Computational Intelligence 468
Editor-in-Chief
Prof.JanuszKacprzyk
SystemsResearchInstitute
PolishAcademyofSciences
ul.Newelska6
01-447Warsaw
Poland
E-mail:kacprzyk@ibspan.waw.pl
Forfurthervolumes:
http://www.springer.com/series/7092
Agnieszka Dardzinska
Action Rules Mining
ABC
Author
Dr.AgnieszkaDardzinska
DepartmentofMechanicsand
AppliedComputerScience
BialystokUniversityofTechnology
Bialystok
Poland
ISSN1860-949X e-ISSN1860-9503
ISBN978-3-642-35649-0 e-ISBN978-3-642-35650-6
DOI10.1007/978-3-642-35650-6
SpringerHeidelbergNewYorkDordrechtLondon
LibraryofCongressControlNumber:2012954483
(cid:2)c Springer-VerlagBerlinHeidelberg2013
Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof
thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,
broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation
storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology
nowknownorhereafterdeveloped.Exemptedfromthislegalreservationarebriefexcerptsinconnection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of
this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publisher’slocation,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.
PermissionsforusemaybeobtainedthroughRightsLinkattheCopyrightClearanceCenter.Violations
areliabletoprosecutionundertherespectiveCopyrightLaw.
Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication
doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant
protectivelawsandregulationsandthereforefreeforgeneraluse.
Whiletheadviceandinformationinthisbookarebelievedtobetrueandaccurateatthedateofpub-
lication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityforany
errorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,withrespect
tothematerialcontainedherein.
Printedonacid-freepaper
SpringerispartofSpringerScience+BusinessMedia(www.springer.com)
To my great Family
Foreword
Knowledge discovery in databases (KDD, also called data mining) is a well-
establishedbranchofinformatics.ThebasisofKDDiscreatedbyananalysis
of the given data with the goal of finding unsuspected patterns and summa-
rizingthedatainnewwaystomakethemunderstandableandusablefortheir
owners. The analyzeddata is often savedin large and complex databases, in
some cases covering long history of the corresponding topic. KDD has been
foralongtimeappliedinlargecorporationsandinresearchanddevelopment
institutions.However,interestisalsogrowinginsmallandmedium-sizecom-
panies and in the public sector (e.g. health care). The growing interest on
KDD brings new challenges both for computer scientists and for practition-
ers. One of the challenges is related to the fact that data mining systems
usually can only list results; they are unable to relate the results of min-
ing to the real-world decisions. Thus, a specific challenge is to turn given
data into actionable knowledge. The goal is not only to produce strong, yet
not known and interesting patterns. The goal is to suggest an action which
will be useful for the data owner. This need led to the notion of actionable
rules, which was introduced in 1990s. The rule is considered actionable if a
data owner can start an action based on this rule and if the action is use-
ful for her/him. Such specification is, however, too wide and makes possible
various interpretations. That’s why additional definitions were formulated
including the definition of an action rule. This was introduced by Ras and
Wieczorkowskain 2000.Severalbranchesofresearchofactionrulesandsim-
ilar notions started, and dozens of papers were published. It is notable that
important notion of stable and actionable attributes came from the group
of researchers led by Professor Zbigniew Ras. The attribute ”Year of birth”
is an example of stable attribute and the attribute ”Level of interest” is an
example of an actionable attribute. These notions represent a clear bridge
between analyzed data and relevant items of domain knowledge. There are
various theoretically interesting and practically important results on action
rules and related topics. It becomes clear that there is a strong need for a
comprehensive overview of all achieved results.
VIII Foreword
Dr. Dardzinska was one of the authors which invented the notion of action
rules, and she still continues her research of this topic. She is a person per-
fectly qualified to prepare such a comprehensive overview. The book gives a
smooth introduction to the notion of action rules. Then all necessary con-
cepts are introduced. The core of the book consists in descriptions of many
algorithms and methods for mining and dealing with action rules. Attention
is given also to handle incomplete values in data. The book emphasizes the
theoreticalapproaches;however,thereareenoughexamples,whichmakethe
book well readable. It can be concluded that the book is a carefully writ-
ten comprehensive description of the state of art in the area of action rules.
The book can be seen as a unique source of information on action rules
for researchers, teachers and students interested in knowledge discovery in
databases.
Prague, October 2012 Jan Rauch
Contents
1 Introduction............................................. 1
2 Information Systems..................................... 5
2.1 Types of Information Systems........................... 5
2.2 Types of Incomplete Information Systems ................ 7
2.3 Simple Query Language ................................ 10
2.3.1 Standard Interpretation of Queries in Complete
Information Systems............................. 10
2.3.2 Standard Interpretation of Queries in Incomplete
Information Systems............................. 11
2.4 Rules ................................................ 14
2.5 Distributed Information Systems ........................ 16
2.6 Decision Systems ...................................... 19
2.7 Partially Incomplete Information Systems ................ 19
2.8 Extracting Classification Rules .......................... 21
2.8.1 Attribute Dependency and Coverings .............. 21
2.8.2 System LERS................................... 22
2.8.3 Algorithm for Finding the Set of All Coverings
(LEM1) ....................................... 22
2.8.4 Algorithm LEM2 ............................... 25
2.8.5 Algorithm for Extracting Rules from Incomplete
Decision System (ERID) ......................... 27
2.9 Chase Algorithms ..................................... 29
2.9.1 Tableaux Systems and Chase ..................... 29
2.9.2 Handling Incomplete Values Using CHASE
1
Algorithm ...................................... 32
2.9.3 Handling Incomplete Values Using CHASE
2
Algorithm ...................................... 38
X Contents
3 Action Rules ............................................ 47
3.1 Main Assumptions..................................... 47
3.2 Action Rules from Classification Rules ................... 50
3.2.1 System DEAR .................................. 51
3.2.2 System DEAR2 ................................. 56
3.3 E-action Rules ........................................ 59
3.3.1 ARAS Algorithm................................ 64
3.4 Action Rules Tightly Coupled Framework ................ 67
3.5 Cost and Feasibility.................................... 70
3.6 Association Action Rules ............................... 73
3.6.1 Frequent Action Sets ............................ 74
3.7 Representative Association Action Rules ................. 75
3.8 Simple Association Action Rules ........................ 77
3.9 Action Reducts ....................................... 77
3.9.1 Experiments and Testing ......................... 81
3.10 Meta-action .......................................... 83
3.10.1 Discovering Action Paths......................... 87
References................................................... 91
1
Introduction
Data mining, or knowledge discovery, is frequently referred to in the liter-
ature as the process of extracting interesting information or patterns from
large databases. There are two major directions in data mining research:
patterns and interest. The pattern discovery techniques include: classifica-
tion, association, and clustering. Interest refers to pattern applications in
business, education, medicine, military or other organizations, being useful
or meaningful [16]. Since pattern discovery techniques often generate large
amounts of knowledge, they require a great deal of expert manual work to
post-process the mined results. Therefore, one of the central research prob-
lems in the field relates to reducing the volume of the discovered patterns,
and selecting appropriate interestingness measures. These measures are in-
tendedforselectingandrankingpatternsaccordingtotheirpotentialinterest
to the user. Good measures also allow the time and space costs of the min-
ing process to be reduced. There are two aspects of interestingness of rules
that have been studied in data mining literature: objective and subjective
measures (Liu et al. [26](1997), Adomavicius and Tuzhilin [1](1997) and Sil-
berschatz and Tuzhilin [57](1995)). Objective measures are data-driven and
domain-independent. Generally, they evaluate the rules based on the quality
andsimilaritybetweenthem.Subjectivemeasures,includingunexpectedness,
novelty and actionability, are user-driven and domain-dependent. A rule is
actionableifausercandoanactiontohisadvantagebasedonthatrule[26].
Thisdefinition,inspiteofitsimportance,isquitevagueanditleavesanopen
doortoanumberofdifferentinterpretationsofactionability.Forinstance,we
mayformulateactionsthatinvolveattributesoutsidethedatabaseschema.In
ordertonarrowitdown,anewclassofrules(calledaction rules)constructed
from certain pairs of association rules, extracted from a given database, has
been proposed by Ras and Wieczorkowska [52](2000). The main idea was
to generate, from a database, special type of rules which basically form a
hint to the users, showing a way to re-classify objects with respect to some
distinguished attribute (called a decision attribute). Values of some of at-
tributes, used to describe objects stored in a database, can be changed and
A.Dardzinska:ActionRulesMining,SCI468,pp.1–4.
DOI:10.1007/978-3-642-35650-6_1 (cid:2)c Springer-VerlagBerlinHeidelberg2013