ebook img

Approximate Boolean Reasoning PDF

174 Pages·2006·1.73 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Approximate Boolean Reasoning

Approximate Boolean Reasoning: Foundations and Applications in Data Mining Hung Son Nguyen Institute of Mathematics, Warsaw University Banacha 2, 02-097 Warsaw, Poland [email protected] Table of Contents Approximate Boolean Reasoning: Foundations and Applications in Data Mining ........................................................... 1 Hung Son Nguyen Introduction................................................... 5 Chapter 1: Preliminaries .................................... 7 1 Knowledge Discovery and Data Mining ............................ 7 1.1 Approximate Reasoning Problem............................. 10 2 Rough set preliminaries.......................................... 12 2.1 Information systems ........................................ 12 2.2 Rough Approximation of Concept ............................ 14 2.3 Standard Rough Sets ....................................... 16 2.4 Rough membership function ................................. 17 2.5 Inductive searching for rough approximations .................. 18 3 Rough set approach to data mining ............................... 21 Chapter 2: Boolean Functions .............................. 23 1 Boolean Algebra ................................................ 23 1.1 Some properties of Boolean algebras: ......................... 25 1.2 Boolean expressions ........................................ 26 1.3 Boolean function ........................................... 27 1.4 Representations of Boolean functions ......................... 28 2 Binary Boolean algebra.......................................... 30 2.1 Some classes of Boolean expressions, Normal forms ............. 30 2.2 Implicants and prime implicants.............................. 32 Chapter 3: Boolean and Approximate Boolean Reasoning Approaches................................................. 35 1 Boolean Reasoning Methodology.................................. 35 1.1 Syllogistic reasoning:........................................ 36 1.2 Application of Boolean reasoning approach in AI ............... 38 1.3 Complexity of Prime Implicant Problems...................... 40 1.4 Monotone Boolean functions................................. 41 2 Approximate Boolean Reasoning.................................. 43 2.1 Heuristics for prime implicant problems ....................... 44 2.2 Searching for prime implicants of monotone functions ........... 46 2.3 Ten challenges in Boolean reasoning .......................... 48 3 Chapter 4: Approximate Boolean Reasoning Approach to Rough set theory....................................... 49 1 Rough sets and feature selection problem .......................... 49 1.1 Basic types of reducts in rough set theory ..................... 50 1.2 Boolean reasoning approach for reduct problem ................ 52 1.3 Example .................................................. 55 2 Approximate algorithms for reduct problem ........................ 57 3 Malicious decision tables......................................... 58 4 Rough Sets and classification problems ............................ 61 4.1 Rule based classification approach ............................ 63 4.2 Boolean reasoning approach to decision rule inductions.......... 64 4.3 Rough classifier ............................................ 67 Chapter 5: Rough set and Boolean reasoning approach to continuous data......................................... 68 1 Discretization of Real Value Attributes ............................ 68 1.1 Discretization as data transformation process .................. 68 1.2 Classification of discretization methods........................ 70 1.3 Optimal Discretization Problem.............................. 71 2 Discretization method based on Rough Set and Boolean Reasoning.... 73 2.1 Encoding of Optimal Discretization Problem by Boolean Functions 74 2.2 Discretization by reduct calculation........................... 77 2.3 Basic Maximal Discernibility heuristic. ........................ 77 2.4 Complexity of MD-heuristic for discretization .................. 79 3 More complexity results ......................................... 80 4 Attribute reduction vs. discretization .............................. 85 4.1 Boolean reasoning approach to s-OptiDisc.................... 88 5 Bibliography Notes.............................................. 93 Chapter 6: Approximate Boolean Reasoning Approach to Decision tree method .................................. 97 1 Decision Tree Induction Methods ................................. 99 1.1 Entropy measure ........................................... 100 1.2 Pruning techniques ......................................... 101 2 MD Algorithm.................................................. 102 3 Properties of the Discernibility Measure ........................... 103 3.1 Searching for binary partition of symbolic values ............... 105 3.2 Incomplete Data ........................................... 110 3.3 Searching for cuts on numeric attributes....................... 111 3.4 Experimental results........................................ 115 4 Bibliographical Notes............................................ 116 4.1 Other heuristic measures .................................... 116 4 Chapter 7: Approximate Boolean reasoning approach to Feature Extraction Problem ............................ 119 1 Grouping of symbolic values...................................... 119 1.1 Local partition............................................. 120 1.2 Divide and conquer approach to partition ..................... 121 1.3 Global partition: a method based on approximate Boolean rea- soning approach............................................ 121 2 Searching for new features defined by oblique hyperplanes............ 124 2.1 Hyperplane Searching Methods............................... 125 2.2 Searching for optimal set of surfaces .......................... 130 Chapter 8: Rough sets and association analysis .......... 131 1 Approximate reducts ............................................ 131 2 From templates to optimal association rules ........................ 132 3 Searching for Optimal Association Rules by rough set methods ....... 136 3.1 Example .................................................. 138 3.2 The approximate algorithms ................................. 141 Chapter 9: Rough set methods for mining large data sets144 1 Searching for reducts ............................................ 144 2 Induction of rough classifiers ..................................... 146 2.1 Induction of rough classifiers by lazy learning .................. 146 2.2 Example .................................................. 149 3 Searching for best cuts........................................... 150 3.1 Complexity of searching for best cuts ......................... 150 3.2 Efficient Algorithm ......................................... 151 3.3 Examples ................................................. 154 3.4 Local and Global Search .................................... 155 3.5 Further results ............................................. 156 3.6 Approximation of discernibility measure under fully dependent assumption ................................................ 157 3.7 Approximate Entropy Measures .............................. 158 4 Soft cuts and soft decision trees................................... 162 4.1 Discretization by Soft Cuts .................................. 162 4.2 Fuzzy Set Approach ........................................ 163 4.3 Rough Set Approach........................................ 163 4.4 Clustering Approach........................................ 164 4.5 Decision Tree with Soft Cuts................................. 165 References..................................................... 167 5 Abstract. As boolean algebra has a fundamental role in computer sci- ence,thebooleanreasoningapproachisalsoanideologicalmethodology in Artificial Intelligence. In recent years, boolean reasoning approach shows to be a powerful tool for designing effective and accurate solu- tionsformanyproblemsinroughsettheory.Thispaperpresentsamore generalizedapproachtomodernproblemsinroughsettheoryaswellas their applications in data mining. This generalized method is called the approximate boolean reasoning (ABR) approach. We summarize some most recent applications of ABR approach in development of new effi- cient algorithms in rough sets and data mining. Keywords: Rough sets, data mining, boolean reasoning. Introduction Concept approximation problem is one of most important issues in machine learning and data mining. Classification, clustering, association analysis or re- gressionareexamplesofwellknownproblemsindataminingthatcanbeformu- latedasconceptapproximationproblems.Agreateffortofmanyresearchershas beendonetodesignnewer,fasterandmoreefficientmethodsforsolvingconcept approximation problem. Roughsettheoryhasbeenintroducedby[72]asatoolforconceptapproxima- tionunderuncertainty.Theideaistoapproximatetheconceptbytwodescriptive sets called lower and upper approximations. The lower and upper approxima- tions must be extracted from available training data. The main philosophy of rough set approach to concept approximation problem is based on minimizing the difference between upper and lower approximations (also called the bound- ary region). This simple, but brilliant idea, leads to many efficient applications of rough sets in machine learning and data mining like feature selection, rule induction, discretization or classifier construction [42]. As boolean algebra has a fundamental role in computer science, the boolean reasoning approach is also an ideological method in Artificial Intelligence. In recentyears,booleanreasoningapproachshowstobeapowerfultoolfordesign- ing effective and accurate solutions for many problems in rough set theory. This paper presents a more generalized approach to modern problems in rough set theory as well as their applications in data mining. This generalized method is called the approximate boolean reasoning (ABR) approach. Structure of the paper Chapter 1: Basic notions of data mining, rough set theory and rough set methodology in data mining Chapter 2: Introduction to Boolean algebra and Boolean functions Chapter 3: BooleanReasoningandApproximateBooleanReasoningapproach to problem solving. Chapter 4: Application of Approximate Boolean Reasoning (ABR) in feature selection and decision rule generation. 6 Chapter 5: Rough set and ABR approach to discretization. Chapter 6: Application of ABR in decision tree induction Chapter 7: ABR approach to feature extraction problem Chapter 8: Rough sets, ABR and Association analysis. Chapter 9: Discretizationanddecisiontreeinductionmethodsfromlargedata- bases Chapter 10: Concluding remarks 7 Chapter 1: Preliminaries 1 Knowledge Discovery and Data Mining Knowledge discovery and data mining (KDD) – the rapidly growing interdisci- plinary field which merges together database management, statistics, machine learning and related areas – aims at extracting useful knowledge from large col- lections of data. There is a difference in understanding the terms ”knowledge discovery” and ”datamining”betweenpeoplefromdifferentareascontributingtothisnewfield. In this chapter we adopt the following definition of these terms [27]: Knowledge discovery in databases is the process of identifying valid, novel, potentially useful, and ultimately understandable patterns/models in data. Data mining is a step in the knowledge discovery process con- sisting of particular data mining algorithms that, under some acceptable computational efficiency limitations, finds patterns or models in data. Therefore, anessence of KDDprojects relates to interesting patterns and/or models that exist in databases but are hidden among the volumes of data. A model can be viewed ”a global representation of a structure that summarizes the systematic component underlying the data or that describes how the data may have arisen”. In contrast, ”a pattern is a local structure, perhaps relating to just an handful of variables and a few cases”. Usually, pattern is an expression φ in some language L describing a subset U of the data U (or a model applicable to that subset). The term ”pattern” φ goesbeyonditstraditionalsensetoincludemodelsorstructureindata(relations between facts). Data mining–anessentialstepinKDDprocess–isresponsibleforalgorith- mic and intelligent methods for pattern (and/or model) extraction from data. Unfortunately, not every extracted pattern becomes a knowledge. To specify the notion of knowledge for a need of algorithms in KDD processes, we should define an interestingness value of patterns by combining their validity, novelty, usefulness, and simplicity. Given an interestingness function I :L→∆ F I 8 parameterized by a given data set F, where ∆ ⊆ R is the domain of I , a I F pattern φ is called knowledge if for some user defined threshold i∈M I I (φ)>i F A typical process of KDD includes an iterative sequence of following steps: 1. data cleaning: removing noise or irrelevant data, 2. data integration: possible combining of multiple data sources, 3. data selection: retrieving of relevant data from the database, 4. data transformation, 5. data mining, 6. pattern evaluation: identifying of the truly interesting patterns representing knowledge based on some interestingness measures, and 7. knowledge presentation:presentationofthe minedknowledgeto the userby using some visualization and knowledge representation techniques. The success of a KDD project strongly depends on the choice of proper data mining algorithms to extract from data those patterns or models that are really interesting for the users. Up to now, the universal recipe of assigning to each data set its proper data mining solution does not exist. Therefore, KDD must be an iterative and interactive process, where previous steps are repeated in an interaction with users or experts to identify the most suitable data mining method (or their combination) to the studied problem. One can characterize the existing data mining methods by their goals, func- tionalities and computational paradigms: – Data mining goals:Thetwoprimarygoalsofdatamininginpracticetend tobepredictionanddescription.Predictioninvolvesusingsomevariablesor fieldsinthedatabasetopredictun-knownorfuturevaluesofothervariables of interest. Description focuses on finding human interpretable patterns de- scribing the data. The relative importance of prediction and description for particular data mining applications can vary considerably. – Data mining functionalities: Dataminingcanbetreatedasacollection of solutions for some predefined tasks. The major classes of knowledge dis- covery tasks, also called data mining functionalities, include the discovery of • concept/class descriptions, • association, • classification, • prediction, • segmentation (clustering); • trend analysis, deviation analysis, and similarity analysis. • dependency modeling such as graphical models or density estimation; • summarization such as finding the relations between fields, associations, visualization;Characterizationanddiscriminationarealsoformsofdata summarization. 9 – Data mining techniques: Type of method used to solve the task is called the data mining paradigm. Some exemplar data mining techniques (compu- tational paradigms) are listed as below • RI = rule induction; • DT = decision tree induction; • IBL = instancebased learning (e.g., nearest neighbour) • NN = neural networks; • GA = genetic algorithms/programming; • SVM = support vector machine • etc. Many combinations of data mining paradigm and knowledge discovery task are possible. For example the neural network approach is applicable to both predictive modeling task as well as segmentation task. Any particular implementation of a data mining paradigm to solve a task is called the data mining method. Lots of data mining methods are derived by modification or improvement of existing machine learning and pattern recogni- tion approaches to manage with large and untypical data sets. Every method in data mining is required to be accurate, efficient and scalable. For example, in prediction tasks, the predictive accuracy refers to the ability of the model to correctly predict the class label ofnew orpreviously unseendata.The efficiency referstothecomputationcostsinvolvedingeneratingandusingthemodel.Scal- ability refers to the ability of the learned model to perform efficiently on large amounts of data. For example, C5.0 is the most popular algorithm for the well known method from machine learning and statistics called decision tree. C5.0 is quite fast for construction of decision tree from data sets of moderate size, but becomes in- efficient for huge and distributed databases. Most decision tree algorithms have the restriction that the training samples should reside in main memory. In data mining applications, very large training sets of millions of samples are common. Hence, this restriction limits the scalability of such algorithms, where the de- cision tree construction can become inefficient due to swapping of the training samplesinandoutofmainandcachememories.SLIQandSPRINTareexamples of scalable decision tree methods in data mining. Each data mining algorithm consists of the following components (see [27]) 1. ModelRepresentationisthelanguageLfordescribingdiscoverablepatterns. too limited a representation can never produce an accurate model what are the representational assumptions of a particular algorithm more power- ful representations increase the danger of overfitting and resulting in poor predictive accuracy more complex representations increase the difficulty of search more complex representations increase the difficulty of model inter- pretation 2. Model Evaluation estimates how well a particular pattern (a model and its parameters) meet the criteria of the KDD process. Evaluation of predictive accuracy (validity) is based on cross validation. Evaluation of descriptive 10 quality involves predictive accuracy, novelty, utility, and understandability ofthefittedmodel.Bothlogicalandstatisticalcriteriacanbeusedformodel evaluation. 3. Search Method consists of two components: (a) Inparametersearch,thealgorithmmustsearchfortheparameterswhich optimize the model evaluation criteria given observed data and a fixed model representation. (b) Model search occurs as a loop over the parameter search method: the model representation is changed so that a family of models are consid- ered. 1.1 Approximate Reasoning Problem InthisSectionwedescribeanissueofapproximatereasoningthatcanbeseenas a connection between data mining and logics. This problem occurs, e.g., during an interaction between two (human/machine) beings which are using different languages to talk about objects (cases, situations, etc.) from the same universe. The intelligence skill of those beings, called intelligent agents, is measured by the ability of understanding the other agents. This skill is performed by differ- ent ways, e.g., by learning or classification (in Machine Learning and Pattern Recognition theory), by adaptation (in Evolutionary Computation theory), or by recognition (in Recognitive Science). Logicsisamathematicalsciencethattriestomodelthewayofhumanthink- ing and reasoning. Two main components of each logic are logical language, and the set of inference rules. Each logical language contains a set of formulas or well-formulated sentences in the considered logic. In some logics, meanings (se- mantics) of formulas are defined by some sets of objects from a given universe. For a given universe U, a subset X ⊂U is called a concept in a given language L, if X can be described by a formula φ in L. Therefore, it is natural to distinguish two basic problems in approximate reasoning, namely: approximation of unknown concepts and approximation of reasoning scheme. By concept approximation problem we denote the problem of searching for description–inapredefinedlanguageL–ofconceptsdefinableinotherlanguage L∗.NoteveryconceptinL∗ canbeexactlydescribedinL,thereforetheproblem is to find an approximate description rather than exact description of unknown concepts, and the approximation is required to be as exact as possible. In many applications, the problem is to approximate those concepts that are definable either in natural language or by expert or by unknown process. Forexampleletusconsidertheproblemofautomaticrecognitionof“overweight people”fromcamerapictures.Thisconcept(intheuniverseofallpeople)iswell understoodinmedicineandcanbedeterminedbyBMI(theBody Mass Index)1. 1 BMIiscalculatedasweightinkilogramsdividedbythesquareofheightinmeters;In thesimplestdefinition,peoplearecategorizedasunderweight(BMI<18.5),healthy weight (BMI ∈[18.5,25)), overweight (BMI ∈[25,30)), and obese (BMI ≥30.0).

Description:
Institute of Mathematics, Warsaw University. Banacha .. Chapter 9: Discretization and decision tree induction methods from large data- bases . Classification, clustering, association analysis or regression are examples of .. method is derived from the George Boole's formalism (1854) that eventually
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.