Jun-Bao Li · Shu-Chuan Chu Jeng-Shyang Pan Kernel Learning Algorithms for Face Recognition Kernel Learning Algorithms for Face Recognition Jun-Bao Li Shu-Chuan Chu • Jeng-Shyang Pan Kernel Learning Algorithms for Face Recognition 123 Jun-Bao Li Jeng-Shyang Pan Harbin InstituteofTechnology HITShenzhen Graduate School Harbin Harbin InstituteofTechnology People’s Republic ofChina Shenzhen City Guangdong Province Shu-ChuanChu People’s Republic ofChina Flinders Universityof SouthAustralia Bedford Park,SA Australia ISBN 978-1-4614-0160-5 ISBN 978-1-4614-0161-2 (eBook) DOI 10.1007/978-1-4614-0161-2 SpringerNewYorkHeidelbergDordrechtLondon LibraryofCongressControlNumber:2013944551 (cid:2)SpringerScience+BusinessMediaNewYork2014 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionor informationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purposeofbeingenteredandexecutedonacomputersystem,forexclusiveusebythepurchaserofthe work. Duplication of this publication or parts thereof is permitted only under the provisions of theCopyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the CopyrightClearanceCenter.ViolationsareliabletoprosecutionundertherespectiveCopyrightLaw. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexempt fromtherelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. While the advice and information in this book are believed to be true and accurate at the date of publication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityfor anyerrorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,with respecttothematerialcontainedherein. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface Facerecognition(FR)isanimportantresearchtopicinthepatternrecognitionarea and is widely applied in many areas. Learning-based FR achieves good perfor- mance, but linear learning methods share their limitations on extracting the fea- turesoffaceimage,changeofpose,illumination,andexpresscausingtheimageto presentacomplicatednonlinearcharacter.Therecentlyproposedkernelmethodis regarded an effective method for extracting the nonlinear features and is widely used. Kernel learning is an important research topic inthe machine learning area, andsometheoryandapplicationsfruitsareachievedandwidelyappliedinpattern recognition, data mining, computer vision, image, and signal processing. The nonlinear problems are solved at large with kernel function and system perfor- mances such as recognition accuracy and prediction accuracy that are largely increased. However, the kernel learning method still endures a key problem, i.e., kernel function and its parameter selection. Research has shown that kernel function and its parameters have a direct influence on data distribution in the nonlinearfeaturespace,andinappropriateselectionwillinfluencetheperformance of kernel learning. Research on self-adaptive learning of kernel function and its parameterhasimportanttheoreticalvalueforsolvingthekernelselectionproblem widely endured by the kernel learning machine, and has the same important practical meaning for improvement of kernel learning systems. The main contributions of this book are described as follows: First, for parameter selection problems endured by kernel learning algorithms, thisdissertationproposesthekerneloptimization methodwiththedata-dependent kernel. The definition of data-dependent kernel is extended, and its optimal parameters are achieved through solving the optimization equation created based on Fisher criterion and maximum margin criterion. Two kernel optimization algorithms are evaluated and analyzed from two different views. Second, for problems of computation efficiency and storage space endured by kernel learning-based image feature extraction, an image matrix-based Gaussian kerneldirectlydealingwiththeimagesisproposed.Theimagematrixneednotbe transformed to the vector when the kernel is used in image feature extraction. Moreover, by combining the data-dependent kernel and kernel optimization, we propose an adaptive image matrix-based Gaussian kernel which not only directly deals with the image matrix but also adaptively adjusts the parameters of the v vi Preface kernels according to the input image matrix. This kernel can improve the per- formance of kernel learning-based image feature extraction. Third, for the selection of kernel function and its parameters endured by tra- ditionalkerneldiscriminantanalysis,thedata-dependentkernelisappliedtokernel discriminant analysis. Two algorithms named FC?FC-based adaptive kernel dis- criminantanalysisandMMC?FC-basedadaptivekerneldiscriminantanalysisare proposed. The algorithms are based on the idea of combining kernel optimization andlinearprojection-basedtwo-stagesalgorithm.Thealgorithmsadaptivelyadjust the structure of kernels according to the distribution of the input samples in the input space and optimize the mapping of sample data from the input space to the feature space. Thus the extracted features have more class discriminative ability compared with traditional kernel discriminant analysis. As regards parameter selection problem endured by traditional kernel discriminant analysis, this report presentstheNonparametricKernelDiscriminantAnalysis(NKDA)methodwhich solves the performance of classifier owing to unfitted parameter selection. As regards kernel function and its parameter selection, kernel structure self-adaptive discriminant analysis algorithms are proposed and tested with simulations. Fourth, for problems endured by the recently proposed Locality Preserving Projection(LPP) algorithm: (1) The class label information oftraining samples is not used during training; (2) LPP is a linear transformation-based feature extrac- tion method and is not able to extract the nonlinear features; (3) LPP endures the parameter selection problem when it creates the nearest neighbor graph. For the above problems, this dissertation proposes a supervised kernel locality preserving projection algorithm, and the algorithm applies the supervised no parameters method for creating the nearest neighbor graph. The extracted nonlinear features have the largest class discriminative ability. The improved algorithm solves the above problems endured by LPP and enhances its performance on feature extraction. Fifth, for Pose, Illumination and Expression (PIE) problems endured by image feature extraction for face recognition, three kernel learning-based face recogni- tion algorithms are proposed. (1) To make full use of advantages of signal pro- cessing and learning-based methods on image feature extraction, a face image extractionmethod ofcombining Gabor waveletandenhanced kernel discriminant analysis is proposed. (2) Polynomial kernel is extended to fractional power polynomial model, and is used for kernel discriminant analysis. A fraction power polynomial model-based kernel discriminant analysis for feature extraction of facial image is proposed. (3) In order to make full use of the linear and nonlinear featuresofimages,anadaptivelyfusingPCAandKPCAforfaceimageextraction is proposed. Finally,onthetrainingsamplesnumberandkernelfunctionandtheirparameter enduredbyKernel PrincipalComponentAnalysis,thisreportpresentsaone-class support vector-based Sparse Kernel Principal Component Analysis (SKPCA). Moreover, data-dependent kernel is introduced and extended to propose SKPCA algorithm. First, a few meaningful samples are found for solving the constraint optimization equation, and these training samples are used to compute the kernel Preface vii matrix which decreases the computing time and saving space. Second, kernel optimizationisappliedtoself-adaptive,adjustingthedatadistributionoftheinput samples and the algorithm performance is improved based on the limit training samples. The main contents of this book include Kernel Optimization, Kernel Sparse Learning, Kernel Manifold Learning, Supervised Kernel Self-adaptive Learning, and Applications of Kernel Learning. Kernel Optimization This bookaims tosolveparameter selectionproblemsendured by kernel learning algorithms, and presents kernel optimization method with the data-dependent kernel. The book extends the definition of data-dependent kernel and applies it to kernel optimization. The optimal structure of the input data is achieved through adjusting the parameter of data-dependent kernel for high class discriminative ability for classification tasks. The optimal parameter is achieved through solving the optimization equation created based onFishercriterion andmaximum margin criterion. Two kernel optimization algorithms are evaluated and analyzed from two different views. On practical applications, such as image recognition, for problemsofcomputationefficiencyandstoragespaceenduredbykernellearning- based image feature extraction, an image matrix-based Gaussian kernel directly dealing with the images is proposed in this book. Matrix Gaussian kernel-based kernel learning is implemented on image feature extraction using image matrix directly without transforming the matrix into vector for the traditional kernel function.Combiningthedata-dependentkernelandkerneloptimization,thisbook presents an adaptive image matrix-based Gaussian kernel with self-adaptively adjusting the parameters of the kernels according to the input image matrix, and the performance of image-based system is largely improved with this kernel. Kernel Sparse Learning Onthetrainingsamplesnumberandkernelfunctionanditsparameterenduredby KernelPrincipalComponentAnalysis;thisbookpresentsone-classsupportvector- based Sparse Kernel Principal Component Analysis (SKPCA). Moreover, data- dependent kernel is introduced and extended to propose SKPCA algorithm. First, the few meaningful samples are found with solving the constraint optimization equation,andthese training samplesare usedtocompute thekernel matrix which decreases the computing time and saving space. Second, kernel optimization is applied to self-adaptive adjusting data distribution of the input samples and the algorithm performance is improved based on the limit training samples. viii Preface Kernel Manifold Learning On the nonlinear feature extraction problem endured by Locality Preserving Projection (LPP) based manifold learning, and this book proposes a supervised kernel locality preserving projection algorithm creating the nearest neighbor graph. The extracted nonlinear features have the largest class discriminative ability, and it solves the above problems endured by LPP and enhances its per- formance on feature extraction. This book presents kernel self-adaptive manifold learning.ThetraditionalunsupervisedLPPalgorithmisextendedtothesupervised and kernelized learning. Kernel self-adaptive optimization solves kernel function and its parameters selection problems of supervised manifold learning, which improves the algorithm performance on feature extraction and classification. Supervised Kernel Self-Adaptive Learning On parameter selection problem endured by traditional kernel discriminant anal- ysis, this book presents Nonparametric Kernel Discriminant Analysis (NKDA) to solve the performance of classifier owing to unfitted parameter selection. On kernel function and its parameter selection, kernel structure self-adaptive dis- criminant analysis algorithms are proposed and tested with simulations. For the selection of kernel function and its parameters endured by traditional kernel dis- criminant analysis, the data-dependent kernel is applied to kernel discriminant analysis. Two algorithms named FC?FC-based adaptive kernel discriminant analysisandMMC?FC-basedadaptivekerneldiscriminantanalysisareproposed. Thealgorithmsarebasedontheideaofcombining kerneloptimizationandlinear projection-based two-stage algorithm. The algorithms adaptively adjust the structure of kernels according to the distribution of the input samples in the input spaceandoptimizethemappingofsampledatafromtheinputspacetothefeature space.Thustheextractedfeatureshavemoreclassdiscriminativeabilitycompared with traditional kernel discriminant analysis. Acknowledgements This work is supported by the National Science Foundationof China under Grant no. 61001165, the HIT Young Scholar Foundation of the 985 Project, and the Fundamental Research Funds for the Central Universities Grant No. HIT.BRETIII.201206 ix Contents 1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Basic Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Supervised Learning. . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . 2 1.1.3 Semi-Supervised Algorithms . . . . . . . . . . . . . . . . . . 3 1.2 Kernel Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 Kernel Definition . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Kernel Character . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Current Research Status . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.1 Kernel Classification. . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.2 Kernel Clustering. . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.3 Kernel Feature Extraction . . . . . . . . . . . . . . . . . . . . 8 1.3.4 Kernel Neural Network. . . . . . . . . . . . . . . . . . . . . . 9 1.3.5 Kernel Application. . . . . . . . . . . . . . . . . . . . . . . . . 9 1.4 Problems and Contributions. . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 Contents of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2 Statistical Learning-Based Face Recognition . . . . . . . . . . . . . . . . 19 2.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Face Recognition: Sensory Inputs. . . . . . . . . . . . . . . . . . . . . 20 2.2.1 Image-Based Face Recognition . . . . . . . . . . . . . . . . 20 2.2.2 Video-Based Face Recognition. . . . . . . . . . . . . . . . . 22 2.2.3 3D-Based Face Recognition. . . . . . . . . . . . . . . . . . . 23 2.2.4 Hyperspectral Image-Based Face Recognition . . . . . . 24 2.3 Face Recognition: Methods . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.3.1 Signal Processing-Based Face Recognition . . . . . . . . 26 2.3.2 A Single Training Image Per Person Algorithm. . . . . 27 2.4 Statistical Learning-Based Face Recognition . . . . . . . . . . . . . 33 2.4.1 Manifold Learning-Based Face Recognition. . . . . . . . 34 2.4.2 Kernel Learning-Based Face Recognition . . . . . . . . . 36 2.5 Face Recognition: Application Conditions. . . . . . . . . . . . . . . 37 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 xi
Description: