Table Of ContentJun-Bao Li · Shu-Chuan Chu
Jeng-Shyang Pan
Kernel Learning
Algorithms for
Face Recognition
Kernel Learning Algorithms
for Face Recognition
Jun-Bao Li Shu-Chuan Chu
•
Jeng-Shyang Pan
Kernel Learning Algorithms
for Face Recognition
123
Jun-Bao Li Jeng-Shyang Pan
Harbin InstituteofTechnology HITShenzhen Graduate School
Harbin Harbin InstituteofTechnology
People’s Republic ofChina Shenzhen City
Guangdong Province
Shu-ChuanChu People’s Republic ofChina
Flinders Universityof SouthAustralia
Bedford Park,SA
Australia
ISBN 978-1-4614-0160-5 ISBN 978-1-4614-0161-2 (eBook)
DOI 10.1007/978-1-4614-0161-2
SpringerNewYorkHeidelbergDordrechtLondon
LibraryofCongressControlNumber:2013944551
(cid:2)SpringerScience+BusinessMediaNewYork2014
Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionor
informationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar
methodology now known or hereafter developed. Exempted from this legal reservation are brief
excerpts in connection with reviews or scholarly analysis or material supplied specifically for the
purposeofbeingenteredandexecutedonacomputersystem,forexclusiveusebythepurchaserofthe
work. Duplication of this publication or parts thereof is permitted only under the provisions of
theCopyright Law of the Publisher’s location, in its current version, and permission for use must
always be obtained from Springer. Permissions for use may be obtained through RightsLink at the
CopyrightClearanceCenter.ViolationsareliabletoprosecutionundertherespectiveCopyrightLaw.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexempt
fromtherelevantprotectivelawsandregulationsandthereforefreeforgeneraluse.
While the advice and information in this book are believed to be true and accurate at the date of
publication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityfor
anyerrorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,with
respecttothematerialcontainedherein.
Printedonacid-freepaper
SpringerispartofSpringerScience+BusinessMedia(www.springer.com)
Preface
Facerecognition(FR)isanimportantresearchtopicinthepatternrecognitionarea
and is widely applied in many areas. Learning-based FR achieves good perfor-
mance, but linear learning methods share their limitations on extracting the fea-
turesoffaceimage,changeofpose,illumination,andexpresscausingtheimageto
presentacomplicatednonlinearcharacter.Therecentlyproposedkernelmethodis
regarded an effective method for extracting the nonlinear features and is widely
used. Kernel learning is an important research topic inthe machine learning area,
andsometheoryandapplicationsfruitsareachievedandwidelyappliedinpattern
recognition, data mining, computer vision, image, and signal processing. The
nonlinear problems are solved at large with kernel function and system perfor-
mances such as recognition accuracy and prediction accuracy that are largely
increased. However, the kernel learning method still endures a key problem, i.e.,
kernel function and its parameter selection. Research has shown that kernel
function and its parameters have a direct influence on data distribution in the
nonlinearfeaturespace,andinappropriateselectionwillinfluencetheperformance
of kernel learning. Research on self-adaptive learning of kernel function and its
parameterhasimportanttheoreticalvalueforsolvingthekernelselectionproblem
widely endured by the kernel learning machine, and has the same important
practical meaning for improvement of kernel learning systems.
The main contributions of this book are described as follows:
First, for parameter selection problems endured by kernel learning algorithms,
thisdissertationproposesthekerneloptimization methodwiththedata-dependent
kernel. The definition of data-dependent kernel is extended, and its optimal
parameters are achieved through solving the optimization equation created based
on Fisher criterion and maximum margin criterion. Two kernel optimization
algorithms are evaluated and analyzed from two different views.
Second, for problems of computation efficiency and storage space endured by
kernel learning-based image feature extraction, an image matrix-based Gaussian
kerneldirectlydealingwiththeimagesisproposed.Theimagematrixneednotbe
transformed to the vector when the kernel is used in image feature extraction.
Moreover, by combining the data-dependent kernel and kernel optimization, we
propose an adaptive image matrix-based Gaussian kernel which not only directly
deals with the image matrix but also adaptively adjusts the parameters of the
v
vi Preface
kernels according to the input image matrix. This kernel can improve the per-
formance of kernel learning-based image feature extraction.
Third, for the selection of kernel function and its parameters endured by tra-
ditionalkerneldiscriminantanalysis,thedata-dependentkernelisappliedtokernel
discriminant analysis. Two algorithms named FC?FC-based adaptive kernel dis-
criminantanalysisandMMC?FC-basedadaptivekerneldiscriminantanalysisare
proposed. The algorithms are based on the idea of combining kernel optimization
andlinearprojection-basedtwo-stagesalgorithm.Thealgorithmsadaptivelyadjust
the structure of kernels according to the distribution of the input samples in the
input space and optimize the mapping of sample data from the input space to the
feature space. Thus the extracted features have more class discriminative ability
compared with traditional kernel discriminant analysis. As regards parameter
selection problem endured by traditional kernel discriminant analysis, this report
presentstheNonparametricKernelDiscriminantAnalysis(NKDA)methodwhich
solves the performance of classifier owing to unfitted parameter selection. As
regards kernel function and its parameter selection, kernel structure self-adaptive
discriminant analysis algorithms are proposed and tested with simulations.
Fourth, for problems endured by the recently proposed Locality Preserving
Projection(LPP) algorithm: (1) The class label information oftraining samples is
not used during training; (2) LPP is a linear transformation-based feature extrac-
tion method and is not able to extract the nonlinear features; (3) LPP endures the
parameter selection problem when it creates the nearest neighbor graph. For the
above problems, this dissertation proposes a supervised kernel locality preserving
projection algorithm, and the algorithm applies the supervised no parameters
method for creating the nearest neighbor graph. The extracted nonlinear features
have the largest class discriminative ability. The improved algorithm solves the
above problems endured by LPP and enhances its performance on feature
extraction.
Fifth, for Pose, Illumination and Expression (PIE) problems endured by image
feature extraction for face recognition, three kernel learning-based face recogni-
tion algorithms are proposed. (1) To make full use of advantages of signal pro-
cessing and learning-based methods on image feature extraction, a face image
extractionmethod ofcombining Gabor waveletandenhanced kernel discriminant
analysis is proposed. (2) Polynomial kernel is extended to fractional power
polynomial model, and is used for kernel discriminant analysis. A fraction power
polynomial model-based kernel discriminant analysis for feature extraction of
facial image is proposed. (3) In order to make full use of the linear and nonlinear
featuresofimages,anadaptivelyfusingPCAandKPCAforfaceimageextraction
is proposed.
Finally,onthetrainingsamplesnumberandkernelfunctionandtheirparameter
enduredbyKernel PrincipalComponentAnalysis,thisreportpresentsaone-class
support vector-based Sparse Kernel Principal Component Analysis (SKPCA).
Moreover, data-dependent kernel is introduced and extended to propose SKPCA
algorithm. First, a few meaningful samples are found for solving the constraint
optimization equation, and these training samples are used to compute the kernel
Preface vii
matrix which decreases the computing time and saving space. Second, kernel
optimizationisappliedtoself-adaptive,adjustingthedatadistributionoftheinput
samples and the algorithm performance is improved based on the limit training
samples.
The main contents of this book include Kernel Optimization, Kernel Sparse
Learning, Kernel Manifold Learning, Supervised Kernel Self-adaptive Learning,
and Applications of Kernel Learning.
Kernel Optimization
This bookaims tosolveparameter selectionproblemsendured by kernel learning
algorithms, and presents kernel optimization method with the data-dependent
kernel. The book extends the definition of data-dependent kernel and applies it to
kernel optimization. The optimal structure of the input data is achieved through
adjusting the parameter of data-dependent kernel for high class discriminative
ability for classification tasks. The optimal parameter is achieved through solving
the optimization equation created based onFishercriterion andmaximum margin
criterion. Two kernel optimization algorithms are evaluated and analyzed from
two different views. On practical applications, such as image recognition, for
problemsofcomputationefficiencyandstoragespaceenduredbykernellearning-
based image feature extraction, an image matrix-based Gaussian kernel directly
dealing with the images is proposed in this book. Matrix Gaussian kernel-based
kernel learning is implemented on image feature extraction using image matrix
directly without transforming the matrix into vector for the traditional kernel
function.Combiningthedata-dependentkernelandkerneloptimization,thisbook
presents an adaptive image matrix-based Gaussian kernel with self-adaptively
adjusting the parameters of the kernels according to the input image matrix, and
the performance of image-based system is largely improved with this kernel.
Kernel Sparse Learning
Onthetrainingsamplesnumberandkernelfunctionanditsparameterenduredby
KernelPrincipalComponentAnalysis;thisbookpresentsone-classsupportvector-
based Sparse Kernel Principal Component Analysis (SKPCA). Moreover, data-
dependent kernel is introduced and extended to propose SKPCA algorithm. First,
the few meaningful samples are found with solving the constraint optimization
equation,andthese training samplesare usedtocompute thekernel matrix which
decreases the computing time and saving space. Second, kernel optimization is
applied to self-adaptive adjusting data distribution of the input samples and the
algorithm performance is improved based on the limit training samples.
viii Preface
Kernel Manifold Learning
On the nonlinear feature extraction problem endured by Locality Preserving
Projection (LPP) based manifold learning, and this book proposes a supervised
kernel locality preserving projection algorithm creating the nearest neighbor
graph. The extracted nonlinear features have the largest class discriminative
ability, and it solves the above problems endured by LPP and enhances its per-
formance on feature extraction. This book presents kernel self-adaptive manifold
learning.ThetraditionalunsupervisedLPPalgorithmisextendedtothesupervised
and kernelized learning. Kernel self-adaptive optimization solves kernel function
and its parameters selection problems of supervised manifold learning, which
improves the algorithm performance on feature extraction and classification.
Supervised Kernel Self-Adaptive Learning
On parameter selection problem endured by traditional kernel discriminant anal-
ysis, this book presents Nonparametric Kernel Discriminant Analysis (NKDA) to
solve the performance of classifier owing to unfitted parameter selection. On
kernel function and its parameter selection, kernel structure self-adaptive dis-
criminant analysis algorithms are proposed and tested with simulations. For the
selection of kernel function and its parameters endured by traditional kernel dis-
criminant analysis, the data-dependent kernel is applied to kernel discriminant
analysis. Two algorithms named FC?FC-based adaptive kernel discriminant
analysisandMMC?FC-basedadaptivekerneldiscriminantanalysisareproposed.
Thealgorithmsarebasedontheideaofcombining kerneloptimizationandlinear
projection-based two-stage algorithm. The algorithms adaptively adjust the
structure of kernels according to the distribution of the input samples in the input
spaceandoptimizethemappingofsampledatafromtheinputspacetothefeature
space.Thustheextractedfeatureshavemoreclassdiscriminativeabilitycompared
with traditional kernel discriminant analysis.
Acknowledgements
This work is supported by the National Science Foundationof China under Grant
no. 61001165, the HIT Young Scholar Foundation of the 985 Project, and the
Fundamental Research Funds for the Central Universities Grant No.
HIT.BRETIII.201206
ix
Contents
1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Basic Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Supervised Learning. . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Semi-Supervised Algorithms . . . . . . . . . . . . . . . . . . 3
1.2 Kernel Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Kernel Definition . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Kernel Character . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Current Research Status . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 Kernel Classification. . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 Kernel Clustering. . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.3 Kernel Feature Extraction . . . . . . . . . . . . . . . . . . . . 8
1.3.4 Kernel Neural Network. . . . . . . . . . . . . . . . . . . . . . 9
1.3.5 Kernel Application. . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Problems and Contributions. . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Contents of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Statistical Learning-Based Face Recognition . . . . . . . . . . . . . . . . 19
2.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Face Recognition: Sensory Inputs. . . . . . . . . . . . . . . . . . . . . 20
2.2.1 Image-Based Face Recognition . . . . . . . . . . . . . . . . 20
2.2.2 Video-Based Face Recognition. . . . . . . . . . . . . . . . . 22
2.2.3 3D-Based Face Recognition. . . . . . . . . . . . . . . . . . . 23
2.2.4 Hyperspectral Image-Based Face Recognition . . . . . . 24
2.3 Face Recognition: Methods . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.1 Signal Processing-Based Face Recognition . . . . . . . . 26
2.3.2 A Single Training Image Per Person Algorithm. . . . . 27
2.4 Statistical Learning-Based Face Recognition . . . . . . . . . . . . . 33
2.4.1 Manifold Learning-Based Face Recognition. . . . . . . . 34
2.4.2 Kernel Learning-Based Face Recognition . . . . . . . . . 36
2.5 Face Recognition: Application Conditions. . . . . . . . . . . . . . . 37
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
xi
Description:Jun-Bao Li · Shu-Chuan Chu. Jeng-Shyang Pan. Kernel Learning. Algorithms for. Face Recognition Another reason for exploiting this alternative is the outstanding performance of our visual Xie Y, Setia L, Burkhardt H (2008) Face image retrieval based on concentric circular fourier-Zernike