Face Recognition with Support Vector Machines: Global versus Component-based Approach Bernd Heisele PurdyHo TomasoPoggio (cid:1) (cid:2) MassachusettsInstituteofTechnology Center forBiologicalandComputationalLearning Cambridge,MA02142 [email protected] [email protected] [email protected] Abstract we divideface recognitiontechniquesintotwo categories: i)globalapproachandii)component-basedapproach. We present a component-basedmethod and two global i) In this category a single feature vector that repre- methods for face recognition and evaluate them with re- sents the wholeface image is used as input to a classifier. specttorobustnessagainstposechanges.Inthecomponent Severalclassifiershavebeenproposedintheliteraturee.g. systemwe first locatefacialcomponents,extractthemand minimumdistanceclassificationintheeigenspace[18,20], combinethem into a single feature vector which is classi- Fisher’sdiscriminantanalysis[1],andneuralnetworks[6]. fied by a Support Vector Machine (SVM). The two global Global techniques work well for classifying frontal views systemsrecognizefacesbyclassifyingasinglefeaturevec- offaces.However,theyarenotrobustagainstposechanges torconsistingofthegrayvaluesofthewholefaceimage.In sinceglobalfeaturesarehighlysensitivetotranslationand thefirstglobalsystemwetrainedasingleSVMclassifierfor rotation of the face. To avoid this problem an alignment eachpersoninthedatabase.Thesecondsystemconsistsof stagecanbeaddedbeforeclassifyingtheface. Aligningan setsofviewpoint-specificSVMclassifiersandinvolvesclus- inputfaceimagewithareferencefaceimagerequirescom- tering during training. We performed extensive tests on a putingcorrespondencesbetweenthetwofaceimages. The database which included faces rotated up to about in Æ correspondencesareusuallydeterminedforasmallnumber depth. The component system clearly outperformed(cid:1)(cid:2)both of prominent points in the face like the center of the eye, globalsystemsonalltests. the nostrils, or the corners of the mouth. Based on these correspondencesthe input face image can be warped to a reference face image. In [12] an affine transformation is 1.Introduction computedtoperformthewarping.Activeshapemodelsare usedin[10]toaligninputfaceswithmodelfaces. Asemi automaticalignmentstepincombinationwithSVMclassi- Over the past 20 years numerous face recognition pa- ficationwasproposedin[9]. pers have been published in the computer vision commu- ii) An alternative to the global approaches is to clas- nity; a survey can be found in [4]. The number of real- sifylocalfacialcomponents.Themainideaofcomponent- world applications (e.g. surveillance, secure access, hu- basedrecognitionistocompensateforposechangesbyal- man/computerinterface) and the availability of cheap and lowingaflexiblegeometricalrelationbetweenthecompo- powerful hardware also lead to the development of com- nentsintheclassificationstage.In[3]facerecognitionwas mercial face recognition systems. Despite the success of performedbyindependentlymatchingtemplatesofthreefa- someofthesesystemsinconstrainedscenarios,thegeneral cialregions(botheyes,noseandmouth).Theconfiguration taskoffacerecognitionstill posesa numberofchallenges ofthecomponentsduringclassificationwas unconstrained with respect to changes in illumination, facial expression, sincethesystemdidnotincludeageometricalmodelofthe andpose. face.Asimilarapproachwithanadditionalalignmentstage Inthefollowingwegiveabriefoverviewonfacerecog- wasproposedin[2]. In[23]ageometricalmodelofaface nitionmethods. Focusingontheaspectofposeinvariance was implementedbya 2-Delastic graph. Therecognition was based on wavelet coefficients that were computed on withHondaResearchLaboratory,Boston,MA,fromApril2001 (cid:1) withHewlett-Packard,PaloAlto,CA,fromJuly2001 thenodesoftheelasticgraph.In[14]awindowwasshifted (cid:2) 1 Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 3. DATES COVERED 2001 2. REPORT TYPE 00-00-2001 to 00-00-2001 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Face Recognition with Support Vector Machines: Global versus 5b. GRANT NUMBER Component-based Approach 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Massachusetts Institute of Technology,Center for Biological and REPORT NUMBER Computational Learning,77 Massachusetts Avenue,Cambridge,MA,02139 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES The original document contains color images. 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF ABSTRACT OF PAGES RESPONSIBLE PERSON a. REPORT b. ABSTRACT c. THIS PAGE 7 unclassified unclassified unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 over the face image and the DCT coefficients computed Thecoefficients andthe inEq.(1)arethesolutions within the window were fed into a 2-D Hidden Markov ofaquadraticprogra(cid:7)m(cid:1)mingpro(cid:8)blem[21].Classificationof Model. a newdata point is performedbycomputingthe signof We present two global approaches and a component- therightsideofE(cid:1)q.(1).Inthefollowingwewilluse based approach to face recognition and evaluate their ro- bustnessagainstposechanges.Thefirstglobalmethodcon- (cid:3)(cid:1)(cid:1)(cid:2)(cid:7)(cid:1)(cid:5)(cid:1)(cid:1)(cid:1)(cid:5)(cid:1)(cid:10)(cid:8) (2) sists of a face detector which extracts the face part from (cid:9)(cid:8)(cid:1)(cid:9)(cid:5) (cid:3) an image and propagates it to a set of SVM classifiers (cid:2)(cid:6) (cid:1)(cid:1)(cid:2)(cid:7)(cid:1)(cid:5)(cid:1)(cid:1)(cid:1)(cid:6) to perform multi-class classification. The sign of is the that perform the face recognition. By using a face detec- classificationresultfor ,a(cid:2)nd isthedistancefro(cid:9)m to torweachievetranslationandscaleinvariance. Inthesec- thehyperplane.Intuitive(cid:1)ly,the(cid:7)f(cid:9)a(cid:7)rtherawayapointisf(cid:1)rom ondglobalmethodwesplittheimagesofeachpersoninto thedecisionsurface,i.e.thelarger ,themorereliablethe viewpoint-specificclusters. We thentrainSVM classifiers classificationresult. (cid:7)(cid:9)(cid:7) oneachsinglecluster.Incontrasttotheglobalmethods,the The entire construction can be extended to the case of componentsystemusesafacedetectorthatdetectsandex- nonlinear separating surfaces. Each point in the input tractslocalcomponentsoftheface.Thedetectorconsistsof space is mapped to a point of a(cid:1)higher dimen- a set ofSVM classifiers that locatefacial componentsand sional space, called the feat(cid:2)ur(cid:5)e sp(cid:11)a(cid:8)c(cid:1)e(cid:9), where the data are asinglegeometricalclassifierthatchecksiftheconfigura- separated by a hyperplane. The key property in this con- tionofthecomponentsmatchesalearnedgeometricalface struction is that the mapping is subject to the condi- model.Thedetectedcomponentsareextractedfromtheim- tionthatthedotproductoftw(cid:11)o(cid:8)p(cid:5)(cid:9)ointsinthefeaturespace age,normalizedinsizeandfedintoasetofSVMclassifiers. canberewrittenasakernelfunction . Theoutlineofthepaperisasfollows: Chapter2givesa (cid:11)Th(cid:8)(cid:1)e(cid:9)de(cid:5)c(cid:11)is(cid:8)i(cid:3)o(cid:9)nsurfacehastheequation: (cid:10)(cid:8)(cid:1)(cid:2)(cid:3)(cid:9) briefoverviewonSVMlearningandonstrategiesformulti- class classification with SVMs. In Chapter 3 we describe (cid:3) the two global methodsfor face recognition. Chapter 4 is aboutthecomponent-basedsystem. Chapter5containsex- (cid:6)(cid:8)(cid:1)(cid:9)(cid:5) (cid:5)(cid:1)(cid:7)(cid:1)(cid:10)(cid:8)(cid:1)(cid:2)(cid:1)(cid:1)(cid:9)(cid:10)(cid:8)(cid:2) (cid:1)(cid:1)(cid:2) perimentalresultsandacomparisonbetweentheglobaland again, the coefficients(cid:1) and are the solutions of a componentsystems. Chapter6concludesthepaper. quadraticprogrammingp(cid:7)r(cid:1)oblem.(cid:8)Note that doesnot dependonthedimensionalityofthefeatures(cid:6)pa(cid:8)(cid:1)ce(cid:9). 2.Support VectorMachine Classification An important family of kernel functions is the polyno- mialkernel: We first explainthebasics of SVMs forbinaryclassifi- (cid:4) cation[21]. Thenwediscusshowthistechniquecanbeex- (cid:10)(cid:8)(cid:1)(cid:2)(cid:3)(cid:9)(cid:5)(cid:8)(cid:6)(cid:10)(cid:1)(cid:5)(cid:3)(cid:9) (cid:2) where is the degree of the polynomial. In this case the tendedtodealwithgeneralmulti-classclassificationprob- compon(cid:9)entsofthemapping areallthepossiblemono- lems. mialsofinputcomponentsu(cid:11)p(cid:8)t(cid:1)o(cid:9)thedegree . (cid:9) 2.1.BinaryClassification 2.2.Multi-classclassification SVMs belong to the class of maximum margin classi- There are two basic strategies for solving -class prob- fiers.Theyperformpatternrecognitionbetweentwoclasses lemswithSVMs: (cid:11) byfindingadecisionsurfacethathasmaximumdistanceto i)Intheone-vs-allapproach SVMsaretrained.Eachof theclosestpointsinthetrainingsetwhicharetermedsup- theSVMsseparatesasingleclas(cid:11)sfromallremainingclasses portvectors.Westartwithatrainingsetofpoints , (cid:2) [5,17]. where each point belongs (cid:1)to(cid:1)o(cid:1)ne(cid:3)(cid:4)of ii) In the pairwise approach machines are (cid:1)tw(cid:5)o cl(cid:6)a(cid:2)s(cid:7)se(cid:2)s(cid:3)(cid:3)i(cid:3)d(cid:2)e(cid:4)ntified by the label (cid:1)(cid:1) . Assum- trained.EachSVMseparatesapa(cid:11)ir(cid:8)(cid:11)of(cid:3)cla(cid:6)s(cid:9)s(cid:12)e(cid:7)s.Thepairwise ing linearly separabledata1, the go(cid:5)a(cid:1)l o(cid:1)f m(cid:2)(cid:3)ax(cid:6)i(cid:2)m(cid:6)u(cid:4)mmargin classifiersarearrangedintrees,whereeachtreenoderepre- classificationistoseparatethetwoclassesbyahyperplane sentsanSVM.Abottom-uptreesimilartotheelimination suchthatthedistancetothesupportvectorsismaximized. treeusedintennistournamentswasoriginallyproposedin Thishyperplaneiscalledtheoptimalseparatinghyperplane [16]forrecognitionof3-Dobjectsandwasappliedtoface (OSH).TheOSHhastheform: recognitionin [7]. A top-downtree structure has beenre- centlypublishedin[15]. (cid:3) (1) Thereisnotheoreticalanalysisofthetwostrategieswith (cid:6)(cid:8)(cid:1)(cid:9)(cid:5) (cid:7)(cid:1)(cid:5)(cid:1)(cid:1)(cid:1)(cid:5)(cid:1)(cid:10)(cid:8)(cid:2) respecttoclassificationperformance. Regardingthetrain- 1Forthenon-separablecase(cid:1)(cid:1)th(cid:2)ereaderisreferredto[21]. ingeffort, the one-vs-allapproachis preferablesince only (cid:1) SVMshavetobetrainedcomparedto SVMs (cid:11)in the pairwise approach. The run-time(cid:11)c(cid:8)o(cid:11)m(cid:3)pl(cid:6)ex(cid:9)(cid:12)it(cid:7)yof the twostrategiesis similar: Theone-vs-allapproachrequires the evaluation of , the pairwise approach the evaluation of SVMs. (cid:11)Recent experiments on person recogni- tion(cid:11)s(cid:3)ho(cid:6)w similar classification performances for the two strategies[13].Sincethenumberofclassesinfacerecogni- tioncanberatherlargeweoptedfortheone-vs-allstrategy where the number of SVMs is linear with the number of classes. 3.GlobalApproach Bothglobalsystemsdescribedinthispaperconsistofa facedetectionstagewherethefaceisdetectedandextracted fromaninputimageandarecognitionstagewheretheper- son’sidentityisestablished. 3.1.Facedetection Figure1.Theuppertworowsareexampleim- ages from our training set. The lower two Wedevelopedafacedetectorsimilartotheonedescribed rows show the image parts extracted by the in [8]. In order to detect faces at different scales we first SVMfacedetector. computedaresolutionpyramidfortheinputimageandthen shifteda windowovereachimageinthepyramid. We applie(cid:12)d(cid:13)t(cid:8)wo(cid:12)(cid:13)preprocessingsteps to the gray images to inthetrainingset(labeled ). Forbothtrainingandtest- compensateforcertainsourcesofimagevariations[19]. A ing we ran the face detecto(cid:3)r(cid:6)on the input image to extract best-fitintensityplanewassubtractedfromthegrayvalues theface. Were-saledthefaceimageto pixelsand tocompensateforcastshadows. Thenhistogramequaliza- convertedthegrayvaluesintoafeaturev(cid:1)e(cid:2)ct(cid:8)or(cid:1)2.(cid:2)Givenaset tionwas appliedto removevariationsin theimagebright- of peopleandasetof SVMs,eachoneassociatedtoone nessandcontrast. Theresultinggrayvalueswere normal- per(cid:11)son,theclasslabel (cid:11)ofafacepattern iscomputedas izedtobeinarangebetween0and1andwereusedasinput follows: (cid:5) (cid:1) featurestoalinearSVMclassifier. Somedetectionresults if areshowninFig.1. (cid:13) if (cid:9)(cid:2)(cid:8)(cid:1)(cid:9)(cid:10)(cid:14)(cid:15)(cid:2) (3) (cid:5)(cid:5) Thetrainingdataforthefacedetectorweregeneratedby (cid:2) (cid:9)(cid:2)(cid:8)(cid:1)(cid:9)(cid:10)(cid:14)(cid:9)(cid:2) renderingseventextured3-Dheadmodels[22]. Theheads with (cid:3) (cid:5) were rotated between Æ and Æ in depth and illumi- where iscom(cid:9)p(cid:2)u(cid:8)t(cid:1)ed(cid:9)a(cid:5)cc(cid:15)or(cid:16)d(cid:17)in(cid:2)g(cid:9)(cid:1)to(cid:8)(cid:1)E(cid:9)q(cid:4).(cid:1)(cid:1)(2(cid:2))(cid:3)fortheSVM natedbyambientlight(cid:3)an(cid:14)d(cid:2)asingl(cid:14)e(cid:2)directionallightpoint- trained(cid:9)t(cid:1)o(cid:8)(cid:1)re(cid:9)cognizeperson .Theclassificationthresholdis ingtowardsthecenteroftheface.Wegenerated3,590face denotedas . Theclasslabe(cid:1)l standsforrejection. images of size pixels. The negative training set Changes(cid:14)intheheadposel(cid:2)eadtostrongvariationsinthe initiallyconsiste(cid:12)d(cid:13)o(cid:8)f1(cid:12)0(cid:13),209 non-facepatternsran- images of a person’s face. These in-class variations com- domlyextractedfrom502no(cid:12)n(cid:13)-(cid:8)fac(cid:12)e(cid:13)images. We expanded plicate the recognition task. That is why we developed a the training set by bootstrapping [19] to 13,655 non-face secondmethodinwhichwesplitthetrainingimagesofeach patterns. personintoclustersbyadivisiveclustertechnique[11].The algorithm started with an initial cluster including all face 3.2.Recognition images of a person after preprocessing. The cluster with thehighestvarianceissplitintotwobyahyperplane. The We implementedtwoglobalrecognitionsystems. Both varianceofaclusteriscalculatedas: systems were based on the one-vs-all strategy for SVM (cid:6) multi-classclassificationdescribedinthepreviousChapter. (cid:6) (cid:3) (cid:6) (cid:3) ThefirstsystemhadalinearSVMforeverypersoninthe (cid:16) (cid:5)(cid:15)(cid:18)(cid:19) (cid:5) (cid:6)(cid:1)(cid:2)(cid:3)(cid:1)(cid:7)(cid:6) database.EachSVMwastrainedtodistinguishbetweenall 2Weappliedthesam(cid:4)ep(cid:4)repro(cid:7)c(cid:1)e(cid:1)ss(cid:2)ingstepstogener(cid:5)at(cid:2)e(cid:1)th(cid:2)efeaturesas imagesofasingleperson(labeled )andallotherimages forthefacedetectordescribed. (cid:10)(cid:6) where isthenumberoffacesinthecluster.Afterthepar- 4.1.Detection titionin(cid:4)g has been performed, the face with the minimum distance to all other faces in the same cluster is chosen to We implementedatwo-levelcomponent-basedfacede- betheaveragefaceofthecluster. Iterativeclusteringstops tector which is described in detail in [8]. The principles whenamaximumnumberofclustersis reached3. Theav- of the system are illustrated in Fig. 3. On the first level, eragefaces canbe arrangedina binarytree. Fig. 2shows component classifiers independently detected facial com- the result of clustering applied to the training images of a ponents. On the secondlevel, a geometricalconfiguration person in our database. The nodes represent the average classifier performed the final face detection by combining faces; theleavesofthetreearesomeexamplefacesofthe the results of the component classifiers. Given a final clusters. As expected divisive clustering performs a window, the maximum continuous outputs of the(cid:12)c(cid:13)om(cid:8)p(cid:12)o(cid:13)- viewpoint-specificgroupingoffaces. nentclassifierswithinrectangularsearchregionsaroundthe WetrainedalinearSVMtodistinguishbetweenallim- expectedpositions of the componentswere used as inputs agesinonecluster(labeled )andallimagesofotherpeo- to the geometrical configuration classifier. The search re- pleinthetrainingset(labele(cid:10)d(cid:6) )4.Classificationwasdone gionshavebeencalculatedfromthemeanandstandardde- accordingtoEq.(3)with no(cid:3)w(cid:6)beingthenumberofclus- viationofthecomponents’locationsinthetrainingimages. tersofallpeopleinthetra(cid:11)iningset. Wealsoprovidedthegeometricalclassifierwiththeprecise positions of the detected componentsrelative to the upper left corner of the window. The 14 facial compo- nentsusedinthed(cid:12)e(cid:13)te(cid:8)cti(cid:12)o(cid:13)nsystemareshowninFig.4(a). Theshapesandpositionsofthecomponentshavebeenauto- maticallydeterminedfromthetrainingdatainordertopro- vide maximum discrimination between face and non-face images;see[8]fordetailsaboutthealgorithm.Thetraining setwasthesameasforthewholefacedetectordescribedin thepreviousChapter. Outputof Outputof Outputof EyeClassifier NoseClassifierMouthClassifier FirstLevel: Classifier Component Classifiers SecondLevel: Detectionof Configurationof Components Figure2.Binarytreeoffaceimagesgenerated Classifier bydivisiveclustering. 4.Component-based Approach Figure3.Systemoverviewofthecomponent- based face detector using four components. The global approach is highly sensitive to image vari- On the first level, windows of the size of the ations caused by changes in the pose of the face. The components (solid lined boxes) are shifted component-basedapproachavoidsthisproblembyindepen- overthefaceimageandclassifiedbythecom- dentlydetectingparts of the face. For small rotations, the ponent classifiers. On the second level, the changes in the components are relatively small compared maximum outputs of the component classi- to the changes in the whole face pattern. Changes in the fierswithinpredefinedsearchregions(dotted 2-D locations of the components due to pose changes are linedboxes)andthepositionsofthedetected accountedforbyalearned,flexiblefacemodel. componentsarefedintothegeometricalcon- 3Inourexperimentswedividedthefaceimagesofapersonintofour figurationclassifier. clusters. 4This is not exactly a one-vs-all classifier since images of the same personbutfromdifferentclusterswereomitted. 5.Experiments The training data for the face recognition system were recordedwithadigitalvideocameraataframerateofabout 5Hz. Thetrainingsetconsistedof8,593grayfaceimages offivesubjectsfromwhich1,383werefrontalviews. The resolutionofthefaceimagesrangedbetween and pixelswithrotationsinazimuthupto(cid:13)a(cid:2)bo(cid:8)ut(cid:13)(cid:2) . Æ (cid:6)T(cid:14)h(cid:2)e(cid:8)te(cid:6)s(cid:14)t(cid:2)set was recorded with the same camera bu(cid:10)t o(cid:1)n(cid:2)a (a) (b) separatedayandunderdifferentilluminationandwithdif- ferentbackground. Thesetincluded974imagesofallfive Figure4.(a)showsthe14componentsofour subjectsinourdatabase. Therotationsindepthwasagain facedetector. Thecentersofthecomponents uptoabout . Æ are markedbya whitecross. The10compo- Two exp(cid:10)er(cid:1)im(cid:2)ents were carried out. In the first experi- nentsthatwereusedforfacerecognitionare ment we trained on all 8,593 rotated and frontal face im- shownin(b). ages in the training set and tested on the whole test set. This experimentcontainedfourdifferenttests: Global ap- proachusingonelinearSVMclassifierforeachperson,us- 4.2.Recognition ing one linear SVM classifier for each cluster, using one seconddegreepolynomialSVMclassifierforeachperson, Totrainthefacerecognizerwefirstranthecomponent- andcomponent-basedapproachusingonelinearSVMclas- based detectorovereach image in the trainingset and ex- sifierforeachperson. tractedthecomponents.Fromthe14originalwekept10for facerecognition,removingthosethat eithercontainedfew gray value structures (e.g. cheeks) or strongly overlapped with other components. The 10 selected components are showninFig.4(b).Examplesofthecomponent-basedface detectorappliedto imagesofthetrainingset areshownin Fig. 5. To generatethe inputto our face recognitionclas- sifier we normalized each of the components in size and combinedtheirgrayvaluesintoasinglefeaturevector5.As for the first global system we used a one-vs-all approach with a linear SVM for every person in the database. The classificationresultwasdeterminedaccordingtoEq.(3). Figure6.ROCcurveswhentrainedandtested onfrontalandrotatedfaces. In the second experimentwe trained only on the 1,383 frontal face images in the training set but tested on the whole test set. This experiment contained three different Figure5.Examplesofcomponent-basedface tests: GlobalapproachusingonelinearSVMclassifierfor detection. Shown are face parts covered by eachperson,usingonelinearSVMclassifierforeachclus- the 10 components that were used for face ter, andcomponent-basedapproachusingonelinear SVM recognition. classifierforeachperson. TheROCcurvesofthesetwoexperimentsareshownin Fig.6andFig.7,respectively.EachpointontheROCcurve 5Beforeextractingthecomponentsweappliedthesamepreprocessing correspondstoadifferentvalueoftheclassificationthresh- stepstothedetected faceimageasintheglobalsystems. old from Eq. (3). At the end points of the ROC curves (cid:1)(cid:2)(cid:3)(cid:1)(cid:2) (cid:14) Figure8.Examplesofcomponent-basedface Figure7.ROCcurveswhentrainedonfrontal recognition. Thefirst3rowsandthefirstim- facesandtestedonfrontalandrotatedfaces. ageinthelastrowshowcorrectidentification. The last two images in the bottomrow show misclassificationsduetostrongrotationand therejectionrateis0.Someresultsofthecomponent-based facialexpression. recognitionsystemareshowninFig8. Therearethreeinterestingobservations: 6.Conclusion Inbothexperimentsthecomponentsystemclearlyout- (cid:11) performedtheglobalsystems. Thisalthoughtheface We presented a component-based technique and two classifieritself(5linearSVMs)waslesspowerfulthan global techniques for face recognition and evaluated theclassifiersusedintheglobalmethods(5non-linear their performance with respect to robustness against pose SVMsintheglobalmethodwithoutclustering,and changes. The component-based system detected and ex- linearSVMsinthemethodwithclustering). (cid:7)(cid:2) tractedasetof10facialcomponentsandarrangedthemina singlefeaturevectorthatwasclassifiedbylinearSVMs. In Involvingclusteringleadtoasignificantimprovement bothglobalsystems we detectedthe wholeface, extracted (cid:11) of the global method when the training set included itfromtheimageanduseditasinputtotheclassifiers. The rotated faces. This is because clustering generates firstglobalsystemconsistedofasingleSVMforeachper- viewpoint-specific clusters that have smaller in-class soninthedatabase. Inthesecondsystemweclusteredthe variations than the whole set of images of a person. database of each person and trained a set of view-specific The global method with clustering and linear SVMs SVMclassifiers. was also superior to the global system without clus- We tested the systems on a database which included teringandanon-linearSVM(seeFig.6). Thisshows faces rotatedin depth up to about . In all experiments Æ thatacombinationofweakclassifierstrainedonprop- the component-based system ou(cid:1)tp(cid:2)erformed the global erlychosensubsetsofthedatacanoutperformasingle, systemseventhoughweusedmorepowerfulclassifiers(i.e. morepowerfulclassifiertrainedonthewholedata. non-linear instead of linear SVMs) for the global system. This shows that using facial components instead of the Adding rotated faces to the training set improves the wholefacepatternasinputfeaturessignificantlysimplifies (cid:11) results of the global method with clustering and the thetaskoffacerecognition. component method. Surprisingly, the results for the globalmethodwithoutclusteringgotworse.Thisindi- catesthattheproblemofclassifyingfacesofoneper- Acknowledgements son over a large range of views is too complex for a linearclassifier. Indeed,theperformancesignificantly The authors would like to thank V. Blanz, T. Serre, improvedwhen using non-linearSVMs with second- and V. Schmid for their help. The research was par- degreepolynomialkernel. tially sponsored by the DARPA HID program. Addi- tional support was provided by a grant from Office of Naval Research under contract No. N00014-93-1-3085, system. submitted to IEEE Transactions On Neural Net- Office of Naval Research under contract No. N00014- works,2000. 00-1-0907, National Science Foundation under contract [14] A. Nefian and M. Hayes. An embedded hmm-based ap- No.IIS-0085836,NationalScienceFoundationundercon- proach for face detection and recognition. In Proc. IEEE InternationalConference onAcoustics,Speech, and Signal tract No. IIS-9800032, and National Science Foundation Processing,volume6,pages3553–3556,1999. under contract No. DMS-9872936, AT&T, Central Re- [15] J.Platt,N.Cristianini,and J.Shawe-Taylor. Largemargin search Institute of Electric Power Industry, Eastman Ko- dagsformulticlassclassification.AdvancesinNeuralInfor- dakCompany,DaimlerChrysler,DigitalEquipmentCorpo- mationProcessingSystems,2000. ration, Honda R&D Co., Ltd., Merrill Lynch, NEC Fund, [16] M. Pontil and A. Verri. Support vector machines for 3-d Nippon Telegraph & Telephone, Siemens Corporate Re- objectrecognition. IEEETransactionsonPatternAnalysis search,Inc.,andWhitakerFoundation. andMachineIntelligence,pages637–646,1998. [17] B.Scho¨lkopf,C.Burges,andV.Vapnik. Extractingsupport dataforagiventask. InU.FayyadandR.Uthurusamy,ed- References itors,Proceedings oftheFirstInternational Conference on Knowledge Discovery and Data Mining, Menlo Park, CA, [1] P.Belhumeur, P.Hespanha, and D. Kriegman. Eigenfaces 1995.AAAIPress. vsfisherfaces:recognitionusingclassspecificlinearprojec- [18] L.SirovitchandM.Kirby. Low-dimensionalprocedurefor tion. IEEETransactions onPattern Analysisand Machine thecharacterizationofhumanfaces. JournaloftheOptical Intelligence,19(7):711–720,1997. SocietyofAmericaA,2:519–524,1987. [2] D. J. Beymer. Face recognition under varying pose. A.I. [19] K.-K. Sung. Learning and Example Selection for Object Memo 1461, Center for Biological and Computational andPatternRecognition. PhDthesis,MIT,ArtificialIntel- Learning,M.I.T.,Cambridge,MA,1993. ligenceLaboratoryandCenterforBiologicalandComputa- [3] R.BrunelliandT.Poggio. Facerecognition: Featuresver- tionalLearning,Cambridge,MA,1996. sustemplates. IEEETransactions onPatternAnalysisand [20] M.TurkandA.Pentland.Facerecognitionusingeigenfaces. MachineIntelligence,15(10):1042–1052,1993. InProc.IEEEConferenceonComputerVisionandPattern [4] R.Chellapa,C.Wilson,andS.Sirohey.Humanandmachine Recognition,pages586–591,1991. recognition of faces: a survey. Proceedings of the IEEE, [21] V.Vapnik.Statisticallearningtheory.JohnWileyandSons, 83(5):705–741,1995. NewYork,1998. [5] C.CortesandV.Vapnik.Supportvectornetworks.Machine [22] T.Vetter.Synthesisofnovelviewsfromasingleface.Inter- Learning,20:1–25,1995. nationalJournalofComputerVision,28(2):103–116,1998. [6] M.Fleming andG.Cottrell. Categorization offacesusing [23] L.Wiskott,J.-M.Fellous,N.Kru¨ger,andC.vonderMals- unsupervisedfeatureextraction.InProc.IEEEIJCNNInter- burg. Face recognition by elastic bunch graph matching. national Joint Conference on Neural Networks, pages 65– IEEETransactionsonPatternAnalysisandMachineIntelli- 70,90. gence,19(7):775–779,1997. [7] G. Guodong, S. Li, and C. Kapluk. Face recognition by supportvectormachines. InProc.IEEEInternationalCon- ferenceonAutomaticFaceandGestureRecognition,pages 196–201,2000. [8] B.Heisele,T.Poggio,andM.Pontil. Facedetectioninstill gray images. AI Memo 1687, Center for Biological and ComputationalLearning,MIT,Cambridge,MA,2000. [9] K. Jonsson, J. Matas, J. Kittler, and Y. Li. Learning sup- port vectors forface verification and recognition. InProc. IEEEInternationalConferenceonAutomaticFaceandGes- tureRecognition,pages208–213,2000. [10] A.Lanitis,C.Taylor,andT.Cootes. Automaticinterpreta- tionandcodingoffaceimagesusingflexiblemodels. IEEE TransactionsonPatternAnalysisandMachineIntelligence, 19(7):743–756,1997. [11] Y. Linde, A. Buzo, and R. Gray. An algorithm for vector quantizer design. IEEE Transactions on Communications, 28(1):84–95,1980. [12] B.Moghaddam,W.Wahid,andA.Pentland. Beyondeigen- faces: probabilisticmatchingforfacerecognition. InProc. IEEEInternationalConferenceonAutomaticFaceandGes- tureRecognition,pages30–35,1998. [13] C.Nakajima,M.Pontil,B.Heisele,andT.Poggio. Person recognitioninimagesequences: Themitespressomachine