SUBMITTEDTOIEEETRANS.M.M. 1 Tri-Subject Kinship Verification: Understanding the Core of A Family Xiaoqian Qin, Xiaoyang Tan, and Songcan Chen Abstract—One major challenge in computer vision is to go fairly well - the best result on the challenging LFW (labeled beyond the modeling of individual objects and to investigate the face in the wild) face verification database has reached an bi- (one-versus-one) or tri- (one-versus-two) relationship among accuracy as high as 99.15% [11] - even better than what multiple visual entities, answering such questions as whether a can be done by a human being. However, extending those 5 child in a photo belongs to given parents. The child-parents 1 relationship plays a core role in a family and understanding techniques to characterise the complex relationship among 0 such kin relationship would have fundamental impact on the multiple entities is not trivial. One major reason is due to 2 behavior of an artificial intelligent agent working in the human the fact the appearance gap encountered in a kinship problem l world. In this work, we tackle the problem of one-versus-two is much larger than that in a conventional face recognition u (tri-subject) kinship verification and our contributions are three setting (e.g, given two face images with different sex and J folds: 1) a novel relative symmetric bilinear model (RSBM) introduced to model the similarity between the child and the different ages, verify whether those two subjects are father 1 parents, by incorporating the prior knowledge that a child may and daughter). 2 resemble a particular parent more than the other; 2) a spatially In this sense, kinship learning is a step towards such a voted method for feature selection, which jointly selects the ] trend to capture mutual information among different visual V most discriminative features for the child-parents pair, while entities, particularly multiple face images. Most of current taking local spatial information into account; 3) a large scale C tri-subject kinship database characterized by over 1,000 child- researches[12][13][14][15][16][17],however,mainlyfocuson . parentsfamilies.ExtensiveexperimentsonKinFaceW,Family101 the kinship involving only two subjects (one-versus-one) such s c andournewlyreleasedkinshipdatabaseshowthattheproposed as father-son or mother-daughter, while in practice, kin rela- [ method outperforms several previous state of the art methods, tionship involving more subjects are desirable, for example, whilecouldalsobeusedtosignificantlyboosttheperformanceof 3 one-versus-one kinship verification when the information about in the problem of finding missing children, usually we have v both parents are available. the photos of both parents, and there is no reason preventing 5 us from using images of both parents at the same time for 5 Index Terms—Kinship verification, tri-subject relationship, more effective kinship verification. As another application 5 feature selection. scenario of law enforcement, it would be beneficial to match 2 0 theimageofacriminalsuspectwiththoseofhis/herparentsto I. INTRODUCTION . improve the performance of suspect searching. Motivated by 1 Kinship verification from facial images is an emerging this, [18] assembled a family database containing 45 families 0 problem in computer vision. From an aspect of face recog- 5 with an average of 120 near frontal facial samples per family. 1 nition, kinship provides us with a valuable and operational Fang et. al. [19] collected the Family101 kinship dataset, v: opportunity to construct useful relationship between persons containing14,816faceimagesfrom206nuclearfamilies.Both basedontheirvisualsignals,thusdeepeningourunderstanding i [18] and [19] ask questions concerning more general family X on their semantics. Applications of kin relationships include membership (one-versus-multi) beyond father and son. r faceimageretrieval[1][2][3]/annotation[4][5]/organization, In this paper we focus on the problem of tri-subject (one- a increasingface recognitionrates [6][7],social mediaanalysis versus-two) kinship learning (i.e., son-parents and daughter- [8] [9], finding of missing children, children adoptions [10], parents). This is an important special case of the more ambi- and so on. tious one-versus-multi verification and is largely overlooked Besides its wide applications, kinship learning is also moti- in literatures. The child-parents is the core and the most vated by the long-term goal of computer vision to go beyond basic unit formed in a family and understanding such kind the understanding of a single visual entity (e.g., “whose face of kin relationship would have fundamental impact on the is this?”) and to investigate the bi- or tri- relationship among behavior of an artificial intelligent agent working in a human multiple visual entities, e.g., answering such questions as world. Furthermore, compared to the problem of one-versus- whether a child in a photo belongs to given parents. Actu- multikinshipverification,theone-vs-twoverificationisamore ally, recent research has demonstrated that computer vision convenient and more practical choice - not only because its algorithmshavebeenabletounderstandindividualfaceimage scope is more controllable, but also because the problem by itself is easier to define since otherwise it could be difficult Copyright(c)2013IEEE Xiaoqian Qin, Xiaoyang Tan and Songcan Chen is with the Department to determine kinship relations in a big family genetically and ofComputerScienceandTechnology,NanjingUniversityofAeronauticsand without ambiguity, especially for those people among whom Astronautics,NanjingChina.Correspondingauthor:[email protected]. the kinship ties are weak. XiaoqianQiniswithHuaiyinNormalUniversity. Manuscriptreceivedxxxxx,2015;revisedxxxxx,2015. To address this problem of tri-subject kinship verification, SUBMITTEDTOIEEETRANS.M.M. 2 the key idea of our method is to fully exploit the dependence mouth and nose with spatial Gaussian kernels. In [24], a structure between child and parents in a few aspects: sim- spatial pyramid learning-based feature descriptor is utilized to ilarity measure, feature selection and classifier design. This represent kinship faces. In [25], a gated autoencoder method is based on the observation that compared to the case with is used to encode the resemblance between a parent and a only one image from one of the parents, images from both child, which is trained through minimizing the reconstruc- parents could provide richer information about the kinship tion error given a set of randomly sampled local patches. relation regards to a child, due to the genetic overlapping In [7], dense stereo matching is used to determine kinship between a pair of parents and their child. To this end, our similarity. Other feature sets for kinship verification include contributions are three folds. First, we use a bilinear function Gradient Orientation Pyramid (GGOP) [8], Self Similarity to model the similarity between the parents and the child, Representation (SSR) [26] and prototype-based discriminative with the dependence between them captured by a covariance- feature learning (PDFL) method [27]. Since semantic-related like matrix learnt from the data. To make this more robust, feature sets such as attributes usually show more tolerance we introduce a novel relative bilinear similarity model which to appearance changes, they are naturally used for kinship effectivelyincorporatesthepriorknowledge[20]thatchildren verification [14]. Based on the idea that people look more may resemble a particular parent more than another. like their parents when they smile, [16] proposes to describe Second, we propose a spatially voted method for feature facial dynamics and spatio-temporal appearance over smile selection,whichjointlyselectsthemostdiscriminativefeatures expression and uses these to improve the kinship verification for the child-parents pair, while taking local spatial informa- rate. tionintoaccount.Comparedtotraditionalgroup-basedfeature In [15] the authors show that combining several types selectionmethodssuchasgrouplasso,weessentiallyallowthe of middle-level features is useful. For that purpose, they featuresinawholeimagetocompetewitheachotherandthen introducedamultiviewneighborhoodrepulsedmetriclearning selectthegroupinwhichhigherportionofindividualfeatures method (MNRML) by learning a distance metric under which in the corresponding local region win. By contrast, in group the samples with a kinship relation are pulled close and lasso, features are teamed together beforehand and have to those without a kinship relation are pushed away. [17] and compete with others as a group. Our method is more flexible [28] extract multiple features to characterize face images and than the latter in the sense that it permits fine-grained control maximize the correlation of different features to exploit com- over the contribution of each feature to the establishment of plementary information for kinship verification. Another way one-vs-two kin relationships. to reduce the appearance similarity gap is to use intermediate Finally, we release a new face database specific to the tri- samples which bridge the two sides with large divergence. subject kinship problem, characterized by over 1,000 child- In [13], [29], [30] and [31], such a bridge is constructed by parents groups. State-of-the-art results are achieved using our facial images of parents at the similar ages of their children. method. Interestingly, our experimental results also show that However,itisnoteasytocollectsuchanimagesetinpractice. theaccuracyofone-vs-onebi-kinshipverificationbenefitsalot While most of the above works focus on the bi-subject byreformulatingitasaspecificcaseofone-vs-twotri-kinship (one-versus-one) kinship verification, [18] and [19] deal with verification when the information about a second parent is theone-versus-multikinshiprelation.ParticularlyGhahramani available. et al. [18] addresses the problem of family verification, i.e., This journal paper builds on the earlier conference work predicting whether a query face image has kin relation with [21]. In what follows we briefly review some of the related multiple family members, by fusing similarity of each mem- workinSection2,anddetailourproposedmethodinSection3. ber’s facial image segments. Fang et al. [19] tackle the more Our new kinship database is described in Section 4 and general family membership classification, i.e., given a query experimental results are given in Section 5. We conclude this face image, asking which family it belongs to, and they do paper in Section 6. this with a minimum sparse reconstruction method. Despite the partial success of these methods, we argue that in general itisdifficulttoestablishtherelationshipbetweenasubjectand II. RELATEDWORK some members of his/her family through the face appearance The aim of bi-subject (one-versus-one) kinship verification if the kinship ties between them is weak1. Instead we focus through computer vision is predicting whether a given pair of on the verification of the most basic unit that forms a family, images has kin relation. The research in the field of human thatis,thechild-parents(one-versus-two)relationship.Wecall visualsignalprocessing[22][23]hasprovidedstrongevidence this tri-subject kinship verification. The methods developed that facial appearance is a useful cue for genetic similarity, here can be potentially extended to handle more complex since children look more similar to their parents than other relationship by treating a family tree as an ensemble of tri- adults of the same gender. To find such distinguishable cues relationships. from facial appearance, in an early attempt, Fang et al. [12] III. TRI-SUBJECTKINSHIPVERIFICATION used various features including the skin, hair and eye color, facial structure measures and local/holistic texture. Inthissection,wepresentourmethodfortri-subjectkinship verification. Assume that we are given a set of N training Later,researchersevaluatedvarioustypesoffeaturedescrip- torsforkinshipverification.In[10],theDAISYdescriptorsare 1Forexample,itmakesnosensetoreconstructaman’sfaceimageusing adopted to facilitate local facial patches matching for eyes, hisfather-in-law’s. SUBMITTEDTOIEEETRANS.M.M. 3 Verification y=1 or not between a parent and a child, to be learnt from the training data. f(xf, xm, xc) Since both W ,W are d × d matrix and the similarity f m SBM Sf(xf,. x.c). Sm(xm,. x.c). function is a bilinear function, we call this Symmetric Bilin- xf ... xm ... ear Model (SBM). The bilinear model has many advantages xc Wf xc Wm compared to the simple Euclidean-based model: 1) it is a natural choice to model the similarity between two subjects; ... ... Feature selection xf ... xm ... and 2) it is also a much richer model than a traditional linear model – actually the bilinear model is related to the xc xc Mahalanobis distance (especially when the energy of each ... ... ... feature vector is fixed) and hence it can effectively capture Feature descriptor xf xc xm the correlation between any two feature variables. However, the bilinear model is different from the Mahalanobis distance Face image partition in that its parameter matrix W is not necessarily a positive definitematrix,whichnotonlyindicatesthatitcouldbemore flexible than a traditional metric learning-based method, but also means that what a bilinear model learns is not a metric Fig.1. Theoverallarchitectureoftheproposedmethod. but a classifier. But this is exactly what we need – a model to predict directly whether a given pair of subjects has some samples {(x ,x ,x ,y )}N , where x ,x ,x ∈ Rd fi mi ci i i=1 fi mi ci kind of kinship, rather than the metric between them. respectively denotes the i-th sample of a father, a mother and We further denote the probability that a child x belongs c a child, d is the dimension of the feature representation of a to a pair of parents (x ,x ) as P(y = 1|x ,x ,x ), and it f m f m c sample, and y ∈ {+1,−1} indicates whether this child has i is linked to our verification function f(x ,x ,x ) through a f m c a valid kinship relation with the corresponding two parents. sigmoid function, i.e., Here by kinship relation we mean a very close family type relation, that is, the child is produced by the two parents. p(y =1|x ,x ,x )=σ(f(x ,x ,x )) (2) f m c f m c Ourgoalistolearnafunctionf :(x ,x ,x )→{+1,−1} f m c from the training data to check whether such a kinship wheresigmoidfunctionσ isdefinedtobeσ(x)= 1 .The 1+e−x could be established for three previously never seen images verification function f(xf,xm,xc) is modeled as the linear (x ,x ,x )ofacoupleandachild.Forsimplicityweassume combination of two pieces of evidence, i.e., the similarity of f m c thatthegenderofbothparentsimages(xf,xm)areknownand xc to xm and xf, respectively, thattheyindeedgeneticallyproducesomechildren,butwedo f(x ,x ,x )=β s (x ,x )+β s (x ,x )+b (3) not know whether x is one of them. We also assume that the f m c 1 f f c 2 m m c c gender of the test image x of the child is known. where the combination coefficients β and β are two scalars c 1 2 and b is the similarity threshold term. To learn these parame- ters, we maximize the conditional likelihood defined by Eq. 2 A. Two Bilinear Models for Kinship Verification by plugging Eq. 1 into it, with L regularization added. How 2 The overall architecture of the proposed method is shown to learn the pairwise similarity Eq. 1 will be detailed in the in Figure 1, which can be roughly divided into three stages. next section. Particularly, in the first stage, we partition an image into Alternatively,onecantreattheparentsandthechildassam- overlapping patches and extract a middle-level feature de- ples from two domains. Let us denote the parents domain as scriptor(e.g.,128-dimensionalSIFTfeatures)fromeachpatch P, with data points (x ,x ),(x ,x ),...,(x ,x ), f1 m1 f2 m2 fN mN (location), which are then concatenated into a feature vector and the child domain as C, with data points x ,x ,...,x . c1 c2 cN as the input to the next stage. In the second stage we use With these notations, one can model the similarity between a a spatially voted feature selection method to select the most child x and his/her parents x =(x ,x ) as, c p f m discriminative local facial patches to improve the robustness. Finally in the third stage, we learn the similarity between s (x ,x )=(cid:104)x ,x (cid:105) (4) p p c p c Wp parents and child using bilinear models, based on which the whereW isa2d×dmatrix.ThismodeliscalledAsymmetric final kinship verification is made. p Bilinear Model (ABM) in what follows. In this work, we explore two ways to encode the similarity For the ABM model, our verifier is defined as follows, betweenparentsandachild.Thefirstoneistodecomposethe triples of (xf,xm,xc) into two pairs (xf,xc) and (xm,xc), p(y =1|xf,xm,xc)=σ(sp(xp,xc)+b) (5) and the pairwise similarity between them is respectively, where σ is the sigmoid function. The parameters {W ,b} p sf(xf,xc)=(xf)TWfxc ≡(cid:104)xf,xc(cid:105)Wf (1) are learnt using the following regularized logistic regression s (x ,x )=(x )TW x ≡(cid:104)x ,x (cid:105) objective, m m c m m c m c Wm N where (cid:104)a,b(cid:105) ≡ aTWb, and the transformation matrix (cid:88) W min log(1+exp(−y ((cid:104)x ,x (cid:105) +b))+λ(cid:107)W (cid:107) (6) Wf,Wm essentially encodes the “covariance” relationship Wp,bi=1 i pi ci Wp p ∗ SUBMITTEDTOIEEETRANS.M.M. 4 wherebisthethreshold,and(cid:107)W (cid:107) isthetracenorm,defined Incorporating these into the SBM model, we obtain the p ∗ as (cid:107)W (cid:107) = (cid:80) σ (the σ(cid:48)s are the singular values of W ). following relative symmetric bilinear model (RSBM), p ∗ i i i p With appropriate parameter λ, the trace norm shall force a sR(x ,x )=pfc·(cid:104)x ,x (cid:105) solution with many singular values of Wp being exactly zero. f f c f c Wf (8) This allows a more compact representation of the data, thus sRm(xm,xc)=pmc·(cid:104)xm,xc(cid:105)Wm beingusefulespeciallywhentheoriginalfeaturespaceishigh- One remaining problem is how to determine these priors. dimensional. Equation 6 is a nonsmooth convex objective and Eq.7showsthattheydependontheparametersW andW , f m one can use proximal methods to solved it, where at each whichsuggestsanaturaliterativeprocedure-initializepfcand step the singular values of the standard gradient update are pmc first, then optimize W and W in a supervised manner, f m replaced by their soft-threshold versions. See [32] for details finally update pfc and pmc again. In this work, we learn W f on an efficient implementation of this. and W separately using the same trace-norm regularized m Comparing the SBM model and the ABM model, the SBM logistic regression model as that shown in E.q. 6. learns two simple models first (i.e., by learning two d × d However, updating pfc and pmc is somewhat subtle - the parameter matrices W ,W separately) and then combines f m range of the sigmoid function of E.q. 7 is in [0,1], meaning them with coefficients β1 and β2, while the ABM learns a that when one of pfc, pmc reaches 1 the other one must be bigger model at one time (a 2d×d parameter matrix W ). In p nearly0.Thisisrisky,sincefortheonewith0probability,the other words, the SBM essentially combines two sub-modules contributionofitscorrespondingsimilaritycouldbecancelled (onedoesthefather-childkinshipverificationandtheotherfor out. To prevent this from happening, we update the new the mother-child relation), which not only makes the learning pfc,pmc using a stabilizing term, as follows, taskeasier,butalsoprovides furtherflexibilitytocalibratethe outputs of the two sub-modules such that the final prediction pfc =αpfc+(1−α)pfc new 0 cur (9) (father/mother-child) is as accurate as possible. By contrast, pmc =αpmc+(1−α)pmc, 0<α<1 new 0 cur the ABM model tries to do this in one big step, which is where α ∈ (0,1) is a trade-off parameter, and the stabilizing much harder especially when the size of dataset is relatively terms pfc, pmc are initialized to be 0.5 for each sample, and small(lessthan2Kimagesfortraininginourcase)fora2d×d 0 0 pfc , pmc are priors calculated according to the W or W matrix (for d=400, the total number of parameters would be cur cur f m values estimated in the current iteration. In other words, we 2×400×400=160,000). choose not to trust the currently-estimated similarity prior too much and always regularize it with some fixed stabilizing B. Learning A Relative Pairwise Similarity Measure value.Principallyonecanoptimizethevalueofαbyplugging Note that the SBM model introduced in Eq. 1 is a pairwise Eq. 9 into the corresponding regularized logistic regression similarity model without exploiting the dependence structure objective function while treating W or W as a constant, f m among parents and child, which can be considered as a but in our implementation we set it using a cross validation limitation. In fact, one can interpret the SBM as a likelihood strategy2. model, while to better model the similarity between a father TheproposedRSBMalgorithmissummarizedinAlgorithm and a son for example, one should put it under the context of 1. threesubjects-i.e.,insteadofmodelingthemarginalpairwise similarity(e.g.,p(fatherissimilartoson)),modelingitscondi- C. Spatially Voting for Feature Selection tionalversion(e.g.,p(fatherissimilartoson|father,mother,and son)). One major advantage of this is to allow us to embed The total number of parameters (i.e., Wp, Wf and Wm) variouspriorknowledgeconcerningtri-subjectgroupsintothe for our kinship verification model grows quadratically with similaritymodel.Inthiswork,weareparticularlyinterestedin the dimensions of input features, hence performing feature the prior knowledge that children may resemble a particular selection is needed. It can be observed that some important parent more than another [20] - “Jack looks more like his geneticcharacteristicsforakinshiprelationshiparedistributed father than his mother” or “John has similar appearance with locallyinfaceimages,anditisbettertolearnthembyfinding her mother”. the most discriminative local facial regions (patches) with Let us denote the probability that a child looks more like some supervised information. Furthermore, we want to select his/her father or his/her mother as pfc and pmc respectively, those most discriminative patches from both parents and the i.e., ‘pfc = 1’ means that a child looks more like his father childimagessimultaneouslysuchthatgoodgeneralizationcan than his mother. Taking the parents as references, the child is be obtained. either more like his/her father or more like his/her mother, Onesimplewayforthisistotreateachpatchinanimageas so we have pfc + pmc = 1. We therefore define the two a group and use existing techniques such as group lasso [33] probabilitiesusingthesoftmaxfunction,basedonthepairwise to select a few groups (patches) such that they give the best similarity model defined in Eq. 1, prediction accuracy, see Fig.2 (a) for illustration. However, one drawback of this method is that the feature selection is exp(s (x ,x )) pfc = f f c performed at the level of groups (patches), i.e., the features exp(s (x ,x ))+exp(s (x ,x )) f f c m m c have to be teamed together before competition and this may (7) exp(s (x ,x )) pmc = m m c exp(s (x ,x ))+exp(s (x ,x )) 2Inpractice,asmallvalueofα=0.1usuallyworkswell. f f c m m c SUBMITTEDTOIEEETRANS.M.M. 5 Algorithm 1 Solving the Relative Symmetric Bilinear Model 88xx 1 100-1-414 1 (RSBM) 66 0.8 Input: 44 00..46 22 0.2 Training images: S ={(xfi,xmi,xci,yi)}Ni=1; 001 11000010 22000010 33000010 44000010 55000001 66000010 (a) 011 55 99 1133 1177 2211 2255 2299 3333 3377 4411 4455 4499 Parameters:regularizationtermλ,iterationnumberT,and 1x 10-12 120 trade-off parameter α 100 0.5 80 Output: 60 40 Symmetric transformation matrix W , W ; 0 1000 2000 3000 4000 5000 6000 201 5 913172125293337414549 f m (b) 1: Initialization: 1x 10-12 110200 23:: andDSSeetmcopm=fcp{o=(sxe[m1S,i,1ixn,c.ti.o,.,yt1iw)]},opNi=mse1ct;s=S[f1,=1,{..(.x,f1i],;xci,yi)}Ni=1 Ffmm0e.i50aiegdtt.uhdrol2eed1.0s0id0seiPlrepaec2atc0tcr0ittho0liytnisos3mene0l0elee0edctchttiois4on0dn0tt.0ohHeu4se59mi0rn0e0ogosfvote(6rar0dl)0iia0lspltcuhprseiitmnrgagitnripoaoatunitpvcpehuleaprspsa.2468sot0000ocFs1heoea5rtsnh9(deba1)y3f(,ab1i7tc)mhe2e1tpih2om5ges2rai9ongp3ug3erpo37ipgnl4ora1osst4uehs5dope49 0 2 2 2 0 2 2 2 4: For L=1,2,...,T do competition,andtheselectedgroups(patches)areindicatedontherightwith bluebars,whilethecorrespondingweightsofeachfeaturevectorinagroupis 5: Estimate Wf and Wm by solving regularized logistic showninthelefthistogram;For(b),thediscriminativepowerofeachfeature regression objective (c.f., E.q. (6)); (i.e.,theweightvectoru,seetextfordetails)isfirstestimatedandisshown 6: Update the pairwise similarity with E.q. (8); inthelefthistogram,whilethehistogramontherightshowsforeachpatch how many votes it receives, and the first K patches receiving the highest 7: Estimate pfcucr,pmcucr using E.q. (7); numberofvoteswillbeselected. 8: Update pfc ,pmc by using E.q. (9); new new 9: Set pfc =pfc ; pmc =pmc ; new new 10: end for mappingstructurebetweeneachfeatureandeachpatchbefore- 11: Output Symmetric transformation Wf and Wm. hand, the votes vk received by the k-th patch can be simply calculated as the sum of weights of u corresponding to this (cid:80) patch, i.e., v = u , where u denotes the j-th element k j∈k j j of vector u corresponding to patch k. Note that for a patch of hurt the flexibility of feature selection. To overcome this, we thechildimage,itwouldreceivevotesfromthecorresponding adopt an alternative strategy - competition before grouping. features of both uf and um, while for a father or a mother Thatis,allfeaturesextractedateachlocationinagivenimage patch,itsvotecomesmerelyfromuf orum accordingly.After (c.f., Section III-A on how we extract features) are allowed to voting,weselectthefirstK patcheswiththehighestv value freely compete with each other and then select the groups k for parent and child respectively, where K is set using cross (local regions) in which higher portion of individual features validationoveravalidationsetinourimplementation(thebest win. Hence our method works in a finer granularity than that valueisusuallybetween20and30with49patchesperface.). of the group lasso. The process of our vote-based feature As mentioned previously, after patch selection, we collect selection method and group lasso is shown in Fig.2. for each image the selected K patches and encode them with Particularly,ouralgorithmhastwosteps.Inthefirststep,we SIFT descriptor, which are further concatenated to form a evaluate the discriminative power of each feature of a parent feature vector x for that face. regard to the given child. For this we decompose the triple of (x ,x ,x ) into two pairs of (x ,x ) and (x ,x ). Then f m c f c m c forapairoffather-childfeatures(x ,x ),wefirstconcatenate IV. THETSKINFACEDATABASEANDEVALUATION f c them into a 2d-dimension vector denoted as af, and learn a PROTOCOL weightvectoruf withthesamedimensionusingthefollowing To analyze the behavior of the proposed algorithm for tri- sparse l1 regularized logistic regression objective, subject kinship verification, we constructed a new kinship face database named TSKinFace (Tri-Subject Kinship Face N min(cid:88)log(1+exp(−y ·(cid:104)uf,af(cid:105))+γ(cid:107)uf(cid:107) (10) Database). All images in the database are harvested from uf i i 1 the internet based on knowledge of public figures family i=1 and photo-sharing social network such as flickr.com. During wherey =1ifthepairisapositivesampleand−1otherwise. i images collecting, we impose no restrictions in terms of pose, Solving this will give us a 2d-dimension vector uf with its lighting, expression, background, race, image quality, etc. first half and the second half respectively representing the Fig. 3 shows some image groups of child-parents pair from importance of each feature of the father and the child. The our TSKinFace database. This database will be made publicly same procedure is repeated for the mother-child pairs and available online3 to advance the research and applications yields a vector um. related to this topic. Now, instead of performing feature selection directly using TableIgivesacomparisonbetweenourTSKinFacedatabase theinformationcontainedinu,weusethistovotethepatches andotherexistingkinshipdatabasesofhumanfaces.Itcanbe offaceimagesandselectthosepatchesreceivinghighvotesfor seen that our database is characterized by the largest number face representation. Particularly, after solving the L1 logistic of people and families. Specifically, the number of families regression, we calculate separately for the parent and child contained in our database is over 20 times more than that how many votes per patch received from u. Intuitively, the of [18] and about 5 times more than that of the Family101 more votes a patch receives, the more important it is. Fig.2 (b) illustrates this procedure. Since we know the 3Availableat:http://parnec.nuaa.edu.cn/xtan/data/TSKinFace.html SUBMITTEDTOIEEETRANS.M.M. 6 divided into five folds such that each fold contains nearly the same number of face groups with kinship relation, which facilitates five-fold cross validation experiments. Table II lists the face number index for the five folds of our TSKinFace database.Forfaceimagesineachfold,weconsiderallgroups Fig.3. SomefamilyimagegroupsofourTSKinFacedatabase,whereeach offaceimageswithkinshiprelationaspositivesamples,while groupconsistsofafamilytripleofafather,amotherandachild.Thefirstrow the negative samples are a random combination with a child shows three Father-Mother-Daughter (FM-D) relation families, respectively image and two parents images subjected to the constraint and the second row are three Father-Mother-Son (FM-S) relation families, accordingly. that the child was not produced by them. In general, the number of negative samples is much more than that of the positive samples. In our experiments, each couple and child database.Thesefeaturesmakeourdatasetparticularlysuitable imagesappearedonlyonceinthenegativesamples.Hence,the for one-vs-two type kinship verification. number of positive groups and negative groups are the same. TABLEI TABLEII COMPARISONOFOURTSKINFACEDATABASEANDSOMEEXISTING FACENUMBERINDEXOFEACHFOLDOFTHETSKINFACEDATABASE KINSHIPDATABASESOFHUMANFACES,WHERE“#GROUPS”REFERSTO THENUMBEROFKINSHIPRELATIONGROUPS(BLOOD-RELATIONFAMILY) Fold 1 2 3 4 5 INTHEDATABASE,AND“FAMILYSTRUCTURE”REFERSTOTHE EXISTENCEOFFAMILYRELATIONSHIPINTHEDATABASE FM-D [1,100] [101,200] [201,300] [301,400] [401,502] FM-S [1,102] [103,204] [205,306] [307,408] [409,513] Database #People #Images #Groups Familystructure? CornellKin[12] 300 300 150 NO UBKinFace[34][14] 400 600 200 NO KinFaceW-I[15] 1066 1066 533 NO KinFaceW-II[15] 2000 1000 1000 NO V. EXPERIMENTS Family101[19] 607 14,816 206 YES Database[18] – 5400 45 YES A. The Tri-Subject Kinship Verification TSKinFace 2589 787 1015 YES To the best of our knowledge, there are very few works that tackle the tri-subject kinship verification problem, and it In particular, we are interested in three kinds of child- isverydifficulttofindanexistingmethoddirectlycomparable parents families in real life, i.e., Father-Mother-Daughter toours.Wethereforedesignanaivebaselinebyconcatenating (FM-D), Father-Mother-Son (FM-S) and Father-Mother-Son- thefeaturevectorsofthreevisualentitiesandlearningalinear Daughter (FM-SD). For each type, we collected 274, 285 and SVM for verification. We denote this method as ‘concate- 228 family photos respectively, with one photo per family. nated+SVM’. Usingthese,weconstructedtwokindsoffamily-basedkinship Alternatively, one can use any existing state-of-the-art bi- relations in the TSKinFace database: Father-Mother-Son(FM- subject kinship verification model to score the similarity S) and Father-Mother-Daughter(FM-D). The FM-S and the between a child and his/her parents separately, and then train FM-D contain 513 and 502 groups of tri-subject kinship a linear SVM over these to make the final prediction (c.f., relations (c.f., Fig. 3), respectively. Hence we have 1015 tri- E.q. 3). Here two best performers (on the KinFaceW dataset) subject groups in our database totally. The families included on bi-subject kinship learning, i.e., neighborhood repulsed inourdatabasearediverseintermsofracesaswell.ForFM-S metric learning (NRML) [15], and gated autoencoder [25] relation, there are 343 and 170 groups of tri-subject kinship are adopted as our base models. Furthermore, considering relations for Asian and non-Asian, respectively. And for FM- that the similarity modeling is related to metric learning, we D relation, the numbers for Asian and non-Asian groups are also include two classical metric learning algorithms, i.e., respectively 331 and 171. Information-theoretic metric learning (ITML) [36] and large margin nearest neighbor classification (LMNN) [37] as the Preprocessing All downloaded images undergo the same base models. geometric normalization prior to analysis: face detected and Although they deal with different problem, the image set cropped using our own implemented Viola-Jones detector, based face verification bears some similarities to the prob- rigid scaling and image rotation to place the centers of the lem of tri-subject kinship verification from the respect of two eyes at fixed positions, using the eye coordinates output methodology, i.e., both involve similarity matching between from an eye localizer [35]; image cropping to 64×64 pixels multiple faces. Hence in this work, we also adopt one of the andconversionto8bitgray-scaleimages.Inourexperiments, best performers on the YouTube Face database, i.e., DDML each face image was divided into 7×7 overlapping patches (Discriminative Deep Metric Learning) [38], to score the and the size of each patch is 16 × 16. For each patch, we pairwise similarity between a child and his/her parents, and extracted a 128-dimensional SIFT feature. Except mentioned then train a SVM over these to make the final prediction. otherwise,forallexperimentsdescribedinthiswork,theSIFT Particularly,wetrainadeepmetriclearningnetworkwiththree is adopted as our default feature descriptor. layers using our own implementation, with the threshold τ, Evaluation Protocol We design a verification protocol for the learning rate µ and regularization parameter λ set to be our database following [13] and [15]: the database is equally 3,10−3,10−2, respectively. SUBMITTEDTOIEEETRANS.M.M. 7 Our method is also closely related to Fang et al. [19] in TABLEIII that both deal with the family structure. However, since their CORRECTVERIFICATIONRATES(%)FORDIFFERENTMETHODSONTHE TSKINFACEDATABASE(WHERE“FM-S”,‘FM-D”DENOTE method is mainly designed for kinship classification, it is not “FATHER-MOTHERANDSON”AND“FATHER-MOTHERANDDAUGHTER”, directlycomparabletoours.Butwefollowtheirideastobuild RESPECTIVELY.) a linear SVM-based kinship verifier to make the comparison Method FM-S FM-D avg. feasible. Particularly, we construct a reconstruction errors- Concatenated+SVM 53.5±0.2381 53.2±0.2037 53.4 based representation (at the patch level) for each face using SparseGroupLasso[19] 71.6±0.9644 69.8±0.3485 70.7 sparse group lasso [19], by treating images belonging to the NRML[15] 77.0±0.5831 71.4±0.5933 74.2 same family as a group. Gatedautoencoder[25] 81.9±0.4433 79.6±0.3685 80.8 DDML[38] 82.1±1.0357 79.8±0.5879 81.0 Finally,wecompareseveralvariantsoftheproposedmethod ITML[36] 76.6±0.3753 71.4±0.4087 74.0 in our experiments, as follows, LMNN[37] 75.4±0.7293 70.3±0.7372 72.9 • With/without feature selection (FS): to investigate the ABM(proposed) 78.5±0.3411 73.2±0.3888 75.9 ABM-FS(proposed) 78.6±0.3114 76.9±0.2927 77.8 effectivenessoftheproposedvote-basedfeatureselection ABM-block-FS(proposed) 83.4±0.2508 81.9±0.3025 82.7 method, for both SBM and ABM models, we evaluate SBM(proposed) 82.4±0.3568 78.2±0.4105 80.3 their with/without feature selection versions. SBM-FS(proposed) 82.8±0.2608 79.5±0.2550 81.2 • Working at the block level: The proposed method can SBM-block-FS(proposed) 85.2±0.3031 83.5±0.2985 84.4 RSBM-block-FS(proposed) 86.4±0.4105 84.4±0.3601 85.4 also be applied at the level of blocks (patches), i.e., selectingthemostdiscriminativepatchesfirst,thenlearn- ingthe“covariance”relationshipandmakingverification 1 predictions based on each selected patches, and finally 0.9 aggregatingthesemeta-decisionsthroughlinearSVMfor 0.8 the final verification judgement. 0.7 Concatenated+SVM IRvnoetUlewadnthilvaefeetsasftSouoylrltmeohwemsreswel,etircasitceinoBonntoia(ltFiteniSoden,)a,rilwniMkoeraokld‘liRnelgeSxB(apRtMeSrt-hiBmbeMleobnc)ltkosw-cFkiwSthl’eevmseuplesa.eatinastlhlyae etar evitisop eurT0000....3456 SNGILASTpBMRaBMatMMMerNsLdeNL aGurtooeunpc Loadsesro ABM-FS following default parameter settings: λ = 5.0 in Eq. 6 (but 0.2 SBM-FS ABM-block-FS change to 0.1 if working at the block level); γ = 0.08 in 0.1 SBM-block-FS RSBM-block-FS Eq. 10; α = 0.1 in Eq. 9; and the iteration number T in 00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Algorithm 1 is empirically set to be 5. The influence of some False positive rate parameters will be investigated in details below, but the exact Fig. 4. The ROC curves of different methods obtained on the TSKinFace setting of these parameters is not critical: the method gives dataset. similar results over a broad range of settings. Comparison with the state of the art methods Fig. 4 fail to model the dependence structure among the three visual gives the ROC (Receiver Operating Characteristic) curves of entities. By contrast, the proposed RSBM model effectively different methods and Table III summarizes the results. One exploits such priors during several stages of the verifica- can see from the table that the performance of the baseline tion pipeline (e.g.,similarity modeling, feature selection) and SVMalgorithmgivesanaverageaccuracyof53.4%,indicating achieves the best verification performance. that the one-vs-two type tri-subject kinship verification is Last but not least, it is interesting to note that the gender a very challenging problem. However, our proposed RSBM plays a significant role in kinship verification - in all the model working at the patch level improves this by over 30%, cases tested, the verification rates on “FM-S” are consistently being the best performer among all the compared methods. higher than those on “FM-D”. One possible reason is that the The closest competitor of our method is the DDML [38], appearance variations appeared in a female subject (daughter) which gives an average accuracy of 81.0% - similar to our are more complex than those in a male subject (son). This methodof‘SBM-FS’,butwiththepriorinformationexploited, seems to be in accordance with earlier psychological research our ‘RSBM-block-FS’ performs better. results[39]thatthekinrecognitionsignalislessevidentfrom Our method also significantly works better than the sparse daughters than from sons. group lasso based method proposed in Fang et al. [19] - one possible explanation is that for a core family group involved The importance of prior knowledge Fig. 5 compares in onlythreesubjects,theassumptionmadein[19]thatanimage detail the FM-S performance of the SBM model with/without of a child should be best reconstructed by face images in exploiting the prior knowledge about the relative difference his/her own family is too strong, although it is reasonable of a child to his/her parents, as a function of the number of under their situation where dozens of face images per family patches selected for each face. One can see that when the are available. number of patches selected for verification is relatively small, Thirdly, we see that simply adopting state of the art metric theRSBMmethodsignificantlyperformsbetterthantheSBM learning methods for tri-subject kinship verification is not a method. For example, using only 20 patches, the accuracy of good choice. This is partly due to the fact that these methods RSBMreachesanverificationaccuracyof86.4%,3.7%higher SUBMITTEDTOIEEETRANS.M.M. 8 88 87 SBM-block-FS FM-S RSBM-block-FS 86.4 86 FM-D )%( ycaruccA n8846 84.3 85.1 82.7 84.3 85.6 85.2 85.4 )%(ycarucca no88882345 oitacifireV 8802 80.9 81.4 itacifireV788901 78 10 1 5 2 0 2 5 3 0 780 0.1 0.2 0.3 0.4 Valu0e.5 of 0.6 0.7 0.8 0.9 1 Value of K Fig.7. Theaverageperformanceof“RSBM-block-FS”asafunctionofthe Fig.5. ComparisonofFM-SperformanceofSBMandRSBM,underdifferent amountofstabilizationα. numberK ofpatches. 88 FM-S 87 FM-D )%86 (ycarucca noitacifireV88882345 81 801 2 3 4 5 6 7 8 9 10 Value of T Fig.8. Theaverageverificationaccuracyof“RSBM-block-FS”asafunction ofthenumberofiterationT. patches. For a fair comparison, we set the parameter that controls the group weights as 0.88 for FM-D and 0.85 for Fig.6. Illustrationofthelearnedpriorknowledgeforfourfamilies.Ineach FM-S, obtaining the same number of selected patches as that family, the image on the left is the input child image, and the two rows of inourvote-basedfeatureselectionmethod.Asthebaselinewe imagesontherightaretheparentsimagesmultipliedbytherespectivelearnt selectthel normlassoalgorithm,whichperformsthefeature priordensityin10iterations(progressivelyfromlefttoright)-thehigherthe 1 probabilitythelighterthepixelvalue. selection without using any spatial information. Figure 9 gives the results. One can see that the proposed spatiallyvotedfeatureselectionscheme(“FS”)performsbetter than that of the SBM model. This highlights the benefits of than the group lasso “GL”, on average improving the perfor- exploiting prior knowledge for complex kinship verification. mance by about 2.3% and 0.4%, respectively on both tasks, To further illustrate this, we visualize the prior knowledge while the simple lasso method performs the worst. learnedbymultiplyingitelementwisebytheimage:seeFig.6 To answer the question of how many patches it needs to for an example. We can observe that some children do look be selected, we investigate the effect of the parameter K more like his/her father than mother, or vice versa, and such (the number of patches selected) on the performance. Fig. 10 information is effectively captured and utilized by our model. shows verification rate as a function of the number of patches Fig.7showstheaverageverificationaccuracyofourRSBM selected for each face, with the ABM model as the verifier. model as a function of the stabilizing term α (c.f., Eq. 9). We One can see that the performance boots from about 60.0% to can see that the RSBM model obtains the best performance when α = 0.1 for both FM-S and FM-D. In general, good performance could be obtained by setting the value of α 84 between 0.05 and 0.3. ABM-L1 82.8 ABM-GL Fig. 8 shows the performance curve of the RSBM model )%82 ASBBMM--LF1S 80.6 80.9 apserafofrumnacnticoenoofftthheeRnuSmBbMermofoditeelrabtoioonssts. Wtoeitcsanhisgeheestthavtatluhee ( ycarucc80 SSBBMM--GFSL 78.6 78.3 79.2 79.5 only after a few iterations. In practice we would recommend A no78 76.9 tEoffseecttiTve=ne5sstoofavsopidatoiavlelyrfitvtiontgin.g for feature selection To itacifireV 7746 75.9 76.0 74.8 75.0 verify this, we compare our feature selection scheme with grouplasso(GL)-bothaimtoautomaticallyfigureoutasetof 72 FM -S FM -D patchesfromfaceimagesforkinshipverification.Particularly, we formulate the problem as the sparse group lasso penalized Fig.9. Correctverificationrates(%)fordifferentfeatureselectionmethodson theTSKinFacedatabase(where“FS”denotesourvotebasedfeatureselection logistic regression in which the groups are defined as the methodwhile“L1”denoteslassoand“GL”denotesgrouplasso) SUBMITTEDTOIEEETRANS.M.M. 9 80 TABLEVI CORRECTVERIFICATIONRATES(%)FORDIFFERENTMETHODSONTHE 5-THFOLDOFTSKINFACEDATABASEWITH10DIFFERENTNEGATIVE )%75 SAMPLESSETS,WHERE’EASY’REPRESENTSTHOSESAMPLESWHICHARE (yca CLASSIFIEDBY“LMNN”CORRECTLYAND’HARD’DENOTESTHOSE ru SAMPLESWHICHARECLASSIFIEDBY“LMNN”INCORRECTLY cca70 no Method Easy Hard avg. itacifireV65 NLCMoRnMNcaNLte[n[13a57te]]d+SVM 7486..431±±0000...000024264 5318..41±±0.000..00215763 4624..29±±5000..00015736 FM-S DDML[38] 77.6±0.0083 55.1±0.0163 66.4±0.0086 FM-D ABM-block-FS(proposed) 87.3±0.0041 54.0±0.0146 70.7±0.0030 60 1 5 10 15 20 25 30 35 40 45 49 SBM-block-FS(proposed) 92.5±0.0037 57.3±0.0143 74.9±0.0036 Value of K RSBM-block-FS(proposed) 94.7±0.0040 63.1±0.0142 78.9±0.0032 Fig. 10. Influence of the number of patches K selected on the verification rates. over 73.0% with only 5 patches. The performance increases withmorepatchesaddeduntil20patchesareselected,andthe improvementisnotevidentafterthatfortheFM-Sverification. While for the FM-D verification, the number of selected patches is better to be kept less than 20 so as to reduce the possible influence of noise. Influence of randomness in negative sample generationsIn the previous experiments the negative samples are randomly generated by combining child and parents from different families but there is only one for each. We now investigate theimpactofsuchrandomness.Particularly,wefirstrandomly Fig. 11. Illustration of some samples in the easy negative set (the top two generate a large negative samples pool containing both the rows)andthehardnegativeset(thebottomtworows). FM-D and the FM-S negative relationship. Although there are many ways to distinguish less distinct negative samples lem,weuseafastimplementation[32]withitscomputational (i.e., hard samples with features of child and parents not complexityO(d3N/g2),whereg istheiterationcounter,while very different) from those very distinct negative samples (i.e, the computational complexity of the estimating part is O(N). easy ones with features of child and parents quite different), Hence the total computational complexity of our proposed for example, by simply measuring the similarity between the RSBM is O(d3N/g2T)+O(NT). child and parents, we choose the “LMNN” method [37] here. That is, those samples correctly classified by the “LMNN” B. Enhancing Bi-Subject Kinship Verification method as negative are categorized as “easy” ones, otherwise Intuitively, having more information about one’s parents is as “hard”. We use this criterion to randomly select 2,000 potentially useful to improve the performance of bi-subject samplesfromthenegativepool,with1,000eachforthe“easy” kinshipverification.Inordertoverifythishypothesis,another and the “hard” category respectively. Some of these samples series of experiments are conducted. This is similar to the tra- areillustratedinFig.11.Forexperimentweequallysplitthose ditional bi-subject verification in thatfour types of kinship re- samples into 10 sets, and run the model trained on the fifth lationswillbeevaluated,i.e.,Father-Son(FS),Father-Daughter fold of our TSKinFace database over them. (FD), Mother-Son (MS), and Mother-Daughter (MD). How- Table VI gives the results. One can see that all the ever, the key difference lies in that we are now given a methods investigated here work consistently and significantly triple including two parents and a child as a test sample. In better on the easy set than on the hard set, indicating that other words, we are interested in, for example, whether the it makes sense to distinguish the two sets based on the information about one’s father is useful to verify the Mother- outputs of the “LMNN” method, while our “RSBM-block- Daughter (MD) relation. FS” method demonstrates the highest robustness against this One simple way for this is to reformulate the bi-subject random confusion. The table also reveals that although some kinship verification problem as a tri-subject problem, since of the methods work well on the easy negative samples, the once a FM-D relationship is established, a FD and a MD performance on the hard ones are not satisfactory in general relationship must be established as well, see Fig. 12 for (withaccuracylessthan70.0%),showingthatfurtherresearch an example. For a bi-subject verification problem shown in is needed on this topic. Fig. 12(a), one can treat it as a tri-subject problem shown ComputationalComplexityWenowbrieflyanalyzethecom- in Fig. 12(b) when the father’s information is available, and putational complexity of the RSBM method, which involves use that result to answer the question of one-vs-one kinship T iterations. In each iteration we solve a regularized logistic verification. regression problem and make the estimation of the weights of Table IV compares the results of these two approaches for pfc andpmc.Tosolvetheregularizedlogisticregressionprob- bi-subject kinship verification. One can see that exploiting SUBMITTEDTOIEEETRANS.M.M. 10 TABLEIV CORRECTRATES(%)OFDIFFERENTMETHODSFORBI-SUBJECTKINSHIPVERIFICATIONWITHTRIPLEINPUTS(COLUMN2AND5)ANDPAIRINPUTS (COLUMN3,4,6AND7,“*”DENOTESTHATTHERESULT(P-VALUES)OFt−TESTFORTHEPERFORMANCECOMPARISONBETWEENPAIRINPUTSAND TRIPLEINPUTSVERIFICATIONISLESSTHAN0.05). Method FM-S FS MS FM-D FD MD SparseGroupLasso[19] 71.6±0.9644 69.1±0.6093 68.7±1.2204 69.8±0.3485 66.8±0.4627(*) 67.9±0.5977 NRML[15] 77.0±0.5831 74.8±0.7279(*) 72.2±0.3360(*) 71.4±0.5933 70.0±0.6716(*) 71.3±0.5853 Gatedautoencoder[25] 81.9±0.4433 79.9±0.6790(*) 78.5±0.5963(*) 79.6±0.3686 74.2±0.3170(*) 76.3±0.2296(*) ITML[36] 76.6±0.3753 75.6±0.3866(*) 72.1±0.3330(*) 71.4±0.4087 70.5±0.4000(*) 70.7±0.4435(*) LMNN[37] 75.4±0.7293 72.7±0.7305 71.5±0.7455(*) 70.3±0.7372 69.8±0.7243(*) 70.1±0.3846 ABM-block-FS(proposed) 83.4±0.2508 83.0±0.5558 82.8±0.5037 81.9±0.3025 80.5±0.4301 81.1±0.4003 SBM-block-FS(proposed) 85.2±0.3031 83.0±0.5558(*) 82.8±0.5037(*) 83.5±0.2985 80.5±0.4301(*) 81.1±0.4003(*) RSBM-block-FS(proposed) 86.4±0.4105 83.0±0.5558(*) 82.8±0.5037(*) 84.4±0.3601 80.5±0.4301(*) 81.1±0.4003(*) TABLEV CORRECTVERIFICATIONRATES(%)FORDIFFERENTMETHODSONTHETSKINFACEDATABASE(WHERE”FS-M”,”MS-F”,”FD-M”AND”MD-F” DENOTE”FATHER-SONANDMOTHER”,”MOTHER-SONANDFATHER”,”FATHER-DAUGHTERANDMOTHER”AND”MOTHER-DAUGHTERANDFATHER”, RESPECTIVELY.)) Method FS-M MS-F FD-M MD-F avg. Concatenated+SVM 53.5±0.2066 53.8±0.1337 53.0±0.1732 53.3±0.2523 53.4 SparseGroupLasso[19] 69.9±0.4927 70.3±0.7489 68.6±0.5350 68.9±0.5070 69.4 NRML[15] 75.6±0.5931 75.8±0.3788 70.8±0.9266 70.2±0.3189 73.1 Gatedautoencoder[25] 79.7±0.4077 80.9±0.4023 78.4±0.4022 79.2±0.4133 79.6 ITML[36] 73.8±0.4921 74.6±0.2564 70.1±0.3727 70.0±0.4950 72.1 LMNN[37] 71.9±0.2372 72.6±0.7213 70.5±0.5129 69.5±0.4232 71.1 ABM(proposed) 78.0±0.4379 78.7±0.3117 73.1±0.4190 73.5±0.5103 75.8 ABM-FS(proposed) 77.9±0.3104 78.0±0.4235 75.1±0.3008 76.5±0.4730 76.9 ABM-block-FS(proposed) 81.3±0.2791 81.8±0.3984 80.4±0.3604 80.7±0.3309 81.1 SBM(proposed) 79.2±0.4113 80.4±0.4103 76.8±0.4216 77.1±0.4010 78.4 SBM-FS(proposed) 81.0±0.3716 81.7±0.3002 78.7±0.3919 79.0±0.2637 80.1 SBM-block-FS(proposed) 82.9±0.1841 83.9±0.4197 81.9±0.1071 81.8±0.2157 82.6 It is well known that a problem like MS or FD verification is quite difficult due to the different genders of two subjects tobeverified.Ourmethodessentiallyprovidesanewsolution to this, and we consider it as one of the major motivations to study the tri-subject kinship verification problem. (a) (b) C. Comparisons with Human Beings in Kinship Verification Fig. 12. When the images of a second parent is available, the traditional To investigate human beings’ performance on the kinship one-vs-one type bi-subject kin verification problem can be reformulated as a one-vs-two one. In this example, once the Father/Mother-Daughter (FM- verification problem, we randomly selected 100 groups of D) relationship is established for the three subjects shown on the right, one cropped grayscale face samples, including 50 positive groups can safely infer that the Mother-Daughter (MD) kinship is validated for the and 50 negative ones. Then we presented these to 10 human subjectsshownontheleft. observerswithagesof20to40yearsoldtoasktheiropinions about the kinship relation. These human observers did not more information about one’s parents is indeed beneficial. receive any training on how to verify kinship from facial Particularly,theperformanceofthemother-son(MS)verifica- images before the experiment, and will completely rely on tion is improved significantly from 72.2% to 77.0% using the their own knowledge accumulated to answer the questions. SVM-based NRML baseline method, while that of the father- Particularly, we conduct two parts of tests on kinship daughter (FD) verification is improved from 80.5% to 84.4% verification. For the first part, 100 child-parent pairs (one-vs- using our RSBM model, and t-test analysis shows that this one) are shown to human observers (“A” ), and for the second improvement is statistically significant. Particularly, the stars part, 100 child-parents groups (one-vs-two) are presented to inTableIVindicatewhethertheimprovementofperformance these observers (“B” ). Obviously, these two types of testing fortripleinputsissignificantcomparedtothatwhenonlytwo are respectively corresponding to the problem of bi-subject subjects are available. For example, the accuracy of ’SBM- andtri-subjectkinshipverification.Werepeatedthisprocedure block-FS’ for father-son verification (FS) is 83.0%, but if two times, one for the FM-S subset and other for the FM-D the information about the mother is known, this improves subset, both from our TSKinFace database. We also run the to 85.2% (as shown in the column of ’FM-S’), which is same experiments using our SBM method for comparison. significantly better than that of FS according to the t-Test and Table VII and Table VIII give the results. One can see hence a star is marked, otherwise there is no star. that “B” can obtain better performance than “A” on the two