ebook img

Pairwise Constraint Propagation on Multi-View Data PDF

0.31 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Pairwise Constraint Propagation on Multi-View Data

1 Pairwise Constraint Propagation on Multi-View Data Zhiwu Lu and Liwei Wang Abstract—This paper presents a graph-based learning ap- from different views (see Fig. 1). In this case, inter-view proach to pairwise constraint propagation on multi-view data. pairwise constraints still specify the must-link or cannot- Althoughpairwiseconstraintpropagationhasbeenstudiedexten- link relationships between data points. Since the similarity of sively,pairwiseconstraintsareusuallydefinedoverpairsofdata two data points from different views is commonly unknown pointsfromasingleview,i.e.,onlyintra-viewconstraintpropaga- 5 tionisconsideredformulti-viewtasks.Infact,verylittleattention in practice, inter-view constraint propagation is significantly 1 hasbeenpaidtointer-viewconstraintpropagation,whichismore more challenging than intra-view constraint propagation. In 0 challengingsincepairwiseconstraintsarenowdefinedoverpairs fact,verylittle attentionhasbeenpaidtointer-viewconstraint of data points from different views. In this paper, we propose 2 propagation for multi-view tasks in the literature. Although to decompose the challenging inter-view constraint propagation n problem intosemi-supervised learning subproblemsso that they pairwise constraint propagationhas been successfully applied a canbeefficientlysolvedbasedongraph-basedlabelpropagation. tomulti-viewclusteringin[16],[17],onlyintra-viewpairwise J To the best of our knowledge, this is the first attempt to give constraints are propagated across different views. Here, it 8 an efficient solution to inter-view constraint propagation from a shouldbenotedthatthesetwoconstraintpropagationmethods 1 semi-supervisedlearningviewpoint.Moreover,sincegraph-based have actually ignored the concept of inter-view pairwise label propagation has been adopted for basic optimization, we constraintsorthestrategyofinter-viewconstraintpropagation. ] develop two constrained graph construction methods for inter- V view constraint propagation, which only differin how the intra- Since multi-view data can be readily decomposed into a C viewpairwiseconstraintsareexploited.Theexperimentalresults series of two-view data, we focus on inter-view constraint incross-viewretrievalhaveshownthepromisingperformanceof propagationonlyacrosstwoviewsinthispaper.However,such . s our inter-view constraint propagation. inter-viewconstraintpropagationremainsa ratherchallenging c [ Index Terms—Pairwise constraint propagation, multi-view task. Fortunately, from a semi-supervised learning viewpoint, data, label propagation, graph construction, cross-view retrieval we can formulate inter-view constraint propagation as mini- 1 mizing a regularized energy functional. Specifically, we first v 4 decomposethe inter-viewconstraintpropagationprobleminto 8 I. INTRODUCTION a set of independent semi-supervised learning [9]–[12] sub- 2 problems. Through formulating these subproblems uniformly 4 As an alternative type of supervisory information easier to asminimizingaregularizedenergyfunctional,wethusdevelop 0 accessthantheclasslabelsofdatapoints,pairwiseconstraints an efficient algorithm for inter-view constraint propagation 1. are widely used for different machine learning tasks in the based on the traditional graph-based label propagation tech- 0 literature. To effectively exploit pairwise constraints for clus- nique [9]. In summary, we succeed in giving an insightful 5 tering or classification [1]–[4], much attention has been paid explanationofinter-viewconstraintpropagationfromagraph- 1 to pairwise constraint propagation [5]–[7]. Different from the : based semi-supervised learning viewpoint. v method [8] which only adjusts the similarities between con- However, since graph-based label propagation has been i strained data points, these approaches can propagate pairwise X adopted for basic optimization, there remains one problem to constraints to other similarities between unconstrained data r beconcernedininter-viewconstraintpropagation,i.e.,howto a points and thus achieve better results in most cases. More exploit intra-view pairwise constraints for graph construction importantly, given that each pairwise constraint is actually within each view. In this paper, we develop two constrained defined over a pair of data points from a single view, these graph construction methods for inter-view constraint prop- approaches can all be regarded as intra-view constraint prop- agation, which only differ in how the intra-view pairwise agation when multi-view data is concerned. Since we have to constraints are exploited. The first method limits our inter- learntherelationships(must-linkorcannot-link)betweendata view constraint propagation to a single view and then utilize points, intra-view constraint propagation is more challenging the constraint propagation results to adjust the weight matrix than the traditional label propagation [9]–[14] whose goal is of each view, while the second method formulates graph only to predict the labels of unlabeled data points. constructionassparserepresentationandthendirectlyaddthe However, besides intra-view pairwise constraints, we may intra-view pairwise constraints into sparse representation. also have easy access to inter-view pairwise constraints in The flowchartof ourinter-viewconstraintpropagationwith multi-view tasks such as cross-view retrieval [15], where constrained graph construction is illustrated in Fig. 1, where each pairwise constraint is defined over a pair of data points only two views (i.e. text and image) are considered. It should be noted that, when multiple views referto text, image,audio Z. Lu is with the Key Laboratory of Data Engineering and Knowledge Engineering (MOE), School of Information, Renmin University of China, and so on, the outputof our inter-view constraintpropagation Beijing 100872,China(e-mail:[email protected]). actuallydenotesthecorrelationbetweendifferentmediaviews. L.WangiswiththeKeyLaboratoryofMachinePerception(MOE),School That is, the proposed algorithm can be directly used for ofElectronicsEngineeringandComputerScience,PekingUniversity,Beijing 100871,China(e-mail: [email protected]). cross-view retrieval (also see examples in Fig. 3) which has 2 Original morechallengingthanthetraditionalpairwiseconstraintprop- similarity agation over a single view. Considering that this multi-view Graph over Intra-PCs CGC single view problemcan be readily decomposedinto a series of two-view subproblems, we focus on inter-view constraint propagation Inter-PCs Inter-CP rFesinualtls on two-view data in the following. Let{X,Y}beatwo-viewdataset,whereX ={x ,...,x } 1 N Intra-PCs CGC sGinragpleh voiveewr and Y = {y1,...,yM}. It should be noted that we may have Original Must-link N 6= M. As an example, a two-view dataset is shown in similarity Cannot-link Fig. 1, with image andtext beingthe two differentviews. For Fig.1. Illustrationoftheflowchartofinter-viewconstraintpropagation(Inter- the two-view dataset {X,Y}, we can define a set of initial CP)withconstrainedgraphconstruction(CGC).Here,weonlyconsidertwo must-link constraints as M = {(x ,y ) : l(x ) = l(y )} different views: text and image. Moreover, Intra-PCs and Inter-PCs denote i j i j intra-view andinter-view pairwiseconstraints, respectively. and a set of initial cannot-link constraints as C = {(xi,yj) : l(x ) 6= l(y )}, where l(x ) (or l(y )) is the class label of i j i j drawn much attention recently [15]. For cross-view retrieval, xi ∈X (or yj ∈ Y). Here, the two data points xi and yj are it is not feasible to combine multiple views just as previous assumed to share the same class label set. If the class labels multi-viewretrievalmethods[18],[19].Morenotably,thetwo are not provided, the inter-view pairwise constraints can be closelyrelatedmethods[16],[17]formulti-viewclusteringare definedonlybasedonthecorrespondencebetweentwoviews, actually incompetent for cross-view retrieval. which can be readily obtained from Web-based content (e.g. Finally,toemphasizeourmaincontributions,wesummarize Wikipedia articles). Several examples of inter-view pairwise the following distinct advantages of our pairwise constraint constraints are illustrated in Fig. 1. propagation on multi-view data: We can now state that the goal of inter-view constraint propagation is to propagate the two sets of initial pairwise • We have made the first attempt to give an efficient so- constraints M and C across both X and Y. In fact, this is lution to inter-view constraintpropagationfrom a graph- equivalentto derivingthe bestsolutionF∗ ∈F frombothM based semi-supervised learning viewpoint. • We have developed two constrained graph construction and C, with F = {F = {fij}N×M}. Here, any exhaustive set of inter-view pairwise constraints is denoted as F ∈ F, methods so that the intra-view pairwise constraints can where f > 0 means (x ,y ) is a must-link constraint while also be exploited for inter-view constraint propagation. ij i j f < 0 means (x ,y ) is a cannot-link constraint, with |f | • Whenappliedtocross-viewretrieval,ourinter-viewcon- ij i j ij denotingtheconfidencescoreof(x ,y )beingamust-link(or straint propagationhas been shown to achieve promising i j cannot-link)constraint. Hence, F can actually be regarded as results with respect to the state-of-the-art. the feasible solution set of inter-view constraint propagation. • Althoughonlyevaluatedincross-viewretrieval,ourinter- AlthoughitisdifficulttodirectlyfindthebestsolutionF∗ ∈ view constraint propagation can be readily extended to F tointer-viewconstraintpropagation,wecantacklethischal- many other multi-view tasks. lenging problem by decomposing it into a set of independent The remainder of this paper is organized as follows. In semi-supervised learning subproblems. More concretely, we Section II, we formulate inter-view constraint propagation first denote the two sets of initial pairwise constraints M and from a semi-supervised learning viewpoint. In Section III, C with a single matrix Z ={zij}N×M: we develop two constrained graph construction methods for ourinter-viewconstraintpropagation.InsectionIV, ourinter- +1, (x ,y )∈M; i j view constraint propagationis applied to cross-view retrieval. z = −1, (x ,y )∈C; (1) ij  i j Finally, Sections V and VI provide the experimental results 0, otherwise. and conclusions, respectively. Moreover, by making vertical and horizontal observations on such initial matrix Z, we decomposethe inter-viewconstraint II. INTER-VIEWCONSTRAINT PROPAGATION propagation problem into independent semi-supervised learn- Inthissection,wefirstformulateinter-viewconstraintprop- ing subproblems, which is also illustrated in Fig. 2. Finally, agation as minimizing a regularized energy functional from a given two graphs GX = {X,WX} and GY = {Y,WY} semi-supervised learning viewpoint. Furthermore,we develop constructed over {X,Y} with WX (or WY) being the edge an efficient algorithm for inter-view constraint propagation weightmatrixdefinedoverthe vertexset X (orY), we utilize based on the label propagation technique [9]. the graph-based label propagation method [9] to uniformly solve these semi-supervised learning subproblems: A. Problem Formulation min kFX −Zk2fro+µXtr(FXTLXFX)+kFY −Zk2fro FX,FY Given a set of inter-view pairwise constraints defined over pairsofdatapointsfromdifferentviews,thegoalofinter-view +µYtr(FYLYFYT)+γkFX −FYk2fro, (2) constraint propagation is to learn the cross-view relationships where µX >0 (µY >0, or γ >0) denotes the regularization from these initial pairwise constraints. Since the similarity of parameter, LX (or LY) denotes the normalized Laplacian two data points from different views is unknown in practice, matrix defined over X (or Y), ||·|| denotes the Frobenius fro inter-view constraint propagation on multi-view data is much norm of a matrix, and tr(·) denotes the trace of a matrix. 3 y1 y2 y3 y4 y5 y6 y7 y8 which can be equivalently transformed into: x1 0 -1 0 1 0 0 1 0 FY(I +µˆYLY)=(1−β)Z +βFX∗, (6) x -1 0 1 -1 0 0 0 0 2 whereµˆY =µY/(1+γ)andβ =γ/(1+γ).Since I+µˆYLY x 1 -1 0 0 0 -1 0 1 is positive definite, we then obtain an analytical solution: 3 x4 0 1 0 0 0 0 -1 0 FY∗ =((1−β)Z +βFX∗)(I +µˆYLY)−1, (7) x 0 0 0 1 0 0 0 -1 which involves time-consuming matrix inverse. In fact, the 5 linear equation (6) can also be efficiently solved using label must-link: 1 cannot-link: -1 propagation [9] with k-NN graph. Fig.2. Illustration oftheinitial matrixZ.Whenwefocusonasinglepair of data points, e.g. (x3,y4) here, the inter-view constraint propagation can Let WX (or WY) denote the weight matrix of the k-NN beviewedasatwo-classsemi-supervisedlearningproblem(innameonly)in graph constructed over X (or Y). The complete algorithm for bothvertical andhorizontal directions, where+1(or-1)denotes positive (or inter-view constraint propagation is summarized as follows: negative) labeled dataand0denotes unlabeled data. (1) Compute two matrices SX = DX−1/2WXDX−1/2 and The first and second terms of the above objective function SY = DY−1/2WYDY−1/2, where DX (or DY) is a arerelatedtothepairwiseconstraintpropagationoverX,while diagonal matrix with its i-th diagonal entry being thethirdandfourthtermsarerelatedtothepairwiseconstraint the sum of the i-th row of WX (or WY); propagation over Y. Moreover, the fifth term can ensure (2) Initialize FX(0)=0, FY∗ =0, and FY(0)=0; that the solutions of these two types of pairwise constraint (3) Iterate FX(t+1) = αXSXFX(t)+(1−αX)((1− propagation are as approximate as possible. Let FX∗ and FY∗ β)Z+βFY∗) until convergenceat FX∗, where αX = be the best solutions of pairwise constraint propagation over µˆX/(1+µˆX) and β =γ/(1+γ); X and Y, respectively. The best solution of our inter-view (4) Iterate FY(t+1) = αYFY(t)SY +(1−αY)((1− constraint propagation is defined as follows: β)Z +βFX∗) until convergence at FY∗, where αY = F∗ =(F∗ +F∗)/2. (3) µˆY/(1+µˆY); X Y (5) Iterate Steps (3)–(4) until convergence, and output As for the second and fourth terms, they are known as the the final solution F∗ =(F∗ +F∗)/2. X Y energy functional [10] (or smoothness) defined over X and According to the convergence analysis in [9], Step (3) Y. In summary, we have formulated intere-view constraint propagation as minimizing a regularized energy functional. convergesto FX∗ =(1−α)(I−αXSX)−1((1−β)Z+βFY∗), equal to the solution (5) given that αX = µˆX/(1 + µˆX) B. Efficient Algorithm and SX = I −LX. Similarly, Step (4) converges to FY∗ = (1−α)((1−β)Z+βFX∗)(I−αYSY)−1,equaltothesolution Let Q(FX,FY) denote the objective function in equation (7) given that αY = µˆY/(1 + µˆY) and SY = I − LY. In (2). The alternate optimization technique can be adopted to theexperiments,wefindthatSteps(3)–(5)generallyconverge solve minFX,FY Q(FX,FY) as follows:1) Fix FY =FY∗, and in very limited iterations (<10). Moreover, based on k-NN find FX∗ =argminFX Q(FX,FY∗); 2) Fix FX =FX∗, and find graphs, the above inter-view constraint propagationalgorithm FY∗ =argminFY Q(FX∗,FY). hasatimecostofO(kNM),whichisproportionaltothenum- PairwiseConstraintPropagationoverX:WhenFY isfixed ber of all possible inter-view pairwise constraints. Hence, we at FY∗, the solution of minFX Q(FX,FY∗) can be found by consider that this algorithm can provide an efficient solution solving the following linear equation to inter-view constraint propagation (note that even a simple ∂Q2(F∂XFX,FY∗) =(FX −Z)+µXLXFX +γ(FX −FY∗)=0, assignment operator on F∗ incurs a time cost of O(NM)). which can be equivalently transformed into: III. CONSTRAINED GRAPH CONSTRUCTION (I +µˆXLX)FX =(1−β)Z+βFY∗, (4) Inthelastsection,wehavejustdevelopedanefficientinter- view constraint propagation algorithm based on the graph- whereµˆX =µX/(1+γ)andβ =γ/(1+γ).SinceI+µˆXLX based label propagation technique. However, since graph- is positive definite, we then obtain an analytical solution: basedlabelpropagationhasbeenadoptedasa basicoptimiza- FX∗ =(I +µˆXLX)−1((1−β)Z +βFY∗). (5) tion technique, there remains one problem to be concerned in inter-view constraintpropagation,i.e., how to exploitintra- However, this analytical solution is not efficient for large view pairwise constraints for graph construction within each datasets, since matrix inverse has a time cost of O(N3). view. In this section, we then develop two constrained graph Fortunately, equation (4) can also be efficiently found using construction methods for inter-view constraint propagation, label propagation [9] with k-nearest neighbor (k-NN) graph. which only differ in how the intra-view pairwise constraints PairwiseConstraintPropagationoverY:WhenFX isfixed are exploited.To ensure our inter-view constraint propagation at FX∗, the solution of minFY Q(FX∗,FY) can be found by algorithmrunsefficientlyevenonlargedatasets,weutilizethe solving the following linear equation traditional k-NN graph construction as the basis of our con- ∂Q(2F∂XF∗,YFY) =(FY −Z)+µYFYLY +γ(FY −FX∗)=0, sgtrraapinhesdcagnrabpehccoonnssidtreurcetdioans,tih.ee.,vtahreianotbstaoifnked-NtwNogrcaopnhs.trIaninthede 4 following, we will only elaborate how to construct the graph (1) Compute SX =D−12WXD−12, where D is a diago- GX = {X,WX} over X. The graph GY = {Y,WY} over Y nal matrix with its entry (i,i) being the sum of row can be constructed exactly in the same way. i of WX; (2) Initialize F (0)=0, F∗ =0, and F (0)=0; v h h A. Constrained Weight Adjustment (3) IterateFv(t+1)=αSXFv(t)+(1−α)((1−β)ZX+ βF∗) until convergence at F∗, where α = µ/(1+ The first constrained graph construction method limits our h v µ+γ) and β =γ/(1+γ); inter-view constraint propagation proposed in Section II to a single view (i.e. intra-view constraint propagation over X) (4) IterateFh(t+1)=αFh(t)SX+(1−α)((1−β)ZX+ βF∗) until convergence at F∗; and then utilize the obtained results of intra-view constraint v h (5) Iterate Steps (3)–(4) until the stopping condition is propagationtoadjusttheweightmatrix,whichisthuscalledas satisfied, and obtain F∗ =(F∗+F∗)/2. constrained weight adjustment (CWA). According to the con- v h (6) Output the normalized solution F∗ = F∗/F∗ , vergenceanalysis in Section II-B, we constructa k-NN graph max where F∗ denotes the maximum entry of F∗. over X to speed up our intra-view constraint propagation. max In the experiments, we find that Steps (3)–(5) generally 1) Intra-View Constraint Propagation: We have just pro- converge in very limited iterations (<10). Moreover, based vided a sound solution to the challenging problem of intra- on k-NN graph, our algorithm has a time cost of O(kN2) view constraint propagation in Section II. In this subsection, proportionaltothenumberofallpossiblepairwiseconstraints. we further consider pairwise constraint propagation over a Hence, it can be considered to provide an efficient solution. single view, where each pairwise constraint is defined over 2) Weight Adjustment Using Propagated Constraints: It a pair of data points from the same view. In fact, this intra- viewconstraintpropagationproblemcanalsobesolvedfroma shouldbenotedthatthenormalizedoutputF∗ ={fi∗j}N×N of ourintra-viewconstraintpropagationrepresentsan exhaustive semi-supervisedlearningviewpointby limiting our inter-view set of intra-viewpairwise constraints. Our originalmotivation constraint propagation to a single view. istoconstructanewgraphoverX thatisfullyconsistentwith Given the dataset X = {x ,...,x }, we denote the set of 1 N F∗. In fact, we can exploit F∗ for such graph construction initialmust-linkconstraintsasMX ={(xi,xj):li =lj} and the set of initial cannot-link constraints as CX = {(xi,xj) : b0y≤adwj(uxs)ti≤ng1t)hjeusotriagsin[a2l0]n:ormalized weight matrix WX (i.e. li 6=lj}, where li is the label of data point xi. Similar to our ij representationoftheinitialinter-viewpairwiseconstraints,we 1−(1−f∗)(1−w(x)), f∗ ≥0; fiCXrstwdietnhoatestihneglienimtiaaltriinxtrZa-Xvie=w{pzai(ijxr)w}iNse×cNo:nstraintsMX and w˜i(jx) =((1+fi∗j)wii(jjx), ij fii∗jj <0. (10) +1, (xi,xj)∈MX; Since W˜X = {w˜i(jx)}N×N is nonnegative and symmetric, we zi(jx) =−1, (xi,xj)∈CX; (8) tthheant Wu˜se(x)it≥asWthe(xn)e(worw<eigWht(xm))atrifixF. M∗ o≥reo0ve(or,rw<e 0ca).nTfihnadt 0, otherwise. is, theinjew weigijht matrix W˜iXj is deriijved from the original Furthermore, by making vertical and horizontal observations weight matrix WX by increasing Wi(jx) for the must-link on ZX, we further decompose the intra-view constraint prop- constraintswith F∗ >0 and decreasingW(x) for the cannot- ij ij agation problem into semi-supervised learning subproblems, link constraints with F∗ < 0. This is entirely consistent ij just as our interpretation of inter-view constraint propagation with our original motivation of exploiting intra-view pairwise from a semi-supervised learning viewpoint. These subprob- constraints for graph construction. lemscanbesimilarlymergedtoasingleoptimizationproblem Once we have constructedthe new weightmatrix W˜X over (similar to [20]–[22]): X, we can similarly construct the new weight matrix W˜Y min kFv−ZXk2fro+µtr(FvTLXFv)+kFh−ZXk2fro over Y. Based on these two new weight matrices, our inter- Fv,Fh viewconstraintpropagationcanbeperformedwithconstrained +µtr(FhLXFhT)+γkFv−Fhk2fro, (9) graph construction (CGC) (as shown in Fig. 1) using con- strained weight adjustment (CWA) developed here. where µ>0 (or γ >0) denotes the regularization parameter, andLX denotesthenormalizedLaplacianmatrixdefinedover B. Constrained Sparse Representation the k-NN graph. The second and fourth terms of the above equation denote the energy functional [10] (or the smooth- The second constrained graph construction method formu- ness measure) defined over X. In summary, we have also latesgraphconstructionassparserepresentation[23]–[25]and formulated intra-view constraint propagation as minimizing a thendirectlyaddtheintra-viewpairwiseconstraintsintosparse regularized energy functional. representation, which is thus called as constrained sparse Similar to what we have done for solving equation (2), representation (CSR). Our work is mainly inspired by recent we can adopt the alternate optimization technique to find the effort to exploit sparse representation for graph construction, best solution to the above intra-view constraint propagation i.e., L -graph construction [26], [27]. The basic idea of L - 1 1 problem.LetWX denotetheweightmatrixofthek-NNgraph graph construction is to seek a sparse linear reconstruction constructedoverthedatasetX.Theproposedalgorithmforour of each data point with the other data points. However, such intra-view constraint propagation is outlined as follows: L -graph construction may become infeasible since it incurs 1 5 too much time cost given a large data size N. Hence, we thisconstrainedtermintosparselinearreconstruction(thekey only consider the k nearest neighbors of each data point for step of L -graph construction). In the following, we will first 1 its sparse linear reconstruction, which thus becomes a much elaborate how to derive a new Laplacian regularization term smaller scale optimization problem (k ≪ N). More notably, from intra-view pairwise constraints. due to such neighborhood limitation, the obtained L1-graph Given a set of intra-view must-link constraints MX and a is actually a variant of k-NN graph, which can ensure that set of intra-view cannot-link constraints CX defined over X, our inter-view constraint propagation proposed in Section II we can represent both MX and CX using a single matrix runs efficiently on large datasets. Finally, to exploit intra- ZX = {zi(jx)}N×N exactly the same as equation (8). The view pairwise constraints for L1-graph construction, we seek normalized Laplacian matrix limited to the k-nearest neigh- a constrained sparse linear reconstruction of each data point. borhood of data point x can thus be defined as: i 1) L -GraphConstructionwithSparseRepresentation: We startwit1htheproblemformulationforsparselinearreconstruc- Li =I−Di−1/2(1+Zi)Di−1/2, (15) tion of each data pointin its k-nearestneighborhood.Given a where Z = [z(x)] ∈ Rk×k, and D is a diagonal data point xi ∈ X, we suppose it can be reconstructed using i jj′ j,j′∈Nk(i) i matrixwithitsj-thdiagonalelementbeingthesumofthej-th itsk-nearestneighbors(theirindicesarecollectedintoN (i)), k rowof1+Z .Here,wedefinethesimilaritymatrix(i.e.1+Z ) whichresultsinanunderdeterminedlinearsystem:x =B α , i i i i i limited to the k-nearest neighborhood N (i) of x based on whereα ∈Rk isa vectorthatstoresunknownreconstruction k i i coefficients, and B = [x ] is an overcomplete dictio- the intra-view pairwise constraints stored in ZX. From this i j j∈Nk(i) normalized Laplacian matrix L , we can derive the Laplacian nary with k bases. According to [23], if the solution for x is i i regularizationterm for the sparse representationproblem (12) sparse enough, it can be recovered by: as αTL α , the same as the original definition in [9]. i i i min ||αi||1, s.t. xi =Biαi, (11) However, we have difficulty in directly incorporating this αi Laplacian regularization term into the sparse representation where||αi||1 is the L1-normofαi. Giventhe kernel(affinity) problem (12), no matter as a part of the objective function matrixA={aij}N×N computedoverX, wemakeuseofthe or a constraintcondition. Hence, we further formulate an L - 1 kernel trick and transform the above problem into: norm version of Laplacian regularization [12], [28]–[30]: mαiin ||αi||1, s.t. xˆi =Ciαi, (12) ||C˜iαi||1 =||Σi12ViTαi||1, (16) where xˆi = [aji]j∈Nk(i) ∈Rk, Ci =[ajj′]j,j′∈Nk(i) ∈Rk×k. where C˜ = Σ21VT, V is a k×k orthonormal matrix with In practice, due to the noise in the data, we can reconstruct i i i i each column being an eigenvector of L , and Σ is a k×k xˆ similar to [24]: xˆ =C α +ζ , where ζ is the noise term. i i i i i i i i diagonal matrix with its diagonal element Σ (j,j) being an The above L -optimization problem can then be redefined by i 1 eigenvalue of L (sorted as Σ (1,1)≤ ... ≤ Σ (k,k)). Given minimizing the L -norm of both reconstruction coefficients i i i 1 thatL isnonnegativedefinite,Σ ≥0(i.e.alltheeigenvalues and reconstruction error: i i ≥ 0). Since L V = V Σ and V is orthonormal, we have i i i i i min ||α′|| , s.t. xˆ =C′α′, (13) L = V Σ VT. Hence, the original Laplacian regularization α′i i 1 i i i αTiL α icainibe reformulated as: i i i whereC′ =[C ,I]∈Rk×2k andα′ =[αT,ζT]T.Thisconvex optimizaitioncainbesolvedbygeneirallineiarpirogrammingand αTi Liαi =αTi ViΣi12Σi21ViTαi =||C˜iαi||22, (17) has a globally optimal solution. which means that our new formulation ||C˜ α || can indeed After we have obtained the reconstruction coefficients for i i 1 be regarded as an L -norm version of the original Laplacian all the data points by the above sparse linear reconstruction, 1 regularization αTL α =||C˜ α ||2. the weight matrix WX ={wi(jx)}N×N can be defined by: 3) L -GraphiCoinsitructioni wiit2h L -Norm Laplacian Reg- 1 1 |α′(j′)|, j ∈N (i),j′ =index(j,N (i)); ularization: After we have formulated L1-norm Laplacian w(x) = i k k (14) regularization based on intra-view pairwise constraints, we ij (0, otherwise, can further incorporate this constrained term into sparse where α′(j′) denotes the j′-th element of the vector α′, and linear reconstruction used for L -graph construction. More i i 1 j′ =index(j,N (i)) means that j is the j′-th element of the concretely,byintroducingnoisetermsforlinearreconstruction k setNk(i).BysettingtheweightmatrixWX =(WX+WXT)/2, andL1-normLaplacianregularization,wetransformthesparse weconstructa graphGX ={X,WX}overX,whichiscalled representation problem (12) into as L -graph since it is constructed by L -optimization. 1 1 min ||[αT,ζT,ξT]|| , 2) L -NormLaplacianRegularizationwithIntra-ViewPair- i i i 1 1 αi,ζi,ξi wise Constraints: In the above L1-graph construction, we s.t. xˆ =C α +ζ , 0=C˜ α +ξ , (18) i i i i i i i have ignored intra-view pairwise constraints (see examples in Fig. 1). In fact, this supervisory information can be exploited where the reconstruction error and Laplacian regularization for L1-graph construction through Laplacian regularization with respect to αi are controlled by ζi and ξi, respectively. [9],[10].OurbasicideaistofirstderiveLaplacianregulariza- C I 0 Let α′ = [αT,ζT,ξT]T, C′ = i , and xˆ′ = tion fromintra-viewpairwise constraintsandthen incorporate i i i i i C˜ 0 I i i (cid:20) (cid:21) 6 [xˆT,0T]T. We finally solve the following constrained spare Text query Retrieved images by cross-view retrieval i representation problem for L -graph construction: The watershed lies partly in the Coast 1 Range ecoregion and partly in the WillametteValleyecoregiondesignatedby CA+SA min ||α′|| , s.t. xˆ′ =C′α′, (19) theU.S.EnvironmentalProtectionAgency α′i i 1 i i i (BEaPlcAh).CrReeevkerwseatesrisdheedTthherouhgishtotrhice1lo8w80esr wasamixtureofopenwater,wetlands, whichtakesthesameformastheoriginalsparerepresentation grassland,andforest,whileabovetheflood plain the watershed consisted of closed Inter-CP problem(13).Here,itisnoteworthythatthisconstrainedspare canopyforest.EuropeanAmericans..…. representation (CSR) problem can be solved very efficiently, Three Puerto Ricans were awarded since it is limited to k-nearest neighborhood. The weight DPFisCti.nguLiusihsedF.SerCvaicsetro,CroPsrsi,vattheeyAnwiberael CA+SA matrix WX of the L1-graph GX = {X,WX} can be defined IJrorsiezparhry(aJondseP)FRC.JoMseaprthinRez.Mbaorrtnineizn.PSFaCn the same as equation (14). GInefarmntaryn,uPnuiteartnodRtainckoindeTsturnoiysebdyaprGoveridminang In our CSR formulation, the L1-norm Laplacian regular- hbeeianvgyaatrttailclkeerydfiinret,hseavpirnogcehsiss.pHlaetoroencefirvoemd Inter-CP theDistinguishedServiceCross…… ization can be smoothly incorporated into the original sparse representation problem (12). However, this is not true for Shortandstocky,Hillwasagiftedbatsman whocouldscorequicklywhenrequired. the traditional Laplacian regularization [9], [10], which may ''Wisden'' described Hill as a "specially CA+SA brilliantbatsmanonhardpitches". Hehad introduce extra parameters (hard to tune in practice) into anawkwardcrouchedstance,grippingthe batlowonthehandle.Thislimitedhis theL1-optimizationforsparserepresentation.Meanwhile,our feoffrewcatirvdenreeasschawndhepnowedrriavnindgredbuucetdhhies L -norm Laplacian regularization can induce another type compensatedforthiswithquickfootwork. Inter-CP 1 Hill'sstrongbottomhandandhiskeen…… of sparsity (see the extra noise term ξ ), which can not be i ensured by the traditionalLaplacian regularization.Moreover, Fig.3. Cross-view retrieval examples onthe Wikipedia benchmark dataset the p-Laplacian regularization [31] can also be regarded as [15].Here,theincorrectly retrieved images aremarkedwithredboxes. an ordinary L -generalization of the Laplacian regularization 1 when p = 1. According to [32], by defining a matrix C ∈ p [37]–[39], since these three tasksall aim to learnthe relations Rk(k2−1)×k, the p-Laplacian regularization can be formulated between the textand image views. However,evenif only text as ||Cpαi||1, similar to our L1-normLaplacian regularization. and image views are considered, cross-view retrieval is still Hence, we can similarly apply the p-Laplacian regularization quite different from automatic image annotation and image withp=1toconstrainedsparerepresentation.However,such captiongeneration.More concretely,automaticimage annota- Laplacianregularizationincurslargetimecostduetothelarge tionreliesonverylimitedtypesoftextualrepresentationsand matrix Cp even for small neighborhoodsize (e.g. k =90). mainly associates images only with textual keywords, while Once we have constructed the L1-graph GX = {X,WX} cross-viewretrievalisdesignedtodealwithmuchmorerichly over X, we can similarly construct the L1-graph GY = annotated data, motivated by the ongoing explosion of Web- {Y,WY}overY.Basedonthetwoweightmatrices,ourinter- based content such as news archives and Wikipedia pages. viewconstraintpropagationcanbeperformedwithconstrained Similar to cross-view retrieval, image caption generation can graph construction (CGC) (as shown in Fig. 1) using con- also deal with more richly annotated data (i.e. captions) with strained sparse representation (CSR) developed here. respectto the textualkeywordsconcernedin automatic image annotation. However, this task tends to model image captions IV. APPLICATION TOCROSS-VIEW RETRIEVAL as sentences by exploiting certain prior knowledge (e.g. the When multiple views refer to text, image, audio and so <object, action, scene> triplets used in [37]), different from on (see Fig. 3), the output of our inter-view constraint prop- cross-view retrieval that focuses on associating images with agation actually can be viewed as the correlation between complete text articles using no prior knowledge from the differentmediaviews.Aswehavementioned,giventheoutput text view (any general textual representations are applicable F∗ = {fi∗j}N×M of our inter-view constraint propagation, actually once their similarities are provided). (x ,y ) denotes a must-link (or cannot-link) constraint if In the context of cross-view retrieval, one notable recent i j f∗ > 0 (or < 0). Considering the inherent meanings of workis[15]whichfirstlearnsthecorrelationbetweenthetext ij must-link and cannot-link constraints, we can state that: x and image views with canonical correlation analysis (CCA) i and y are “positively correlated” if f∗ > 0, while they are [40] and then achieves the abstraction by representing text j ij “negatively correlated” if f∗ < 0. Hence, we can view f∗ and image at a more general semantic level. However, two ij ij as the correlation coefficient between x and y . The distinct separate steps, i.e. correlation analysis (CA) and semantic i j advantageofsuchinterpretationofF∗asacorrelationmeasure abstraction(SA),areinvolvedinthismodeling,andtheuseof is that F∗ can thus be used for ranking on Y given a query semantic abstraction after CCA (i.e. CA+SA) seems rather ad x or ranking on X given a query y . In fact, this is just the hoc.Fortunately,thisproblemcanbecompletelyaddressedby i j goal of cross-view retrieval which has drawn much attention ourinter-viewconstraintpropagation(Inter-CP).Thesemantic recently [15]. That is, such task can be directly handled by information(e.g.class labels)associated with imagesand text our inter-view constraint propagation. can be used to define the initial must-link and cannot-link In this paper, we focus on a special case of cross-view constraintsbasedon the trainingdataset,while the correlation retrieval,i.e.onlytextandimageviewsareconsidered.Inthis between text and image views can be explicitly learnt by case, cross-view retrieval is somewhat similar to automatic the proposed algorithm in Section II. That is, the correlation image annotation [33]–[36] and image caption generation analysis and semantic abstraction has been successfully in- 7 αY=0.025,β=0.95,k=90 αX=0.025,β=0.95,k=90 αX=0.025,αY=0.025,k=90 αX=0.025,αY=0.025,β=0.95 0.32 0.32 0.32 0.32 0.3 0.3 0.3 0.3 MAP0.28 Image Query MAP0.28 Image Query MAP0.28 Image Query MAP0.28 Image Query Text Query Text Query Text Query Text Query 0.26 0.26 0.26 0.26 0.24 0.24 0.24 0.24 0 0.1 0α.X2 0.3 0.4 0 0.1 0α.Y2 0.3 0.4 0. 65 0.7 0.75 0.8β0.85 0.9 0.95 6 5 70 75 80k 85 90 95 100 Fig.4. Thecross-view retrieval results bycross-validation onthetraining setoftheWikipedia dataset forourInter-CPalgorithm (CSRisusedhere). tegrated in our inter-view constraint propagation framework. For the above two datasets, we take the same strategy as The effectiveness of such integration as compared to CA+SA [15] to generate both text and image representation. More [15] is preliminarily verified by several cross-view retrieval concretely, in the Wikipedia dataset, the text representation examplesshowninFig.3.Furtherverificationwillbeprovided foreachdocumentisderivedfromalatentDirichletallocation in our later experiments. More notably, although only tested modelwith 10 latenttopics, while the image representationis in cross-view retrieval, our inter-view constraint propagation based on a bag-of-words model with 128 visual words learnt can be readilyextendedto othermulti-view tasks, since ithas from the extracted SIFT descriptors, just as [15]. Moreover, actually learnt the correlation between different views. for the Flickr dataset, we generate the text and image repre- sentationsimilarly, andthe main differenceis that we select a V. EXPERIMENTALRESULTS relativelylargevisualvocabulary(ofthesize2,000)forimage In this section, our inter-viewconstraintpropagation(Inter- representation and refine the noisy textual vocabulary to the CP) algorithm is evaluated in the challenging application of size 1,000 by a preprocessing step for text representation. cross-view retrieval. We focus on comparing our Inter-CP Inourexperiments,theintra-viewpairwiseconstraintsused algorithm with the state-of-the-art approach [15], since they for our CGC and inter-view pairwise constraints used for both consider not only correlation analysis (CA) but also our Inter-CP are initially derived from the class labels of the semanticabstraction(SA)fortextandimageviews.Moreover, training documents of each dataset. The performance of our we also make comparison with another two closely related Inter-CPwithCGCisevaluatedonthetestset.Here,twotasks approaches that integrate CA and SA for cross-view retrieval of cross-view retrieval are considered: text retrieval using an similarto[15]butperformcorrelationanalysisbypartialleast image query, and image retrieval using a text query. In the squares (PLS) [41] and cross-modal factor analysis (CFA) following, these two tasks are denoted as “Image Query” and [42] instead of CCA, respectively. In the following, these “Text Query”, respectively. For each task, the retrieval results two CA+SA approaches are denoted as CA+SA (PLS) and are measured with mean average precision (MAP) which has CA+SA (CFA), while the state-of-the-art approach [15] is been widely used in the image retrieval literature [18]. denoted as CA+SA (CCA). Finally, to show the effectiveness LetX denotethetextrepresentationandY denotetheimage of constrained graph construction, we construct four types of representation. For our Inter-CP algorithm, we perform CGC graphs for our Inter-CP algorithm: k-NN graph (k-NN), L - over X and Y with the same k. The parameters of our Inter- 1 graph using sparse representation (SR), k-NN graph using CP algorithm with CGC can be selected by fivefold cross- constrained weight adjustment (CWA), and L -graph using validationonthetrainingset.Forexample,accordingtoFig.4, 1 constrained sparse representation (CSR). we set the parametersof ourInter-CP (CSR is used for CGC) on the Wikipedia dataset as: αX = 0.025, αY = 0.025, β = 0.95, and k = 90. It is noteworthy that our Inter-CP A. Experimental Setup with CSR is not sensitive to these parameters. Moreover, the Weselecttwodifferentdatasetsforperformanceevaluation. parameters of our Inter-CP with CWA can be similarly set The first one is a Wikipedia benchmark dataset [15], which to their respective optimal values. To summarize, we have containsa total of 2,866 documentsderivedfromWikipedia’s selected the best values for all the parameters of our UCP “featured articles”. Each document is actually a text-image algorithm with CGC by cross-validation on the training set. pair,annotatedwithalabelfromthevocabularyof10semantic For fair comparison, we take the same parameter selection classes.Thisbenchmarkdataset[15]issplitintoatrainingset strategy for other closely related algorithms. of 2,173 documents and a test set of 693 documents. More- over, the second dataset consists of totally 8,564 documents B. Retrieval Results crawled from the photo sharing website Flickr. The image and text views of each document denote a photo and a set Thecross-viewretrievalresultsonthetwodatasetsarelisted of tags provided by the users, respectively. Although such in Tables I and II, respectively. The immediate observation text presentation does not take a free form as that for the is that we can achieve the best results when both intra-view Wikipedia dataset, it is rather noisy since many of the tags and inter-view pairwise constraints are exploited by Inter- may be incorrectlyannotatedby the users. This Flickr dataset CP+CWA (or Inter-CP+CSR). This means that our Inter-CP isorganizedinto11semanticclasses.Wesplititintoatraining with CGC can most effectively exploit the initial supervisory set of 4,282 documents and a test set of the same size. information provided for cross-view retrieval. As compared 8 TABLEI ACKNOWLEDGEMENTS THECROSS-VIEWRETRIEVALRESULTSONTHETESTSETOFTHE WIKIPEDIADATASETMEASUREDBYTHEMAPSCORES. This work was supported by National Natural Science Foundation of China under Grants 61202231 and 61222307, Methods ImageQuery TextQuery Average CA+SA(PLS) 0.250 0.190 0.220 National Key Basic Research Program (973 Program) of CA+SA(CFA) 0.272 0.221 0.247 China under Grant 2014CB340403, Beijing Natural Science CA+SA(CCA) 0.277 0.226 0.252 Foundation of China under Grant 4132037, the Fundamental Inter-CP+k-NN 0.329 0.256 0.293 Inter-CP+SR 0.336 0.259 0.298 Research Funds for the Central Universities and the Research Inter-CP+CWA 0.337 0.260 0.299 FundsofRenminUniversityofChinaunderGrant14XNLF04, Inter-CP+CSR 0.343 0.268 0.306 and a grant from Microsoft Research Asia. TABLEII THECROSS-VIEWRETRIEVALRESULTSONTHETESTSETOFTHEFLICKR REFERENCES DATASETMEASUREDBYTHEMAPSCORES. [1] Z. Lu and H. H.-S. Ip, “Generalized competitive learning of gaussian Methods ImageQuery TextQuery Average mixturemodels,” IEEETrans.Systems, Man,andCybernetics, PartB: CA+SA(PLS) 0.201 0.168 0.185 Cybernetics, vol.39,no.4,pp.901–909, 2009. CA+SA(CFA) 0.252 0.231 0.242 [2] Z.Lu,“Aniterativealgorithmforentropyregularizedlikelihoodlearning CA+SA(CCA) 0.280 0.263 0.272 ongaussianmixturewithautomaticmodelselection,” Neurocomputing, Inter-CP+k-NN 0.495 0.483 0.489 vol.69,no.13,pp.1674–1677, 2006. Inter-CP+SR 0.509 0.496 0.503 [3] L.Wang,Z.Lu,andH.Ip,“Imagecategorizationbasedonahierarchical Inter-CP+CWA 0.521 0.499 0.510 spatial markovmodel,”inCAIP,2009,pp.766–773. Inter-CP+CSR 0.521 0.505 0.513 [4] Z.Lu,Y.Peng,andJ.Xiao,“Fromcomparingclusteringstocombining clusterings,” inAAAI,2008,pp.665–670. [5] Z. Lu and M. Carreira-Perpinan, “Constrained spectral clustering throughaffinitypropagation,” inCVPR,2008,pp.1–8. to the three CA+SA approaches by semantic abstraction after [6] Z.Li,J.Liu,andX.Tang,“Pairwiseconstraintpropagationbysemidef- correlation analysis (via PLS, CFA, or CCA), our Inter-CP inite programming for semi-supervised classification,” in ICML, 2008, can seamlessly integrate these two separate steps and then pp.576–583. [7] S. Yu and J. Shi, “Segmentation given partial grouping constraints,” leadto muchbetterresults. Moreover,the effectivenessofour IEEETrans.PatternAnalysisandMachine Intelligence, vol.26,no.2, CGC is verified by the comparison Inter-CP+CWA vs. Inter- pp.173–183, 2004. CP+k-NN (or Inter-CP+CSR vs. Inter-CP+SR), especially on [8] S. Kamvar, D. Klein, and C. Manning, “Spectral learning,” in IJCAI, 2003,pp.561–566. theFlickrdataset.AsforourtwoCGCmethods,CSRisshown [9] D.Zhou,O.Bousquet, T.Lal,J.Weston,andB.Scho¨lkopf, “Learning toperformbetterthanCWA,whichismainlyduetothenoise- with local and global consistency,” in Advances in Neural Information robustness property of sparse representation. ProcessingSystems 16,2004,pp.321–328. [10] X.Zhu,Z.Ghahramani,andJ.Lafferty,“Semi-supervisedlearningusing It should be noted that our Inter-CP algorithm can be Gaussianfieldsandharmonicfunctions,” inICML,2003,pp.912–919. considered to provide an efficient solution, since it has a [11] Z. Lu and H. Ip, “Image categorization by learning with context and time cost proportional to the number of all possible pairwise consistency,” inCVPR,2009,pp.2719–2726. [12] Z. Lu and L. Wang, “Noise-robust semi-supervised learning via fast constraints. This is also verified by our observations in the sparsecoding,”PatternRecognition, vol.48,no.2,pp.605–612,2015. experiments.For example, the runningtime taken by CA+SA [13] Z. Lu and Y. Peng, “Combining latent semantic learning and reduced (CCA, CFA or PLS), Inter-CP+k-NN, and Inter-CP+CWA on hypergraphlearningforsemi-supervisedimagecategorization,” inACM Multimedia, 2011,pp.1409–1412. the Wikipedia dataset is 10, 24, and 55 seconds, respectively. [14] Z. Lu and H. Ip, “Combining context, consistency, and diversity cues Here, we run all the algorithms (Matlab code) on a computer forinteractive imagecategorization,” IEEETrans.Multimedia, vol.12, with 3GHz CPU and 32GB RAM. Since our Inter-CP with no.3,pp.194–203,2010. [15] N. Rasiwasia, J. Costa Pereira, E. Coviello, G. Doyle, G. Lanckriet, CGC leads to significantly better results, we prefer it to R. Levy, and N. Vasconcelos, “A new approach to cross-modal multi- CA+SAinpractice,regardlessofitsrelativelylargertimecost. mediaretrieval,” inACMMultimedia, 2010,pp.251–260. [16] E. Eaton, M. desJardins, and S. Jacob, “Multi-view clustering with constraintpropagationforlearningwithanincompletemappingbetween VI. CONCLUSIONS views,”inCIKM,2010,pp.389–398. [17] Z.Fu,H.Ip,H.Lu,andZ.Lu,“Multi-modalconstraintpropagationfor In this paper,we haveinvestigatedthe challengingproblem heterogeneous image clustering,” in ACM Multimedia, 2011, pp. 143– 152. of pairwise constraint propagation on multi-view data. By [18] M. Guillaumin, J. Verbeek, and C. Schmid, “Multimodal semi- decomposing the inter-view constraint propagation problem supervisedlearning forimageclassification,” inCVPR,2010,pp.902– into a set of independent semi-supervised learning subprob- 909. [19] E. Bruno, N. Moenne-Loccoz, and S. Marchand-Maillet, “Design of lems, we have uniformly formulated them as minimizing a multimodaldissimilarityspacesforretrievalofvideodocuments,”IEEE regularized energy functional. More importantly, these semi- Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 9, pp. supervised learning subproblems can be solved efficiently 1520–1533, 2008. [20] Z. Lu and H. Ip, “Constrained spectral clustering via exhaustive and using label propagation with k-NN graph. We then develop efficient constraint propagation,” inECCV,vol.6,2010,pp.1–14. twoconstrainedgraphconstructionmethodsforourinter-view [21] Z. Fu, Z. Lu, H. H.-S. Ip, Y. Peng, and H. Lu, “Symmetric graph constraint propagation, and the obtained two graphs can be regularized constraint propagation,” inAAAI,2011. [22] Z. Lu and Y. Peng, “Exhaustive and efficient constraint propagation: considered as the variants of k-NN graph. The experimen- A graph-based learning approach and its applications,” International tal results in cross-view retrieval have shown the promising JournalofComputer Vision,vol.103,no.3,pp.306–325,2013. performance of our inter-view constraint propagation with [23] D.Donoho,“Formostlargeunderdeterminedsystemsoflinearequations theminimal ℓ1-normsolution isalso thesparsest solution,” Communi- constrained graph construction. For future work, our method cationsonPureandAppliedMathematics,vol.59,no.7,pp.797–829, will be extended to other multi-view tasks. 2004. 9 [24] J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma, “Robust face [34] S.Feng,R.Manmatha,andV.Lavrenko,“MultipleBernoullirelevance recognitionviasparserepresentation,”IEEETrans.PatternAnalysisand models for image and video annotation,” in CVPR, vol. 2, 2004, pp. Machine Intelligence, vol.31,no.2,pp.210–227,2009. 1002–1009. [25] Z.LuandY.Peng,“Latentsemanticlearningbyefficientsparsecoding [35] Z. Lu, H. Ip, and Y. Peng, “Contextual kernel and spectral methods withhypergraph regularization.” inAAAI,2011. for learning the semantics of images,” IEEE Trans. Image Processing, [26] H.Cheng,Z.Liu,andJ.Yang,“Sparsityinducedsimilaritymeasurefor vol.20,no.6,pp.1739–1750, 2011. label propagation,” inICCV,2009,pp.317–324. [36] Z. Lu and H. Ip, “Spatial markov kernels for image categorization [27] B. Cheng, J. Yang, S. Yan, and T. Huang, “Learning with ℓ1-graph and annotation,” IEEE Trans. Systems, Man, and Cybernetics, Part B: forimageanalysis,” IEEETrans.ImageProcessing, vol.19,no.4,pp. Cybernetics, vol.41,no.4,pp.976–989, 2011. 858–866,Apr.2010. [37] A.Farhadi,M.Hejrati,M.Sadeghi,P.Young,C.Rashtchian,J.Hocken- [28] Z.Luand Y. Peng, “Image annotation bysemantic sparse recoding of maier,andD.Forsyth,“Everypicturetellsastory:Generatingsentences visualcontent,” inACMMultimedia, 2012,pp.499–508. fromimages,”inECCV,vol.4,2010,pp.15–29. [29] ——, “Latent semantic learning with structured sparse representation [38] G.Kulkarni,V.Premraj,S.Dhar,S.Li,Y.Choi,A.Berg,andT.Berg, forhumanaction recognition,” PatternRecognition, vol. 46,no. 7,pp. “Baby talk: Understanding and generating simple image descriptions,” 1799–1809, 2013. inCVPR,2011,pp.1601–1608. [30] Z. Lu,P. Han, L. Wang, and J.-R. Wen, “Semantic sparse recoding of [39] V. Ordonez, G. Kulkarni, and T. Berg, “Im2Text: Describing images visual content for image applications,” IEEETrans. Image Processing, using 1 million captioned photographs,” in Advances in Neural Infor- vol.24,no.1,2015. mationProcessingSystems 24,2012,pp.1143–1151. [31] D. Zhou and B. Scholko¨pf, “Regularization on discrete spaces,” in [40] H. Hotelling, “Relations between two sets of variates,” Biometrika, DAGM,2005,pp.361–368. vol.28,no.3-4,pp.321–377,1936. [32] X.Chen,Q.Lin,S.Kim,J.G.Carbonell, andE.P.Xing,“Smoothing [41] H.Wold,“Partialleastsquares,”inEncyclopediaofStatisticalSciences, proximalgradientmethodforgeneralstructuredsparselearning,”inUAI, S.KotzandN.Johnson,Eds. NewYork:Wiley,1985,pp.581–591. 2011,pp.105–114. [42] D. Li, N. Dimitrova, M. Li, and I. K. Sethi, “Multimedia content [33] J. Li and J. Wang, “Automatic linguistic indexing of pictures by processingthroughcross-modalassociation,”inACMMultimedia,2003, a statistical modeling approach,” IEEE Trans. Pattern Analysis and pp.604–611. Machine Intelligence, vol.25,no.9,pp.1075–1088, Sept.2003.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.