1 Agile Inexact Methods for Spectral Projector-Based Graph Fourier Transforms Joya A. Deri, Member, IEEE, and José M. F. Moura, Fellow, IEEE Abstract—We propose an inexact method for the graph in terms of the Jordan decomposition. This requires 7 Fourier transform of a graph signal, as defined by the determination of Jordan chains, which are expensive 1 signal decomposition over the Jordan subspaces of the to compute, potentially numerically unstable, and non- 0 graph adjacency matrix. This method projects the signal 2 over the generalized eigenspaces of the adjacency matrix, uniquewithrespecttoasetoffixedeigenvectors,result- which accelerates the transform computation over large, ing in non-unique coordinate graph Fourier transforms. n sparse, and directed adjacency matrices. The trade-off In [11], we reformulate the graph signal processing a J between execution time and fidelity to the original graph frameworkso thatthe graphFouriertransformis unique structure is discussed. In addition, properties such as a 1 over defective networks. generalizedParseval’sidentityandtotalvariationordering 1 Since our objective is to provide methods that can of the generalized eigenspaces are discussed. Themethodisappliedto2010-2013NYCtaxitripdatato be applied to real-world systems, it is necessary to ] I identifytraffichotspotsontheManhattangrid.Ourresults consider the associated computational cost. Efficient S show that identical highly expressed geolocations can be algorithms and fast computation methods are essential, . identifiedwiththeinexactmethodandthemethodbasedon s andwithmoderncomputingsystems,parallelizationand c eigenvector projections, while reducing computation time vectorization of software allow decreased computation [ byafactorof26,000andreducingenergydispersalamong thespectralcomponentscorrespondingtothemultiplezero times [12]. In this paper, we provide a method to 1 eigenvalue. accelerate computation by relaxing the determination of v the Jordan decomposition of an adjacency matrix. Index Terms—Jordan decomposition, generalized 1 eigenspaces, directed graphs, sparse matrices, signal Consider a graph G = G(A) with adjacency matrix 5 8 processing on graphs, large networks, inexact graph A ∈ CN×N and a graph signal s ∈ CN, which Fourier transforms 2 characterizespropertiesorbehaviorsateachnodeof the 0 graph. The graph Fourier transform of [1] is defined as . I. INTRODUCTION 1 follows. If the adjacency matrix has Jordan decompo- 0 Graph signal processing [1], [2], [3], [4], [5] allows sition A = VJV−1, the graph Fourier transform of a 7 for analysis of signals over graph structures with the signal s∈CN over G is defined as 1 framework of digital signal processing. Properties that v: have recently been explored include graph filtering [3], s=V−1s, (1) Xi [5] and sampling and recoveryof graph signals [6], [7], where V−1 is the Fouerier transform matrix of G. This [8]. This paper focuses on considerations for applying is essentially a projection of the signal onto the proper r a adjacencymatrix-basedgraphsignalprocessingasin[1], and generalized eigenvectors of A. It is an orthogonal [9], [10], [11] to real-world networks. projection when A is normal (AHA = AAH) and the Real-world networks are often characterized by two eigenvectors form a unitary basis (i.e., V−1 =VH). salientfeatures:(1)extremelylargedatasetsand(2)non- For defective, or non-diagonalizable adjacency matri- idealnetworkpropertiesthatrequireextensivecomputa- ces,whichhaveatleastoneeigenvalueλ withalgebraic i tiontoobtainsuitabledescriptionsofthenetwork.These multiplicity (the exponent in the characteristic polyno- featurescanresultinextremelylongexecutiontimesfor mial of A) greater than the geometric multiplicity (the theanalysesofinterest.Thispaperpresentsmethodsthat kernel dimension of A − λ I, or dimKer(A − λ I), i i significantly reduce these computation times. where I is the N ×N identity matrix), the correspond- The adjacency matrices of real-world large, directed, ing eigenvectors may not span CN. The basis can be and sparse networks may be defective. For such matri- completed by computing Jordan chains of generalized ces, [1] defines the eigendecomposition of the matrix eigenvectors [13], [14], but the computation introduces ThisworkwaspartiallysupportedbyNSFgrantsCCF-1011903and degrees of freedom that render these generalized eigen- CCF-1513936andanSYS-CMUgrant. vectorsnon-unique.Thisisaddressedin[11]bydefining The authors are with the Department of Electrical and Computer a spectral projector-based GFT in terms of projections Engineering, Carnegie Mellon University, Pittsburgh, PA 15213USA (email:jderi,[email protected]) onto the Jordan subspaces of the adjacency matrix of the graph. Formally, the GFT of a signal s ∈ CN GFT equivalenceacross networks of varioustopologies. over graph G = G(A), where adjacency matrix A Sections IV, VI, and VII discuss the increased compu- with k distinct eigenvalues has Jordan decomposition tational efficiency that the AIM allows. We emphasize A=VJV−1andJordansubspaces{J },i=1,...,k, that reducing computation time is a major objective for ij j =1,...,dimKer(A−λ I), isdefinedasthe mapping formulating the inexact method. i k gi F :S → J The Agile Inexact Method (AIM) for computing the MM ij graph Fourier transform of a signal is defined as the i=1 j=1 signalprojectionsontothegeneralizedeigenspacesofthe s→(s ,...,s ,...,s ,...,s ), (2) 11 1g1 k1 kgk adjacencymatrixA. WhenAhask distincteigenvalues, b b b b where each sij represents the projection onto Jordan the definition of a generalized eigenspace Gi (see [13], subspace Jijbparallel to S\Jij. [16]) is The spectral projector GFT requires Jordan chain vectorcomputationsfornon-diagonalizable,or defective Gi =Ker(A−λiI)mi, i=1...,k (3) adjacencymatrices,whichcanbeexceedinglyinefficient for corresponding eigenvalue λ and eigenvalue in- i for large, directed matrices with many nontrivial Jordan dex m . Each G is the direct sum of the Jordan sub- i i subspaces. The desire for a more efficient transform spaces of λ . i computation motivates the inexact approaches proposed inthispaper.Inparticular,weproposeaninexactmethod The direct sum of generalizedeigenspaces providesa for computing the graph Fourier transform that simpli- unique decomposition of the signal space. This leads to fies the choice of projection subspaces. This method thedefinitionoftheAgileInexactMethodforcomputing is motivated by insights based on graph equivalence the graph Fourier transform of a graph signal s∈S classes in [15] anddramaticallyreducescomputationon real-world networks. The runtime vs. fidelity trade-off k H:S → G associated with this method is explored. M i Toobtaintheinexactmethod,thespectralcomponents i=1 s→(s˘ ,...,s˘ ). (4) are redefined as the generalized eigenspaces of adja- 1 k cency matrix A ∈ CN×N, and the resulting properties The method when applied to signal s yields the unique are discussed in Section II. Section III establishes the decomposition generalizedParseval’sidentitywithrespecttotheinexact k method.Anequivalenceclasswithrespecttotheinexact s= s˘, s˘ ∈G , (5) methodisdiscussedinSectionIV,whichleadstoatotal X i i i i=1 variation ordering of the new spectral components as discussed in Section V. Section VI discusses the utility whereeachs˘iisthe(oblique)projectionofsontotheith of the inexact method for real-world network analysis, generalizedeigenspaceGi paralleltoS\Gi,orLj6=iGj. and Section VII describes the trade-offsassociated with its use. We define the projection operator for the inexact Lastly, Section VIII demonstrates the application of method. Denote eigenvector matrix the inexact GFT to New York City taxi data on the V =[v ···v ···v ···v ], (6) Manhattan street network. Our results demonstrate the 1,1 1,a1 k,1 k,ak speed of the inexact method and show that it reduces where v represents an eigenvector or generalized i,j energydispersal over spectral componentcorresponding eigenvector of λ and a is the algebraic multiplicity i i to eigenvalue λ=0. of λ , i = 1,...,k, j = 1,...,a . Let V0 represent i i i the N ×N matrix that consists of zero entries except II. AGILE INEXACT METHODFORACCELERATING for columns containing vi,1 through vi,ai. The signal GRAPH FOURIER TRANSFORM COMPUTATIONS expansioncomponentsforsignalsares=Vsforsignal This section defines the Agile Inexact Method (AIM) s∈CN, where e for computing the graph Fourier transform (2) (GFT) s=[s ···s ···s ···s ]T. (7) ofa signal.The inexactnessofthe AIM resultsfromthe 1,1 1,a1 k,1 k,ak e e e e e abstractionoftheJordansubspacesthatcharacterize(2); Therefore, this concept is discussed further in this section. The s˘ =s v +···s v (8) agility of the AIM is related to the Jordan equivalence i i,1 i,1 i,ai i,ai classes introduced in [15], which demonstrate that the =Vei0s e (9) degrees of freedom allowed by defective matrices yield =V0Ve−1s (10) i 2 ... Let V = [v1···vN] be a Jordan basis for A with dual =V I V−1s (11) basis matrix W =V−H partitioned as [w1···wN]. Let ai ... isn=baPsisNi=V1hsa,nvdiisvi==PVseNi=V1hbse,wthiiewriep=resWenstaWtiobneotfhse =Z s, (12) representation of s in basis W. Then i0 e where ksk2 =hs,si=hs ,s i. (14) V W ... e e In this way, a biorthogonal basis set can be used to Z =V I V−1 (13) i0 ai define signal energy. In particular, for a signal s over ... graph G = G(A), V represents the eigenvector matrix ofgraphadjacencymatrixA, andthe columnsof V and is the first component matrix of the ith generalized W = V−H form a dual basis. Thus, the expansions eigenspace[13].Missingentriesin(11)and(13),includ- of signal s in these bases yield coefficient vectors s ing dot entries, are zero. The matrix Z (13) projects V i0 and s that satisfy (14). s ∈ CN onto the generalized eigenspace G parallel to W e i Wee define the energy of a signal projection G . Note that the projection may be oblique. The rLeajd6=eir isj directed to Theorems 9.5.2 and 9.5.4 in [13] onto an AIM GFT component s˘i. Suppose s˘i = span(v ,...,v ), where v ,...,v are columns for more properties of (13). i,1 i,ai i,1 i,ai of V with correspondingcolumnsw ,...,w of W. The AIM provides a unique, Fourier transform-like i,1 i,ai For signal s over graph G =G(A), write s˘ in terms of projectionspaceforasignalthatisanalogoustothedef- i the columns of V as inition(2)basedonJordansubspacesbecausethesignal spaceisadirectsumofthegeneralizedeigenspaces.The s˘ =α v +···+α v (15) i 1 i,1 ai i,ai generalized eigenspaces are not necessarily irreducible and in terms of the columns of W as componentsof the signal space, however,while the Jor- dan subspaces do provide an irreducible decomposition s˘ =β w +...β w . (16) i 1 i,1 ai i,ai ofthesignalspace.Intermsofalgebraicsignalprocess- ing [17], [18], this means that the Jordan subspace for- Note that α = (α1,...,αai) and β = (β1,...,βai) are mulation (2) can define a Fourier transform of a signal, subsetsofcoefficientvectorssV andsW thatsatisfy(14), while the generalizedsubspace (4) is inexact, and is not respectively. The energy of s˘ei can theen be defined as strictly a formulation of a true graph Fourier transform. ks˘k2 =hα,βi. (17) i The projections onto the generalized eigenspaces are additionsofthespectralcomponentsoftheGFT (2)and Equation (17) is used in Section VIII to compare the canbeconsideredtobeinexactanaloguestothespectral inexact method (4) and original GFT (2). components. NotethattheAIM(4)resolvestotheJordansubspace- IV. GRAPH EQUIVALENCE UNDERTHEAGILE based GFT (2) whenever the maximum number of Jor- INEXACT METHOD dan subspaces per distinct eigenvalue is one (i.e., the characteristic and minimalpolynomialsof A are equal). Just as different choices of Jordan subspace bases Otherwise,Jordansubspaceinformationislostwhenthe yield the same GFT over Jordan equivalence classes of AIM is applied. graph topologies in [11], different choices of bases for The following sections present key properties of (4), thegeneralizedeigenspacesoftheadjacencymatrixyield with particular emphasis on the generalized Parseval’s the same inexact GFT. We define a G-equivalence class identity, graph equivalence classes with respect to (4), as follows: and total variation ordering. Definition 2 (G-Equivalent Graphs). Consider III. GENERALIZEDPARSEVAL’S IDENTITY graphs G(A) and G(B) with adjacency matrices Since the full graph Fourier basis V may not be A,B ∈CN×N. Then G(A) and G(B) are G-equivalent unitary, the signal energy can be characterized by a graphs if all of the following are true: generalized Parseval’s identity. This identity is defined 1) TheeigenvaluesofAandB areidentical(including in [19] and used in [11] to characterize the spectral algebraicandgeometricmultiplicities)–thatis,G A components. The identity is presented as a property and G are cospectral; and B below: 2) {G }k = {G }k , where G and G are A,i i=1 B,i i=1 A,i Bi Property 1 (Generalized Parseval’s Identity). Consider theithgeneralizedeigenspacesofAandB,respec- graph signal s ∈ CN over graph G(A), A ∈ CN×N. tively, and k is the number of distinct eigenvalues. 3 The set of all graphs that satisfy Definition 2 with labels. Theorem 4 shows an upper bound on this total respect to a graph adjacency matrix A with Jordan variation. These theorems follow immediately from the decomposition A = VJV−1 form the G-equivalence proofs of Theorems 25 and 28 of [11]. These theorems class of G(A). Denote by G the G-equivalence class show the dependence of the total variation (18) on A of G(A). the choice of eigenvector matrix V and motivate the The power of the G-equivalence classes comes from definition of basis-independent class total variations as the observation that a graph adjacency matrix A = in [15]. VJV−1 mayhavethesameeigenvaluesandgeneralized Theorem 3. Let A,B ∈ CN×N such that G(B) is eigenspaces as a matrix B = VJV−1. For the AIM method (4), the basis providedebyethe columns of V isomorphic to G(A). Let VA,i ∈ CN×ai be a union of Jordan chains that span generalized eigenspace G of ythieeldcsoliudmenntsicoafl sVig.nIanl padrodjieticotino,nwshaesnthVatcparnovbiedecdombey- matrixAandVB,i ∈CN×ai thecorrespondingunioinof Jordan chains of B. Then TV (V )=TV (V ). puted without finding Jordan chain vecteors, the overall G A,i G B,i computationtime for the eigendecompositiondecreases. We briefly note that the generalized eigenspaces of SectionVIIdemonstratestheutilityoftheseequivalence the adjacency matrices A and B in Theorem 3 are not classes. necessarilyequivalentbutarerelatedbyanisomorphism. In contrast to the Jordan equivalence classes in [15], The reader is referred to [15] for more on isomorphic G-equivalence classes are independent of Jordan sub- equivalence classes. spaces and their dimensions. This allows for greater Theorem 4. Consider adjacency matrix A with k dis- degrees of freedom in basis choices that yield the same tincteigenvaluesandN×a eigenvectorsubmatricesV signal projections. As a result, the eigendecomposition i i with columns corresponding to the union of the Jordan required for the AIM (4) can be simpler and faster to chainsofλ ,i=1,...,k.Thenthegraphtotalvariation compute than that for the GFT (2). A particular case is i is bounded as discussed in greater detail in Section VII. Thenextsectiondefinesatotalvariationorderingover TV (V )≤|1−λ |+1. (19) G i i thegeneralizedeigenspacesG (3)ofthegraphadjacency i The bound (19) highlights that the choice of Jordan matrix.A classtotalvariationisproposedto accountfor chainvectorsintheeigendecompositionoftheadjacency different choices of Fourier bases that correspond to a matrix affects the total variation. In order to gain inde- particular G-equivalence class. pendence from a choice of basis, the definition of total variation can be generalized to a class total variation V. TOTALVARIATION ORDERING definedovertheG-equivalenceclassassociatedwith the In [11] and [15], we defined an ordering on the GFT graph of adjacency matrix A. We define the class total spectral components based on the total variation of a variation of spectral component G as the supremum of i signal. This allows ordering the frequency components the graph total variation of V over the G-equivalence soastodefinelow-pass,band-pass,andhigh-passgraph class (for all G(B)∈G ): i signals with respect to specific graphs. Using a total A variation-based characterization signifies that the varia- TVGA(Gi)= sup TVG(Vi). (20) tionbetweenasignalanditstransformationbythegraph G(B)∈GA B=VJV−1 adjacency matrix is less for low-pass signals than for span{Vi}=Gi high-pass signals. In this section, such an ordering is kVik1=1 shown for GFT spectral components corresponding to In particular, the low-to-high variation sorting is the Agile Inexact Method (AIM). achieved via the function f(λ )=|1−λ |+1. i i DenotebyV thesubmatrixofcolumnsofeigenvector i matrix V that span G , the ith generalized eigenspace. VI. APPLICABILITY TO REAL-WORLD NETWORKS i The (graph) total variation of V is defined as This section discusses how the AIM can be applied i on real-world large, sparse, and directed networks. We TV (V )=kV −AV k . (18) G i i i 1 discuss examples that illustrate that adjacency matrices The definition (18) of total variationhas been charac- for these networks have a single generalized eigenspace terizedin[11]whereeachV spansaJordansubspacein- of dimensiongreater than one. We demonstratethe ease i stead of a generalizedeigenspace. Since the generalized of computing the AIM for such networks. eigenspaces are direct sums of Jordan subspaces (see Section I), many properties hold for the total variation A. Large, Directed, and Sparse Networks over generalized eigenspaces. Theorem 3 shows that Inpractice,sparsityindirectednetworksyieldsazero the total variation is invariant to a permutation of node eigenvalueofhighalgebraicandgeometricmultiplicities 4 0.01 numerical eigenvalues are indeed the true eigenvalues.1 4 0.005 This phenomenon is typically observed in systems with 3 λm() 0 multiple eigenvalues [23], [24], [25]. In practice, we λ||2 I can verify that numerical zero is an eigenvalue using -0.005 1 either pseudospectra [26] or techniques that explore the 00 2000Sorted 4In0d0e0x 6000 -0.0-10.01 Re0(λ) 0.01 numerical stability of the singular value decomposi- (a) Eigenvalue magnitudes (b) Small eigenvalues tion [27], [28]. For the examples explored here, the sin- for road network for road network gularvaluedecompositiondemonstratedthattheclusters ×10-6 of low-magnitude eigenvalues represented a numerical 1 zero eigenvalue. This method involves comparing small 30 λm()0 singular values to machine precision and is discussed in λ||20 I more detail in Section VIII-C. 10 -1 Furthermore, the adjacency matrices A and A rd bl 0 -1 0 1 have kernels of dimension 446 and 439, respectively2, 0 50S0orted Index1000 Re(λ) ×10-6 confirmingthateigenvalueλ=0hashighalgebraic(and (c) Eigenvalue magnitudes (d) Small eigenvalues geometric) multiplicity. In addition, eigenvector matri- for political blog for political blog ces V and V computed with MATLAB’s eigensolver rd bl Fig. 1: (a) Eigenvalue magnitudes of a directed New have numerical rank 6363 < 6408 and 910 < 1222, York City road network adjacency matrix. The magni- respectively, implying the existence of nontrivial Jordan tudes are small in the index range 1 to 699 (up to the subspaces (Jordan chains of length greater than one) dottedline).(b)Roadnetworkeigenvalues(699total)in forλ=0.Whilethesegeneraleigenspacepropertiescan the sorted index range 1 to 699 plotted on the complex bereadilydetermined,asubstantialamountofadditional plane. (c) Eigenvalue magnitudes of the largest weakly computation is required to determine the dimensions of connected component of a political blog network. The eachJordansubspaceforeigenvalueλ=0.Forexample, magnitudesare small in the index range 1 to 548 (up to the Jordan subspaces for λ = 0 can be deduced by the dotted line). (d) Blog eigenvalues (548 total) in the computing a generalized null space as in [30], but this sorted index range 1 to 548. computation takes O(N3) or O(N4) for an N × N matrix. On the other hand, the eigenvectors corresponding to eigenvalues of higher magnitude in these networks are full rank. In other words, from a numerical perspective, that may not be equal. This behavior can be illustrated the Jordan subspaces for these eigenvalues are all one- using two examples. The first is a directed, strongly dimensional. connected road network Grd = G(Ard) = (Vrd,Erd) of In such networks, for which sparsity yields a high- Manhattan, New York [20]; Grd contains 6,408 nodes multiplicity zero eigenvalue while the other eigenvalues that represent the latitude and longitude coordinates of correspond to a full-rank eigenvector submatrix, the intersections and mid-intersection locations (available AIM inexact method given in (4) can be applied in the from [21]), as well as 14,418 directed edges, where followingway.DenotetheeigenvectormatrixV ofgraph edge e = (v1,v2) represents a road segment along adjacency matrix A such that which traffic is legally allowed to move from v to v 1 2 as determined from Google Maps [20]. The second V = V V , (21) h known i network example is the largest weakly connected com- e whereV bethefull-rankeigenvectorsubmatrixcor- ponent G = G(A ) = (V ,E ) of a political blog known bl bl bl bl respondingto the nonzero eigenvalues,and the columns network, with 1,222 blogs as nodes and 19,024 di- of submatrix V span the generalized eigenspace G rected edges, where edge e = (v ,v ) exists between 0 1 2 corresponding teo eigenvalue zero. blogs v ,v ∈V if v contains a link to v [22]. 1 2 bl 1 2 OnewaytofindaspanningsetofcolumnsforV isto Figures 1b and 1d show that the numerical eigen- find the kernelof VkTnown, since the range space oef The values of these real-world networks occur in clusters on the complex plane. These eigenvalues have small 1Theeigenvalues inFigure1werecomputedinMATLAB.Similar magnitude (lying left of the dashed lines in Figures 1a clusters appear using eig without balancing and schur. They also and 1c). On first inspection, it is not obvious whether appear when analyzing row-stochastic versions of the asymmetric adjacency matrices. thenumericaleigenvaluesarespuriouseigenvaluesclus- 2The kernels (null spaces) were computed in MATLAB [29] on a tered around a true eigenvalue of zero, or whether the 16-GB,16-core Linuxmachine. 5 resulting inexact graph Fourier basis is then Therandomassignmentofmissingeigenvectorstothe nontrivial generalized eigenspaces would be a coarser V = V Ker VT . (22) (cid:2) known (cid:0) known(cid:1)(cid:3) inexact method compared to the AIM method (4); in b If the original network has adjacency matrix A = particular, it is unknown which eigenvector assignment, VJV−1, then (22) implies that the network with adja- ifany,correspondstoagraphinthesameG-equivalence cencymatrixA=VJV−1 isinthesameG-equivalence classastheoriginalgraph.Ontheotherhand,thematri- class as the boriginbalbnetwork with adjacency matrix ces built from a series of random assignments could be A=VJV−1.AsdiscussedinSectionIV,theAIMGFT usedtoconstructafilter banksuchthateachassignment isequivalentoverbothnetworks.SectionVIIshowsthat corresponds to a different graph filter. An optimal or computingtheFourierbasisasin(22)iscomputationally near-optimalweighted combinationof such filters could fast and efficient. be learned by using time-series of filtered signals as In this way, for large, sparse, and directed real-world inputs to train a classifier. While the resulting Fourier networks that exhibit a single zero eigenvalue of high basisisonlyanapproximationoftheonerequiredforthe multiplicity, the AIM can simplify the computation of a AIM method (4), such a learned combination of filters graphFourier basis. Instead of computingJordan chains would provide an interpretable tool to analyze graph or generalizednull spaces to deduce the structure of the signalswhilealsorankingeachfilter(representingaran- Jordansubspaces,a singlekernelcomputationis needed dom assignment of vectors to generalized eigenspaces) to compute the spanning basis. This idea is explored in terms of its utility for typical graph signals. further in Section VII. The next section characterizes the loss of fidelity to the original graph when the AIM is applied as in (22) fora single unknown,nontrivialgeneralizedeigenspace. B. Multiple Nontrivial Jordan Subspaces In the case of multiple distinct eigenvalueswith large VII. RUNTIME VS. FIDELITYTRADE-OFF but unequal algebraic and geometric multiplicities, it While maximizing the fidelity of the network rep- maybepossibletocomputeafewJordanchainvectorsto resentation is desirable in general, obtaining as much differentiatethe nontrivialgeneralizedeigenspaces.This knowledge of the Jordan chains as possible is compu- approachisonlycomputationallyefficientif the number tationally expensive. As shown below, execution time of Jordan chain vectors to compute is relatively small, is linear with respect to the number of Jordan chain as discussed in [14], [31]. vectorstocompute.Inpractice,however,computationof When this is not feasible, a coarser inexact method increasingnumbersofchainsrequiresallocatingmemory can be obtained by reformulating (22) so that V known for matrices of increasing size. This memory alloca- consistsofallknowneigenvectorsandisfullrank.Then tion time can increase nonlinearly once specific limits abasisofthekernelofVT canbecomputedtoobtain known (e.g., RAM) imposed by the computing hardware are a set of spanning vectors. exceeded. Since practical applications of GFT analysis It is unknown a priori which basis vectors may be constrained by execution time requirements of KerVT belong to the nontrivial generalized known and available computing hardware, methods that allow eigenspaces of A. This can be overcome by randomly fidelity vs. speedup tradeoffs will prove useful. assigning these vectors to the nontrivial generalized Our inexact GFT approach provides precisely such a eigenspaces, which results in a coarser inexact method. method for trading between higher fidelity analysis (a The dimensionsof each generalizedeigenspace must be greater number of Jordan chains computed) and execu- known so that the correctnumber of vectors is assigned tiontime,asdiscussedin SectionVI.Inthissection,the to each eigenspace. The dimensions can be determined details of such trade-offs are discussed in detail. by recursively computing for eigenvalue λ i To illustrate the cost of computing the Jordan chain f(l)=dimKer(A−λ I)l−dimKer(A−λ I)l−1 (23) vectors of A ∈ CN×N that complete the generalized i i eigenspace of eigenvalue λ , consider the general algo- i for l = 2,...,N; this equation provides the number rithm given by: of Jordan chains of length at least l [13]. The value of dimKer(A−λ I)l−1 forl >1 atwhich f(l)=0 is the i 1: function COMPUTE_CHAINS dimension of generalized eigenspace Gi. If |λmax| > 1, 2: V = FULL_RANK_EIGENVECTORS(A) theconditionf(l)=0maynotbeattained;instead,f(l) 3: R = PSEUDOINVERSE(A−λiI) becomesamonotonicallyincreasingfunctionforlargel. 4: N = KER(A−λiI) In this case, the value of dimKer(A − λ I)l−1 for l 5: Nnew=[ ] i 6: for c in 1..maximum chain length do at which f(l) becomes monotonically increasing is the 7: for v in N do approximate generalized eigenspace dimension. 8: if [(A−λiI) v] full rank then 6 9: v2 = R*v [(A−λI) v] is 10: Append v2 to V. 11: Append v2 to Nnew. O(MN(j))·m(MN(j)), (28) 12: end if 13: end for where m(·) is the platform-dependent time per unit of 14: N = Nnew memoryallocationasafunctionofthematrixsize.Since 15: end for each for loop allocates memory in this way, the total 16: end function memory allocation time is bA The eigenvector and pseudoinverse computations are O(MN(j))·m(MN(j)). (29) X pre-processing steps that can be executed just once for j=1 a network with stationary topology. Efficient methods Assumingc is the platform-dependenttime per floating- for these computations include power iterations, QR point operation and SVD constants k = 4 and k′ =22, algorithms, and, for large, sparse matrices, Lanczos the total expected runtime for the Jordan chain compu- algorithms and Krylov subspace methods [14]. The key tation can be approximated as bottlenecksofconcernresideintheforloop.Thisloop bA consistsofarank-checkingsteptoverifythatthecurrent c(4MN(j)2+22M3+2MN(j)) eigenvectorisintherangespaceofA−λ I.Thisstepcan X i j=1 be implementedusingthe singularvalue decomposition, +MN(j)m(MN(j)), (30) which has O(kMN2 + k′M3) operations [14] on an M × N matrix, M > N; constants k and k′ could wherecistheplatform-dependenttimeperfloating-point be 4 and 22, respectively, as in the R-SVD algorithm operation and the full SVD complexity coefficients are of[14].Inaddition,thematrix-vectorproductinthefor set to k =4 and k′ =22. loop takes about 2MN operations [14]. The original Equation (30) shows that the runtime for the Jordan dimension of V is M × N(j), where the number of chain computations is approximately linear with the columns N(j) approaches M from below as the num- number of missing vectors one needs to compute. In ber j of vectors traversed increases. The resulting time practice,however,thetimetoallocatememorycanscale complexity for a single iteration is nonlinearly with the number of nodes as the size of the eigenvector matrix approaches system capacity. For O(kMN(j)2+k′M3)+O(2MN(j)). (24) networks at massive scale, this can present a significant problem. Thefirstforloopincompute_chainsiteratesm i In contrast, the algorithm to compute an orthogonal times, where m is the maximum chain length corre- i set of missing vectors is very fast. The algorithm is the sponding to λ . A method for estimating m is demon- i i following: stratedinSectionVIII-D.Thesecondforloopwillfirst run for g iterations, where g is the kernel dimension i i 1: function COMPUTE_MISSING of A−λiI (the geometric multiplicity of λi). On the 2: V = FULL_RANK_EIGENVECTORS(A) lth runof the outerloop, the numberof iterationsof the 3: Vt = TRANSPOSE(A) inner loop equals the difference of kernel dimensions 4: Vnew = KER(Vt) f(l) given by (23). Therefore, the total number b of 5: V = [V Vnew] iterations is m mi f(l), or A 6: end function iPl=1 mi(dimKer(A−λiI)mi −dimKer(A−λiI)) (25) The bottleneck in compute_missing is comput- =m (a −g ), (26) ing the kernel of VT, which has time complexity i i i O(kMN(j)2+k′M3)foraSVD-basedimplementation. where a and g are the algebraic and geometric multi- i i In addition, memory allocation for Vt and V each plicitiesofλ ,respectively.Sinceb (26)dependsonthe i A takes O(MN(j)) time. Notably, this allocation is only adjacencymatrixA,ithasanimplicitdependenceonthe performed once in the AIM implementation (22) (O(1) M ×N(j) dimensions of A. The total time complexity in space), as comparedto the m allocationsrequiredin i of the for loops is then the for loop of compute_chains. bA While the output V matrix in compute_missing O(kMN(j)2+k′M3)+O(2MN(j)). (27) yieldsthe same AIM results given by (4) as the original X j=1 eigenvector matrix, the corresponding adjacency ma- trix A may not preserve key structures in the original Inaddition,thetimetoallocatememoryforthematrix grapheadjacency matrix A. In other words, the trade-off 7 forthefastercomputationtimeincompute_missing 1) NYC Taxi Data: The 2010-2013 taxi data we is a loss of fidelity to the original graph. In order work with consists of 700 million trips for 13,000 to improve the fidelity, one may compute a certain medallions [32]. Each trip has about 20 descriptors numberb<b ofJordanchainsasdeterminedbyRAM includingpick upand dropofftimestamps, latitude,and A andothersystemconstraints.Inthisway,theAIMcanbe longitude, as well as the passenger count, trip duration, used to selectively trade off execution time (30) against tripdistance,fare,tip,andtaxpaid.Thedataisavailable fidelity in the computations. as 16.7 GB of compressed CSV files. This section shows that the AIM can be employed to Dynamic representation. Since the available data enable GFT analysis to meet execution time constraints providesonly static (start and end) informationfor each bytradingruntimeforfidelity.Theanalysisofalgorithms trip, an estimate of the taxi trajectory is needed in order compute_chains and compute_missing quanti- to capture the taxi locations interspersed throughoutthe fies the expected execution time in terms of a given city at a given time slice. Our method for estimating number of Jordan chain vectors. These considerations thesetrajectories,whichisbasedonDijkstra’salgorithm, need to be addressed when applying the graph Fourier is described in detail in [33]. transform (2) and the AIM method (4) to real-world Oncetaxipathsareestimated,theyareusedtoextract problems. Such an application is shown in the next trafficbehavior.Thestatisticofinterestweconsiderhere section. is the average number of trips that pass through a given location at a given time of day. 2) NYC and Manhattan Road Networks: The road VIII. NEW YORK CITY TAXIDATA: networkG=(V,E)consistsofasetV of|V|=79,234 EIGENDECOMPOSITION ANDGRAPH FOURIER nodes and a set E of |E| = 223,966 edges which we TRANSFORM representas a |V|×|V| adjacency matrix A. The nodes in V represent intersections and points along a road This section expands on the issues described in Sec- based on geo-data from [21]. Each edge (v ,v ) ∈ E, i j tion VI and presents details for computing the eigen- v ,v ∈ V, corresponds to a road segment on which i j vector matrix V for the non-diagonalizable adjacency traffic may flow from geo-location v to geo-location i matrix A of the Manhattan road network. v as determined by Google Maps [34]. An edge of j TheAgileInexactMethod(AIM)forthegraphFourier length d > 0 is represented by a nonzero entry in ij transform developed in Section II is then applied to the the adjacency matrix so that [A] = d . A zero entry ij ij statistics computed from four years of New York City corresponds to the absence of an edge. taxi data. Our method provides a fine-grained analysis The network G is directed since the allowed traffic of city traffic that should be useful for urban planners, directions along a road segment may be asymmetric. In such as in improving emergency vehicle routing. addition,Gisstronglyconnectedsinceapath(trajectory) Sections VIII-B– VIII-E describe our method to find exists from any geo-location to any other geo-location. the eigendecompositionof the Manhattan road network. Our analysis focuses on the Manhattan grid, which Section VIII-Fdemonstratesthe AIM for computingthe is a subgraph of G with 6,408 nodes and 14,418 graph Fourier transform. edges. This subgraph is also strongly connected. The eigendecomposition in Section VIII-B is based on this network. Expressing the average number of trips that pass througha givenlocationat a fixedtime of dayas a A. Data Set Descriptions vectoroverthe Manhattangrid definesa graphsignalto Our goal is to extract behaviors over space and time whichthe GFT (2) and AIM method(4) can be applied. that characterizetaxi movementthroughNew York City based on four years (2010-2013)of New York City taxi B. Eigendecomposition of Manhattan Road Network data [32]. Since the path of each taxi trip is unknown, an additional processing step is required to estimate The Manhattan road network described in Sec- taxi trajectories before extracting statistics of interest. tion VIII-A provides the adjacency matrix A ∈ For example, if a taxi picks up passengers at Times R6408×6408 for which we compute the eigendecomposi- SquareanddropsthemoffattheRockefellerCenter,itis tion.Forthisparticularadjacencymatrix,theeigenvector desirable to have statistics that capture not just trip data matrix V ∈ C6408×6408 generated by a standard obs at the landmarks, but also at the intermediate locations. eigenvalue solver such as MATLAB or LAPACK is not Estimatingtaxtrajectoriesrequiresoverlayingthetaxi fullrank,wheretherankcanbecomputedwitheitherthe data on the New YorkCity road network.Details on the singular value decomposition or the QR decomposition; taxi data and the road network are described as follows. i.e., A is not diagonalizable. 8 In order to apply the GFT (2) or the original projec- is a numerically multiple eigenvalue with respect to α tionsonto each eigenvectoras in [1], Jordanchain com- and δ if putationsare required.These computationsare costly as σ Ak >α>δ >σ Ak , (31) describedinSectionVIIandalsopotentiallynumerically N−mk(cid:0) (cid:1) N−mk+1(cid:0) (cid:1) unstable. In contrast, the AIM (4) does not require this for k = 1,2,...,h, where h is the maximum Jordan step. chain length for the zero eigenvalue. Since it is instructive to compare the GFT with the Since the constants α and δ have different or- AIM results, it is necessary to compute the Jordan ders of magnitude, Equation (31) implies that sin- chainshere.Forthisreason,thefollowingsectionsdetail gular value σ (Ak) is significantly greater than our method for computing the Jordan chains for the N−mk σ (Ak). Manhattan road network. N−mk+1 Result5servestwopurposesforourapplication.First, it verifies the existence of a numerical zero eigenvalue. C. Eigenvalue Verification and Maximum Jordan Chain It also implies that a value of k at which Equation (31) Length fails cannotbe the maximum Jordan chain length of the zero eigenvalue. Accordingly, we propose the following Since the kernel dimension of A (the geometric mul- method to find the maximum Jordan chain length h: tiplicity of the zero eigenvalue) is 446, and there are increment the value of k starting from k = 1, and let 699 eigenvalues of magnitude close to zero (as seen in k = k′ be the first value of k such that Equation (31) Figure 1a), 699−446 = 253 generalized eigenvectors fails. Then the maximumJordan chain length for eigen- must be determined to complete the Fourier basis. value zero is h=k′−1. As discussed in Section VI-A, there is ambiguity in Table I is obtained by applying Result 5 to the determining whether the 699 eigenvalues of small mag- adjacencymatrixAoftheManhattanroadnetwork.The nitude represent true zero eigenvalues. Consequently, columnsofthetablecorrespondtothepowerkofA,the accuratelyidentifyingeigenvaluesofhighalgebraicmul- dimensionm ofthekernelofAk,theindexN−m of k k tiplicity can be a hardproblem– see, for example,[24], the first singularvalueofAk to examine,andthe values [25], [26], [27], [31], [14]. Furthermore,these eigenval- ofthesingularvaluesatindicesN−m andN−m +1. k k ues of small magnitudeform constellationswith centers Theresultsinthetablearereasonablesincethecomputa- close to zero, as shown in Figure 1b. References [24] tionalmachineprecisionwasonthe orderof10−16,and and [25] show that the presence of such constellations the first four rows of the table display singular values may indicate the presence of an eigenvalue of high for which δ is on the order of 10−15 or 10−16 and the algebraic multiplicity; [24] states that an approximation constant α is larger by 6 to 12 orders of magnitude. of this eigenvalue may be the center of the cluster of Therefore,theinequality(31)holdsforthefirstfourrows numerical eigenvalues, while [25] presents an iterative of Table I. The inequality begins to fail at k = 5; thus, algorithm to approximate a “pseudoeigenvalue.” It is theexpectedmaximumnumericalJordanchainlengthis unclear from pure inspection3 whether the observed no more than 3 or 4. constellations are a result of the multiplicity of a zero eigenvalue or are the actual eigenvaluesof A. For these reasons, it is necessary to verify the existence of a D. Jordan Chain Computation numerical zero eigenvalue before computing the Jordan The steps for finding the eigenvectors for eigenvalue chains. Our method is explained below; see also [20]. zero are described here. This is done to provide ground To verify a numerical zero eigenvalue, a result truthtoassesstheapplicationoftheAIM(4)torealdata from [27] is applied, which states the following: as shown in Section VIII-F. Result 5 ([27]). Consider a matrix A ∈ CN×N with To find the eigenvectors for eigenvalue zero, the null singular values σ (A) > ··· > σ (A) that is scaled spaceorkernelKer(A)ofAisfirstcomputedtofindthe 1 N so that σ (A) = 1. Let m = Ker(Ak) denote the corresponding eigenvectors. Each of these eigenvectors 1 k (cid:12) (cid:12) dimensionofthekernelofAk.Ina(cid:12)ddition,le(cid:12)tαandδbe correspondsto a Jordan block in the Jordan decomposi- positive constants; δ is usually on the order of machine tionofA,andtheJordanchainswithmaximumlengthh precision and α is significantly greater than δ. Then 0 (asdeterminedbyapplyingResult5from[27]asshown inTableI)arecomputedbytherecurrenceequation[13] 3Considering double-precision floating point (64 bit) and that the Av =λv +v =v , (32) numberofoperationstocomputetheeigenvaluesofanN×N matrix k k k−1 k−1 is O(N3), the expected precision is on the order of 10−6 or10−7. where k =1,...,h and v ∈Ker(A). If the number of Numerouseigenvalues inFigures1aand1bdemonstratethisorderof 0 magnitude. linearlyindependentproperandgeneralizedeigenvectors 9 k m N −m σ Ak σ Ak k k N−mk(cid:0) (cid:1) N−mk+1(cid:0) (cid:1) 1 446 5962 1.9270×10−3 1.2336×10−15 2 596 5812 2.1765×10−6 6.9633×10−16 3 654 5754 1.4013×10−8 3.4250×10−16 4 678 5730 1.1853×10−10 3.1801×10−16 5 692 5716 2.0163×10−11 8.4063×10−14 6 700 5708 9.6533×10−11 8.2681×10−11 TABLE I: Singular values to validate existence of a numerical zero. equalsN,wearedone;otherwise,theFourierbasismust a large part of the adjacency matrix structure. The next be extended. section describes our inexact alternative to full compu- Numerical instability. Numerically stable methods tation of the Jordan chains, which providesa significant such as the power method, QR algorithms, and Jacobi speedup in the determination of a useful basis set. methodsfordiagonalizablematricesareapplicablemeth- ods for eigendecomposition;see [14] and the references E. Generalized Eigenspace Computation therein for more details. Furthermore, eigendecomposi- Inordertocomputethegeneralizedeigenspaceforthe tions of large, sparse matrices can be computed with zero eigenvalue,the known eigenvectorsare determined iterativeLanczosorKrylovsubspacemethods[14,Chap- as in Section VIII-D. Then the eigenvector matrix is ter 10], [35]. computed as (22). This is a five minute computation on On the other hand, for defective or nearly defective a 64-bit machine with 16 cores and 16 GB RAM. matrices, small perturbations in network structure due to numerical round-off errors can significantly perturb thenumericaleigenvaluesandeigenvectors;forexample, F. Demonstration of the Agile Inexact Method for a defective eigenvalue of A corresponding to a In this section, the original graph Fourier transform, p-dimensional Jordan block, O(ǫ) perturbations in A defined as eigenvector projections, is compared to the can result in O(ǫ1/p) perturbations in λ [14], [36]. In Agile Inexact Method (AIM) (22) for a graph signal addition, while computing the Jordan normal form can representing the four-year average of trips that pass be numerically unstable [27], [28], [31], [14], forward throughManhattanfromJunethroughAugustonFridays stability was shown for the Jordan decomposition al- from 9pm to 10pm. gorithm in [37] and forms the basis for an SVD-based Figures2a and 2b show the percentageof energythat implementation here. The Jordan chains are generated is contained in the signal projections onto eigenvectors from the vectors in the kernel of Ah, where h is the andontogeneralizedeigenspaces,respectively,usingour maximum Jordan chain length; these computations are previouslydefinedenergymetric(SectionIII).Themax- based on the SVD or the QR decomposition,and so are imumenergycontainedinanyoftheeigenvectorprojec- more stable. Then the Jordan chain can be constructed tionsis1.6%asshowninthedatapointinFigure2a.The by direct application of the recurrence equation (32). observed energy dispersal in the eigenvectorprojections Computation details. A single pass of the Jordan correspondingtoλ=0isaresultofobliqueprojections. chain algorithm requires about a week to run on a The computed Jordan chain vectors are nearly parallel 30-machine cluster of 16 GB RAM/16-core and 8 GB to the proper eigenvectors (angles less than 1 degree), RAM/8-core machines;however,the algorithmdoes not so the signal projection onto a proper or generalized return a complete set of 253 generalized eigenvectors. eigenvector v of λ = 0 parallel to CN\span(v) has Tomaximizethenumberofrecoveredgeneralizedeigen- an augmented magnitude compared to the orthogonal vectors,successivecomputationswererunwithdifferent projection. The reader is referred to [38] for more on combinations of vectors that span the kernel as starting oblique projectors. points for the Jordan chains. After three months of In contrast, projecting onto the generalized testing, the best run yielded 250 of the 253 missing eigenspaces with the AIM method (4) concentrates Jordan chain vectors; the remaining eigenvectors were energy into two eigenspaces. The data points in computedas in (22). The maximumJordanchain length Figure 2b at indices 706 and 707 contain 30% and 28% is two, which is consistent with Result 5 and Table I. ofthesignalenergy,respectively.Ontheotherhand,the As a result, the constructed eigenvector matrix captures energy concentrated on the generalized eigenspace for 10