TENSOR ALGEBRAS AND DISPLACEMENT STRUCTURE. II. NONCOMMUTATIVE SZEGO¨ POLYNOMIALS T. CONSTANTINESCU AND J. L. JOHNSON 2 0 0 2 Abstract. In this paper we continue to explore the connection between tensor algebras n and displacement structure. We focus on recursive orthonormalization and we develop an a analogue of the Szego¨ type theory of orthogonal polynomials in the unit circle for several J noncommuting variables. Thus, we obtain the recurrence equations and Christoffel-Darboux 2 2 formulas for Szego¨ polynomials in several noncommuting variables, as well as a Favard type result. Also we continue to study a Szego¨ type kernel for the N-dimensional unit ball of an ] infinite dimensional Hilbert space. A F Key words:Displacement structure, tensor algebras, Szego¨ polynomials . h AMS subject classification: 15A69, 47A57 t a m 1. Introduction [ 1 Inthefirst partofthispaper, [5], weexploredtheconnectionbetween tensoralgebras v 3 and displacement structure. The displacement structure theory was initiated in [13] 1 as a recursive factorization theory for matrices whose implicit structure is encoded by 2 a so-called displacement equation. This has been useful in several directions including 1 0 constrained and unconstrained rational interpolation, maximum entropy, inverse scat- 2 tering, H∞-control, signal detection, digital filter design, nonlinear Riccati equations, 0 certain Fredholm and Wiener-Hopf equations, etc., see [14]. Aspects of the Szeg¨o the- / h ory can be also revealed within the displacement structure theory. Our main goal is to t a develop an analogue for polynomials in several noncommuting variables of the Szeg¨o m theory of orthogonal polynomials on the unit circle. An analogue of the Szeg¨o theory : v of orthogonal polynomials on the real line is being developed in the companion paper i X [6]. r The paper is organized as follows. In Section 2 we review notation and several re- a sults from [5]. In this way, this paper can be read independently of [5]. In Section 3 we introduce orthogonal polynomials in several noncommuting variables associated to certain representations of the free semigroup and discuss their algebraic properties, mostly related to the recursions that they satisfy. In Section 4 we consider several pos- itive definite kernels on the N-dimensional unit ball of an infinite dimensional Hilbert space. In particular, we prove a basic property of the Szeg¨o type kernel studied in [4] by characterizing its Kolmogorov decomposition. In Section 5 we discuss the problem of recovering the representation from orthogonal polynomials and we prove a Favard type result. We plan a more detailed study of applications to multiscale systems in a sequel of this paper. 1 2. Preliminaries We briefly review several constructions of the tensor algebra and introduce the nec- essary notation. We also review the connection with displacement structure theory as established in [7]. 2.1. Tensor algebras. The tensor algebra over CN is defined by the algebraic direct sum = (CN)⊗k, N k≥0 T ⊕ where (CN)⊗k denotes the k-fold tensor product of CN with itself. The addition is the componentwise addition and the multiplication is defined by juxtaposition: (x y) = x y . n k l ⊗ ⊗ k+l=n X If e ,... ,e is the standard basis of CN, then e ... e 1 i ,... ,i N is {an1orthonNor}mal basis of . Let F+ be the un{itai1l ⊗free s⊗emiigkro|up≤on1N genker≤ator}s TN N 1,... ,N with lexicograhpic order . The empty word is the identity element and ≺ the length of the word σ is denoted by σ . The length of the empty word is 0. If | | σ = i ...i then we write e instead of e ... e , so that any element of can be 1 k σ i1⊗ ⊗ ik TN uniquely written in the form x = c e , where only finitely many of the complex σ∈F+ σ σ N numbers c are different from 0. σ P Another construction of can be obtained as follows. Let S be a unital semigroup N and denote by F (S) the seTt of functions φ : S C with the property that φ(s) = 0 0 → 6 for only finitely many values of s. This set has a natural vector space structure and B = δ s S is a vector basis for F (S), where δ is the Kronecker symbol S s 0 s { | ∈ } associated to s S. Also F (S) is a unital associative algebra with respect to the 0 ∈ product φ ψ = ( φ(s)δ ) ( ψ(t)δ ) ∗ s∈S s ∗ t∈S t P P = φ(s)ψ(t)δ . s,t∈S st It is readily seen that F (F+) is isPomorphic to . Since each element in F (F+) can 0 N TN 0 N be uniquely written as a (finite) sum, φ = c δ , the isomorphism is the linear σ∈F+ σ σ N extension Φ of the mapping δ e , σ F+. 1 σ → σ ∈ PN Another copy of the tensor algebra is given by the algebra 0 of polynomials in PN N noncommuting indeterminates X ,... ,X with complex coefficients. Each element 1 N P 0 can be uniquely written in the form P = c X , with c = 0 for finitely ∈ PN σ∈F+N σ σ σ 6 many σ’s and X = X ...X where σ = i ...i F+. The linear extension Φ of the mapping δ σ X ,iσ1 F+ik, gives an isom1orphPkism∈ ofN with 0. 2 σ → σ ∈ N TN PN Yet another copy of inside the algebra of lower triangular operators allowed for N T the connection with displacement structure established in [7]. Thus, let be a Hilbert E 2 space and define: = and for k 1, 0 E E ≥ (2.1) = ... = ⊕N. Ek Ek−1⊕ ⊕Ek−1 Ek−1 Nterms For = C we have that C can be identified with (CN)⊗k and is isomorphic to the k | {z } N algebEra 0 of lower triangular operators T = [T ] ( CT) with the property LN ij ∈ L ⊕k≥0 k (2.2) T = T ... T = T⊕N , ij i−1,j−1 ⊕ ⊕ i−1,j−1 i−1,j−1 Nterms for i j, i,j 1, and T = 0 for all sufficiently large j′s. The isomorphism is given j0 | {z } by th≤e map Φ≥defined as follows: let x = (x ,x ,...) (x (CN)⊗p is the pth 3 0 1 N p ∈ T ∈ homogeneous component of x); then x = c e and for j 0, T denotes the p |σ|=p σ σ ≥ j0 column matrix [c ]T , where ”T” denotes the matrix transpose. Then T = 0 for σ |σ|=j P j0 all sufficiently large j’s and we can define T ( C ) by using (2.2). Finally, set k≥0 k ∈ L ⊕ Φ (x) = T. 3 2.2. Displacement structure. We can now describe the displacement structure of the tensor algebra. We write this connection for 0 . Then it can be easily translated into any other realization of the tensor algebra. LLetNF = [Tk] ( C ), k = 1,... ,N, k ij ∈ L ⊕k≥0 k be isometries defined by the formulae: T = 0 for i = j+1 and T is a block-column ij i+1,i matrix consisting of N blocks of dimension dim C ,6 all of them zero except for the kth i block which is the identity on C . We have the following result noticed in [7]. k Theorem 2.1. Let T 0 and define A = I TT∗. Then ∈ LN − N (2.3) A F AF∗ = GJ G∗, − k k 11 k=1 X where 1 T 00 1 0 G = 0. T.01 and J11 = 0 1 . . . . . (cid:20) − (cid:21) The model 0 of the tensor algebra is also useful in order to extend this algebra to LN some topological tensor algebras - see [12]. Here we consider only the norm topology and denote by the algebra of all lower triangular operators T = [T ] ( C ) N ij k≥0 k L ∈ L ⊕ satisfying (2.2). 2.3. Multiscale processes. Multiscale processes are stochastic processes indexed by nodes on a tree. They became quite popular lately, see [1], [2], and have potential to model the self-similarity of fractional Brownian motion leading to iterative algorithms in computer vision, remote sensing, etc. Here we restrict our attention to the case of the Cayley tree, in which each node has N branches. The vertices of the Cayley tree are indexed by F+. N 3 Let(X, ,P)beaprobabilityspace andlet v L2(P)beafamilyofrandom F { σ}σ∈F+N ⊂ variables. Its covariance kernel is K(σ,τ) = v v dP, σ τ ZX and assume that the process is stationary in the sense (considered earlier, e.g. [10]) that: (2.4) K(τσ,τσ′) = K(σ,σ′), τ,σ,σ′ F+, ∈ N (2.5) K(σ,τ) = 0 if there is no α F+ such that σ = ατ or τ = ασ. ∈ N Conversely, by the invariant Kolmogorov decomposition theorem, see e.g., [15], Ch. II, there exists an isometric representation u of F+ on a Hilbert space and a mapping v : F+ such that N K N → K K(σ,τ) = v(τ),v(σ) , K h i u(τ)v(σ) = v(τσ), for all σ,τ F+, the set v(σ) σ F+ is total in , and u(1), ..., u(N) are ∈ N { | ∈ N} K isometries with orthogonal ranges. This class of multiscale processes would be suitable to model branching processes without ”past”. If a ”past” should be attached to a process as above, we could try to consider processes indexed by the nodes of the tree associated to the free group on N generators 1, ..., N. As mentioned in the introduction, we plan to look at this matter in a sequel of this paper. Here we focus on processes with covariance kernel satisfying (2.4) and (2.5). It was shown in [5] that such a kernel has displacement structure. Also, it is clear that for all j,k 1, ≥ (2.6) [K(σ,τ)] = ([K(σ′,τ′)] )⊕N, |σ|=j,|τ|=k |σ′|=j−1,|τ′|=k−1 so that the kernel is determined by the elements s = K( ,σ), σ F+. σ ∅ ∈ N ByTheorem1.5.3in[3], eachpositivedefinite kernelK onF+ isuniquely determined N by a family of contractions γ σ,τ F+,σ τ such that γ = 0, σ F+, and { σ,τ | ∈ N (cid:22) } σ,σ ∈ N otherwise γ ( , ) (for a contraction T between two Hilbert spaces σ,τ ∈ L Dγσ+1,τ Dγσ∗,τ−1 D = (I T∗T)1/2 denotes the defect operator of T and is the defect space of T T T − D defined as the closure of the range of D – note that in our case γ are just complex T σ,τ numbers and the condition γ ( , ) for σ τ encodes the fact that σ,τ ∈ L Dγσ+1,τ Dγσ∗,τ−1 ≺ γ = 1 or γ = 1 implies that γ = 0; also, τ 1 denotes the predecessor of σ+1,τ σ,τ−1 σ,τ τ| with|respect|to the|lexicographic order on F+, whi−le σ +1 denotes the successor ≺ N of σ). In addition, the positive definite kernel K satisfies (2.4) and (2.5) if and only if (2.7) γ = γ , τ,σ,σ′ F+, τσ,τσ′ σ,σ′ ∈ N (2.8) γ = 0 if there is no α F+ such that σ = ατ or τ = ασ. σ,τ ∈ N 4 We define γ = γ , σ F+ and we notice that γ σ,τ F+,σ τ is uniquely σ ∅,σ ∈ N { σ,τ | ∈ N (cid:22) } determined by γ by the formula { σ}σ∈F+N (2.9) [γ ] = ([γ ] )⊕N, j,k 1. σ,τ |σ|=j,|τ|=k σ′,τ′ |σ′|=j−1,|τ′|=k−1 ≥ 3. Szego¨ polynomials Weintroducepolynomialsinseveralnoncommutingvariablesorthogonalwithrespect to a positive definite kernel K satisfying (2.4) and (2.5). We extend some elements of the Szeg¨o theory to this setting. The kernel K being given, we can introduce an inner product on F (F+) in the usual 0 N manner: (3.1) φ,ψ = K(σ,τ)ψ(τ)φ(σ). K h i σ,Xτ∈F+N By factoring out the subspace = φ F (F+) φ,φ = 0 and completing NK { ∈ 0 N | h iK } with respect to the norm induced by (3.1) we obtain a Hilbert space denoted . A K H similar structure can be introduced on 0. Let P = c X , Q = d X PN σ∈F+N σ σ σ∈F+N σ σ be elements in 0, then define: PN P P (3.2) P,Q = K(σ,τ)d c . K τ σ h i σ,Xτ∈F+N By factoring out the subspace = P 0 P,P = 0 and completing with MK { ∈ PN | h iK } respect to the norm induced by (3.2) we obtain a Hilbert space denoted L2(K). One can check that the map Φ defined by δ X , σ F+, extends to a unitary operator 2 σ → σ ∈ N from onto L2(K). K H From now on we assume that for any α F+ the matrix [K(σ,τ)] is invertible. ∈ N σ,τ(cid:22)α This implies that = 0 and 0 can be viewed as a subspace of L2(K). Also, for any α F+, X Mis Ka linearly iPndNependent family in L2(K). Then, the Gram-Schmidt ∈ N { σ}σ(cid:22)α procedure gives a family ϕ of elements in 0 such that { σ}σ∈F+N PN (3.3) ϕ = a X , a > 0; σ σ,τ τ σ,σ τ(cid:22)σ X (3.4) ϕ ,ϕ = 0, σ τ. σ τ K h i ∅ (cid:22) ≺ An explicit formula for the orthogonal polynomials ϕ can be obtained in the same σ manner as in the classical (one variable) case. Define for σ F+, ∈ N (3.5) D = det[K(σ′,τ′)] σ σ′,τ′(cid:22)σ and let γ be the parameters associated to K as described in Section 2.2.3. { σ}σ∈F+N Note that since all the matrices [K(σ,τ)] , α F+, are assumed to be invertible, σ,τ(cid:22)α ∈ N it follows that γ < 1 for all σ F+. | σ| ∈ N 5 Theorem 3.1. (1) ϕ = 1 and for σ, ∅ ∅ ≺ [K(σ′,τ′)] 1 σ′≺σ;τ′(cid:22)σ (3.6) ϕ = det . σ √D D σ−1 σ 1 X ... X 1 σ (2) For σ = i ...i , 1 k ∅ ≺ 1 (3.7) ϕ = (X +lower order terms). σ (1 γ 2)1/2 σ 1≤j≤k −| ij...ik| Proof. The proof is sQimilar to the classical one. Thus, we deduce from the orthog- onality condition (3.4) that ϕ ,X = 0 for τ′ σ, which implies that σ τ′ K h i ∅ (cid:22) ≺ a K(τ′,τ) = 0 for τ′ σ. Using the Cramer rules for the system τ(cid:22)σ σ,τ ∅ (cid:22) ≺ P τ(cid:22)σaσ,τK(τ′,τ) = 0, ∅ (cid:22) τ′ ≺ σ, P a X = ϕ , τ(cid:22)σ σ,τ τ σ with unknowns a , we deduce σ,τ P ϕ D σ σ−1 a = . σ,σ [K(σ′,τ′)] σ′≺σ;τ′(cid:22)σ det 1 X ... X 1 σ Therefore, [K(σ′,τ′)] a σ′≺σ;τ′(cid:22)σ σ,σ ϕ = det . σ D σ−1 1 X ... X 1 σ We now compute a and D in terms of the parameters γ of K. First we σ,σ σ { σ}σ∈F+N notice that [K(σ′,τ′)] σ′≺σ;τ′(cid:22)σ det ,X = D σ K σ h i 1 X ... X 1 σ and since X = 1 ϕ + c X , we deduce σ aσ,σ σ τ≺σ τ τ PDσ−1 1 Dσ−1 D = ϕ , ϕ + c X = , σ h a σ a σ τ τiK a2 σ,σ σ,σ τ≺σ σ,σ X so that 1 D σ = a2 D σ,σ σ−1 which gives (3.6). In order to compute D in terms of γ we use Theorem 1.5.10 in [3] and the σ { σ}σ∈F+N special structure of D . Thus, σ D = (1 γ 2) σ σ′,τ′ −| | ∅≺σ′,τ′(cid:22)σ Y 6 and for σ = i ...i we deduce 1 k ∅ ≺ 1 D = σ = (1 γ 2). a2 D −| ij...ik| σ,σ σ−1 1≤j≤k Y Then, ϕ = a X + a c X σ σ,σ σ τ≺σ σ,σ τ τ = 1P (X +lower order terms), 1≤j≤k(1−|γij...ik|2)1/2 σ which gives (3.7). Q We illustrate this result for N = 2. From now on it is convenient to use the notation d = (1 γ 2)1/2, σ F+ . σ −| σ| ∈ N −{∅} Example. Let N = 2 and assume the positive kernel K satisfies the conditions in Theorem 3.1. We have D = 1 and the next three determinants are: ∅ 1 s D = det 1 = d2; 1 s 1 1 1 (cid:20) (cid:21) 1 s s 1 2 D = det s 1 0 = d2d2; 2 1 1 2 s 0 1 2 1 s s s 1 2 11 s 1 0 s D = det 1 1 = d4d2d2 . 11 s 0 1 0 1 2 11 2 s s 0 1 11 1 Using Theorem 3.1 we can easilycalculate the firstfour orthogonal polynomials of K. Thus, ϕ = 1 and then ∅ 1 1 s γ 1 ϕ = det 1 = 1 + X ; 1 d 1 X −d d 1 1 (cid:20) 1 (cid:21) 1 1 1 s s 1 1 2 γ γ γ 1 ϕ = det s 1 0 = 2 + 1 2X + X , 2 d2d 1 −d d d d 1 d 2 1 2 1 X X 1 2 1 2 2 1 2 where we used the fact thats = d γ . Then, after some calculations, 2 1 2 1 s s s 1 2 11 1 s 1 0 s ϕ = det 1 1 11 d3d2d s 0 1 0 1 2 11 2 1 X X X2 1 2 1 γ γ γ γ γ γ 1 = 11 +( 1 + 11 1 )X + 11 2X + X2. −d d d −d d d d d 1 d d 2 d d 1 1 2 11 1 11 1 2 11 2 11 11 1 We establish that the orthogonal polynomials introduced above satisfy equations similar to the classical Szeg¨o difference equations. 7 Theorem 3.2. The orthogonal polynomials satisfy the following recurrences: ϕ = 1 ∅ and for k 1,... ,N , σ F+, ∈ { } ∈ N 1 (3.8) ϕ = (X ϕ γ ϕ♯ ), kσ d k σ − kσ kσ−1 kσ where ϕ♯ = 1 and for k 1,... ,N , σ F+, ∅ ∈ { } ∈ N 1 (3.9) ϕ♯ = ( γ X ϕ +ϕ♯ ). kσ d − kσ k σ kσ−1 kσ Proof. We deduce this result from similar formulae obtained for an arbitrary positive definite kernel. In this way we can show the meaning of the polynomials ϕ♯, σ F+. σ ∈ N Let [t ] be a positive definite kernel on N and assume that each matrix A(i,j) = i,j i,j≥1 [t ] , 1 i j, is invertible. Also, assume t = 1 for all k 1. Let F be the k,l i≤k,l≤j ≤ ≤ k,k ≥ i,j upper Cholesky factor of A(i,j), so that F is an upper triangular matrix with positive i,j diagonal and A(i,j) = F∗ F . A dual, lower Cholesky factor is obtained as follows: i,j i,j define the symmetry of appropriate dimension, 0 0 ... 0 I 0 0 I 0 = ... ... J 0 I 0 I 0 0 0 and then let F˜ denote the upper Cholesky factor of B(i,j) = A(i,j) . If G = i,j i,j J J F˜ , then i,j J J A(i,j) = B(i,j) = F∗ F = G∗ G , J J J i,j i,jJ i,j i,j and G is a lower triangular matrix with positive diagonal, called the lower Cholesky i,j factor of A(i,j). Let P be the last column of F−1 and let P♯ be the first column of i,j i,j i,j G−1, that is i,j P = F−1E, P♯ = G−1 E, i,j i,j i,j i,jJ T where E = 0 ... 0 I . Let r be the parameters associated to [t ] { i,j}1<i≤j i,j i,j≥1 by Theorem 1.5.3 in [3] and let ρ = (1 r 2)1/2. We have that (cid:2) (cid:3) i,j −| i,j| 1 0 r P♯ (3.10) P = 1,n 1,n−1 , 1,n d P − d 0 1,n (cid:20) 2,n (cid:21) 1,n (cid:20) (cid:21) r 0 1 P♯ (3.11) P♯ = 1,n + 1,n−1 . 1,n −d P d 0 1,n (cid:20) 2,n (cid:21) 1,n (cid:20) (cid:21) These formulae are presumable known to the experts. For the sake of completeness we give a proof here based on results and notation from [3]. First we introduce the following elements. For i < j, (3.12) L(j) = L( r j ) = r ρ r ... ρ ...ρ r ; i { i,k}k=i+1 i,i+1 i,i+1 i,i+1 i,i+1 i,j−1 i,j 8 (cid:2) (cid:3) r j−1,j . . (i) . C = , j r ρ ...ρ i+1,j i+2,j j−1,j r ρ ...ρ i,j i+1,j j−1,j and r ρ ...ρ i,i+1 i,i+2 i,j . K(j) = .. = Ki(j−1)ρi,j . i r ρ r i,j−1 i,j (cid:20) i,j (cid:21) r i,j (i+1) Also, we define inductively: D = ρ , i i,i+1 (j−1) (j−1) (3.13) Di(j) = D({ri,k}jk=i+1) = Di0 −Kiρ ri,j . (cid:20) i,j (cid:21) We also need to review the factorization of unitary matrices. This is an extension of Euler’s description of SO(3). First we define r ρ R (r ) = I i,k i,k I , j−i i,k k−1−i ⊕ ρ r ⊕ j−k−1 i,k i,k (cid:20) − (cid:21) where I is the identity matrix of size k 1 i. Then, k−1−i − − R = R (r )...R (r ), i,j j−i i,i+1 j−i i,j and U = R (U 1). i,j i,j i+1,j ⊕ It turns out that any unitary matrix can be written as a matrix of the form of U . i,j The main idea for the proof of (3.10) is to use the identity (3.14) U G = F , i,j i,j i,j J which follows from the relations (1.6.10), (6.3.8) and (6.3.9) in [3]. Thus, we notice that (3.14) implies P♯ = F U E, i,j i,j i,j which is more tractable than the original definition of P♯ . This is seen from the i,j (n) following calculations. Using formula (1.5.7) in [3], the above definition of D , and 1 9 the notation D−(n) = (D(n))−1, we obtain that 1 1 (n) −(n) (n) −(n) 1 L D L D E − 1 1 − 1 1 P = E = 1,n 0 F−1D−(n) F−1D−(n)E 2,n 1 2,n 1 D−(n−1) r1,nD−(n−1)K(n−1) L(n) 1 ρ1,n 1 1 E − 1 0 1 " # ρ1,n = D−(n−1) r1,nD−(n−1)K(n−1) F−1 1 ρ1,n 1 1 E 2,n 0 1 " # ρ1,n r1,nD−(n−1)K(n−1) L(n) ρ1,n 1 1 − 1 1 " # ρ1,n = r1,nD−(n−1)K(n−1) F−1 ρ1,n 1 1 2,n 1 " # ρ1,n r1,nD−(n−1)K(n−1) L(n) ρ1,n 1 1 0 − 1 " 1 # 1 ρ1,n = + ρ 1,n F2−,n1E F2−,n1 ρr11,,nnD1−(n−1)−1K1(n−1) " 0 # L(n−1)D−(n−1)K(n−1) ρ ...ρ 0 − 1 1 1 − 1,2 1,n−1 1 r 1,n = + . ρ ρ D−(n−1)K(n−1) 1,n P 1,n F−1 1 1 2,n 2,n 0 (cid:20) (cid:21) The proof of formula (1.6.15) in [3] gives 1 (n−1) −(n−1) (n−1) L D K +ρ ...ρ = 1 1 1 1,2 1,n−1 ρ ...ρ 1,2 1,n−1 and using formula (1.5.6) in [3] we deduce D−(n−1)K(n−1) F−1 D−(n−1)K(n−1) F−1 1 1 = 2,n−1 1 1 , 2,n 0 0 (cid:20) (cid:21) (cid:20) (cid:21) 10