Table Of ContentQuantum Gibbs Sampling
Using Szegedy Operators
Robert R. Tucci
P.O. Box 226
0
1
Bedford, MA 01730
0
2 tucci@ar-tiste.com
n
a
January 14, 2010
J
4
1
]
h
p
-
t
n
a
u Abstract
q
[
We present an algorithm for doing Gibbs sampling on a quantum computer. The
2
v algorithm combines phase estimation for a Szegedy operator, and Grover’s algorithm.
7
Foranyǫ > 0, thealgorithmwillsampleaprobabilitydistributionin ( 1 )stepswith
4 O √δ
6 precision (ǫ). Here δ is the distance between the two largest eigenvalue magnitudes
1 O
of the transition matrix of the Gibbs Markov chain used in the algorithm. It takes
.
0
(1) steps to achieve the same precision if one does Gibbs sampling on a classical
1 O δ
9 computer.
0
:
v
i
X
r
a
1
1 Introduction
In Ref.[1], Szegedy proposed a quantum walk operator for each classical Markov
chain. In Ref.[2], Somma et al. proposed a method for doing simulated annealing on
a quantum computer. In Ref.[3], Wocjan et al. showed how to improve the Somma et
al. algorithm. The algorithms of Somma et al. and Wocjan et al. both use Szegedy
operators. In Ref.[4], I presented computer programs called QuSAnn and Multiplexor
Expander that implement ideas of Refs.[2] and [3], and also some of my own ideas
about quantum multiplexors.
InRef.[5], IdescribedoneparticularalgorithmfordoingGibbsandMetropolis-
Hastings sampling of a classical Bayesian network (i.e., a probability distribution) on
a quantum computer. In this paper, I describe a different algorithm for doing Gibbs
sampling on a quantum computer. Unlike my first algorithm, this one uses Szegedy
operators. Forany ǫ > 0, thisnew algorithmwill sample aBayesian network in ( 1 )
O √δ
steps with precision (ǫ). Here δ is the distance between the two largest eigenvalue
O
magnitudes of the transition matrix of the Gibbs Markov chain used in the algorithm.
It takes (1) steps to achieve the same precision if one does Gibbs sampling on a
O δ
classical computer.
This paper assumes that its reader has read the section entitled “Notation and
Preliminaries” in Ref.[5]. The reader should refer to Refs.[5, 4] for clarification when
any notation of this paper eludes him.
2 Dual Gibbs Markov Chains
In this section, we will discuss dual “Gibbs” Markov chains with transition matrices
M and M , respectively. These two transition matrices are both defined in terms of
1 2
a single classical Bayesian network x.
2.1 Definitions of M and M
1 2
Consider a classical Bayesian net with N nodes, labeled x ,x ,...,x where
nds 1 2 Nnds
x S foreachj. (Asusualinmypapers, Iindicaterandomvariablesbyunderlining
j ∈ xj
them.) Let x = (x ,x ,...,x ). Let x assume values in a set S which has
1 2 Nnds x
N = 2NB elements.
S
Let
π(x) = P(x = x) (1)
for all x S .
x
∈
For N = 3 and x,y S , let
nds x
∈
M (y x) = P (y x ,x )P (y x ,y )P (y y ,y ) , (2)
1 x x x 1 2 3 x x x 2 3 1 x x x 3 1 2
| 1| 2 3 | 2| 3 1 | 3| 1 2 |
2
and
M (y x) = P (y y ,y )P (y y ,x )P (y x ,x ) . (3)
2 x x x 1 2 3 x x x 2 3 1 x x x 3 1 2
| 1| 2 3 | 2| 3 1 | 3| 1 2 |
(M (y x) can be obtained by swapping x and y in the conditioned arguments of
2 i i
|
M (y x).) Note that M (y x) = 1 and M (y x) = 1. Define M and M for
1 | y 1 | y 2 | 1 2
arbitrary Nnds using thPe same pattern. M1 aPnd M2 are transition matrices of the type
typical for Gibbs sampling. (See Ref.[5] for an introduction to Gibbs sampling and
the more general Metropolis-Hastings sampling).
Youcancheck thatπ()isnotadetailedbalanceofeitherM norM separately.
1 2
However, the following property is true. We will refer to this property by saying that
π() is a detailed balance of the pair (M ,M ).
1 2
Claim 1
M (y x)π(x) = M (x y)π(y) (4)
1 2
| |
for all x,y S .
x
∈
proof:
Let P(x ,x ,...) stand for P(x = x ,x = x ,...). Assume N = 3 to begin
j k j j k k nds
with. One has
M (y x) P(y x ,x )P(y x ,y )P(y y ,y )
1 1 2 3 2 3 1 3 1 2
| = | | | (5)
M (x y) P(x x ,x )P(x x ,y )P(x y ,y )
2 1 2 3 2 3 1 3 1 2
| | | |
P(y ,x ,x )P(y ,x ,y )P(y ,y ,y )
1 2 3 2 3 1 3 1 2
= (6)
P(x ,x ,x )P(x ,x ,y )P(x ,y ,y )
1 2 3 2 3 1 3 1 2
P(y ,x ,x )P(y ,y ,x )P(y)
1 2 3 1 2 3
= (7)
P(x)P(y ,x ,x )P(y ,y ,x )
1 2 3 1 2 3
P(y)
= . (8)
P(x)
A proof for an arbitrary number N of nodes follows the same pattern.
nds
QED
2.2 Eigenvalues of M , M and M
1 2 hyb
Let
Λ (y x) = M (y x) , (9)
j j
| q |
for j = 1,2 and x,y S . It’s convenient to define a hybrid function of M and M ,
x 1 2
∈
as follows:
3
M (y x) = Λ (x y)Λ (y x) (10)
hyb 2 1
| | |
for x,y S . (Note that unlike M (y x) and M (y x), M (y x) is not a probability
x 1 2 hyb
∈ | | |
function in y, its first argument.)
Define the quantum states
(π)η = [π(x)]η x (11)
| i | i
Xx
for η = 1,1. (Note that only the η = 1 state is normalized in the sense of quantum
2 2
mechanics.)
Claim 2
M π = π for j = 1,2 , (12)
j
| i | i
and
M √π = √π . (13)
hyb
| i | i
Also, M , M and M have the same eigenvalues.
1 2 hyb
proof:
Taking the square root of both sides of the pair detailed balance statement
Eq.(4), we get
Λ (y x) π(x) = Λ (x y) π(y) . (14)
1 2
| |
p p
Therefore,
1 1
M (y x) = Λ (x y) Λ (x y) π(y) = M (x y) π(y) . (15)
hyb 2 2 2
| | π(x) | π(x) |
p p
p p
Hence,
M (y x)π(x) = M (x y)π(y) = π(y) , (16)
1 2
| |
Xx Xx
M (x y)π(y) = M (y x)π(x) = π(x) , (17)
2 1
| |
Xy Xy
and
1
M (y x) π(x) = M (x y) π(y) π(x) = π(y) . (18)
hyb 2
| π(x) |
Xx p Xx p p p
p
OrdertheelementsofthefinitesetS insomepreferredway. Usethispreferred
x
order to represent M , M and M as matrices. Define a diagonal matrix D whose
1 2 hyb
4
diagonal entries are the numbers π(x) for each x S , with the x ordered in the
x
∈
preferred order:
D = diag[(π(x)) ] . (19)
x
∀
Since
M2T = D−1M1D ,MhTyb = D−21M2D21 , (20)
it follows that
det(M λ) = det(M λ) = det(M λ) (21)
1 2 hyb
− − −
for any λ C.
∈
QED
Lettheeigenvalues1 ofM (andalsoofM andM )bem ,m ,...m C
hyb 1 2 0 1 NS−1 ∈
with m = 1 > m m ... m . Define m to be the corresponding
0 | 1| ≥ | 2| ≥ | NS−1| | ji
eigenvectors of M (but not necessarily of M and M ). Thus
hyb 1 2
M m = m m , (22)
hyb j j j
| i | i
for j = 0,1,...,N 1. In particular, m = √π .
S 0
− | i | i
For each j, define ϕ [0, π] and η [0,2π) so that m = eiηj cosϕ . (Thus,
j ∈ 2 j ∈ j j
cosϕ 0). Note that m = 1 so ϕ = 0. The M eigenvalue gap δ is defined as
j 0 0 1
≥
ϕ2
δ = 1 m . δ 1 when ϕ is small.
−| 1| ≈ 2 1
3 Q-Embeddings U and U
1 2
In this section, we will define a “q-embedding” U of M , for j = 1,2. (For more
j j
information about q-embeddings, see Ref.[5].)
For simplicity, we begin this section by considering a Bayesian net with only
3 nodes x ,x ,x , and such that each of these nodes is binary (i.e., S = Bool for
1 2 3 xj
j = 1,2,3). At the end of this section, we will show how to remove these restrictions
and make our treatment valid for general Bayesian networks.
Using the same language as Ref.[5], consider a unitary matrix U of the form
1
shown in Fig.1, with its multiplexor gates defined as follows. Let x k Bool
jh i ∈
and x k Bool for any j,k. U has 3 analogous gates (a.k.a. nodes) labeled
′jh i ∈ 1
(x 2 ,x 2 ,x 2 ), (x 3 ,x 3 ,x 3 ), and (x 4 ,x 4 ,x 4 ). Consider the first
′1h i 3h i 2h i ′2h i ′1h i 3h i ′3h i ′2h i ′1h i
ofthesefordefiniteness. LettheprobabilityamplitudeA(x 2 ,x 2 ,x 2 x 1 ,x 1 ,x 1 )
′1h i 3h i 2h i| ′1h i 3h i 2h i
of node (x 2 ,x 2 ,x 2 ) satisfy the constraint
′1h i 3h i 2h i
1There must be a single eigenvalue 1 and all others must have a magnitude strictly smaller than
one because of the Frobenius-PerronTheorem. The eigenvalues may be complex.
5
x
1
1
2
x 1
2
3 2 x 1
U = 4 3 2 x3
1 Ry 1 1
4 Ry 3 x2 1
Ry 4 x3 1
Figure 1: Unitary matrix U expressed as a product of quantum multiplexors.
1
A(x 2 ,x 2 ,x 2 x 1 = 0,x 1 ,x 1 ) (23)
′1h i 3h i 2h i| ′1h i 3h i 2h i
= qPx1|x3,x2(x′1h2i|x3h2i,x2h2i)δxx22hh21iiδxx33hh21ii . (24)
If we indicate non-zero entries by a plus sign,
000 001 010 011
···
(x′1,x3,x2)=000 + ···
+
001
···
+
010
···
A = + (25)
011
···
+
100
···
+
101
···
+
110
···
+
111
···
G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27)
→ eiθ~bσY ⊗P~b = G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) , (26)
X
~b Bool2
∈
for some θ R. Here the right pointing arrow means that the expression at the
~b ∈
origin of the arrow can be extended to the expression at the target of the arrow.
From the above definition of U , it follows that, for x,x,y,y Bool3,
1 ′ ′
∈
6
y x
1 1
h | | i
y G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) x
2 2
h | | i
hy| U |xi = hy3| G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) |x3i (27)
y′ 1 0 ⊗3 y G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) 0
h | | i h 1′| | i
hy2′| G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) |0i
y 0
h 3′| | i
= Λ (y x)δ(y,x) . (28)
1 ′
|
Hence,
x x
U1 | i = | i or U1|0i⊗3|xi = (Λ1|xi)|xi . (29)
0 ⊗3 Λ1 |xi
| i
R 4 x 1
y 1
4 3 x 1
R
y 2
2
U = 4 3 Ry x3 1
2 3 2 x 1
1
2 x 1
2
x 1
3
Figure 2: Unitary matrix U expressed as a product of quantum multiplexors.
2
Besides U , it is convenient to consider another unitary matrix called U . We
1 2
define U to be of the form of Fig.2, where the multiplexors are defined in such a way
2
that U satisfies, for all x,x,y,y Bool3,
2 ′ ′
∈
y 0
1
h | | i
hy2| G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) |0i
y 0 3 y G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) 0
h | U | i⊗ = h 3| | i (30)
y 2 x
h ′| | ′i y G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) x
h 1′| | ′1i
hy2′| G#(cid:31)(cid:24)(cid:30)(cid:25)(cid:29)(cid:26)(cid:28)(cid:27) |x′2i
y x
= Λ2(hy3′x|′)δ(y′,x′) . | ′3i (31)
|
7
Hence
U2 |0i⊗3= Λ2 |x′i or U2|x′i|0i⊗3 = |x′i(Λ2|x′i) . (32)
x x
′ ′
| i | i
U is called the q-embedding of M for j = 1,2.
j j
N
Ba
A =
R
y
R N
y
Bb
R
y
Figure 3: The unitary matrix A is a quantum embedding of a probability matrix
P(b a), where a has N bits and b has N bits.
Ba Bb
|
Sofarwehaveconsideredtheq-embeddingsU andU forthecaseofaclassical
1 2
Bayesian network x with 3 binary nodes. What if x has N nodes and some of those
nds
nodes have more than 2 states? In that case, we must use several qubits (horizontal
lines) for each node x (and an equal number of qubits for the dual node x ) in
i ′i
Figs.1 and 2. More specifically, suppose P(x x ,x ,...,x ) equals P(b a) where
1| 2 3 Nnds |
a BoolNBa and b BoolNBb. For the number of bits N , define the number of
Ba
∈ ∈
states N = 2NBa. Likewise, let N = 2NBb. The constraint Eq.(24) generalizes to
Sa Sb
A(b,a˜ ˜b = 0,a) = P(b a)δa˜ , (33)
| | a
p
where a,a˜ BoolNBa and b,˜b BoolNBb. Eq.(33) can be expressed in matrix form
∈ ∈
as follows:
˜
(b = 0,a)
→
(b,a˜) D0,0
[A(b,a˜ ˜b = 0,a)] = D1,0 , (34)
| ↓
···
DNSb−1,0
where, for all b BoolNBb, Db,0 RNSaXNSa are diagonal matrices with entries
∈ ∈
(Db,0) = P(b a)δa˜ . (35)
a,a˜ | a
p
8
By adding more columns to the matrix of Eq.(34), one can extended it (see section
entitled “Q-Embeddings” in Ref.[5]) to a square matrix which can be expressed in
terms of multiplexors as in Fig.3.
The Markov Blanket MB(i) for a node x of the classical Bayesian network x
i
satisfies (see section entitled “Notation and Preliminaries” in Ref.[5])
P(x x ) = P(x x ) . (36)
i i c i MB(i)
| {} |
If the set MB(i) is strictly smaller than the set i c, this property can be used to
{ }
reduce the number of controls for the multiplexor in U and U corresponding to
1 2
P(x x ).
i i c
| {}
Giventhetwo q-embeddings U andU foraBayesian network x, wecandefine
1 2
a unitary matrix U as follows
U = U2†U1 . (37)
Matrix U has the following highly desirable property:
Claim 3 For any j,k 0,1,...,N 1 ,
S
∈ { − }
0 m
h | U | ki = m δk . (38)
m 0 j j
j
h | | i
proof:
0 m y ΛT x x m
mh | U2†U1 |0 ki = m y (cid:20) hy| 2 (cid:21)(cid:20) Λ |xi (cid:21) h | ki (39)
h j| | i Xy,x h j| i h | 1| i
= m y ΛT(y x)Λ (y x) x m (40)
h j| i 2 | 1 | h | ki
Xy,x
= m M m = m δk . (41)
h j| hyb| ki j j
QED
4 Szegedy Quantum Walk Operator W
In this section, we will define a special type of Szegedy quantum walk operator W
corresponding to a Bayesian net x. We will then find the eigenvalues of W.
4.1 Definition of W
As in Ref.[4], define the projection operator πˆ and its dual projection operator πˇ by
9
0 0 —
πˆ = | ih | , πˇ = πˆ = . (42)
— l l 0 0
| ih |
Then the Szegedy quantum walk operator W for the Bayesian net x is defined by
W = U( 1)πˇU ( 1)πˆ . (43)
†
− −
4.2 Eigenvalues of W
To find the eigenvalues of W, we will use the following identities.
Claim 4
πˆ m 0 = m 0 , (44a)
j j
| i | i
πˆ(U ) m 0 = m m 0 , (44b)
j j j
l | i | i
πˆ( U ) m 0 = m m 0 , (44c)
l † | j i ∗j| j i
for all j 0,1,...,N 1 .
S
∈ { − }
proof:
From the definition of πˆ, we see that
0 0
πˆ | i = | i . (45)
m m
j j
| i | i
Also,
0 0 0 m 0
πˆ(U ) | i = | ih | U | ji = m | i , (46)
l m m m 0 j m
| ji Xk | kih k| | i | ji
and
0 0 m 0 0
πˆ(l U†) |mi = |mih 0k| U† |mi = m∗j |mi . (47)
| ji Xk | kih | | ji | ji
QED
An immediate consequence of Claim 4 is that
m 0 U m 0 = m 0 πˆU m 0 = m δk , (48)
h j | l | k i h j | l | k i j j
for j,k 0,1,...,N 1 .
S
∈ { − }
Note that since m = 1, Eq.(48) implies that
0
m 0 = U m 0 . (49)
0 0
| i l | i
10