Table Of ContentCONSTRUCTIVE APPROXIMATION
Subscription Information Copyright
Constructive Approximation is published Submission of a manuscript implies: that the
quarterly by Springer-Verlag New York Inc. work described has not been published before
(except in the form of an abstract or as part
NORTH AMERICA: Institutional subscription
of a published lecture, review, or thesis); that
rate: $114.00, including postage and handling.
it is not under consideration for publication
Subscriptions are entered with prepayment
elsewhere; that its publication has been
only. Pleasemail your order to: Springer
approved by aII coauthors, if any, as well as
Verlag New York Inc., Service Center
by the responsible authorities at the institute
Secaucus,44 Hartz Way, Secaucus, NJ 07094,
where the work has been carried out; that, if
USA. Tel. (201) 348-4033, Telex 12 59 94.
and when the manuscript is accepted for
ALL OTHER COUNTRIES: Subscription rate: publication, the authors agree to automatic
OM 264 plus postage and handling. Please transfer of the copyright to the publisher; and
place your order with your bookdealer or ma il that the manuscript will not be published
directly to: Springer-Verlag, Heidelberger elsewhere in any language without the
Platz 3, 0-1000 Berlin 33, Germany. Tel. (030) consent of the copyright holders.
82071, Telex 183319.
AII articles published in this journal are protec
BACK VOLUMES AND MICROFORM EDI
ted by copyright, which covers the exclusive
TIONS: Back volumes are available on request
rights to reproduce and distribute the article
from the publisher. Microform editions are
(e.g., as offprints), as well as aII translation
available from: University Microfilms Inter
rights. No material published in this journal
national, 300 N. Zeeb Road, Ann Arbor, MI
may be reproduced photographically or stored
48106.
on microfilm, in electronic data bases, video
CHANGE OF ADDRESS: Please inform the disks, etc., without first obtaining written
publisher of your new address 6 weeks before permission from the publisher.
you move. AII communications should include
both old and new addresses (with Zip codes) The use of general descriptive names, trade
and should be accompanied bya mailing labei names, trademarks, etc., in this publication,
from a recent issue. even if not specifically identified, does not
imply that these names are not protected by
the relevant laws and regulations.
While the advice and information in this jour
nal is believed to be true and accurate at the
date of its going to press, neither the authors,
the editors, nor the publisher can accept any
legal responsibility for any errors or omissions
that may be made. The publisher makes no
warranty, express or implied, with respect to
the material contained herein.
Photocopies may be made for personal or
in-house use beyond the limitations stipulated
under Sections 107 or 108 of U.S. Copyright
Law, provideda fee is paid. This fee is US
$0.20 per page. AII fees should be paid to the
Copyright Clearance Center, Inc., 27 Congress
Street, Salem, MA 01970, USA, stating the
ISSN 0176-4276, the volume, and the first and
ISBN 978-1-4899-6816-6 ISBN 978-1-4899-6886-9 (eBook) last page numbers of each article copied. The
DOI 10.1007/978-1-4899-6886-9 copyright owner's consent does not include
ISSN: 0176-4276 copying for general distribution, promotion,
© 1989 by Springer Science+Business Media New York new works, or resale. In these cases, specific
Originally published by Springer-Verlag New York me. written permission must first be obtained from
in 1989. the publisher.
Constr. Approx. (1989) 5: 1-2
CONSTRUCTIVE
APPROXIMAnON
© 1989 Springer-Verlag New York Inc.
Preface
This special issue marks an exciting moment in the history of mathematics.
Let me explain. I believe that mathematics holds the key to the control of the
massive amounts of data which mark the new socioeconomic era. Technologies
come and others go, directly in accord with their abilities to handle more
information faster. Companies rise and fall with these technologies.
Concurrent with the publication of this special issue there has appeared a new
technology for processing and transmitting images. A single digital image of size
1024 x 1024 with 24 bits/ pixel requires three megabytes of computer storage, and
takes half an hour to communicate by telephone. This communication time can
be reduced to seconds by replacing the images by fractals. The approximation
of data by fractals is the central thrust of this special issue.
The theorems herein are blueprints for new technological devices. Almost any
one can be interpreted in a discrete setting, converted to digital algorithms,
translated to software, optimized, then converted to hardware designs, and passed
to a silicon foundry. The product may be part of a new communications device.
The difference between now and earlier eras is the speed with which the technology
transfer can take place: from· m;:tthematics to a component inside a commercially
available television set can take place in less than three years. Mathematicians
hold the key to this process.
Fractal geometry is a new focus for mathematical research. This research draws
both on classical analysis and computer graphics and upon interplay with the
rapidly developing applications. As to what the body of knowledge which we
might now want to call "fractal geometry" will finally look like is not clear: so
much is going on, so many diverse branches of study involve fractals in one way
or another, and yet contain great creativity beyond their fractal theme, that it is
impossible yet to say how the area will be defined eventually. One thing is clear
however: an important part of the area will belong to approximation theory. It
is the aim of this special issue to try and support this assertion and to go some
way toward introducing, if only by illustration, the main ingredients and directions
which will belong to Fractal Approximation Theory.
In the following paragraphs I give some of the questions that an "approximation
theory" must address, and I say how this issue addresses them.
What objects will Fractal Approximation Theory be concerned with
approximating? Complicated subsets of R2, and finite Borel measures supported
on these subsets. An example of the former is provided by wriggly, nondifferenti
able functions, such as might occur in modeling a time series of temperatures in
2 Preface
a jet-engine exhaust. Another example is a subset of R2 which represents a picture
of a fern, black on a white background. Measures on R2 occur in great diversity
in the theory of dynamical systems-how they should be approximated becomes
a pressing question. Also measures provide models for greytone photographs:
their approximation is equivalent to image compression!
What sort of things will Fractal Approximation Theory use as approximating
entities? And how are the approximating entities to be computed? In this issue
the focus is on the use of deterministic fractals and invariant measures of Markov
processes as approximating entities. The papers by Bedford, and by Barnsley,
Elton, and Hardin are in this area. One approach is the use of attractors of
iterated function systems built up from a special class of transformations in the
underlying space, such as affine transformations. The approximants introduced
by Elton and Yan give the flavor of this type of problem. They introduce a special
class of approximants which are quite easy to work with.
The approximating entities are computed by set iteration, and by random
iteration algorithms which exploit ergodic theorems, such as the ergodic theorem
of Elton. Withers uses Newton's method to compute a sequence of fractal
approximations to the graph of a given function, and he uses ergodicity to
compute the necessary-derivative at each step!
Third, what criteria will Fractal Approximation Theory use to decide whether
or not an approximation is a good one? What quantities do we try to make agree
between the approximations and the objects which are being approximated?
What are the distances between target and approximation which are to be
minimized? The Hausdorff dimension of approximating sets associated with
iterated function systems of affine maps is dealt with in the paper of Geronimo
and Hardin. The paper of Bedford introduces the notion of the distribution of
Holder exponents as a characteristic of the graph of a nondifferentiable function.
He gives explicit formulas relating these important approximation criteria num
bers to fractal dimensions.
The paper by Wallin is important because it underlies another fascinating area:
the approximation and continuation properties whose domains are fractal subsets
of Rn. How does one go about approximating the temperature of the coast of
Sweden?
MICHAEL BARNSLEY
Georgia Institute of Technology
Atlanta, Georgia
Constr. Approx. (1989) 5: 3-31
CONSTRUCTIVE
APPROXIMAnON
© 1989 Springer-Verlag New York Inc.
Recurrent Iterated Function Systems
Michael F. Barnsley, John H. Elton, and Douglas P. Hardin
Abstract. Recurrent iterated function systems generalize iterated function sys
tems as introduced by Barnsley and Demko [BD] in that a Markov chain
(typically with some zeros in the transition probability matrix) is used to drive
a system of maps wj : K -> K, j = I, 2, ... , N, where K is a complete metric space.
It is proved that under "average contractivity," a convergence and ergodic
theorem obtains, which extends the results of Barnsley and Elton [BE]. It is also
proved that a Collage Theorem is true, which generalizes the main result of
Barnsley et a/. [BEHL] and which broadens the class of images which can be
encoded using iterated map techniques. The theory of fractal interpolation
functions [B] is extended, and the fractal dimensions for certain attractors is
derived, extending the technique of Hardin and Massopust [HM]. Applications
to Julia set theory and to the study of the boundary of IFS attractors are presented.
1. Introduction
Let X be a complete metric space with metric d. Let wj : X-+ X be Lipschitz
maps, j = 1, 2, ... , N. Let (pij) be an N x N row-stochastic matrix. Then we call
{X, wj, pij, i,j = 1, 2, ... , N} a recurrent iterated function system (IFS)-whether
or not (pij) is technically "recurrent" (i.e., irreducible). The focus of a recurrent
IFS is random walks in X of the following nature: specify a starting point x0 EX
and a starting code i0 E { 1, 2, ... , N}. Choose a number i 1 E { 1, 2, ... , N} with
the (conditional) probability that i1 = j being Pioj• and then define x1 = W;,Xo. Then
pick i2 E {1, 2, ... , N}, with the probability that i2 = j being p;,j, and go to the
point x2= w;,x1 = w;,w;,x0• Continue in this way to generate an orbit {xnl':~o·
Our concern in this paper is with existence, uniqueness, convergence to, and
characterization of limit sets (attractors) A c X, and of associated invariant
(stationary) measures whose support is A. A may be described as follows: x E A
iff every neighborhood of x contains infinitely many xn's, for almost all orbits.
The empirical distribution along an orbit converges to the stationary measure,
for almost all orbits. (The description given of A does not quite follow from the
statement about the stationary measure, which is of interest; see Section 2.)
It is also very important to consider limits when composing maps in the reverse
order w;, · · · w;,x, ·which is exploited, and connections with the random walk
Date received: September 4, 1986. Communicated by Edward B. Safi.
AMS classification: 28D99, 41A99, 58Fll, 60F05, 60G!O, 60105.
Key words and phrases: Iterated function systems, Attractor, Random maps, Markov chain, Ergodic,
Lyapunov exponent, Fractal, Dimension.
3
4 M. F. Barnsley, J. H. Elton, and D.P. Hardin
clarified, in Section 2. In the case of uniformly contractive maps this is especially
useful and in Section 3 this point of view is exclusively used, in a more general
setting, io give an elegant characterization of the attractor as the unique, attractive
fixed point of a certain set map, using the Hausdorff metric. (Actually, a more
precise invariance result is obtained for an N-tuple of sets, based on the connec
tion structure of the chain-that is, which maps are allowed to follow which, i.e.,
which pij are not zero.)
By having some entries in ( pij) equal to zero, the allowable map sequences in
the random walk are restricted, and this gives rise to limit sets with geometries
not obtainable by earlier iterated function systems, which is one motivation for
our work; see Section 5.
Key references which underlie the present work are [BD], [H], [Bed], [D],
[BEHL], [HM], [BE], [E], and [BA]. The structure of this paper is as follows.
In Section 2 we consider the existence, uniqueness, and convergence questions
referred to above. In Section 3 we describe the College Theorem for recurrent
IFS, and in so doing extend the concept of recurrent IFS to multiple spaces and
set maps. In Section 4 we compute the fractal dimension for various recurrent
IFS attractors, using the Perron-Frobenius theorem for the connection matrix.
In Section 5 we give examples, including combinatorial fractal functions,
boundaries of attractors of IFS, and Julia set applications.
2. Ergodicity of the Random Walk
Let (.X, d) be a complete separable locally compact metric space. We consider
a random walk (i.e., a discrete-time stochastic process) in X arising from itera
tively applying Lipschitz maps chosen according to a finite state-space Markov
chain, as described in the Introduction. .
r.i:I
Let (pij) be an irreducible N x N row-stochastic matrix, i.e., pij = 1 for all
i, p;i 2: 0 for all i, j, and for any i, j there exist i1, i2, ••• , in with i1 = i and in = j
such that p;1;2p;2;3 • • • P;._1;. > 0. Let wi, j = 1, ... , N, be Lipschitz maps on X.
The random walk described informally in the Introduction can be formulated
as follows: let i0, i1, ••• be a Markov chain in {1, ... , N}, with transition probabil
ity matrix (pij); then our random walk is the process
Now (Zn) is not a Markov process on X, but Zn = (Zn, in) is a Markov process
on X= X x {1, ... , N} with transition probability function
• N
p( (x, i), B)= L. pijl 8( wix,j);
j=l
this is the probability of transfer from (x, i) into the Borel set Bc X in one step
of the process.
Let (m;) be the unique stationary initial distribution for the Markov chain on
{1, ... , N}; i.e.,
N
L m;pij=mi, j=l, ... ,N.
i=l
Recurrent Iterated Function Systems 5
We show that if the maps are logarithmically contractive on the average after
some number of iterations (see Theorem 2.1 ), then there is a unique initial
distribution which makes the Markov process (Zn) stationary (this is also called
the invariant measure), and more importantly, for any starting value (x0, i0), the
empirical distribution of a trajectory x0, w;,x0, W;2W;,x0, •.. will converge with
probability one to the X-projection J.t of the stationary initial distribution.
Furthermore, if A is the support of J.t, then x E A iff for any neighborhood of x,
almost all trajectories visit the neighborhood infinitely often. This is perhaps
surprising because from the convergence to J.t of the empirical distribution along
trajectories it follows that x it A=>for some neighborhood of x, the proportion
of the number of visits to the neighborhood approaches 0 for almost all trajec
tories, whereas we are making the stronger assertion that some neighborhood of
x will only be visited finitely many times, almost surely (a.s.).
This average contractivity condition appeared in [BE] concerning the case
when the sequence of maps is independent and identically distributed (i.i.d.) (in
the present setup, this would mean pij = pj, i = 1, ... , N, for each j), and was
generalized in the case of i.i.d. affine maps to infinitely many maps in [BA]. It
is equivalent in those cases te> a negative Lyapunov exponent condition, as pointed
out in [BA], and this will be seen to be true here also.
An important point in the proof will be to run the Markov chain "backward
in time"; let us explain. Consider the matrix
I:
which is row stochastic, irreducible, and also satisfies m;qij = mj, as is easily
1
checked. The qij are called inverse transition probabilities in Chapter 15 of [F].
The reason is as follows: consider the Markov chain (i0, i1, ••• ) above, with
transition probability matrix (p;j) and initial distribution (m;). The probability
that (i i in)= (j ,jn) is then
1, 2, ••• , 1, •••
m- m-
N N
"'- mJ o pjo jl p.i th .. ·pj ,_dn -- "'- mJ o _m]J-_qJ do .. . _m1-_n qj ,j,_,
Jo= l Jo= 1 'Jo Jn-l
which is the probability that a Markov chain with transition probability matrix
(qij) and initial distribution (m;) will have for its first n values Un,jn-• · · · j1).
Let P be the probability measure on n = {i = (i0, i1, ••• )} corresponding to the
"forward" chain; that is, P is given on "thin cylinders" by P(i0, i1, ••• , in)=
m;,P;,;, · · · p;,_,;,· Let Q be the probability measure on n corresponding to the
"backward" chain; i.e., Q(i0, i1, .•• , in)= m;,q;,;, · · · q;,_1;,.
We show that under our hypotheses, for the backward process,
limn~oo w;, · · · W;,x = Y(i) exists and is independent of x for Q-almost all
(a.a.) i. Note that this is very different from the iterative process w;, · · · w;,x we
originally discussed, where i1, i2, ••• are chosen according to P. This process
does not converge pointwise, but its trajectories distribute ergodically as the
measure J.t which is obtainable from the limit of the backward process as
6 M. F. Barnsley, J. H. Elton, and D. P. Hardin
f.L(B) = Q( y-1(B)). This is simply because for all n, win· · · wi,x has the same
distribution under P as does wi, · · · winx under Q.
If all the maps wi are uniform contractions, then it is easy to see that
limn ... oo wi, · · · winx = Y(i) exists for all i (not just Q a.e.), and that Y is continuous
with the product topology on n, and range Y =A is a compact set in X which
is exactly the support of f.L, called the attractor. This is discussed in detail in
Section 3, where an invariance result ("Collage Theorem") is given for a special
decomposition of A into compact subsets. But even in this uniformly contractive
case, the trajectories of the random walk (the forward process), wi., · · · wi,x,
converge only in the distribution sense (the points along the trajectory continue
to dance about), and only with probability one.
We hope this detailed discussion of running time backward and inverse prob
abilities will be helpful in clarifying the connection between the "symbolic
dynamics" point of view wi, wi2 · · · wi,x and the "ergodic" point of view
win· · · wi,x, and why the measures are the same; this matter had been a little
unclear to the authors previously.
Now for a precise statement and proof of the convergence and ergodic results.
For a Lipschitz map w: X~ X, define
d(wx, wy)
II w II =sup .
x,.y d(x,y)
Theorem 2.1. Assume that, for some n,
Ep(logllwin · · · wi,II)<O
(see above for the definition of P); that is,
L ... L mi,Pi,i2 ... Pin-lin logllwi, ... win II <0.
i; i11
(This is equivalent to a negative Lyapunov exponent for the process wi,, wi2, ... ;
see the proof). Then:
(i) For Q a. a. i.l wi, · · · winx ~ Y(i), whi_;h does not depend on the choice o[x EX.
(ii) Define ji,(B) = Q(i: ( Y(i), i1 (i)) E B), the distribution of ( Y, i1) on X. Then
ji, is the unique stationary initial distribution for the Markov process Zn =
(Zn, in)· Furthermore, if ii is any probability measure on X satisfying
z;
ii(X x {i}) = mi, i = 1, ... , N, then converges in distribution to ji,, where
z;
represents the Markov process with initial distribution ii. In particular,
z;
the random walk on X converges in distribution to the measure f.L(B) =
ji, ( B x {1 , ... , N}). (The given condition on ii may be expressed as requiring
;t
the marginal distribution of to be ( mi).)
(iii) (Ergodic theorem). For every x, for P a.a. i,
wi,x)~ffdf.L
.!_-£. f(wi.· · ·
n
k=l
for all fE C(X), the bounded continuous functions on X. In other words,
starting at any x, the empirical distribution of a trajectory converges with
probability one to f.L·
Recurrent Iterated Function Systems 7
(iv) If A is the support of J-L, then x E A iff for every neighborhood of x, almost
all trajectories visit the neighborhood infinitely often (recall that the support
A of 1-L is defined as follows: x E A iff every neighborhood of x has positive
J-L-measure; A is a closed set).
Proof. (i) First note that since the distribution of (it · · · in) under P is the same
as (in, ... , it) under Q as pointed out above, we have E0 log II w;, · · · wd < 0 also.
Since (iti2, •• • ) is a stationary ergodic process under P and Q, the proof of the
Furstenberg-Kesten theorem given on p. 40 of [Kr] shows that this is equivalent
to
. 1
hm -logllw;, · · · W;,ll =-a, P a.s.,
n
n-+00
and to
. 1
hm -nl og II W;, · · · w;, II= -a, Q a.s.,
n-+ro
where a> 0 (-a is the Lyapunov exponent). (The proof in [Kr] refers to linear
maps, but there is no change in the proof needed for our case, or for reversing
the order.)
For the remainder of the proof of (i), we borrow the more elegant method of
[BA], rather than the earlier proof of [BE]. Fix x. Now
d(w- · · · w- x w- · · · w- x) :=;II w- · · · w- II C(x)
It ln ' '1 'n+l It In '
where C(x) = max~s;osN d( w;x, x). For Q a.a. i, we may choose n0 (depending
on i) so that n ;:=: n0~ II W;, • • • w;J < e -na/2. Thus
'\'cc d(w- · · · w-x w- · · · w- x)<oo
t.....n=l 't '" ' It '"+I '
so w;, · · · W;,x is Cauchy and converges to say Y(i), for Q a.a. i. Furthermore,
:=; II
d ( W;, • • • W;,x, w;, · · · W;"Y) W;, • · · w;J d ( x, y) __.. 0 Q a.s., so Y does not depend
on x.
(ii) Let il be any probability measure on X satisfying i/(X x { i}) = m;, i =
1, ... , N. Then if the Markov process is given initial distribution il, i will have
distribution P, since i0 will have marginal distribution ( m;). For each j, let
ii(B X {j})
vj(B)= i/(Xx{j})
(note ii(X x {j}) = mj > 0 for all j). This is the conditional distribution of Z0 given
i0 =j Thus for all j E C(X), Ej(i;) = J J j( w;,. · · · w;,x, in) dvio(x) dP(i) =
J Jf(w;, · · · W;"X, it) dv;,+Jx) dQ(i). Now fix x0. For Q a.a. i, d(w;, · · · W; .. x,
W;, · · · w;,.x0) __.. 0 for every x, so
f
f
j(w;, · · · W;,X, it) dv; .. +,(x) dQ(i)
-f f
j(w;, · · · w; .. x0, it) dv;,.+Jx) dQ(i) .. O.
8 M. F. Bamsley, J. H. Elton, and D. P. Hardin
But
ff
j(wi, · · · WinXo, it) dvin+Jx) dQ(i)
= f j( Wi, • • • WinXo, it) dQ(i) ~ f j( Y, it) dQ = f j d[L
i;
This shows converges in distribution to [L
It remains to show that ii is a stationary initial distribution, and is unique.
Let ii be any stationary initial distribution. Then ii(X x {i}) = mi, i = 1, ... , N,
since i must have marginal distribution ( mi) in order that (in) be stationary since
0
i;
the chain is irreducible. For j E C(X), let Tj(:X) = Ej(itx), where is the
x;
Markov process with i{= this is the usual Markov operator on C(X). The
adjoint T* restricted to Borel measures has the following interpretation: if ii is
i it.
the distribution of then T*ii is the distribution of and T*n;; is the
0,
in.
distribution of Thus from what was just shown, if ii is a stationary initial
distribution, ii = T*"ii ~ w• ,:i, so ii is the only possible stationary initial distribu
tion. Furthermore, if ii is any distribution satisfying ii( X x {i }) = mi for all i, then
T* ii satisfies this condition also, since T* ii(X x { i}) is just the marginal distribu
tion of i{, which is (mi) since it was given the stationary initial distribution (mJ.
Thus choosing ii to be, for example, Dx x (mi), we have T*n;; ~ w• ,:i, and
T*(T*nii)~w·T*ii since T* is w*-w* continuous; but T*(T*nii)=
T*n ( T* ii) ~ w• ii also, so T* ii = Ji, so ii is stationary.
(iii) This is the only place where the assumption that X is separable and
locally compact is needed.
if
Now is an ergodic stationary process since ii is the unique stationary initial
distribution (see Lemma 1 of [E)). Now let f E Cc(X), the continuous functions
on X with compact support. "By the classical pointwise ergodic theorem, with
probability one,
k~/(Zf)~ f
.;; fdp.
zn
(we consider f to be defined on X by f(x, i) = f(x)). But d(Zf, =
d (win · z·;z wi, Zo, win · · · wi, X) ~ II (win · · · wi,ll d ( Z0, X) ~ 0 with probability one,
where is the Markov process with Z~ = x and i~ distributed as (mJ. Since f
is uniformly continuous, we get
also, with probability one. Since Cc(X) is separable, we can get this for all
f E C(X) simultaneously, with probability one. Finally, a simple argument using
Urysohn's lemma extends this to C(X) (note (Z~) is tight since it converges in
distribution).
(iv) x E A~for any neighborhood of x, almost all trajectories visit the neigh
borhood infinitely often follows immediately from (iii), which says that in fact
that the proportion of visits is asymptotically positive.