PARTICLE METHODS FOR THE ESTIMATION OF CREDIT PORTFOLIOS LOSS DISTRIBUTION RENE´ CARMONA AND STE´PHANE CRE´PEY Abstract. The goal of the paper is the numerical analysis of the performance of Monte Carlo simulation based methods for the computation of credit-portfolio loss-distributions in the context of Markovian intensity models of credit risk. We concentrate on two of the most frequently touted methods of variance reduction in the case of stochastic processes: importance sampling (IS) and interacting particle systems(IPS)basedalgorithms. Becausethesubtledifferencesbetweenthesemeth- ods are often misunderstood, as IPS is often regarded as a mere particular case of IP, we describe in detail the two kinds of algorithms, and we highlight their funda- mental differences. We then proceed to a detailed comparative case study based on benchmark numerical experiments chosen for their popularity in the quantitative finance circles. (RC) Bendheim Center for Finance ORFE, Princeton University Princeton, NJ 08544 USA (SC) D´epartement de Math´ematiques ´ Universit´e d’Evry Val d’Essonne ´ 91025 Evry Cedex, France Contents 1. Introduction 2 2. Importance Sampling and Interacting Particle Systems 4 2.1. Point Process Set-Up 4 2.2. Importance Sampling for Markov Chains 5 2.3. Twisted Feynman-Kac Expectations 6 2.4. Interacting Particle Systems for the Computation of Rare Events 7 3. Benchmark Models 10 3.1. Local Intensity Model 10 3.2. Homogeneous Groups Model 11 Date: February 2009 (This Version). Key words and phrases. ImportanceSampling,InteractingParticleSystems,RareEvents,Credit Portfolios, Loss Distribution Estimation. The first named author (RC) was partially supported by NSF-FRG-DMS-0455982 while the research of the second named author (SC) benefited from the support of the Europlace Institute of Finance and the “Chair Risque de Cr´edit” (F´ed´eration Bancaire Fran¸caise). 1 2 R. CARMONA & S. CRE´PEY 4. Case of Independent Obligors 12 4.1. Algorithms 12 4.2. Results for IS 15 4.3. Results for IPS 16 5. General Case with Default Contagion 16 5.1. Armageddon 18 5.2. Algorithms 19 5.3. Results for IS 20 5.4. Results for IPS 20 6. Summary and Conclusions 22 Appendix A. IS and IPS for Diffusion Processes 23 References 24 1. Introduction The purpose of this paper is to study the performance of Monte Carlo methods for the computation of credit portfolio loss distributions, in the context of Markovian intensity models of credit risk. For simplicity, we assume homogeneity of individual losses given default, so that the portfolio loss can be identified with the number of defaults in the portfolio. Since most Monte Carlo methods can produce equally sharp estimates of the main parts of these distributions, they need to be tested and com- pared on their ability to estimate accurately the rare events which typically requires variance reduction techniques. Denoting by 𝐿 the cumulative loss (the number of 𝑇 defaults according to our homogeneity assumption) at time 𝑇 of a credit portfolio, we are interested in the computation of the probabilities 𝑝 (𝑇) = ℙ{𝐿 = ℓ} for ℓ in ℓ 𝑇 {0,⋅⋅⋅ ,𝑛} where 𝑛 is the number of names of the portfolio. The naive Monte Carlo estimate #{𝑗; 0 ≤ 𝑗 ≤ 𝑚, 𝐿 (𝜔 ) = ℓ} (1) 𝑝𝑚(𝑇) = 𝑇 𝑗 ℓ 𝑚 computedfromasetof𝑚MonteCarlosamples𝜔 ,⋅⋅⋅ ,𝜔 , isgivenbytheproportion 1 𝑚 of samples giving exactly ℓ losses. It suffers from several shortcomings. First, it is very unlikely that the Monte Carlo samples will reach the loss level ℓ when {𝐿 = ℓ} 𝑇 is a rare event. Furthermore, the variance of the estimator, which is equal to 1 𝜎2,𝑚(𝑇) = 𝑝 (𝑇)(1−𝑝 (𝑇)) ℓ 𝑚 ℓ ℓ can be quite large. In the factor copulae models most commonly used in practice, the conditional loss distribution can be recovered analytically by various means, and the unconditional distribution can then be obtained by quadrature (numerical integration). As for IS AND IPS FOR CREDIT PORTFOLIOS 3 Monte Carlo calculations, generating a direct sample of the unconditional distribu- tion is typically too slow. However it is possible to apply various variance reduction techniques to the simulation of the conditional distribution, possibly in conjunction with a shift on the common factor (see for example Glasserman et al. [7, 9, 8]). Note that for the specific purpose of valuing credit portfolio loss derivatives by sim- ulation, various direct (unconditional) variance reduction techniques are available, among them, the use of the portfolio loss at maturity as control variable. In this paper we consider Markovian intensity models of credit risk, such as those used for pricing purposes in Frey and Backhaus [5], Bielecki et al. [2], or Herbertsson [10]. Financial applications can be divided into two categories depending on whether one is interested in pricing or risk management. Accordingly, probability models are organized into risk neutral and statistical models. Some of the interesting features of Markovian intensity models of credit risk are: ∙ to provide a consistent pricing and hedging theory, as opposed to the abovemen- tioned static models which are essentially pricing tools (even if they can be used for hedging in an ad-hoc manner); ∙ to account for contagion effects which play an important role in pricing, especially since the start of the sub-prime crisis. We shall see in section 5 the potential interest of IPS with respect to IS in relation to this second bullet point. But pricing is not the only application of these models. Indeed, they can also be used for the purpose of the thorny risk management issue highlighted by the credit crisis. Note that the probabilities of the rare events of interest are significantly smaller under the statistical measure than under the risk-neutral probability for which contagion effects are most prevalent. Consequently methods such as those presented in this paper are especially important in that context. In Markovian models, one should be able to compute the loss distribution by numer- ical resolution of the forward Kolmogorov equations. However practical implementa- tionofdeterministicnumericalschemes isprecludedbythe curseofdimensionalityfor models of dimension greater than a few units. Simulation approaches appear as the only reasonable alternative. Unfortunately, simulation methods are typically slow, and their accuracy depends strongly on the control of the variances of the estimates (see, e.g., Glasserman [6], for a general reference on these issues). Importance sampling (IS for short) is often regarded as the method of choice when it comes to variance reduction. In this paper we show how explicit changes of measures can be implemented in Markovian models of portfolio credit risk, and we document the efficiency of these IS estimators. However, we argue that IS does not always make sense. Sometimes, it is not clear which change of measure will significantly reduce the variance, and moreover, producing Monte Carlo random samples after the change of measure can be impractical if at all possible. We thus present an alternative variance reduction method based on the properties of twisted Feynman-Kac expectations and the approximation of their values by inter- acting particle systems (IPS for short), also called in this paper implicit importance sampling as opposed to the previous explicit importance sampling. The use of IPS 4 R. CARMONA & S. CRE´PEY for the computation of the probabilities of rare events was introduced by Del Moral and Garnier in [14]. We refer the reader to Del Moral’s account [13] for the general theory in a textbook form. An important feature of IPS estimates is that they can be computed by generating Monte Carlo samples from the original model distribution ℙ rather than from the changed measure as in the case of importance sampling. In many cases, this repre- sents an important advantage of IPS over IS. Indeed, in many practical applications, large programs are used to generate Monte Carlo samples. These programs have been developed based on complex mathematical models and extensive historical data bases (see for example the Standard & Poor’s documentation on CPDOs [15]). The level of intricacy of their implementation files precludes any change of measure and the generation of Monte Carlo samples from a different distribution. They need to be used as is, as a black box. IPS algorithms can be used without knowing what is in the black boxes. This is a major advantage of IPS over IS in this realm of applications. Remark. A first application of IPS to the computation of credit portfolio losses and CDO tranche spreads was considered by Carmona, Fouque and Vestal in [3] in the framework of structural models of credit risk. In the present paper we consider the application of IS and IPS to the computation of a credit portfolio loss distributions, in a Markovian intensity model. 2. Importance Sampling and Interacting Particle Systems 2.1. Point Process Set-Up. Our goal is to compute the marginal probability dis- tributions of a (continuous time) point process which we denote by {𝐿 } in analogy 𝑡 𝑡≥0 with the notation used for the loss process (number of defaults) of a credit portfolio in the introduction. We assume that at each time 𝑡, the value 𝐿 of the point process 𝑡 can be derived from the values of a finite set of factors, say 𝑥 = (𝑥1,⋅⋅⋅ ,𝑥𝑑) whose time evolution can be described in discrete time by a Markov chain. In other words, we are interested in the evaluation of the probabilities of events relative to a (pos- sibly time inhomogeneous) Markov chain {𝑋 } , the background factor 𝑋 being a 𝑛 𝑛 𝑛 random element of a general measurable space (𝐸 ,ℰ ) which can change with 𝑛. 𝑛 𝑛 We denote by 𝐾 (𝑥 ,𝑑𝑥 ) the transition kernel of the background Markov chain at 𝑛 𝑛−1 𝑛 time 𝑛, and we denote by {𝑌 } the historical process of {𝑋 } defined by: 𝑛 𝑛 𝑛 𝑛 𝑌 = (𝑋 ,⋅⋅⋅ ,𝑋 ) ∈ 𝐹 = 𝐸 ×⋅⋅⋅×𝐸 . 𝑛 0 𝑛 𝑛 0 𝑛 Next, we let 𝑀 (𝑦 ,𝑑𝑦 ) denote the Markov transition kernels of the inhomoge- 𝑛 𝑛−1 𝑛 neous Markov chain {𝑌 } . Finally, for each integer 𝑛 ≥ 0 we consider non-negative 𝑛 𝑛 measurable functions 𝑤 on 𝐹 equipped with the product 𝜎-field, and we interpret 𝑛 𝑛 these functions as weight functions. Note that for IPS it is assumed in [14, 13] that these weight functions are bounded and bounded away from zero in the sense that: 𝑤 (𝑦 ) 𝑛 𝑛 (2) sup < ∞. 𝑤 (𝑦′) (𝑦𝑛,𝑦𝑛′)∈𝐹𝑛×𝐹𝑛 𝑛 𝑛 IS AND IPS FOR CREDIT PORTFOLIOS 5 This restrictive boundedness assumption can be relaxed in many cases. However, we only mention it for the sake of completeness since we are not interested in mathemat- ical proofs of convergence results. Indeed, when in need of convergence results for the IPS algorithm, we directly appeal to [14]. Both IS and IPS computations rely on changes from the original probability measure to absolutely continuous twisted measures, in order to compute quantities of the form 𝔼{𝑓 (𝑌 )} 𝑛 𝑛 for suitable functions 𝑓 (like, for instance, the indicator function of some “rare” set 𝑛 of interest). Yet, despite this similarity, the two methods differ in very significant ways. 2.2. Importance Sampling for Markov Chains. Importance Sampling is espe- cially useful when not enough Monte Carlo samples from the original distribution contribute significantly to the computation of the desired probabilities or expecta- tions. In this case, typical estimates are unstable and carry higher variance than desired. A simple fix could be to use random samples from a probability distribution obtained by deformation of the original measure in such a way that more Monte Carlo samples contribute significantly, and to correct the naive estimator (1) for the fact that the samples were generated from a different distribution. Importance Sampling is well known in the context of random variables, or (see Ap- pendix A) in the context of diffusions. Let us specialize the description of the method to the case of interest in this paper. Given a set of weight functions 𝑤 as before, for 𝑛 each fixed 𝑛, we let ℙ˜ stand for the twisted probability measure defined by its density 𝑛 with respect to the original measure ℙ: ∏ (3) 𝑑ℙ˜𝑛 = 1≤𝑖≤𝑛𝑤𝑖(𝑌𝑖) . 𝑑ℙ 𝔼{∏ 𝑤 (𝑌 )} 1≤𝑖≤𝑛 𝑖 𝑖 We then have: { 𝑑ℙ } 𝔼{𝑓 (𝑌 )} = 𝔼˜ 𝑓 (𝑌 ) 𝑛 𝑛 𝑛 𝑛 𝑑ℙ˜ 𝑛 { } { } ∏ ∏ (4) = 𝔼˜ 𝑓 (𝑌 ) 𝑤−(𝑌 ) 𝔼 𝑤 (𝑌 ) , 𝑛 𝑛 𝑖 𝑖 𝑖 𝑖 1≤𝑖≤𝑛 1≤𝑖≤𝑛 where we used the notation 𝑤− for the inverse 1/𝑤 of the 𝑖-th weight function. 𝑖 𝑖 Assuming 𝔼{𝑤 (𝑌 )∣ℱ𝑋 } = 1 for every 𝑖 ≥ 1, we see that the second expectation in 𝑖 𝑖 𝑖−1 the right hand side is equal to one and that (4) further reduces to { } ∏ (5) 𝔼{𝑓 (𝑌 )} = 𝔼˜ 𝑓 (𝑌 ) 𝑤−(𝑌 ) . 𝑛 𝑛 𝑛 𝑛 𝑖 𝑖 1≤𝑖≤𝑛 For the computation of 𝔼{𝑓 (𝑌 )}, the importance sampling algorithm based on 𝑛 𝑛 ℙ˜ relies on the generation of 𝑚 Monte Carlo samples 𝜁˜𝑗 = (𝜉˜𝑗,𝜉˜𝑗,⋅⋅⋅ ,𝜉˜𝑗) for 𝑗 = 𝑛 𝑛 0 1 𝑛 1,2,⋅⋅⋅ ,𝑚under the twisted distribution ℙ˜ . These samples are generated as follows. 𝑛 For each 𝑗, and given a fixed initial condition 𝜉˜ for 𝑋 , for each 𝑖 = 1,⋅⋅⋅ ,𝑛, use the 0 0 6 R. CARMONA & S. CRE´PEY transition kernel 𝐾˜ of the Markov chain 𝑋 at time 𝑖−1 under the twisted probability 𝑖 measureℙ˜ togenerateasample𝜉˜𝑗 from𝜉˜𝑗 . ThenforeachfixedMonteCarlosample 𝑖 𝑖 𝑖−1 of size 𝑚, we have the following unbiased estimate: { } ∏ (6) 𝔼{𝑓 (𝑌 )} ≈ 𝔼𝑚 𝑓 (𝜁˜ ) 𝑤−(𝜉˜,⋅⋅⋅ ,𝜉˜) 𝑛 𝑛 𝑛 𝑛 𝑛 𝑖 0 𝑖 1≤𝑖≤𝑛 where 𝔼𝑚 refers to expectation with respect to the empirical distribution of the 𝑚 𝑛 samples 𝜁˜𝑗’s. For the record we notice that this estimator is asymptotically consistent in the sense that it converges to 𝔼{𝑓 (𝑌 )} as 𝑚 → ∞ (see for example Glasserman 𝑛 𝑛 [6]). Remark 1. Two conditions need to be satisfied for this change of measure to be practical. First one needs to be able to compute the new transition kernel 𝐾˜ of the chain under ℙ˜, and most importantly, one needs to be able to generate efficiently Monte Carlo samples from the distribution 𝐾˜(𝑥 ,𝑑𝑥 ). Second, one needs to be 𝑛−1 𝑛 abletocomputethedistortedintegrandappearingintherighthandsideof(6), andin particular the product of the inverses of the weight functions with a low overhead. As in the case of the applications given in this paper, this is typically done by resorting to a suitable version of the Girsanov theorem which is the time-honored method to change measure for stochastic processes. Remark 2. In order to minimize the variance, one should choose, for each integer ∏ 𝑖 ≥ 1, a weight function 𝑤 such that 𝑤 is proportional to ∣𝑓 ∣. But of course, 𝑖 1≤𝑖≤𝑛 𝑖 𝑛 in this case, the second expectation in the right hand side of (4) is typically unknown (not equal to 1), since it is the quantity that we aim at computing (at least for ∏ 𝑓 ≥ 0,𝑓 = ∣𝑓 ∣ = 𝑤 ). 𝑛 𝑛 𝑛 1≤𝑖≤𝑛 𝑖 2.3. Twisted Feynman-Kac Expectations. The purpose of this subsection is to give a crash course on Feynman-Kac path measures. We describe their subsequent approximationsbyinteractingparticlesysteminthenextsubsection. Thebasicmate- rial is borrowed from [13] and the actual application to the Monte Carlo computation of probabilities of rare events from [14]. Technical details are included for the sake of completeness and to ease the introduction of a specific set of notations. For any bounded measurable function 𝑓 on 𝐹 , we define the Feynman-Kac expec- 𝑛 𝑛 tation 𝛾 (𝑓 ) by: 𝑛 𝑛 { } ∏ (7) 𝛾 (𝑓 ) = 𝔼 𝑓 (𝑌 ) 𝑤 (𝑌 ) . 𝑛 𝑛 𝑛 𝑛 𝑖 𝑖 1≤𝑖<𝑛 Note that as a non-negative linear form on a cone of non-negative functions, 𝛾 can 𝑛 be viewed as a measure. We shall denote by 𝜂 the corresponding normalized measure 𝑛 which is naturally defined as 𝔼{𝑓 (𝑌 )∏ 𝑤 (𝑌 )} 𝛾 (𝑓 ) (8) 𝜂 (𝑓 ) = 𝑛 𝑛 1≤𝑖<𝑛 𝑖 𝑖 = 𝑛 𝑛 . 𝑛 𝑛 𝔼{∏ 𝑤 (𝑌 )} 𝛾 (1) 1≤𝑖<𝑛 𝑖 𝑖 𝑛 IS AND IPS FOR CREDIT PORTFOLIOS 7 Notice that 𝑛 ∏ (9) 𝛾 (1) = 𝛾 (𝑤 ) = 𝜂 (𝑤 )𝛾 (1) = 𝜂 (𝑤 ). 𝑛+1 𝑛 𝑛 𝑛 𝑛 𝑛 𝑖 𝑖 𝑖=1 Thisseeminglyinnocentremarkwillplayacrucialroleinthefollowing. Consequently, for any given bounded measurable function 𝑓 , we have 𝑛 ∏ (10) 𝛾 (𝑓 ) = 𝜂 (𝑓 ) 𝜂 (𝑤 ). 𝑛 𝑛 𝑛 𝑛 𝑖 𝑖 1≤𝑖<𝑛 The above relationship has the merit of relating the un-normalized expectations in the left hand side to normalized twisted expectations in the right hand side. Using the notation 𝑤− for the inverse of the weight function 𝑤 introduced earlier, we have 𝑖 𝑖 (to be compared with (4) in the case of IS): { } ∏ ∏ 𝔼{𝑓 (𝑌 )} = 𝔼 𝑓 (𝑌 ) 𝑤−(𝑌 ) 𝑤 (𝑌 ) 𝑛 𝑛 𝑛 𝑛 𝑖 𝑖 𝑖 𝑖 1≤𝑖<𝑛 1≤𝑖<𝑛 ( ) ∏ = 𝛾 𝑓 𝑤− 𝑛 𝑛 𝑖 1≤𝑖<𝑛 ( ) ∏ ∏ (11) = 𝜂 𝑓 𝑤− 𝜂 (𝑤 ). 𝑛 𝑛 𝑖 𝑖 𝑖 1≤𝑖<𝑛 1≤𝑖<𝑛 This shows that expectations over the original process can be computed if one can compute normalized twisted expectations. This is in fact possible, in a dynamic way because, like in classical filtering theory, it is easily checked that the sequence of normalized twisted probability measures form a well defined dynamical system in the space of probability measures. Indeed we have (12) 𝜂 = Φ (𝜂 ), 𝜂 = 𝑀 (𝑋 , ⋅) 𝑛 𝑛 𝑛−1 1 1 0 where the nonlinear operators Φ giving the dynamics are defined as 𝑛 ∫ 1 (13) Φ (𝜂) = 𝜂(𝑑𝑦 )𝑤 (𝑦 )𝑀 (𝑦 , ⋅). 𝑛 𝑛−1 𝑛−1 𝑛−1 𝑛 𝑛−1 𝜂(𝑤 ) 𝑛−1 𝐹𝑛−1 2.4. Interacting Particle Systems for the Computation of Rare Events. The IPS method is based on the deformation of the Markov chain successive transitions by way of mutations and selections in order to force the chain into the rare events of interest. Because the deformations of the chain can be understood mathematically, at least locally in time, as changes of measures, this strategy is reminiscent of classical importance sampling as described earlier. However, as seen in the context of diffusion processes (see Appendix A), IPS involves Feynman-Kac changes of measures as of section 2.3, whereas Girsanov-like changes of measure underlie IS. As we already emphasized in the introduction, this implies the important practical difference that while the Monte Carlo samples of an importance sampling computation are generated from the twisted distribution, the Monte Carlo samples used in an IPS Monte Carlo computation are generated under the original distribution of the chain. As we shall see in detail now, all we need to have in order to implement the IPS Monte Carlo 8 R. CARMONA & S. CRE´PEY computations is a black box capable of generating Monte Carlo samples from the distribution of the chain: no need to open the box to perform the changes necessary to generate samples from the twisted distribution! 2.4.1. Model Simulation. For the purpose of numerical computations and Monte Carlo estimation of expectations of the form (11), we introduce approximations of the above probability distributions by convex combinations of Dirac measures, and we show that the time evolution of the measures 𝜂 given by the dynamical system 𝑛 (12) and (13) implies a natural time evolution for the point masses of the convex combinations of Dirac measures, hence the interpretation of these approximations as an interacting particles system. We choose a (‘large’) integer 𝑚 which we shall interpret as the number of particles. A particle at time 𝑛 is an element 𝜁𝑗 = (𝜉𝑗 ,𝜉𝑗 ,⋅⋅⋅ ,𝜉𝑗 ) ∈ 𝐹 = 𝐸 ×𝐸 ×⋅⋅⋅×𝐸 . 𝑛 0,𝑛 1,𝑛 𝑛,𝑛 𝑛 0 1 𝑛 wherethesuperscript𝑗 oftheparticlerangesfrom1to𝑚. Westartwithaninitialcon- figuration 𝜁 = (𝜁𝑗) that consists of 𝑚 independent and identically distributed 1 1 1≤𝑗≤𝑚 random samples from the distribution: 𝜂 (𝑑(𝑥 ,𝑥 )) = 𝑀 (𝑋 ,𝑑(𝑥 ,𝑥 )) = 𝛿 (𝑑𝑥 )𝐾 (𝑥 ,𝑑𝑥 ) 1 0 1 1 0 0 1 𝑋0 0 1 0 1 where we use the notation 𝛿 for the Dirac measure at the point 𝑥. In other words, 𝑥 the 𝜁𝑗 = (𝜉𝑗 ,𝜉𝑗 ) = (𝑋 ,𝜉𝑗 ) ∈ 𝐹 = 𝐸 ×𝐸 are independent and in such a way 1 0,1 1,1 0 1,1 1 0 1 that all the 𝜉𝑗 are all independent with the same distribution 𝐾 (𝑋 , ⋅). Based on 1,1 1 0 the transition given by the dynamic equation (13), we define the transition for the particles which are providing the approximation of 𝜂 by 𝜁 → 𝜁 from 𝐹𝑚 into 2 𝑛−1 𝑛 𝑛−1 𝐹𝑚 according to the transition probability 𝑛 𝑚 ∏ (14) ℙ{𝜁 ∈ 𝑑(𝑦1,⋅⋅⋅ ,𝑦𝑚)∣𝜁 } = Φ (𝜌(𝜁 ))(𝑑𝑦𝑗), 𝑛 𝑛 𝑛 𝑛−1 𝑛 𝑛−1 𝑛 𝑗=1 where 𝜌(𝜁 ) is the empirical measure defined by 𝑛−1 𝑚 1 ∑ 𝜌(𝜁 ) = 𝛿 𝑛−1 𝑚 𝜁𝑛𝑗−1 𝑗=1 and 𝑑(𝑦1,⋅⋅⋅ ,𝑦𝑚) represents an infinitesimal neighborhood of the point (𝑦1,⋅⋅⋅ ,𝑦𝑚) 𝑛 𝑛 𝑛 𝑛 in 𝐹𝑚. Recalling the definition of the operators Φ giving the dynamics of the 𝜂 , one 𝑛 𝑛 𝑛 sees that (14) is the superposition of two clearly identifiable elementary transitions, a selection followed by a mutation. In other words: 𝐹𝑚 ∋ 𝜁 se−lec→tion 𝜁ˆ ∈ 𝐹𝑚 mu−t→ation 𝜁 ∈ 𝐹𝑚 𝑛−1 𝑛−1 𝑛−1 𝑛−1 𝑛 𝑛 as follows. The selection stage is performed by resampling with replacement (i.e. choosing independently) 𝑚 path-particles 𝜁ˆ𝑗 = (𝜉ˆ𝑗 ,𝜉ˆ𝑗 ,⋅⋅⋅ ,𝜉ˆ𝑗 ) ∈ 𝐹 , 𝑛−1 0,𝑛−1 1,𝑛−1 𝑛−1,𝑛−1 𝑛−1 IS AND IPS FOR CREDIT PORTFOLIOS 9 with possible repetitions according to the Gibbs measure ∑𝑚 𝑤 (𝜁𝑗 ) (15) 𝑛−1 𝑛−1 𝛿 . 𝑗=1 ∑𝑚𝑙=1𝑤𝑛−1(𝜁𝑛𝑙−1) 𝜁𝑛𝑗−1 Then, the mutation stage is performed by extending each selected path-particle 𝜁ˆ𝑗 𝑛−1 into a path-particle 𝜁𝑗 ∈ 𝐹 = 𝐹 ×𝐸 of the form 𝑛 𝑛 𝑛−1 𝑛 𝜁𝑗 = (𝜉𝑗 ,𝜉𝑗 ,⋅⋅⋅ ,𝜉𝑗 ) 𝑛 0,𝑛 1,𝑛 𝑛,𝑛 = (𝜉ˆ𝑗 ,𝜉𝑗 ,⋅⋅⋅ ,𝜉ˆ𝑗 ,𝜉𝑗 ) 0,𝑛−1 1,𝑛−1 𝑛−1,𝑛−1 𝑛,𝑛 wherethe𝑚samples𝜉𝑗 areindependentlydrawnfromthedistributions𝐾 (𝜉ˆ𝑗 , ⋅). 𝑛,𝑛 𝑛 𝑛−1,𝑛−1 2.4.2. Convergence Results. We are now in a position to quote the theoretical result on which the Monte Carlo approximations are based. See for example [13] or [14] for details. For each fixed 𝑛 we have lim 𝜂𝑚 = 𝜂 𝑛 𝑛 𝑚→∞ in distribution, where the empirical measures 𝜂𝑚 are defined by: 𝑛 𝑚 1 ∑ 𝜂𝑚 = 𝛿 . 𝑛 𝑚 𝜁𝑛𝑗 𝑗=1 This result is screaming for the introduction of the particle approximation ∏ 𝛾𝑚(𝑓 ) = 𝜂𝑚(𝑓 ) 𝜂𝑚(𝑤 ) 𝑛 𝑛 𝑛 𝑛 𝑛 𝑖 1≤𝑖<𝑛 for 𝛾 (𝑓 ). The main result of [14] which we use below states that 𝛾𝑚 is an unbiased 𝑛 𝑛 𝑛 estimator of 𝛾 in the sense that for any integer 𝑝 ≥ 1 and any bounded measurable 𝑛 function 𝑓 on 𝐹 with sup∣𝑓 ∣ ≤ 1, we have 𝑛 𝑛 𝑛 𝔼{𝛾𝑚(𝑓 )} = 𝛾 (𝑓 ), 𝑛 𝑛 𝑛 𝑛 and in addition √ sup 𝑚𝔼{∣𝛾𝑚(𝑓 )−𝛾 (𝑓 )∣𝑝}1/𝑝 ≤ 𝑐 (𝑛), 𝑛 𝑛 𝑛 𝑛 𝑝 𝑚≥1 for some positive constant 𝑐 (𝑛) < ∞ whose value does not depend upon the partic- 𝑝 ular choice of the function 𝑓 . 𝑛 In view of (11), we thus get the following unbiased (for fixed 𝑚) and asymptotically convergent (as 𝑚 → ∞) estimate of 𝔼{𝑓 (𝑌 )}: 𝑛 𝑛 ∏ ∏ ∏ 𝛾𝑚(𝑓 𝑤−) = 𝜂𝑚(𝑓 𝑤−) 𝜂𝑚(𝑤 ) 𝑛 𝑛 𝑖 𝑛 𝑛 𝑖 𝑖 𝑖 1≤𝑖<𝑛 1≤𝑖<𝑛 1≤𝑖<𝑛 { } ∏ ∏ (16) = 𝔼𝑚 𝑓 (𝜁 ) 𝑤−(𝜉 ,⋅⋅⋅ ,𝜉 ) 𝔼𝑚{𝑤 (𝜁 )} 𝑛 𝑛 𝑛 𝑖 0,𝑛 𝑖,𝑛 𝑖 𝑖 𝑖 1≤𝑖<𝑛 1≤𝑖<𝑛 where as before, for every 𝑖 the notation 𝔼𝑚 refers to expectation under the empirical 𝑖 distribution defined by the 𝜁𝑗’s. More importantly, the variance of the estimator can 𝑖 10 R. CARMONA & S. CRE´PEY be analyzed. In case the expectation of interest is tantamount to the probability of a “rare” event (17) 𝔼{𝑓 (𝑌 )} = 𝔼{1 (𝑉(𝑌 ))} = ℙ{𝑉(𝑌 ) ∈ 𝐴}, 𝑛 𝑛 𝐴 𝑛 𝑛 where 𝑉 is a function from 𝐸 to ℝ the conclusion is that in order to minimize this 𝑛 variance, one should use weight functions 𝑤 favoring the occurrence of the rare event without involving too large normalizing constants. Moreover, the choice of 𝑤 should give rise to an algorithm that can be easily implemented. We use the guidelines of this and the previous subsection for rare events of the form {𝑉(𝑋 ) = ℓ} (since our aim is to estimate portfolio loss distributions at fixed points 𝑛 in time; it is interesting to note however that IPS estimates are also available for more general ‘path-dependent’ events as of (17)). 3. Benchmark Models We now introduce the family of Markovian intensity models of credit risk which we use to illustrate and compare explicit implementations of IS and IPS. Even though the basic principle of IS does not require the Markov property, we saw that IPS algorithms are based on a background Markov chain. For this reason, we described IS in section 2.2 with a definite Markovian bent, and the choice of the models used as implementation test-beds is based on the specification of suitable Markovian factors. 3.1. Local Intensity Model. Within the context of the Top-Down approach to credit risk, we assume that the cumulative loss process 𝐿 = {𝐿 ; 𝑡 ≥ 0} of a credit 𝑡 portfolio of 𝑛 names (we can think of 𝑛 = 125) is modeled as a Markov point process. More specifically, we assume that 𝐿 = 0 and that 𝐿 is a pure birth process with local 0 intensity𝜆(𝑡,𝐿 )givenbyadeterministicfunction{𝜆(𝑡,𝑖)} satisfying𝜆(𝑡,𝑖) = 0 𝑡 𝑡≥0,𝑖≥0 for 𝑖 ≥ 𝑛 (see, for instance, [12, 2, 4]). This last condition guarantees that the process 𝐿 is actually stopped at the level 𝑛, as there are only 𝑛 names in the pool. In any case, conditionally on the information ℱ = ℱ𝐿 available at time 𝑡, up to first order 𝑡 𝑡 in 𝑑𝑡, the probability of a jump in the infinitesimal time interval (𝑡,𝑡 + 𝑑𝑡) is given by 𝜆(𝑡,𝐿 )𝑑𝑡. 𝑡 The infinitesimal generator 𝒢 of the process is given by the (𝑛+1)×(𝑛+1) matrices 𝑡 ⎛ ⎞ −𝜆(𝑡,0) 𝜆(𝑡,0) 0 0 0 ⎜ 0 −𝜆(𝑡,1) 𝜆(𝑡,1) 0 0 ⎟ ⎜ ⎟ 𝒢 = ⎜ ⋅⋅⋅ ⎟. 𝑡 ⎜ ⎟ ⎝ 0 0 0 −𝜆(𝑡,𝑛−1) 𝜆(𝑡,𝑛−1) ⎠ 0 0 0 0 0 For each time 𝑡, we denote by 𝑝(𝑡) the (𝑛+1)-dimensional vector 𝑝(𝑡) = [𝑝 (𝑡)] 𝑖 𝑖=0,1,⋅⋅⋅,𝑛 oflossprobabilities𝑝 (𝑡) = ℙ{𝐿 = 𝑖}. TheysatisfytheforwardKolmogorovequation: 𝑖 𝑡 (18) (∂ −𝒢∗)𝑝 = 0 on (0,+∞), 𝑝(0) = 𝛿 , 𝑡 𝑡 0 which is in fact a system of ordinary differential equations. Here and throughout the paper, 𝒢∗ denotes the adjoint (or transpose) of a generic matrix 𝒢. In the time- homogeneous case (when the intensities 𝜆(𝑡,𝑖) do not depend upon 𝑡, in which case
Description: