IEEECOMMUNICATIONSLETTERS 1 Polar Coding for the Binary Erasure Channel with Deletions Eldho K. Thomas, Vincent Y. F. Tan, Senior Member, IEEE, Alexander Vardy, Fellow, IEEE, and Mehul Motani, Senior Member, IEEE Abstract—We study the application of polar codes in deletion polar codes are also potential candidates for correcting single channels by analyzing the cascade of a binary erasure channel deletions. However, they cannot be used directly on deletion (BEC) and a deletion channel. We show how polar codes can be channels since the polarization of a channel with memory has used effectively on a BEC with a single deletion, and propose not been well-studied. Developing polarization techniques for a list decoding algorithm with a cyclic redundancy check for this case. The decoding complexity is O(N2logN), where N deletion channels is beyond the scope of this study. Instead, is the blocklength of the code. An important contribution is an motivated by decoders that are possibly defective and delete optimizationoftheamountofredundancyaddedtominimizethe symbols arbitrarily, we consider polar codes over a binary 7 overallerrorprobability.Ourtheoreticalresultsarecorroborated 1 erasure channel (BEC) and an adversarial version of the by numerical simulations which show that the list size can be 0 deletionchannelwithonedeletion,andprovidealistdecoding reduced to one and the original message can be recovered with 2 high probability as the length of the code grows. algorithm to successfully recover the original message with n high probability1 (w.h.p.). Unlike RM codes, polar codes Index Terms—Polar codes, deletions, binary erasure channel, a do not have rich run length properties. Instead, we use the cascade, list decoding, cyclic redundancy check, candidate set J successivecancellationalgorithm[1]fordecoding.Inaddition, 8 we provide a detailed analysis of the error probability, which I. INTRODUCTION was lacking in [9]. Channel cascades were studied previously ] Polar codes, invented by Arıkan [1], are the first provably T in [10] but our model has not been previously considered in capacity-achieving codes with low encoding and decoding I the literature. We argue that the capacity of the cascade can . complexity. Arıkan’s presentation of polar codes includes a s be achieved; in constrast, [9] does not discuss capacity issues. c successive cancellation decoding algorithm, which generally [ doesnotperformaswellasthestate-of-the-arterror-correcting II. PRELIMINARIES A. Polar Codes 1 codes at finite block lengths [2]. To improve the performance We consider polar codes of length N = 2n constructed v ofpolarcodes,TalandVardy[3]devisedalistdecodingalgo- 8 rithm. The initial work of Arıkan considers binary symmetric recursively from the kernel G2 =(cid:0)1101(cid:1). Given an information 3 vector (message) uN =(u ,...,u ) where u F , a code- 9 memoryless channels. There have been attempts to study word xN is genera1ted usin1g the rNelation xN i=∈uN2B G⊗n 1 polar codes for other channels, e.g., the AWGN channel [4]. where G1⊗n is the n-th Kronecker product of1G an1d BN i2s a 0 However, there are not many constructions of polar codes for 2 2 N bit-reversal permutation matrix, defined explicitly in [1]. The . channels with memory. See [5] and references therein. 1 vector xN is transmitted through N independent copies of a 0 The deletion channel is a canonical example of a non- binary d1iscrete memoryless channel (BDMC) W : F 17 sbtoaltsioanrabriytr,anriolyn-aenrgdotdhiecpcohsaintinoenlswoifththmeedmeloertiyo.nIstadreeleutnesknsoywmn- with transition probabilities {W(y|x) : x ∈ F2,y ∈ Y2}→anYd : capacity C(W). As n grows, the individual channels start v to the receiver. A survey by Mitzenmacher [6] discusses the polarizing. That is, a subset of the channels tend to noise-free i major developments in the understanding of deletion channels X channels and others tend to completely noisy channels. The in greater detail. To date, the Shannon capacity of deletion r fraction of noise-free channels tends to the capacity C(W). channels, in general, remains unknown. However, there have a The polarization behavior suggests using the noise-free chan- been attempts to find upper and lower bounds on the capacity nelstotransmitinformationbits,whilesettingtheinputstothe of deletion channels [7], [8]. noisychannelstovaluesthatareknownaprioritothedecoder Our motivation is partly the work of Dolecek and Anan- (i.e., the frozen bits). That is, a message vector uN consists tharam [9], in which the run length properties of Reed-Muller 1 of information bits and frozen bits (often set to zero) where (RM) codes were exploited to correct a certain number of 1,...,N = of size k is the information set and ¯ substitutionstogetherwithasingledeletion;ourworkinvolves I ⊂ { } N I is the set of frozen bits. This scheme achieves capacity [1]. correcting erasures rather than substitiutions. RM codes and Denote the channel output by yN =(y ,...,y ) and the i-th polar codes have similar algebraic structures and therefore 1 1 N synthesized subchannel with input u and output (yN,ui−1) i 1 1 E.K.ThomasiswiththeInstituteofComputerScience,UniversityofTartu, by W(i) for i = 1,...,N. The transition probability matrix N Estonia, 51014 (email: eldhokt@gmail.com). V. Y. F. Tan and M. Motani W(i) is defined as are with the Department of Electrical & Computer Engineering, Na- N tmEionongtaainlneiU@ernininvuges,r.esUidtnuyi.vsogefr)s.SiAtiyn.goVafpaorCdraeyl,iifSsoirnwngiiatahpSoDareenp1Da1rit7em5g8eon3,tL(oeafmJEaoilllelsac:,trvCictaAanl@9&2n0Cu9so3.em,dpUuu.SstegAr, WN(i)(y1N,u1i−1|ui):= (cid:88) 2N1−1WN(y1N|uN1 ), andtheSchoolofPhysical&MathematicalSciences,NanyangTechnological uNi+1∈FN2−i University,Singapore637371(email:avardy@ucsd.edu). ThisworkispartiallyfundedbyaSingaporeMinistryofEducation(MoE) 1Inthisletter,weusethetermw.h.p.tomeanwithprobabilitytendingto1 Tier2grant(R-263-000-B61-112). astheblocklengthofthecodeN tendstoinfinity. IEEECOMMUNICATIONSLETTERS 2 uN1 (cid:45)Encoder xN1 (cid:45) W received. In order to decode y˜N−1, we use the SC algorithm 1 1 (refer to Section II-B). Since the position of the deletion is yN 1 unknown,wefirstidentifyasetofvectors,calledthecandidate L(cid:27) Decoder (cid:27)y˜1N−1 W2 (cid:27) set,whichcontainsy˜1N−1asasub-sequence.Ana¨ıvealgorithm to construct the candidate set would be to insert 0,1,e in the N locations before and after each symbol of y˜N−1. We then Fig.1. BEC-1-DeletionCascade.W1=BEC(p)N isthelength-N BEC, apply the SC algorithm to each vector in the ca1ndidate set. W2 isthe1-deletionchannel,andListhelistofpossiblemessages. For example, suppose N = 4 and the received vector is y˜3 =01e.Thenthefollowingset includesallvectorswhich whereW (yN uN):=(cid:81)N W(y x )andxN =uNB G⊗n co1ntain the subsequence 01e: S N 1 | 1 i=1 i| i 1 1 N 2 is the codeword corresponding to the message uN. The en- 1 = 001e,101e,e01e,011e,0e1e,010e,01ee,01e0,01e1 coding complexity of polar coding is O(NlogN) [1]. S { } The size of this set can be further reduced if we notice that inserting e at N positions is enough to identify all possible B. Successive Cancellation Decoding messages those can output y˜N−1 after a single deletion. This Arıkan [1] proposed a successive cancellation (SC) decod- 1 isbecauseofthefollowing:Supposethei-thsymbolisdeleted ing scheme for polar codes. Given y1N and the estimates from yN. Instead of inserting 0 or 1 at position i, we insert uˆi1−1 of ui1−1, the SC algorithm estimates ui. The following an eras1ure symbol e. Since a polar code correcting α Np(cid:48) logarithmic likelihood ratios (LLR) are used to estimate each ≈ (where p(cid:48) < p) erasures also corrects α+1 erasures w.h.p., u for i=1,...,N: i under the SC decoding algorithm, this new length-N vector W(i)(yN,uˆi−1 u =0) decodestothecorrectmessagew.h.p.nomatterwhichsymbol L(Ni)(y1N,uˆi1−1)=logWN(i)(y1N,uˆi1−1|ui =1). was at position i. We state this observation formally: N 1 1 | i Proposition 1. Suppose uN is sent over a BEC-1-Deletion The estimate of an unfrozen bit u is determined by the signs 1 i cascade W. (See Fig. 1.) The size of the candidate set of the LLRs, i.e., uˆi = 0 if L(Ni)(y1N,uˆi1−1) ≥ 0 and uˆi = (constructedabove)isN αwhereαisthenumberoferasureAs 1 otherwise. It is known that polar codes with SC decoding present in the received s−tring y˜N−1. achievecapacitywithdecodingcomplexityofO(NlogN)[1]. 1 Proof: The candidate set is C. Adversarial Deletion Channel = (y˜i−1,e,y˜N−1): i=1,2,...,N 0,1,e N A { 1 i }⊂{ } We suppose that N bits are sent over a channel and exactly where y˜N−1 is the received string. Suppose that the j-th d bits are deleted. We call this a d-deletion channel. That is, symbol o1f y˜N−1 is e. Inserting another e before the j-th for N bits sent, the decoder only receives N −d bits after symbol e for1ms vector y˜j−1eey˜N−1. This vector repeats if d deletions and the positions of deletions are not known to 1 j+1 we insert e again after the the j-th symbol e. Therefore, the receiver. Note that this is not the probabilistic deletion considering non-erasure bits of y˜N−1 and inserting exactly channel in which each symbol is independently deleted with 1 one erasure symbol e at positions before and after these non- some fixed probability q (0,1) [8]. ∈ erasure bits produces unique vectors in the candidate set . A Since the number of erasure symbols is α, the total number III. PROBLEMSETTINGANDMODEL of vectors in is N α. A − Consider the 1-deletion channel (d=1 in the definition in We remark that as N , by the law of large numbers Section II-C), where exactly one bit is deleted. We suppose α p and hence →N∞ Np where p (0,1) is the that N =2n where n N. A message vector uN is encoded eNras→ure probability o|fAt|he≈BEC−. ∈ ∈ 1 using the polar encoder and is sent across N uses of a BEC WN = W , each with erasure probability p (0,1). The 1 1 ∈ B. List Decoding output vector is passed through a 1-deletion channel W . We 2 After the construction of the set , the problem reduces denotethiscascadeofW1 andW2 asWandcallthisaBEC- to the decoding of each vector in Ausing the SC algorithm. 1-DeletionCascade.ThismodelisshowninFig.1.Theoutput A of W is denoted as y˜1N−1. Note that W permits erasures and SNinceα|Aat|=theNen−dαo,fwtheegwethoalelisdteocfodminesgsapgroesceodfusreiz.e at most aasviencgtloerdy˜eNle−ti1oni.sTrheacteiivse,da.mAesdseacgoeduerN1 isisdseesnitgancerdosisnWsucahnda L−et SC(y1N) denote the SC decoding of y1N, and define 1 way that w.h.p., a list (of linear size in N) containing an = uk :uk =uˆN ,uˆN =SC(yN),yN , (1) estimate uˆN of the origLinal message uN is returned. L { 1 1 1 |I 1 1 1 ∈A} 1 1 as the list of messages returned by the set where is the A I information set. IV. CODINGFORTHEBEC-1-DELETIONCASCADE Since we insert the erasure symbol e at each of the N A. Reconstruction of the BEC Output possiblepositions(includingthedeletedposition),theoriginal A message uN is sent over a BEC-1-Deletion cascade message sent belongs to w.h.p. Arıkan [1] proved that using a polar enc1oder described in Section II-A and y˜N−1 is the probability of error PL(N) vanishes asymptotically for 1 e IEEECOMMUNICATIONSLETTERS 3 polar codes over any BDMC. A more precise estimate was where (cid:98) is the modified version of (1) according to the new L provided by Arıkan and Telatar [11] who showed that for polar coding scheme defined as any β (0,1/2), P(N) 2−Nβ for sufficiently large block lengths∈N.Thereforee,und≤erSCdecoding,vectorsin return L(cid:98):={uk1+r :uk1+r =uˆN1 |I∪P,uˆN1 =SC(y1N),y1N ∈A}, all possible messages that can produce the string y˜1N−A1 under andwhere ¯isthesetofparitybits(¯isthesetoffrozen a single (adversarial) deletion. P ⊂I I bits).IftherowsofH arechosenuniformlyandindependently from 0,1 k+r, the probability that a vector uk is in (cid:99) is C. Recovering the Correct Message from the List via Cyclic { } 1 M ReNduantudraanllcyy, tChheereckca(CnRbCe)multiple uk that belong to the Pr(cid:0)uk1 ∈M(cid:99)(cid:1)=Pr(cid:0)(cid:104)hi,uk1+r(cid:105)=0, ∀i=1,...,r(cid:1)= 21r 1 ∈M list anditmaynotbeeasytosingleouttheoriginalmessage. whereuk+r (cid:98).Thatis,amessagein (cid:98)iswronglyidentified HowLever, by applying a simple pre-coding technique using an as the 1origi∈naLl message with probabLility 1/2r. However, r-bit CRC (or a code having an r k parity check matrix) the true message sent satisfies the parity-check condition × [3], [12], the original message can be detected from the list, uk+rHT = 0. Therefore, by the union bound, the total 1 albeit with some additional probability of error. We describe probability that an incorrect message is returned is upper how to recover the correct message w.h.p. here. bounded as RecallthatwehaveN−k frozenbitsthatweusuallysetto P(N) |L(cid:98)| + P(N), (2) zero. Instead of setting all of them to zero, we set N k r TotErr ≤ 2r |A| e − − frozen bits to zero, where r is a small number we optimize in Section IV-D. These r bits will contain the r-bit CRC value where Pe(N) is the probability of error of the SC decoding of the k unfrozen bits (or simply the parity bits). To generate algorithm and (cid:98) N(1 p) for a single deletion. |L|≤|A|≈ − a r-bit CRC, we select a polynomial of degree r, called a To maintain that Rpolar R (that is, as the block length N CRC polynomial, having r +1 coefficients. We then divide grows,R convergesto≈R)andtheupperboundonP(N) polar TotErr the message (by treating it as a binary polynomial) by this in (2) is minimized, we have to choose r carefully. CRC polynomial to generate a remainder of degree at most For a single deletion, the size of the candidate set |A| ≈ r 1, with total number of coefficients r. We append these N(1 p) and hence (cid:98) N(1 p) w.h.p. From Hassani et − − |L|≤ − r coefficients at the end of the k-bit message to generate al. [14], the rate-dependent error probability of the polar code a (k + r)-bit vector. To verify that the correct message is for the BEC with rate R is polar received,weperformthepolynomialdivisionagaintocheckif the remainder is zero. For more details on the choice of CRC P(N) =2−2n2+√2nQ−1(cid:18)RCp(oWla)r(cid:19)+o(√n). polynomials, please refer to [13]. We send these k +r bits e acrossthecascade.Thisnewencodingisaslightvariationthe where N =2n, Q(x):= √1 (cid:82)∞exp( t2)dt is the comple- 2π x −2 original polar coding scheme [1]. Also, note that the original mentaryGaussiancumulativedistributionfunction,andC(W) informationrateR= k ispreserved.However,therateofthe is the capacity of the channel cascade. N polar code is slightly increased to Rpolar = kN+r. From (2), To summarize, we encode the message uk of length k into a length k +r vector uk1+r ∈ C(cid:48) having1redundancy r P(N) (cid:20)2−r+2−2n2+√2nQ−1(cid:18)RCp(oWla)r(cid:19)+o(√n)(cid:21). (3) where (cid:48) = 2k. Then we apply the polar coding scheme TotErr ≤|A| |C | for the codebook (cid:48). This will result in a polar code of C C It can be verified easily that the first term in the square length N and size 2k+r where only the subset (cid:48) C ⊂ C parentheses in (3) is decreasing and the second term with carries information that we wish to transmit. The codeword R = k+r is increasing in r. To optimize the upper xN corresponding to the original message uk is then polar N 1 ∈ C 1 bound in (3), we set the exponents of two terms to be equal passed through the BEC-1-Deletion channel and outputs a (neglecting the insignificant o(√n) term), i.e., vector yˆN−1. After constructing the set by inserting e at 1 A √ each possible N positions, we apply the SC algorithm on r =2n2+√2nQ−1(NkC+(Wr )) =√N2 lo2g2NQ−1(NkC+(Wr )), . However, not all of these resulting vectors in carry A C information. We can check this using the initial r-bit CRC where we used the fact that N =2n. (or the parity check matrix). All vectors which fail under the Now we find an expression for r in terms of the backoff CRC check are removed and we then select the message with from capacity. To transmit the code at a rate close to the the maximum likelihood from the list. capacity, for a small constant δ > 0, assume that R = (1 δ)(1 p) where C(W) = 1 p since a polar code D. AnalysisandOptimizationoftheOverallErrorProbability − − − over the BEC 1-deletion cascade achieves the capacity of the SupposeH denotesther (k+r)paritycheckmatrixwith BEC; this is a simple consequence of [15, Problem 3.14] × rows hi : i = 1,...,r that is being used for adding parity and the fact that the list size is polynomial. Then the rate { } to the k bit message. Then the set of messages that carries R =R+ r(1 p) (1 δ)(1 p) for N large enough. polar N − ≥ −2 − any information can be identified as Therefore, √ (cid:99):=(cid:8)uk :uk+rHT =0,uk+r (cid:98)(cid:9), r =√N 2 lo2g2NQ−1(1−δ2). M 1 1 1 ∈L · IEEECOMMUNICATIONSLETTERS 4 Letz =Q−1(1 δ).Since δ 0,z 0.ThenQ(z)=1 δ PlotofErrorProbabilitiesagainstLogofBlockLength −2 2 ≈ (cid:28) −2 100 and hence δ = Q( z). Since Q( z) decays as e−z2/2 as 2 − −(cid:113) z , z2 = 2ln2. Then z = 2ln2. Therefore, the → −∞ δ − δ optimal value of the number of parity bits r is ties10−1 r =√N 2−(cid:114)(log2N2)(lnδ2) =Θ(√N). obabili · Pr This is a rate-dependent choice of r (through δ) that simul- r taneously ensures that R R and the upper bound on Erro10−2 PT(Not)Err in (2) is minimizpeodl.ar → verall RRR===000...665005,,,||LL||==≥111 E. Finite Number of Deletions O |L| 10−3 R=0.55,|L|≥1 Now consider the cascade of a BEC and a d-deletion R=0.50, =1 |L| channel where d N is finite. This model can be analyzed R=0.50,|L|≥1 usingthesametec∈hniquespresentedhere.Theonlydifference 6 7 8 9 10 11 n=log N isthesizeofthecandidateset .Byusingthesamearguments 2 A as in the 1-deletion case, we construct by inserting erasure Fig.2. Plotoferrorprobabilitiesagainstn=log N ∈{6,...,11}.The symbols at d positions and = (cid:0)NA(cid:1) α. Therefore, the solidlineistheerrorprobabilityofobtainingalisto2fsize1,whichisexactly list size (cid:98) (cid:0)N(cid:1) α. S|iAnc|e thedmo−dels are similar, a theoriginalmessage.Thebrokenlineistheerrorprobabilityofobtaininga |L| ≤ d − listofsizeatleast1containingtheoriginalmessage.Thelistsizeafterusing CRC construction and error probability analysis for the BEC- CRCissmalleventhoughitisnotexactly1.Datapointsthatarenotavailable d-Deletion cascade similar to that presented in Sections IV-C indicatethatthesimulatederrorprobabilityover1000runsisexactly0. and IV-D respectively can be performed. In addition, we see that even if the list size is d = o(cid:0) N (cid:1), the capacity of the logN REFERENCES BEC is achieved because (cid:98) Nd is still subexponential. |L|≤ [1] E. Arıkan, “Channel polarization: A method for constructing capacity- F. Complexity of the Decoding Algorithm achievingcodesforsymmetricbinary-inputmemorylesschannels”, IEEE Trans.Inform.Theory,vol.55,no.7,pp.3051-3073,Jul2009. The encoding complexity of the BEC-1-Deletion cascade [2] S.H.Hassani,K.AlishahiandR.L.Urbanke,“Finite-Lengthscalingfor is same as that for standard polar codes, i.e., O(NlogN). polar codes”, IEEE Trans. Inform. Theory, vol. 60, no. 10, pp. 5875- 5898,Oct2014. However, the SC decoding algorithm has to be applied to all [3] I.TalandA.Vardy,“Listdecodingofpolarcodes”, IEEETrans.Inform. vectors in the candidate set of size N α (cf. Prop. 1). Theory,vol.61,no.5,pp.2213-2226,May2015. Thus, the complexity of the dAecoding algor−ithm of the BEC- [4] E.AbbeandA.Barron,“PolarcodingschemesfortheAWGNchannel”, ProceedingsoftheISIT,2011,pp.194-198. 1-Deletion cascade is O(N2logN) and that for the BEC-d- [5] R. Wang, J. Honda, H. Yamamoto and R. Liu, “Construction of polar DeletioncascadeisO(Nd+1logN).Althoughthecomplexity codes for channels with memory”, Proceedings of the Fall ITW, Jeju of the decoding algorithm increases by O(N) for each addi- Island,SouthKorea,2015,pp.187-191. [6] M.Mitzenmacher,“Asurveyofresultsfordeletionchannelsandrelated tional deletion, it can still be performed in polynomial time. synchronizationchannels”, ProbabilitySurveys,Vol.6,pp1-33,2009. [7] R.Venkataramanan,S.Tatikonda,andK.Ramchandran,“Achievablerates V. SIMULATIONRESULTS forchannelswithdeletionsandinsertions”, IEEETrans.Inform.Theory, vol.59,no.11,pp.6990-7013,Nov2013. In this section, we demonstrate the utility of the proposed [8] S.Diggavi,M.MitzenmacherandH.D.Pfister,“Capacityupperbounds algorithm by performing numerical simulations. The simula- forthedeletionchannel”, ProceedingsoftheISIT,2007,pp.1716-1720. tions are carried out in MATLAB using code provided in [16] [9] L. Dolecek and V. Anantharam, “Using Reed-Muller RM(1;m) codes with the following parameters.2 Let n = log N vary from overchannelswithsynchronizationandsubstitutionerrors”, IEEETrans. 2 Inform.Theory,vol.53,no.4,pp.1430-1443,Apr2007. 6 to 11. The erasure probability of the BEC is p = 0.3. [10] A. Kiely and J. Coffey, “On the capacity of a cascade of channels”, Thus, the capacity of the cascade is C(W) = 0.7. We IEEETrans.Inform.Theory,vol.39,no.4,pp.1310-1321,Apr1993. [11] E. Arıkan and I. E. Telatar, “On the rate of channel polarization”, consider three different code rates: R = 0.50,0.55 and 0.60. ProceedingsoftheISIT,2009,pp.1493-1495. We fix r = 0.7√N and the r-bit CRC polynomial is [12] K. Niu and K. Chen, “CRC-aided decoding of polar codes”, IEEE (cid:100) (cid:101) chosen according to [13]. The error probability is computed Comm.Letters,vol.16,no.10,pp.1668-1671,Oct2012. [13] P. Koopman and T. Chakravarty, “Cyclic redundancy code (CRC) by averaging over 1000 independent runs. polynomialselectionforembeddednetworks”, InternationalConference We encode a random length- RN message using a r-bit onDependableSystemsandNetworks,2004,pp.145-154. (cid:100) (cid:101) CRC polynomial so that the input of the encoder is a k+r [14] S.H.Hassani,R.Mori,T.TanakaandR.L.Urbanke,“Rate-dependent length input vector and the output is an N-bit vector. This analysisoftheasymptoticbehaviorofchannelpolarization”, IEEETrans. Inform.Theory,vol.59,no.4,pp.2267-2276,Apr2013. vector is then transmitted through a BEC-1-deletion cascade [15] A.ElGamalandY.-H.Kim,“Networkinformationtheory”, Cambridge and received a length-(N 1) vector. The CRC list decoder UniversityPress,2012. then computes a list of po−ssible messages given the channel [16] H.Vangala,Y.HongandE.Viterbo,“Efficientalgorithmsforsystematic polarencoding”, IEEEComm.Letters,vol.20,no.1,pp.17-20,Jan2016. output.Fig.2showsthat,withasuitablechoiceofthenumber of CRC bits r and CRC polynomials, as N grows, the list is of size 1 and contains only the original message w.h.p. 2The MATLAB code to reproduce the simulations is provided at https:// www.ece.nus.edu.sg/stfpage/vtan/commL code.zip.

