1 On the Successive Cancellation Decoding of Polar Codes with Arbitrary Linear Binary Kernels Zhiliang Huang, Shiyi Zhang, Feiyan Zhang, Chunjiang Duanmu, and Ming Chen Abstract—Amethodforefficientlysuccessivecancellation(SC) 100 decoding of polar codes with high-dimensional linear binary kernels (HDLBK) is presented and analyzed. We devise a l- 10−1 2⊗8=256 E=0.5 017 elbdWaxiesnpc(ao⋅ro)red(fys⋅e∣sS0ri.koC)enBrasdynnemedccleoostWndhtseooir(dd⋅)rei(wner⋅d∣i1hnlu)iigckcheestelhicptheahaonerboaiodcttb-oecrtlmhaya,iatpninlaoensxifeWmioltryp-tmerlioxafifpnfeorsdcreiotsrirasoerirncoebunsiprpstsrroioamnvbredeyaitfnbholgiiorlnmidteSiaueCirss- Block Error Rate111000−−−432 67111⊗⊗456⊗⊗⊗33==222===231412263925 656EE EEE=====00..00044...55455279014086 2 G G proposed to further reduce the complexity of HDLBK based SC 10−5 1 2 3 4 5 n decoder. For a m×m binary kernel, the complexity of straight- 100 a forward SC decoder is O(2mNlogN). With W-expressions, 2⊗12=4096 L=1 J we reduce the complexity of straightforward SC decoder to 10−1 2⊗12=4096 L=2 17 Okpee(rrmnfoe2rlNmpaolonlagcreNsco)cdowmehspeoanfrfmeedr≤swi1git6nh.ifiS2cima×nu2tlakatedirovnnaenrlteapsgouellastsrisnchotodewremsthusanotdf1ee6rr×rS1oC6r k Error Rate1100−−32 221⊗⊗6⊗11223===444000999666 LLL===481 T] and list SC decoders. Bloc10−4 1166⊗⊗33==44009966 LL==24 .I decIonddienxg,Therigmhs-—dimPoelnasriocnodkeesr,neelx,plo-enxepnrt,esssuiocncse,ssWive-excpanrecsesliloantiso.n 10−51 1.25 1.5 1.75 2 2.25 16⊗3=4096 L=8 s Eb/N0(dB) c [ I. INTRODUCTION Fig.1. Thecoderateis0.5.Top:SCdecodingperformanceofpolarcodes 2 POLAR codes were introduced by Arıkan [1] as the withkernelsG2,G6,G7,G14,G15,G16ontheBPSK-modulatedGaussian channel. All codes are constructed using Gaussian Approximation method v first family of capacity achieving codes with explicit [9] at Eb/N0= 2dB. Bottom: List SC [13] decoding performance of polar 64 cfoornsthtreucctliaosnsomfebthinoadryanindpluotwdiesnccreotdeinmge/dmeocroydliensgscchoamnpnleelxsit(iBes- cporodpeosswedithink[e1rn4e]lsatGE2b/aNn0d=G21d6B. Gan⊗2d12Gc⊗o3decoisdecoinssbtrauscetdedovniaMtohnetemCetahrolod 2 16 DMCs). Arıkan’s original polar codes is based on the kernel methodproposedin[1,Sec.IX]atEb/N0=2dB. 3 .0 matrix G2 = ( 11 01 ) and its nth Kronecker power G⊗2n 1 0 corresponding to a linear code with block length N = 2n. However, it was pointed out in [3] that the complexity of 7 WithArıkan’s2×2kernel,itwasshownin[2]theprobability straightforward SC decoder for Gm polar codes behaved like 1 of block error under successive cancellation (SC) decoding is O(2mNlogNm). So it’s not practical for high-dimensional ker- v: o(2−2nβ) with β =0.5. It was conjectured in [1] that channel nelssuchasm=16.Atpresent,tothebestofourknowledge, there is no efficient SC decoding of large dimension kernels, i polarization is a general phenomenon and it was shown in X although exponent’s definition is base on SC decoding. In [7] [3] that the probability of block error under SC decoding is ar o(2−mnβ)forageneralkernelGm withsizem×m.β iscalled and [8], they tried to generalize the idea of G2 SC decoding to high-dimensional binary kernels. But their methods can exponent of the kernel and can exceed 0.5 for large m [3]. onlyworkonkernelswithverysmalldimensionbecausetheir Basedontheabove,manyresearchershadconstructedhigh- methods need a tree structure for bit-channel graph [7] and dimensional kernels with large exponents. Based on BCH codes, Korada et al. [3] provided a construction of binary it’s not true for large dimension kernels even with m=6. kernels with large exponent. Mori and Tanaka [4] proposed a In this paper, we propose a low complexity SC decoder construction of non-binary kernels with large exponent based for arbitrary binary linear kernels. For G2, it has l2(1) = l1◇ on Reed-Solomon codes. In [5], code decompositions were l21,l2(2) = l11−2u1l2 which are called the l-expressions in this used to design good linear and nonlinear binary kernels. In paper leading to a low complexity SC decoding of G2 polar [6], constructions were presented for kernels with maximum codes [1]. Our basic idea, like G2, is to obtain l-expressions exponents up to dimension 16. for arbitrary binary linear kernel Gm. Fig. 1 shows the error performance of polar codes with Zhiliang Huang, Shiyi Zhang, Feiyan Zhang, and Chunjiang Duanmu differentkernelsunderSCandlistSC(LSC)decodingthrough are with the School of Mathematics, Physics and Information Engineering, ZhejiangNormalUniversity,Jinhua,321004,China(e-mail:zlhuang,syzhang, binary-input additive white gaussian noise (AWGN) channel. zhangfy,[email protected]). Itcanbeseenthaterrorperformanceofpolarcodesarealmost MingCheniswiththeSchoolofInformationScienceandEngineering,the NationalMobileCommunicationsResearchLaboratory,SoutheastUniversity, Nanjing,210096,China(e-mail:[email protected]). 1l1◇l2=(l1l2+1)/(l1+l2) 2 decided by the exponent of kernels. G and G have smaller for an arbitrary binary kernel matrix. In section IV, similar to 6 7 exponent than G and the error performance of polar codes l-expressions, we present a W-expressions method to further 2 with G and G is worse than G even with longer block reduce the complexity of SC decoder with high dimensional 6 7 2 length. For kernels with close exponents, high-dimensional kernel.Also,wegivecomplexityanalysesofl-expressionsand kernel polar codes have better error performance than G W-expressionsbasedSCdecoderinthissection.Construction 2 kernel polar codes even with shorter block length such as methods of polar codes with high dimensional kernel are G⊗2 and G⊗2. G polar codes achieves significant error presented in section V. 14 15 16 performance gains than G polar codes under SC and LSC 2 decoding, although G ’s exponent is a little bigger than G . II. PRELIMINARIES 16 2 Next, we use an example to show our main idea. A. Notations Example 1 (l-expressions for an optimal G6 kernel.): We write W ∶ {0,1} → Y to denote a B-DMC channel 1 0 0 0 0 0 with input alphabet {0,1}, output alphabet Y, and transition ⎛ ⎞ ⎜ 1 1 0 0 0 0 ⎟ probabilities W(y∣x),x ∈ {0,1},y ∈ Y. We use the notation ⎜⎜ 1 0 1 0 0 0 ⎟⎟ aN1 fordenotingarowvector(a1,⋯,aN).Forageneralkernel G6=⎜⎜ 1 0 0 1 0 0 ⎟⎟ matrix Gm (all kernels used in this paper are linear kernels ⎜⎜⎜ 1 1 1 0 1 0 ⎟⎟⎟ given in [6]), WGm ∶{0,1}m→Ym is defined by ⎝ 1 1 0 1 0 1 ⎠ l WGm(y1m∣um1 )≜∏W(yi∣(um1 Gm)i). (1) The l-expressions for this kernel: i=1 l6(1)=l1◇l2◇l3◇l4◇l5◇l6 Tishdeenfi,nbeitd-cbhyannels WG(im) ∶{0,1}→Ym×{0,1}i−1,1≤i≤m l6(2)=(l1◇l3◇l4)(l2◇l5◇l6) 1 l6(3)=(l1◇l4l3)◇(l2◇l6l5) WG(im)(y1m,ui1−1∣ui)≜ 2m−1 u∑m WGm(y1m∣um1 ). (2) i+1 l6(4)=l1◇(l2(l3l5)◇(l4l6))⊠l4◇((l1−1l2)◇(l3l5)l6) ForSCdecoding,thebasictaskistocalculatefollowingvalues ll66((56))==(l1l1l2l2l4)l6◇(l4l6)(l3l5) lm(i)= WWGG((iimm))((yy11mm,,uui1i1−−11∣∣uuii==10)) IFdneocrEotxdhaeemraipbsloeOv1e(,2Gw6N6ekgleoivrgneNell-,)et.xhWpercietoshsmitohpnelessxefiolt-yreaxopf6rs×etsr6asiioogpnhtstif,mowrawelakrereddrnuSecCle. = ∑∑uumimi++11WW((yy11∣∣((uum1m1,,uuii==10GGmm))11))⋯⋯WW((yymm∣∣((uum1m1,,uuii==10GGmm))mm)) (3) thecomplexitytoO(7NlogN)where7isthelength(defined where um1,ui=0 means (ui1−1,ui=0,umi+1). later on) of l-expressions. In order to facilitate notation. We use following simple The above G6 kernel is given in [6] and optimal means notation instead of (3) dcithefiahnannseedml baliyxkiemlliiuhm=oodWexrp(aoytniio∣e0sn)t/aWnamd(oylni6(∣1g1)),a⋯l(lW,6l6(6×)is6atrkheeerbncieth-lasc.nhnal1en,ln)⋯ela,srl’6es lm(i)= ∑∑uumimi++11WW((yy11∣∣uu11))⋯⋯WW((yyii∣∣uuii))⋯⋯WW((yymm∣∣uumm)) (4) likelihoodratios.lilj meansli×lj.Forl6(4),twopartswhichare where uk = (um1,ui=0Gm)k,k = 1,⋯,m, uk = uk+1. Let gik connected by ⊠ are called sub-expressions. ⊠ is the same like denote the element of Gm in the ith row and kth column. ×anditisspeciallyusedforseparatingsub-expressions.Three In the denominator of (4), if gik = 1, (um1,ui=1Gm)k = uk; operators’s priority is ⊠<×<◇. Then li◇ljlk =(li◇lj)×lk. otherwise (um1,ui=1Gm)k =uk. InExample1,weomittheinfluenceofknownvaluesui1−1 for In (4), we see channel expression W(y1∣u1) in numerator l6(i).Forexample,l6(3)=(l1(1−2(u1+u2))◇l4l3)◇(l2(1−2u2)◇l6l5) is different from W(y1∣u1) in denominator. We call this as and it will be explained in section III. C. one difference for lm(i). By our analysis, l-expressions based SC decoder is good It should be noticed that ui views as a expression of binary for medium kernel size such as m≤10. However, it become variables, not a value in our algorithm. For example, u1 = impractical for lager kernel size such as m = 16. So, similar u4+u5+u6 in the definition of l6(3) by (3). Let si denote the to l-expressions, we propose a W-expressions method to set of variables contained in ui. So s1 = {u4,u5,u6} in the furtherreducethecomplexityofSCdecoderforlargerdimen- previous example. ui ∩uj = ∅ means si ∩sj = ∅. ui ⊂ uj sionkernelsbyconsideringbit-channeltransitionprobabilities means si⊂sj. WG(⋅)(⋅∣0) and WG(⋅)(⋅∣1) separately. Our main achievement is: All of operations in this paper will be over GF(2). So, if Using W-expressions method, we show that the complexity of u1=u4+u5 and u2=u5+u6, u1+u2=u4+u6. Gm SC decoder is O(m2NlogN) for optimal kernels given We write 1u1=0 to denote the indicator function of equation in [6] when m≤16. u1=0; thus, 1u1=0 equals 1 if u1=0 and 0 otherwise. The rest of the paper is organized as follows. In section InExample1,l(4)isconnectedbytwosub-expressionswith 6 II, we introduce the basic definitions and point out our basic ⊠. We call the length of l6(4) is 2. And other l-expressions’s task. In section III, we give details how to get l-expressions lengths are all 1. So the total length is 7 for this example. 3 Example 2: l6(4)= ∑∑uu656WW((yy11∣∣uu55++uu66))WW((yy22∣∣uu55++uu66))WW((yy33∣∣uu55))WW((yy44∣∣uu66))WW((yy55∣∣uu55))WW((yy66∣∣uu66)) (5) 5 ∑u6W(y1∣u5+u6)W(y2∣u5+u6)W(y3,y5∣u5)W(y4∣u6)W(y6∣u6) = 5 (6) ∑u6W(y1∣u5+u6)W(y2∣u5+u6)W(y3,y5∣u5)W(y4∣u6)W(y6∣u6) 5 ∑u6W(y1∣u5+u6)W(y2∣u5+u6)W(y3,y5∣u5)W(y4∣u6)W(y6∣u6) ⊠ 5 (7) ∑u6W(y1∣u5+u6)W(y2∣u5+u6)W(y3,y5∣u5)W(y4∣u6)W(y6∣u6) 5 =l1◇ ∑∑uu6565WW((yy22∣∣uu55++uu66))WW((yy33,,yy55∣∣uu55))WW((yy44,,yy66∣∣uu66))11uu55++uu66==01 ⊠ ∑∑uu66WW((yy1122,,yy33,,yy55∣∣uu66))WW((yy44∣∣uu66))WW((yy66∣∣uu66)) (8) =l1◇(l2∑∑uu66WW((yy33,,yy55∣∣uu66))WW((yy44,,yy66∣∣uu66)))⊠l4◇ ∑∑uu66WW((yy1122,,yy33,,yy55,,yy66∣∣uu66))11uu66==10 (9) =l1◇(l2(l3l5)◇(l4l6))⊠l4◇((l1−1l2)◇(l3l5)l6) (10) =l1(1−2(u1+u2+u3))◇(l2(1−2u2)(l3(1−2u3)l5)◇(l4l6))⊠l4◇((l1−(1−2(u1+u2+u3))l2(1−2u2))◇(l3(1−2u3)l5)l6) (11) B. Basic task With one difference property in (6), we get the left part of By using (3), the total computational cost of lm(1),⋯,lm(m) is (8). This is our key step and it is called fundamentalstep. O(m2m). We call these calculations as inside kernel calcula- Firstly, let W(y12∣u5+u6)=W(y1∣u5+u6)W(y2∣u5+u6) tion. A polar code defined by G⊗mn with block length N =mn in (7). It is zero-variable-combine. Then, we get the right needs to recursively implement NlogNm/m times of inside part of (8) by defining W(y12,y3,y5∣u6) = ∑u5W(y12∣u5 + kernel calculation for SC decoding. So the complexity of u6)W(y3,y5∣u5). We call this function as one-variable- SmCatrdiexcoGdmin.gItbeghroawvesselxikpeonOe(n2timalNlylwogitNmh)thfeorkaergneenlesriazle.keSroneitl cl1−o1ml2b◇inl3el.5W. ith this function, we have ly12,y3,y5 =ly12◇ly3,y5 = isnotpracticalforlargekernelsizesuchasm=16.Therefore, The left part of (9) is obtained by doing u5 = u6 and our basic task is to reduce the computational cost of (3). u5=u6+1 for each channel expressions in the numerator and denominator for the left part of (8), respectively. The right part of (9) is obtained by doing zero-variable-combine with III. BIT-CHANNELLIKELIHOODEXPRESSIONSFORGm defining W(y12,y3,y5,y6∣u6) = W(y12,y3,y5∣u6)W(y6∣u6) In this section, we propose our method to generate l- and fundamentalstep in the right part of (8). Implementing expressionsoflm(i),i=1,⋯,mforanarbitrarykernelGm.We fundamental step in the left part of (9) and doing u6 = 0 beginwithanexampletoillustratethemethod.Andwedenote and u6 = 1 in both left and right of (9), we get (10). Doing somefunctionsinthedescriptionoftheexample.Then,ahigh- l1=l11−a1,l2=l21−a2,l3=l31−a3 in (10), we get (11). level description of the l-expressions generating algorithm is proposedaccordingtothesefunctions.Finally,wegivedetails of these functions in general case and proofs of them. B. High-Level Description of The Algorithm A. Illustration Example We denote hide known values, zero-variable-combine, InExample2,weusel(4) toillustratethel-expressionsgen- extendmethod and one-variable-combine and fundamen- 6 talstep functions in the description of Example 2. eratingalgorithm.Basedondefinition(3),weget(5).Actually, weomitknownvaluesa1=u1+u2+u3,a2=u2,a3=u3in(5). For a complete description of the l-expressions generating We will add them in (11). To omit known values at first and algorithm,weneedtwomorefunctionsstandardexpression add them in final step, we call this function as hide known transform and symmetric expression transform. The two values. functions are not necessary, but it can simplify the final l- Define W(y3,y5∣u5)=W(y3∣u5)W(y5∣u5) in (6) and (7). expressions.Also,weusetwosimplificationstodenotezero- Tashezneriot-hvaasrialyb3l,ey5-c≜omWWb((iyyn33e,,yy.55∣∣01)) = l3l5. This function is called vaBriaasbeled-coonmthbeisneefaunndctioonnes,-vwaeriagbivlee-caohmigbhi-nleev.el description By adding a mid term, we get (6) and (7) from (5). ⊠ is of l-expressions generating procedure in Algorithm 1. the same as common multiple ×. Obviously, (5)=(6)⊠(7). In Algorithm 1, every step is working on the result from We call (6) and (7) as sub-expressions. ⊠ is specially used in its previous step. After a while loop is finished, variables of separating sub-expressions. It can be seen there are only one expressions reduce at least 1 (like left part of (9) to (10), u6 5 difference in both (6) and (7). We call the function from (5) to u6). Then, the algorithm will stop after at most m−i+1 to (6)⊠(7) as extendmethod. while loops. 4 Algorithm 1 : Bit-channel likelihood expression generating Lemma 1 (Standard expression transform): For a likeli- procedure hoodexpressionl(i) definedbyalowertriangularmatrixG , m m input: A kernel lower-triangular matrix Gm, index i and it can be transformed to a standard expression. channel output likelihood ratios l1,⋯,lm Proof: First, we give an example of standard expression output: Bit-channellikelihood expressionlm(i) asa function transform. Given a lower triangular matrix G5 of l1,⋯,lm ed//aaArrlldgyoeprxirtpohrcmeessssstiainorgtns:tifrmraonpmlsefmolm(riem)n=toh∑∑niduulemimim(++i11k)WWno((wyy11n∣∣uu11v))a⋯⋯luWWe((syymma∣∣nuudmm))stan- G5= uu34 ⎛⎜⎜⎜⎜⎜ 1111 0100 0010 0001 0000 ⎞⎟⎟⎟⎟⎟. while i+1≤m do u ⎝ 1 1 1 0 1 ⎠ 5 for each sub-expressions do implement twosimplifications as much as possible By doing linear transform in rows 3,4,5 of G , we get 5 implement symmetricexpressiontransform implement extendingmethod 1 0 0 0 0 ⎛ ⎞ implement fundamentalstep 1 1 0 0 0 ⎜ ⎟ end for G´5= t3 ⎜⎜ 1 0 1 0 0 ⎟⎟. end while t4 ⎜⎜ 1 0 0 1 0 ⎟⎟ t ⎝ 0 1 0 0 1 ⎠ 5 C. Details of functions Let l(2) and ´l(2) are likelihood ratio expressions defined by 5 5 1) Hideknownvalues: GivenakernelG ,letG andG G and G´ , respectively. Notice´l(2) is in standard expression m A B 5 5 5 be the submatrices of Gm consisting of first i−1 rows and transform. Therefore, we only need to show l5(2)=´l5(2). Since last m−i+1 rows, respectively. there is one-to-one correspondence between u53 ∈{0,1}3 and Remember our basic task is to simplify (3). Let aj = t53∈{0,1}3, it’s easy to see l5(2)=´l5(2) by the definition. (ui1−1GA)j,j =1,⋯,m, we have The above method can be easily generalized to any l(i) of m W(yj∣(um1 Gm)j)=W(yj∣aj+(umi GB)j). a low triangular matrix Gm. Also, it’s provide a procedure to implement standard expression transform. Since a are known values, we can replace (3) by following j InLemma1,wesupposethekernelG isalowertriangular m expression matrix since all of optimal kernels given in [6] are lower lm(i)= ∑∑uumimi++11WW((yy11∣∣((uumimi,,uuii==10GGBB))11))⋯⋯WW((yymm∣∣((uumimi,,uuii==10GGBB))mm(1))2) tawrsiesaunamglwupltaaiyrosnmciaantnriLcteeramsn.msIfnaormf1a.cGtB,eCwcae(uGsdeCo,ngi’stivnteheneedasnulybomwGeamrtritxrainaodnfgluGm(liam)r, Proposition 1 (Hide known values): Assume we get the consisting of last m−i rows) to a lower triangular form with expression of lm(i),(12)=fi(l1,⋯,lm) using (12) by Algorithm row3)trFanusnfdoarmmeantitoanl Satnedp:column rearrangement. 1. Then the real expression of l(i) using (3) is l(i) = m,(3) m,(3) Lemma 2 (Fundamental step): Given a likelihood ratio ex- fi(l1(1−2a1),⋯,lm(1−2am)). ◻ pression with only one difference between the numerator and Proof: For an arbitrary j ∈1,⋯,l, denominator such as lj,(3)= WW((yyjj∣∣aajj++((uumimi,,uuii==10GGBB))jj ==01)) lm(i)= ∑∑uumim+1WW((yy11∣∣uu11))WW((yy11∣∣uu22))⋯⋯WW((yymm∣∣uumm)) =(WW((yyjj∣∣((uumimi,,uuii==10GGBB))jj ==10)))1−2aj =(lj,(12))1−2aj and assume that iu+11 and um contain ui+1, then we have whBeraeseldj,(3o)nanPdrolpj,o(s1i2t)iomnea1n, lwjeusiimngpl(e3m)aenndt (t1h2e),arlgesoprietchtmiveolyn. lm(i)=l1◇ ∑∑uumimi++22WW((yy22∣∣xx22))⋯⋯WW((yymm∣∣xxmm)) (14) (12) instead of (3). After we get the final l-expressions, we replace lj,(12) by (lj,(3))1−2aj for each j =1,⋯,m. wkh=er2e,⋯xk,m= uinkt+heu1nuifmuerkatcoornotafin(1s4u).i+I1n; tohtehedrweniosemixnkat=oruokf, 2) Standard Expression Transform: Definition 1 (Standard expression): A standard likelihood (14), if uk contains ui+1, xk =xk; otherwise xk =xk. expression l(i) has following form Proof: Let M =W(y1∣u2)⋯W(ym∣um) and define m lm(i)= ∑∑uumimi++11⋯⋯WW((yyii++11∣∣uuii++11))⋯⋯WW((yymm∣∣uumm)); (13) WW((yy22mm∣∣01))≜≜WW((yy22mm∣∣uu11==01))≜≜u∑∑mi+1MM ⋅⋅11uu11==01,. that is uj =uj,j =i+1,⋯,m. umi+1 5 Then is specially used in extend method and its priority is lower lm(i)= ∑∑uumimi++11WW((yy11∣∣uu11))MM11uu11==00++∑∑uumimi++11WW((yy11∣∣uu11))MM11uu11==11 tthoaIantn’sy×e.nausymtboerseoeftdhiaftfethreenecxetse.nHdionwgemveert,htohdeccaonstbiesgteonienrcarleizaesde W(y1∣0)W(y2m∣0)+W(y1∣1)W(y2m∣1) the length of expression. = W(y1∣1)W(y2m∣0)+W(y1∣0)W(y2m∣1) 4) Symmetric expression transform: The length of final =l1◇ WW((yy22mm∣∣uu11==01)) elom(xfip)d.rieUfsfsesiironengnctoehfselcsm(aiy)nmibmseedrteerpidceunpcdreodopnfeorttryhaeogfniubvmietn-bcelhr(ai)no.nfedl,iftfheerennucmesboerf =l1◇ ∑umi+1W(y2∣u2)⋯W(ym∣um)1u1=0 Proposition 3 (symmetric property of bit-mchannel): Given ∑umi+1W(y2∣u2)⋯W(ym∣um)1u1=1 a bit-channel expression =l1◇ ∑∑uumimi++22WW((yy22∣∣xx22))⋯⋯WW((yymm∣∣xxmm)). WG(im)(⋅∣ui=1)=u∑mi+1W(y1∣u1)⋯W(yi∣ui)⋯W(ym∣um), and assume that only u and u contain u , we have 1 i i+1 In Lemma 2, we assume u contains u . Then we have xk = uk +u1 if uk contains1ui+1; otherwi+is1e xk = uk, k = WG(im)(⋅∣ui=1)=u∑m W(y1∣u1)⋯W(yi∣ui)⋯W(ym∣um). 2,⋯,m in the numerator of (14). In fact, we don’t need this i+1 tia⋯xhnskes<tu=hamielutgpkn≤ot+uiromimutnh1.e.mWriAafteasuonscrkudamocnifotec’ns(hu1toag41ioo)n=s.osedH{uauotcnjiwhy1;oeuo+ivctiehjuer,ei,jb2rwwy+∈iees{⋯ex1ap,l+xwe2kru,ai=⋯myitsu},et,nkc}ht,1.sok.T≤o=hsiee12n,u<i⋯iti1h,2mai<ns csocuhhdbaadPnnsreggonteeousfmoiutbfiis+etuor1imim+v(t1uaom.rii1Leua,debi⋯i+tlae1,(ts.euuiiiAjan1)n,c.⋯t(duuF,aiooul1lmr,iyj⋯,iu)ttkw,edu,deeki.njc)o=Ia,nten1th,tahc⋯ehisns,uambnupsgkr,eeotibpfoaeolficsltoiutmcpimioo+oenn1ss,st.aiuWbiwnkleees; Example 3 (Fundamental step for l(1)): otherwise it doesn’t change. 3 Proposition 4 (Symmetric expression transform): Given a l3(1)==l∑∑1uu◇3232∑∑WWuu((33yyWW11∣∣uu((yy2222++∣∣uuuu2233))))WWWW((((yyyy3322∣∣uu∣∣uu3322))))11WWuu22++((uuyy3333==∣∣uu0133)) wlikeeulisheolm(soiyd)m=rma∑t∑eiotuurimimiec++x11ppWWrroe(ps(syyei1r1ot∣∣unyu11o))f⋯⋯bWiWt-c((hyya1in∣∣uuniie))l⋯⋯toWWal((lyysmmub∣∣uusmmet)s),ofum =l1◇ ∑u3W(y2∣u3)W(y3∣u3). for denominator of lm(i) and obtain 2m−i equivalent likelihoio+d1 ∑u3W(y2∣u3)W(y3∣u3) ratio expressions. Assume that ´lm(i) has the least number of ByLemma2,wereduceonevariablefromum toum .So differences among these expressions. Then we replace l(i) by i+1 i+2 m if there are still only one difference between the denominator ´l(i). m and numerator of reduced expression (consider the left part in Proposition 4 describes a procedure to assure that l(i) has m (8)),wecancontinuetoimplementLemma1.Thenwegetthe least number of difference. To do that, it need 2m−i−1 times final likelihood expression after implementing m−i times of tests.It’sacceptedforsmallkernelsize.Actually,wejustneed Lemma 2. to test one and two elements subsets of um to acquire the i+1 Iftherearemorethanonedifferencebetweennumeratorand least number of difference equivalent expression of l(i), up to m denominator ofexpressions, we candefine some mid-termsto kernel size m=16 by our experiments. extendexpressionsandmakeextendingexpressionshaveonly 5) Two Simplifications: In this section, we propose two one difference. obvious ways to simplify the expressions. Proposition 2 (Extend method): Given a likelihood ratio Proposition 5 (Zero-variable-combination): Given a likeli- expression with two differences between the numerator and hood ratio expression as following denominator such as lm(i)= ∑∑uumim+1WW((yy11∣∣uu11))WW((yy22∣∣uu22))⋯⋯WW((yymm∣∣uumm)) lm(i)= ∑∑uumimi++11WW((yy11∣∣uu11))WW((yy22∣∣uu11))⋯⋯WW((yymm∣∣uumm)), i+1 we have we halvm(ie)= ∑∑uumim+1WW((yy11∣∣uu11))WW((yy22∣∣uu22))⋯⋯WW((yymm∣∣uumm)) (15) lm(i)= ∑∑uumimi++11WW((yy1122∣∣uu11))⋯⋯WW((yymm∣∣uumm)), ⊠ ∑∑uumimi++11WW((yy11∣∣uu11))WW((yy22∣∣uu22))⋯⋯WW((yymm∣∣uumm)). (16) whPTerhroeepnWoisni(tyito1h2n∣eu61fi)n(aO=lnWleik-(veyali1rh∣iuoa1ob)dleW-ecx(opymr2eb∣usisn1ia)o.tnio,nit):haGsivlye12n=al1lli2k.eli- i+1 hood ratio expression as following In proposition 2, we divide the given l(i) into two part m b(1y6)o,preersaptoecrti⊠velayn.d⊠itishtahseosnalmyeoanse cdoimffemreonncemuinlti(p1le5)×a.n⊠d lm(i)= ∑∑uummi+1WW((yy11∣∣uu11))WW((yy22∣∣uu22))⋯⋯WW((yymm∣∣uumm)) i+1 6 TABLEI and assume u2 =ui+1, u2 ⊂u1 and u2∩uk =∅, k=3,⋯,m. Cl(m)FORDIFFERENTKERNELS Then we have lm(i)= ∑∑uumimi++22WW((yy1122∣∣uu11++uu22))⋯⋯WW((yymm∣∣uumm)) ECxplm(omne)nt 012..50 01.3.402 014..50 01.5.403 01.46.251 02.47.057 038..56 whTehreenWin(yt1h2∣eu1fin+aul2li)k=eli∑hou2odWe(xyp1r∣eus1s)ioWn,(yit2∣hua2s).ly◻2 =l1◇l2. 0.4961 0.14069 0.14177 0.14292 0.14388 0.14494 0.510508 0.15618 Example 4 (One-variable-combination for l(1)):1 3.8 6 17 49 95 278 793 2487 3 l3(1)= ∑∑uu3232WW((yy11∣∣uu22++uu33))WW((yy22∣∣uu22))WW((yy33∣∣uu33)) edxepnroemssiinoantos,riosfprlm(oip)o.sFeodrtothfisurtrheearsorne,duacemtehtehocdo,mcpallelexdityWof- = ∑∑uu33WW((yy1122∣∣uu33))WW((yy33∣∣uu33)) Slm(Ci)dseecpoadraetrelbyy. considering the numerator and denominator of In W-expressions, we focus on the following equation where W(y12∣u3)=∑u2W(y1∣u2+u3)W(y2∣u2). WG(im)(⋅∣ui)=u∑m W(y1∣u1)⋯W(ym∣um). (17) IV. REDUCECOMPLEXITYBYW-EXPRESSIONS i+1 In this section, we propose our methods to generate W- Definition 2: Let B(yi)=(B(yi∣0),B(yi∣1)). Define expressions of WG(im)(⋅∣ui),i=1,⋯,m for an arbitrary kernel B(yi)−1=(B(yi∣1),B(yi∣0)) G . Firstly, we give an analysis about the complexity of l- m S(B(yi))=B(yi∣0)+B(yi∣1) expressions based SC decoder and show that l-expressions method is not accepted for larger kernels such as m = 16. B(yi)⋅B(yj)=(B(yi∣0)B(yj∣0),B(yi∣1)B(yj∣1)) secondly, we analyse one drawback of l-expressions method B(yi)◇B(yj)=(B(yi∣0)B(yj∣0)+B(yi∣1)B(yj∣1) and propose a W-expressions method to overcome the draw- ,B(yi∣0)B(yj∣1)+B(yi∣1)B(yj∣0)) back. Then an example of W-expressions is given for making Lemma 3 (Fundamental step for W-expressions): Given a themethodmoreclear.Finally,thecomplexityanalysisofW- expressionsbasedSCdecoderisgivenanditcontainsourmain channel expressions WG(⋅m) (⋅∣ui) as (17) and assume u1 and u contains u , we have achievement. m 1 A. Complexity Analysis of l-expressions WG(im)(⋅∣ui)=u∑mi+1W(y1∣u1)⋯W(yi∣ui)⋯W(ym∣um) Let Cl(m) denote the average length of l-expressions for =S(B(y1)⋅(x∑m W(y1∣x2)⋯W(ym∣xm) a kernel Gm. Actually Cl(m) is the average number of sub- i+2 expressions. For a kernel G , the complexity of calculating a , ∑ W(y1∣x2)⋯W(ym∣xm))) (18) m xm sub-expression is O(m) and it needs to compute mCl(m) i+2 sub-expressions for the inside kernel calculation. Then the where xk = uk +u1 if uk contains ui+1; otherwise xk = uk, calculationcostofinsidekernelcalculationisO(m2⋅Cl(m)). k=2,⋯,mintheupperpartof(18).Inthelowerpartof(18), Because it needs to implement NlogNm/m times of inside if uk contains ui+1, xk =xk; otherwise xk =xk. kernel calculation. So the complexity of l-expressions based Proof: Let M =W(y1∣u2)⋯W(ym∣um) and define SCTdabecleodIergiivseOs(tChel(mres)u⋅lmtsNofloCgNml()mf)orbaygiemneprlaelmkeenrtnienlgGAml-. W(y2m∣0)≜W(y2m∣u1=0)≜u∑m M ⋅1u1=0, i+1 tghoeritlh-emxp1resfsoironksermneeltshoudpistogomod=fo1r6s.mItallcaknerbneelsseseunchthaast W(y2m∣1)≜W(y2m∣u1=1)≜ ∑ M ⋅1u1=1. um i+1 m≤10. However, Cl(m) increases very fast with kernel size Then m. Actually, G is the first kernel which obtains significant 16 aBduvtaCntla(g1e6s)iinsatebromust2o4f8e7rrtoimr peesrtfhoarnmCanlc(e2)c.oImtmpaeraendswthiathtGG126. WG(im)(⋅∣ui)=u∑mi+1W(y1∣u1)M1u1=0+u∑mi+1W(y1∣u1)M1u1=1 based SC decoder is about 2487∗16/(2∗log126)=4974 times =W(y1∣0)W(y2m∣0)+W(y1∣1)W(y2m∣1) tchaannntohtebGe2acbcaespetdedS.C decoder with the same block length. It =S(B(y1)⋅(W(y2m∣0),W(y2m∣1))) =S(B(y1)⋅(∑ W(y1∣x2)⋯W(ym∣xm) xm i+2 B. Reduce Complexity by W-expressions , ∑ W(y1∣x2)⋯W(ym∣xm))). It’s shown in previous subsection that the complexity of l- xmi+2 expressions based SC decoder is unaccepted for large kernels. Oneproblemisthatwecannotimplementtwosimplifications From (17) to (18), we decompose B(y1) from (17) and the in some cases because of the relation between numerator and two remaining parts are in the same form as (17). Therefore, 7 Example 5: WG(46)(y16,u31∣1)=∑W(y1∣u5+u6)W(y2∣u5+u6)W(y3∣u5)W(y4∣u6)W(y5∣u5)W(y6∣u6) (19) u6 5 =∑W(y1,y2∣u5+u6)W(y3,y5∣u5)W(y4,y6∣u6) (20) u6 5 =∑W(y1,y2,y3,y5∣u6)W(y4,y6∣u6) (21) u6 =∑W(y1,y2,y3,y5,y4,y6∣u6) (22) u6 =S((B(y1)−1⋅B(y2))◇(B(y3)⋅B(y5))⋅(B(y4)−1⋅B(y6))) (23) WG(46)(y16,u31∣0)=∑W(y1∣u5+u6)W(y2∣u5+u6)W(y3∣u5)W(y4∣u6)W(y5∣u5)W(y6∣u6) (24) u6 5 =S((B(y1)⋅B(y2))◇(B(y3)⋅B(y5))⋅(B(y4)⋅B(y6))) (25) we can use lemma 3 repeatedly and obtain an expression of Using zero-variable-combine again, we get (22). Then it WG(⋅m) (⋅∣ui) as a function of B(y1),⋯,B(ym). has B(y1,y2,y3,y5,y4,y6)=B(y1,y2,y3,y5)⋅B(y4,y6). The two remaining parts in (18) are called sub-expressions Implementing fundamental step for W-expressions, we for W-expressions. So the length of W-expressions increases get (23). 1exaFpforteerrsWsimio-pnelxetpmrraeensnsstiifonongrms,fuhnairddeaintmhgeeknsntaaomlwesnteasvpalfl-rueoxempsre(a1sns7di)ontsost.a(n1Ind8)at.hrde WGS(4i6)m(i⋅∣l0a)r into(2W5)G.(46)(⋅∣1), we give the W-expressions of Compared Example 5 with Example 2, it can be seen that finalW-expressions,wehaveB(y12)=B(y1)⋅B(y2)forzero- the W-expressions reduces the length of expressions from 2 variables-combinationandB(y12)=B(y1)◇B(y2)forone- to 1. By our experiments, W-expressions offers significant variables-combination. It doesn’t need symmetric expres- advantagesintermsoflengthofexpressionsforlargerkernels. siontransform and extendingmethod for W-expressions. Based on above, we give the W-expressions generating TABLEII procedure in Algorithm 2. CW(m)FORDIFFERENTKERNELS Algorithm 2 : W-expressions generating procedure m 2 3 5 6 7 8 9 input: A kernel matrix G , index i and channel output m CW(m) 1 1 1 1 1.1 1.4 1.6 B(y1),⋯,B(ym) m 10 11 12 13 14 15 16 output: WG(im)(⋅∣ui) as a function of B(y1),⋯,B(ym) CW(m) 2.1 2.5 3 4 4.9 6.1 6.7 //Starts from WG(im)(⋅∣ui)=∑umi+1W(y1∣u1)⋯W(ym∣um) early processing: implement hiding known values and standardexpressiontransform on WG(im)(⋅∣ui) while i+1≤m do D. Complexity Analysis of W-expressions for each sub expressions do implement twosimplifications as much as possible In Example 1, it can be seen that l l just needs to 3 5 implement fundamentalstepforW-expressions calculate one time for l(4). We call this as inside expression 6 end for simplification. For the complexity analysis of l-expressions, end while we don’t consider the inside expression simplification since it makes no significant complexity reduction. However, it has significant complexity reduction by considering inside C. Illustration Example for W-expressions expression simplification for W-expressions. For example, the expInressEixoanmsmpleetho5d,.BwaesedusoendWefiG(n46i)t(io⋅∣nu4()2),twoeigleluts(t1r9at)e.ThWen- elexnpgrtehssoiofnWs 1s(6i5n)cieso5t1h2e.rBsuutb-wexepjruesstsinoenesdatroectahlecurleapteeti1t6ionsuobf- define W(y1,y2∣u5 + u6) = W(y1∣u5+u6)W(y2∣u5 + u6), these 16 sub-expressions. W(y3,y5∣u5) = W(y3∣u5)W(y5∣u5) and W(y4,y6∣u6) = Let CW(m) denote the average length of generated W- W(y4∣u6)W(y6∣u6), we get (20). Then we have B(y1,y2)= expressions for a general kernel Gm. Then the complexity of B(y1)−1⋅B(y2), B(y3,y5)=B(y3)⋅B(y5) and B(y4,y6)= W-expressions based SC decoder is O(CW(m)⋅mNlogNm) B(y4)−1⋅B(y6). They are zero-variable-combine. for the Gm based polar code. Table II gives the results of (21) is obtained by defining W(y1,y2,y3,y5∣u6) = CW(m) by using Algorithm 2 for optimal kernels [6] up to ∑u5W(y1,y2∣u5 + u6)W(y3,y5∣u5). This is one-variable- m=16. It means that the complexity of W-expressions based combine. Then B(y1,y2,y3,y5)=B(y1,y2)◇B(y3,y5). SC decoder is O(m2NlogN) when m≤16. 8 100 r o r r e ck 10−1 o bl ate10−1 y of k Error R probabilit10−2 oc n Bl10−2 d o 16⊗ 1 GA−DE n ou 16⊗ 1 Monte Carlo 2⊗ 12 Monte Carlo method er b 2⊗ 4 GA−DE 2⊗ 12 Tal−Vardy method Upp10−3 2⊗ 4 Monte Carlo 10−3 1 1.25 1.5 1.75 2 2.25 1 2 3 4 5 Eb/N0(dB) Eb/N0(dB) Fig. 2. Block-error-rate versus Eb/N0 for SC decoding with G⊗212 polar Fig. 3. Pfer versus Eb/N0 for SC decoding with G⊗24 and G⊗161 polar codeontheBPSK-modulatedGaussianchannelusingTal-Vardymethod[14] codes on the BPSK-modulated Gaussian channel using GA-DE method [9] andMonteCarlomethod[1]atEb/N0=2dB.Thecoderateis0.5. andMonteCarlomethod[1]atEb/N0=2dB.Thecoderateis0.5. V. CODESCONSTRUCTION Fig. 3 shows Pfer vs. Eb/N0 results under DE-GA and Two methods are proposed to construct polar codes with Monte Carlo methods. For the small block length N = 16, it can be considered that Monte Carlo method is an accurate highdimensionalpolarcodes.OneisMonteCarlomethod[1] ,theotherisGaussianApproximationbaseddensityevolution method for computing Pe(Wi). So Fig. 3 confirms that polar codes constructed by GA-DE method become inaccurate as (GA-DE) method [9]. kernel size m goes larger. A. Monte Carlo method REFERENCES Arıkan[1,SectionIX]providesaMonteCarloapproachfor [1] E. Arıkan, “Channel polarization: A method for constructing capacity- constructing polar codes. In Monte Carlo approach, it assume achievingcodesforsymmetricbinary-inputmemorylesschannels,”IEEE that all-zero codeword is transmitted.Firstly, for a bit-channel Trans.Inf.Theory,vol.55,no.7,pp.3051-3073,Jul.2009. WN(i),i ∈ {1,⋯,m}, it assume that ui1−1 = 0i1−1 are known [2] EP.roAc.rıIkEaEnEanIndt.I.SyEm.pT.eIlnaft.arT,h“eOornyt(hIeSIrTa)t,eSoefouclh,aKnonreelap,oJluanri./zJautli.on2,0”0i9n, values. Then it uses SC decoder to evaluate the reliability of pp.1493-1495. WN(i). Finally, based on reliabilities of WN(i),i∈{1,⋯,m}, it [3] Sof.Bex.pKoonreandta,,bEo.uSnadsso,galun,dancodnRst.ruUcrtbioannsk,e”,I“EPEolEarTcroadness.:ICnfh.aTrahcetoerriyz,atviooln. chooses some best bit-channels as information set A; that is 56,no.12,pp.6253-6264,Dec.2010. the polar code. [4] R.MoriandT.Tanaka,“Sourceandchannelpolarizationoverfinitefields Fig.2givesthecomparisonoferrorperformancesforG⊗12 andReed-Solomonmatrices,”IEEETrans.Inf.Theory,vol.60,no.5,pp. 2 2720-2736,May.2014. polar code which are constructed by the Monte Carlo method [5] N. Presman, O. Shapira, S. Litsyn, T. Etzion, and A. Vardy, “Binary andTal-Vardymethod[14].Tal-Vardymethodwasconsidered polarizationkernelsfromcodedecompositions,”IEEETrans.Inf.Theory, the optimal construction method [14]. It was shown that the vol.61,no.5,pp.2227-2239,May.2015. [6] H.Lin,S.Lin,andK.A.S.Abdel-Ghaffar,“Linearandnonlinearbinary Monte Carlo method achieves the same error performance as kernels of polar codes of small dimensions with maximum exponents,” the Tal-Vardy method. Therefore, it is conceivable that the IEEETrans.Inf.Theory,vol.61,no.10,pp.5253-5270,Oct.2015. Monte Carlo approach is an optimal method for constructing [7] G.Bonik,S.Goreinov,andN.Zamarashkin,“Constructionandanalysis of polar code and concateneated polar codes: practical approach,” Jul. polar codes. 2012,[Online]Available:http://arxiv.org/abs/1207.4343 [8] X. Wang, Z. Zhang, and L. Zhang, “On the SC decoder for any polar code of length N = ln,” in Proc. IEEE Wireless Communications and B. Gaussian Approximation Networking Conference (WCNC), Istanbul, Turkey, Apr. 2014, pp. 485- 489. A first efficient construction of polar codes in general case [9] S.-Y.Chung,T.J.Richardson,R.L.Urbanke,“Analysisofsum-product which are based on density evolution [10] was made by Mori decodingoflow-densityparity-checkcodesusingaGaussianapproxima- and Tanaka [11]. Then Trifinov demonstrated that polar codes tion,”IEEETrans.Inf.Theory,vol.47,no.2,pp.657-670,Feb.2001. [10] T.J.RichardsonandR.L.Urbanke,ModernCodingTheory.Cambridge, can be efficiently constructed using GA-DE method [12]. U.K.:CambridgeUniv.Press,2008. With l-expressions, it’s straightforward to construct polar [11] R.MoriandT.Tanaka,“Performanceandconstructionofpolarcodeson codes by using GA-DE method. However, polar codes con- symmetric binary-input memorless channels,” in Proc. IEEE Int. Symp. Inf.Theory(ISIT),Seoul,Korea,Jun./Jul.2009,pp.1496-1500. structed by GA-DE method become inaccurate as kernel size [12] P.Trifonov,“Efficientdesignanddecodingofpolarcodes,”IEEETrans. m become larger by our experiments. Commun.,vol.60,no.11,pp.3221-3227,Nov.2012. Let W1,W2,...,WN be bit-channels and Pe(Wi) denote [13]TIh.eTorayl,avnodl.A6.1,Vnaord.y5,,“pLpi.st22d1e3co-2d2in2g6,oMfapyo.la2r01c5o.des,” IEEE Trans. Inf. the probability of error on the ith bit-channel [14]. Then a [14] I. Tal and A. Vardy, “How to construct polar codes,” IEEE Trans. Inf. union bound on the frame error rate of polar codes is Pfer ≤ Theory,vol.59,no.10,pp.6562-6582,Oct.2013. ∑i∈APe(Wi) where A is the information set of the code [1].

