Table Of Content

Automatic Generation of Typographic Font from a Small Font Subset Tomo Miyazaki, Tatsunori Tsuchiya, Yoshihiro Sugaya, Shinichiro Omachi, Masakazu Iwamura, Seiichi Uchida, and Koichi Kise, Abstract—This paper addresses the automatic generation of a typo- missing characters to obtain a complete typographic font. Although graphicfontfromasubsetofcharacters. Specifically, weuseasubsetof the proposed method requires a subset of a particular typographic atypographicfonttoextrapolateadditionalcharacters.Consequently,we font,theburdenofcreatingasubsetofafontismuchlessthancreat- obtain a complete font containing a number of characters sufficient for 7 dailyuse.TheautomatedgenerationofJapanesefontsisinhighdemand inganentirefont.Thisbenefitishighlightedinlanguagescontaining 1 because a Japanese font requires over 1,000 characters. Unfortunately, manycharacters,suchasChineseandJapanese.Theproposedmethod 0 professional typographers create most fonts, resulting in significant uses sample characters to extract strokes and constructs characters 2 financialandtimeinvestmentsforfontgeneration.Theproposedmethod by deploying them. Fig. 1 illustrates an overview of the proposed n ccarenatbeethaegmraeajotriatiydoffotrhefocnhtarcarcetaetriosnforbeacnauewsefodnets.igTnheersprdooponsoetdnmeeetdhotdo method,whichbeginswithstrokeextractionfromsamplecharacters. a uses strokes from given samples for font generation. The strokes, from Strokesareextractedusingtheskeletonsofthesamples.Wetakethe J which we construct characters, are extracted by exploiting a character skeletonsofatargetcharacterfromtheskeletondatasetandtransform 0 skeleton dataset. This study makes three main contributions: a novel itintothestructureofthetargetfont.Finally,weselectsomestrokes method of extracting strokes from characters, which is applicable to 2 and deploy them to the target skeletons. bothstandardfontsandtheirvariations;afullyautomatedapproachfor The proposed method accepts samples in an image format. For constructing characters; and a selection method for sample characters. ] Wedemonstrateourproposedmethodbygenerating2,965characters in practicality, the ideal font generation method must accept characters V 47 fonts. Objective and subjective evaluations verify that the generated in an image format and require only a small subset of the font C characters are similar tohandmade characters. as input. First, we consider the image format. For the purpose of Index Terms—Font generation, Active shape model, Computer aided . characterrecognition,charactergenerationfromsamplesinanimage s manufacturing, Kanjicharacter. c format is far more useful than from those in a vector format. Image [ I. INTRODUCTION formats are more convenient for sample collection. For example, when users encounter text in an unknown font, it is difficult to 1 AutomaticgenerationofJapanesefontisimportantforloweringthe find samples in a vector format, whereas an image is available v costsassociatedwithcreatingkanji1 characters.Allofthecharacters 3 immediately upon taking a photo of the text. Likewise, sample in a font need to be designed in a particular style and size, a 0 collection is time-consuming work and could be an obstacle for the process generally performed by humans. Moreover, font creation is 7 popularization of a font generation method if too many samples are professional and time-consuming work. 5 needed. The cost problem is especially crucial in languages that contain 0 Our automatic generation framework makes three main contribu- 1. a very large number of characters, such as Chinese, Korean, and tions, each of which is briefly introduced below and described in Japanese.Mostcommercialfontscontainatleast1,006kanjicharac- 0 detail in subsequent sections. ters,whicharedefinedasthoseindailyusebytheJapaneseIndustrial 7 The first contribution of this study is a stroke extraction method. StandardsCommittee2.ChineseandKoreaneachusemorethan6,000 1 We propose a method that extracts character strokes by applying a characters. These numbers are far greater than those of other major : novelactivecontourmodelthatdeterminestheboundariesofstrokes v languages, e.g.,26charactersinEnglish,27inSpanish,49inHindi, i to extract. Some strokes are damaged because of the complexity of X 28 inArabic, and 33in Russian. Thus, ittakes a few years even for characters; therefore, we incorporate a restoration process for fixing a professional designer to create a font. r these damaged strokes. a Previous work on character generation is shown in Table I. We Thesecondcontributionofthisstudyisamethodfortheautomatic categorized previous studies by their inputs and outputs into three construction of characters in two phases: modifying skeletons in the problemsettings.Thefirstproblemsettingistogenerateafontusing target font and deploying strokes to them. The proposed method parametersforhandwritten[1]–[6]ortypographicfonts[7]–[12].The extracts the transformation parameters. With these parameters, we second problem setting is to generate a font by blending several can unify the skeletons in size and geometry. Because our method completefonts3 forhandwritten[13]–[15]ortypographicfonts[16]– transforms skeletons from a standard font (such as Mincho) to fit [19].Thethirdproblemsettingistogenerateafont byextrapolating the target font, it is unnecessary to build skeleton datasets for a characters given a subset of handwritten [20]–[27] or typographic large number of fonts. This approach helps us with variety and fonts[28].Eachproblemsettinghasbeenanimportantresearchtopic. scalability. We can generate as many characters as the skeletons in thedataset willallow (210,000 charactersareavailableat thistime). Inthisstudy, we address the problem of generating atypographic For automation, we define an energy function to deploy a stroke so fontfromagivensubsetofcharacters.Fromasubsetofatypographic that an appropriate one can be added to the skeleton. font that includes only a small number of characters, we generate The third contribution of this study is a sample selection method. We address which characters are suitable for the proposed method 1KanjiareChinesecharacters adapted fortheJapanese language. 2http://www.jisc.go.jp and how we can select them based on the energy function used in 3Acomplete fontcontains asufficient numberofcharacters fordailyuse, character generation. Findingsamplesiscombinatorial problem, and e.g.,26forEnglishandoverathousandforJapanese. it is difficult to obtain a solution analytically. Therefore, we apply a TABLEI CATEGORIESOFRELATEDWORK Output Input Handwriting font Typography font Parameters [1][2][3] {θ,∙∙∙} {A,∙∙∙, Z} [7][8][9] {θ,∙∙∙} {A,∙∙∙, Z} (Parameter-based methods) [4][5][6] 1 [10][11][12] 1 {A,∙∙∙,Z} {A,∙∙∙,Z} Complete fonts (Font blending methods) [13][14][15] {A,∙∙∙∙∙∙,Z} {A,∙∙∙,Z} [[1196]][17][18] {A,∙∙∙∙∙∙,Z} {A,∙∙∙,Z} An incomplete font [[2203]][[2214]][[2252]] {A,B} {A,∙∙∙,Z} [[2T8h]is paper] {A,B} {A,∙∙∙,Z} (Character extrapolation methods) [26][27] Section IV Skeleton dataset Section V-A S • • • Section V-B ˜ S Transformation Transform matrix estimation Transformed Generated skeleton character Input Strokes Sample characters • • • Deploying Stroke Selection strokes on extraction Skeletons • • • skeletons ˆ • S • • • • • Fig.1. Overview oftheproposedmethod. genetic algorithm to find an approximate solution. in the skeleton data. A famous noncommercial font4 was created byapplying thismethod. Kamichi extended theskeleton database in [12]. II. RELATEDWORK There have been attempts to analyzing handwritten characters. Bayoudh et al. used Freeman chain-code to analyze alphabetical In this section we review the relevant work illustrated in Table I. characters[1].DjiouaandPlamondonusedaSigmalognormalmodel Parameter-based methods: The methods in this category use supported by the kinematic theory [2]. An interactive system was parameters acquired by analyzing fonts. One of the most famous developed so that the user can easily fit a Sigma-lognormal model methods is Metafont [7], which use the shapes of pen tips and to alphabetical characters. Wada et al. extracted the trajectories of pen-movement paths asparameters for character generation. [8] was alphabetical characters and replaced them using a genetic algorithm inspired by Metafont. Further, [9], [10] analyzed characters in the [3]. Zheng and Doermann adopted a thin plate spline to model an Times font family to extract parameters. Character properties, such alphabetical character andgenerated anew alphabetical character by asthicknessand aspect ratio,arechanged usingtheseparameters. A calculating the intermediate of the two [4]. Handwriting models for method for generating Japanese characters wasaddressed byTanaka robot arms were developed in [5], [6]. etal.[11],whocreatedadatabaseofskeletonsofJapanesecharacters, includingkanji.Thecharactersweregeneratedbyplacingthicklines 4https://www.tanaka.ecc.u-tokyo.ac.jp/ktanaka/Font/ Parameter-basedmethodsgeneratecleancharacters.However,they are only applicable to particular fonts that are analyzed by humans. For instance, the method in [11] accepts only two fonts: Mincho and Gothic. This drawback comes from the method’s requirement for precise analysis. In addition, the methods cannot be applied to Japanese fonts, since over a thousand of characters need to be parameterized.Moreover,specialequipmentisrequiredtoaccurately parameterizealphabeticalcharacters,suchasaninteractivesystemor (a) (b) device for acquiring trajectories. Unlike these methods, the method Fig.2. Visualization oftheGlyphWiki dataandourdataset. (a)Datafrom proposed inthisstudygeneratescharacterswithout requiringprecise GlyphWiki drawn inMincho. Thered triangles represent the control points. analysis. Thus, only afew character sample imagesare required. As Theline types oftheleft and right strokes are “curve” and“straight” inthe we discussed in the previous section, preparing sample images is a middle.Allstartandendshapesaredifferent.(b)Theskeletonsinourdataset. simple task. Thecontrol points,linetype,startshape,andendshapeareattributes. Font blending methods: Methods in this category receive several complete fonts and generate a font by blending them. Xu et al. generated Chinese calligraphy characters using a weighted blend of decorated. To the best of our knowledge, the method in [28] is the strokes indifferent styles [13]. They decomposed samples into radi- onlymethodthatisapplicabletocharactersinfontswithdecorations. cals and single strokes based on rules defined by expert knowledge. Saitoet al. applied a patch transform [29] to samples and generated Animprovedmethod[14]considersthespatialrelationshipofstrokes. alphabetical characters in wide range of fonts [28]. However, the Choietal.generatedhandwritingHangulcharactersusingaBayesian generated results did not meet the criteria for practical use. In this network trained with a given font [15]. paper, we propose an adaptive active contour model for component Suveeranont and Igarashi addressed a generation of alphabetical extraction.Withtheproposedmethod,wecanobtainnaturalcharacter characters for typographic fonts [16]. They generated characters by strokes even in decorated characters. blending predefined characters from miscellaneous complete fonts. Thismethodisbasedonavectorformat,incontrastwiththeproposed III. SKELETONDATASET method,whichacceptsanimageformat.CampbellandKautzlearned First,we address thecharacter skeleton dataset used inthisstudy. amanifoldofstandardfontsofalphabeticalcharacters[17].Locations WecreatedthedatasetbasedonGlyphWiki[30],whichcontainsdata on the manifold represent a new font. Feng et al. used a wavelet from more than 210,000 of characters5. GlyphWiki utilizes a wiki transform to blend two fonts [18]. Tenenbaum and Freeman used format,allowinganybodytocontributetoandmaintainthedatabase. complete fonts as references to form characters from the generation EachstrokeofacharacterisstoredinKAGEformat[12]whichuses results [19]. four attributes: control points, line type, start shape, and end shape. Themethodsinthisapproachrequirehumansupervisiontogener- There are six line types, seven start shapes, and 15 end shapes. The ate a desired font; since they automatically blend fonts, the result number of control points depends on line type and is at most four. is not always the desired font. In order to avoid this problem, The control points and line types lead to a rough stroke, then the blendingweights[13]oraninteractivesystem[16]areimplemented. start and end shapes provide detailsof astroke. Fig. 2(a) illustrates However, sweepingtheweightsand usingthesystemrequirehuman a character in KAGE format in Mincho. supervision. WecreatedskeletonsofthestrokesinacharacterusingGlyphWiki Incontrast,theproposedmethodgeneratesatypographicfontwith data. Fig. 2(b) illustrates the skeletons of a character in our dataset. little human supervision. Instead of blending fonts, we construct We draw a line of a skeleton from the first control point to the last characters with strokes extracted from sample characters in the control point. Anadditional lineof askeleton isdrawnaccording to desired font. Therefore, the proposed method can directly provide start shape and end shape. For example, the end shape is drawn in the user with the desired font. the middle stroke in Fig. 2(b). We draw a line according to the line Character extrapolation methods: The aim of methods in this typeandthenumberofcontrolpoints.Astraightlineisdrawnwhen category is to extrapolate characters not included in a given subset. thelinetypeisstraight andtwocontrol pointsaregiven. A B-spline Attempts to complete this process on handwritten fonts can be curveisdrawnwhenthelinetype iscurved andthreecontrol points found in [20]–[27]. Lin et al. generated characters with components are given. A straight line and B-spline curve are drawn when the extracted from given characters [20] using an annotated font in line type is vertical and four control points are given. We include which the positions and sizes of components were labeled. The theattributeswitheach skeleton. Weextract pointson askeleton by extractionofthecomponentswasperformedonelectronicdevicesso regular sampling from the start point to the end point. Therefore, a that characters can be easily decomposed. Zong and Zhu developed skeletoniscomposedofasetofsamplingpoints.Byusingsampling a character generation method using machine learning [21]. They points, we define a skeleton S as decomposed the given characters into components by analyzing the orientationofstrokes.Componentswereassignedtoareferencefont ··· xi ··· with a similarity function trained by a semi-supervised algorithm. S = ··· yi ··· , (1) ··· 1 ··· Wang et al. focused on the spatial relationships of the character   componentsfordecompositionandgeneration[22].Anactivecontour where xi and yi represent the x- and y-coordinates of ith sampling model was used for decomposition [23]. point, respectively. We adopted homogeneous coordinates. P = Character component decomposition isacrucial technique for the [x,y,1]T denotes a vector that represents a sampling point. Then, methods in this category. However, most use naive decomposition S = [··· ,Pi,···]. We denote a skeleton in our dataset by S, a that rely on spatial relationships [22]–[26] or special devices [20], [27].Itisdifficulttoextractcomponentswhentheyareconnectedor 5Itincludes characters created byusersthatarenotusedpublicly. move cooperatively when the start point is moved by hand, e.g., if the start point moves up, the other control points also move up. The scale and rotation adjustments are performed only once, before manual adjustment. Weemphasize that skeleton adjustment isnot difficult and can be completedinminutes.Itisunnecessarytoadjustalloftheskeletonsin thedataset.According toour experimental results,approximately15 samples are sufficient for the proposed method. Therefore, skeleton adjustment is significantly less work than creating thousands of characters by hand. B. Skeleton relation assignment Wedeterminetheconnectivityoftheskeletonsinthesamples.We define relations between two skeletons as follows: Fig.3. Strokeextraction. Theblackpixels representtheextracted strokes. • continuous: the start or end of a skeleton is connected to the start or end of another. • connecting: the start or end of a skeleton is connected to the body of another. • connected: the body of a skeleton is connected to the start or end of another. • crossing: the skeleton crosses over another. • isolated: the skeleton is isolated from others. Thebodyisthepartofaskeletonwithoutstartorendparts.Examples of each relation are illustrated in Fig. 5. (a) (b) Two skeletons are in contact when the distance between them is Fig. 4. Skeleton adjustment GUI. (a) Screenshot of the GUI initiating an small. Therefore, we measure the distance d between S and S′ as adjustment. Notethattheskeletonisnotmatchedtothesampleatthispoint becausetheskeletons areinMincho.(b)Adjustmentresult. minkQi,j−Pik2+kQi,j−Pj′k2 if Q exists, d= i,j (2) ( ∞ otherwise. skeletoninthesamplesbySˆ,andatransformedskeleton6 byS˜(See Qi,j isthepointatwhichtwolinescross,whichareverticaltoS and Fig. 1). S′ andthroughPi andPj′,respectively.Fig.6illustratesanexample of Q. Using the¯i and ¯j that minimize d, we can determine which IV. STROKEEXTRACTION part of S isincontact withS′, e.g.,thestart of S isincontact with Whentheproposedmethodreceivessamplesasaninput,weextract thepart ofS′ when¯iissmalland¯j islarge.Weusethreefunctions to indicate contact: I for the start, I for the end, and I for the thestrokesofthesamples.Fig.3showsexamplesofextractedstrokes. s e b body.Fornpointsofaskeleton,thefunctionscalculatetrueorfalse, A. Adjusting skeletons to samples a¯is>fol.l9o5w,so.tIhse(r¯iw)i=seTFraulesei.fIn¯i(¯<i)=.05T,routeheifrw.0is5e≤Fal¯ise≤.I.e9(5¯i), o=thTerruweisief n b n The proposed method maximally exploits the skeletons of sample False. The parameters .05 and .95 were determined experimentally. characters, however, the skeletons of samples are not given. The We assign a relation R to S against S′ as follows: skeletons can be easily extracted if strokes are separated; however, continuous if d<2(τ +τ′) strokes often overlap. Quality skeleton extraction is essential for ∧(Is(¯i)∨Ie(¯i)) tasthhkpeeepWllsepieactromaontdpipseoolvsneieenssdl,ocospamuunercedhbdtheaaaotsaodGsb.getrrTtaaaihptbnoehceruidtecht.faeo[l3rse1ua,]sm,iewnprslepeisianrdteseodorpfatbthcyuaestaecar(GcciucnUurtearIrta)eatecfstoiesorkgnemaltedeotnjouatnasdsttjiiuonosngft R= ccoonnnneecctteindg iiff dd∧∧∧I<<((bII(ss¯22i(()((¯¯jiττ)∧)∨++∨(IIIsττee(′′((¯j))¯i¯j))))∨)∧,IeI(b¯j(¯j))),, (3) ctspekooceniIhnlntenrttosiooqlronudpnseeo,srtia:hnstetossicsllaaaausmlnessdtiprsalattedht.etjehudTesGhitnmeoUpFoeIenipgrsteah,.rto4aorwt.roo,Stsrawktateihedoleejniustmeosatnpscdslhjeauaamrssnetkegmendeletsieesntpdiotn,lnathryabenreyaeddledwtcraiamoiutgohtegop.cimenorgaantttitirohvoneel whereτ representicssrooalsastthienidcgknessoioftfhd∧eaIr<wbs(ti¯r2iso)(ekτ∧,e+Iibnτ(¯ja′))s,ample.Wecandraw move. Firstly, a scale of skeletons is adjusted to fit the sample. We acharacter imagewithτ.Fig.7illustratestheskeletons andcreated calculate a scaling factor from rectangles around the skeleton and character with various values of τ. We empirically determined τ by sample. The second technique is rotation adjustment. We extract creating character images with various τ and choosing the τ that points from the skeleton by sampling. Likewise, we extract points minimizes the hamming distance between the created and sample fromthemedial axisof thesample, whichisobtained byamorpho- character images. Wereconsidertherelationsobtainedabove.Sincetherelationsare logical operation. With these points, we use iterative closest point basedontheskeletons,itisnecessarytoverifythattheyareconsistent matching[32]tocalculatetherotationmatrix.Thirdly,controlpoints withthesampleimages. Therelationsmay occasionally be different 6Wedescribedthedetails ofthetransformed skeletion inSectionV-A. fromthesamples because of humanerror during adjustment and the (a) (b) (c) (d) (e) Fig. 5. Skeleton relations of the red skeletons to the black skeletons: (a) continuous, (b)connecting, (c)connected, (d)crossing,and(e)isolated. (a) (b) Fig.7. (a)Skeletons and(b)created characters withτ =1,10,20,40. Si Si j k Pi Pi k,4 j,4 Q 4,4 (a) (b) Fig. 6. Measurement ofthedistance between two skeletons. Thesampling points aredrawnwithblackcircles. Fig.8. Relation check. (a)Skeleton ofsegmentation results.(b)Sampleof thesegmentation results.Focusingonthetwosegmentsindicated bythered circle,therelationisisolatedin(a),butthereisonlyoneconnectedcomponent in(b). parameters.Theparameter2(τ+τ′)inEq.(3)issettoaseverevalue inordertodetectallstrokecontactswithoutomission.Consequently, an incorrect relation may be assigned. respectively. E has a high energy8 around the strokes, thus, the img The procedures for reconsideration are as follows. We classifyall boundary remains separated from them. We use the boundaries of pixels of the sample to the nearest skeleton within the Euclidean thesegmentationresultsastheinitialAACMs.Weadaptivelymodify distance.Fig.8(a)illustratesthesegmentationresults.Subsequently, E and set some points that the AACMs must pass through. img we choose a connected component from the sample using these Inthecasewheretherelationofthetargetstrokeisisolated,which results. Then, we determine the number of connected components is the simplest case, we directly apply an AACM. Fig. 9 illustrates obtained. If the relation is isolated, there must be two connected the results of the extraction of an isolated stroke. components. Likewise, if the relation is connecting, connected, or In the case where the relation of the target stroke is connecting crossing, there must be only one connected component. This check or continuous, we force an AACMto go through the point at which complements Eq. (3). the strokes are in contact. Then, we decrease Eimg by 1/β1 within C. Adaptive active contour model the range of the length β2τ from the point, except for ISK. This decreaseoperationencourages theAACMtopassthroughthestroke Weproposeanadaptiveactivecontourmodel(AACM)forcharac- intersection. β1 and β2 are constant values that were determined ter stroke extraction. The AACM is able to distinguish even strokes empirically.WiththemodifiedE andtheconstraints,weminimize img that have contact with others, determining the boundary of each theAACMandextractthetarget.Fig.10demonstratestheextraction stroke. Inspired by Snake [33], weseek theboundary by optimizing for this case. The AACM naturally passes through the intersection a spline curve. In addition, we incorporate constraints and adaptive due to the decreased energy. energy modificationintotheAACMsothat strokes canbeextracted Wheretherelationofthetargetstrokeiscrossingorconnected,we from complex characters. decrease Eimg by 1/β1 within the range of β2τothers from the other TheAACMisdefinedasasplinecurvethatminimizestheenergy strokes. Then, we minimize the AACM with the modified E and img function as extract the target along the minimized AACM. E =E +E , (4) AACM int img D. Stroke restoration where E and E take into account the smoothness of the curves int img We reshape extracted strokes, which often have defects at points and the fit to objects, respectively. E follows [33]. E must int img wherecontactoccurs;seeFig.11(a).Anintuitivemethodforrestora- be determined for each task. If we set E to −|△I|, where △I img tionistheremovalofcontoursarounddefectsandtheconnectionwith represents the gradient image of I, E pulls the AACMs to the img cubicBe´zier curves. A cubic Be´ziercurve isdrawn withfour points edges. Forour purposes, itisunnecessary for aboundary tobe very (P1,P2,P3,P4). The curve starts from P1 and ends at P4. P2 and closetoitsstroke,andinstead,mayberatherrelaxed.Iftheboundary P3 serve as control points that provide a direction to the curve. is too close, it may cross the stroke. Consequently, the extracted We apply cubic Be´zier curves to a stroke using POTRACE [34], strokes are damaged. Therefore, in this paper, we define E as img generating a number of curves that represent the short parts of the Eimg=Gv∗ISMP+ISK, (5) contour. Then, we remove the curves within a distance of τ from the extracted stroke; see Fig. 11(c). We create a cubic Be´zier curve where Gv is a Gaussian kernel with variance v, and ISMP and between two curves so that the contour can be continuous; see I are the grayscale images7 of the sample and its skeletons, SK 8The pixel value represents the energy: 0 is the lowest and 255 is the 7Pixelvalues are0iftheyarebackground, and255otherwise. highest. (a) (b) (a) (b) Fig. 9. Extraction of an isolated stroke. (a) Initial AACM (blue) and minimizedAACM(orange)onEimg.(b)Extracted stroke. (c) (d) Fig.11. Restorationofanextractedstroke.(a)Damagedstroke.(b)Restored stroke.(c)Contours removedwithcubicBe´ziercurves;redrectangles repre- sentP1 andP4,andpinkcircles represent P2 andP3.Cyanlines represent (a) (b) contours.(d)Thecreated curves,indicated byredlines. Fig.10. Strokeextraction forconnecting orcontinuous strokerelations.(a) Initial AACM(blue)andminimized AACM(orange)onEimg.(b)Extracted P’ stroke. 1 P’ 4 P Fig. 11(d). We set P1 and P4 of the new curve to P4 and P1, 1 P2 P P4 respectively, of the remaining curves. Fig. 12 illustrates the new 3 curve. We calculate P2 and P3 by P2 = P1+γ(P1−P1′), (6) Fig.12. PointsofanewBe´ziercurve.Theredcurverepresentsanewcurve P3 = P4+γ(P4−P4′), (7) andthegreencurves representthoseremaining. whereγ isaconstant value. P1′ andP4′ represent thenearest control pointsfromP1 andP4intheremainingcurves,respectively.P1−P1′ and P4 −P4′ represent local gradients. Hence, P2 and P3 are the We estimate Taff using the skeletons of the sample and dataset. points moved alongthe local gradientsfromP1 and P4, resultingin However, it is difficult to estimate Taff directly from the skeletons a smooth curve. because of the complex structure of the characters. If the characters are complex, they have many skeletons. The affine transformations V. STROKEDEPLOYMENT of skeletons cancel each other. Eventually, T becomes a trivial aff We generate characters by deploying strokes on skeletons. Our matrix, such as an identity matrix. To avoid these problems, we character generation method consists of two phases: skeleton modi- divide the characters into groups and calculate T by averaging aff fication and stroke deployment. the affine transformation matrices obtained from each group. We divide a character using the relations of skeletons. Groups consist A. Skeleton modification of skeletons whose relations are continuous, connecting, connected, The results of character generation would be strange—even with andcrossing.Fig.13illustratesthegroupingresults.Weusefunction perfect strokes— if the style of the skeleton dataset were different from the target font. Thus, we modify the skeletons to be similar to fT to calculate the affine transformation matrix, which fits stroke S the target font. A feasible method for modification is the use of a tostrokeS′.WeformulatefT basedontheleast-squaresmethod, as Eq. (9), which can be solved analytically as Eq. (10). transformation matrix. We estimate the transformation matrix from theskeletonsofthesamples.Specifically,weseektwotransformation fT(S,S′) = arg minkS−TS′k2, (9) matrices:TszandTaff.Tszadjuststhesizeoftheskeletonsandcentroid T∈T translation,andTaff adjuststheaffinetransformationoftheskeletons. = (S⊤S)−1S⊤S′, (10) Modification is carried out by applying T and T to the dataset. sz aff WeestimateT fromtherectanglesoftheskeletons.LetH andW where T represents a set of all possible affine transformation ma- sz denotetheheightandwidthofacharacterinthedataset,respectively. trices. We calculate Taff by averaging affine transformation matrices Likewise, let Hˆ and Wˆ be the height and width of the skeleton of from S to Sˆ as waesacmaplcleulacthearTactebr,yreasvpeercatgiivnegly.siWzeitthraonustfpourmt iamtiaogneosvizeer IthweasnadmIphle, 1 nigroups characters CTsz==sz{|·C1··|,ccXii∈,C···} aWWˆs00cc:ii HHˆ00ccii IIwh221HWHWˆˆcccciiii . (8) wkgrhooeufrpecshnianrstarcockhteeasrrTaiisac,ftftesh=ruecihnnissuatrmsno:kigberGseouirXikp∈soC.=fGXk{tiko=jtia1|slSjasX∈jitsGre∈oiktkfogesfTro(siutSnrpojiC,kkSeˆ}.ji.iT)n,dhiecensuimnbger(ro1u1op)f     isfavored.ThethirdtermofEsincorporatesattributedifferences.We define Ea as Ea(S,S′)= kSst−Ss′tk1+kSed−Se′dk1 if Saexists (17) (50 otherwise, (a) (b) (c) (d) where Sa = {Sst,Sed,Ss′t,Se′d}, and Sst represent a skeleton con- Fig.13. Groupsofskeletons: (a)skeleton and(b),(c),and(d)groups. nected to the start point of S. In the case where there are several skeletons connected to S at the start point, we choose one skeleton whose center is closest to S. Likewise, S represents a skeleton ed Finally, we modify the skeletons by applying Tsz and Taff. A connectedtotheendpointofS.IfSst,Sed,Ss′t,orSe′disunfavorable, transformed stroke is obtained using the equation Ea has a large value. S˜=T T S. (12) aff sz VI. SAMPLESELECTION B. Stroke deployment The proposed method uses C, a font subset, to generate a large Here, we describe the framework for selecting a stroke that is numberofcharacters.Thegenerationresultsaredeeplyinfluencedby the most suitable for a target skeleton S˜. Stroke deployment has an C.Therefore,itisimportanttoanalyzewhichcharactersaresuitable impact on the appearance of the generated characters; therefore, the elements for C. With an optimal C, we are able to maximize the appropriate stroke must be chosen for each target skeleton. capability of the proposed method. We define a process of seeking First, we select Sˆ, which fits to S˜, from a set of the extracted the optimal C as sample selection. strokes Sˆ. Then, we determine the stroke Iˆ corresponding to the In sample selection, we use the skeleton dataset to seek C. Note selected Sˆ. that sample images are not used. Since the skeletons in the dataset arefundamentaldata,iftheskeletonsinC inthedatasetaresuitable, Iˆ = f (Sˆ), (13) img C in a target font can be expected to work well. Sˆ = arg minE(Sî,S˜), (14) In order to ensure sample selection feasibility, we seek C in a Sî∈Sˆ subset of the dataset that we define as validation characters. In this where f is a function that gives a stroke corresponding to Sˆ and study, we adopt the 1,006 characters of kyoiku-kanji containing the img E isan energy function associated withskeletons Sˆ and S˜. Once Iˆ elemental characters of Japanese. Since the number of characters in is determined, we apply fT(Sˆ,S˜) to Iˆ and deploy the transformed the dataset exceeds 210,000, it is time-consuming to use the entire Iˆ on S˜. Werepeat the selection until strokes for all target skeletons dataset. As we show in experimental results in Section VII, the are determined. Finally, a character is generated by integrating the proposedmethodisabletogenerateacceptableresultswithaselected selected strokes. C, verifying the effectiveness of our approach. We define E as Sampleselectionisbasedonageneticalgorithm.Therearealarge numberofcandidatesofoptimalC evenwiththesubset.Candidates E(Sˆ,S˜)=Es(Sˆ,S˜)+Es(fdata(Sˆ),fdata(S˜))+ undergocrossover,evaluationandselectionprocessesandanoptimal Ea(fdata(Sˆ),fdata(S˜)), (15) candidate can be obtained over many iterations of these processes. Wedefinefunctionf thatcalculatestheenergyofacandidate wheref (S)givestheskeleton corresponding toS intheskeleton selection dataset.dTathae energy E utilizes three terms in inspecting Sˆ. The first Ci using a set of skeletons Si of Ci. Let Sv represent a set of skeletons of validation characters. term measures the distance between skeletons in a sample character sainmdilaar.taTrgheetrecfhoarrea,ctIêr.wWillhennatuthraisllyterfimt tios sthmealtla,rgtwetosskkeelleettoonn.sTahree fselection(Si) = αfe(Si)+(1−α)fr(Si), (18) second term also measures the distance between two strokes, but fe(Si) = minEs(s,sv)+Ea(s,sv) , (19) only using thedataset. If theskeletons inthedataset aresimilar,the svX∈Sv(cid:18)s∈Si (cid:19) stroke is favorable for the target skeleton. This term improves the fr(Si) = |Si|+Ncross(Si)+Ncon(Si),|N| (20) accuracy of the energy function. The third term measures distance where |S| represents the cardinarity. N gives the number of usingadjacentskeletons,makingthedistancemoreglobalthanwhen cross skeletons whose relations are crossing. N gives the number of focusing on only two skeletons. con skeletonswhoserelationsarecontinuous,connecting,andconnected. defiTnheeEense(rSg,ySE′)samseasures the distance between two skeletons. We fr measuresthecomplexityofC andservesasaregularizationterm. We control the complexity of the samples with a constant value α. ′ ′ ′ Es(S,S ) = kS −fT(S,S )Sk1 ′ ′ +kS−fT(S ,S)S k1 3 VII. EXPERIMENTALRESULTS + Ik(S,S′), (16) We demonstrate character generation with the proposed method k=1 usingkanji.Weusedfivefontsasthetargetfonts:Ibara43),Gcomic X whereI is1ifthethreeattributesoftwoskeletons—linetype,start 44),Onryo45),Tsunoda46),andZinpen47),wheretheindicesrep- k shape, and end shape—are different; otherwise, it is 0. I inspects resentthenumbersinAppendix.Fig.14showstheoriginalcharacters k the line type, start shape, and end shape of the stroke when k =1, inthefivefonts.Thecharactersineachfonthavedistinctivestrokes; 2, and 3, respectively. The first term of Es measures the distance forinstance,twoparallellinesareusedinZinpen.Itisverydifficult betweentwoskeletons.ThesecondtermofEstakesintoaccountthe for existing methods to generate characters in these characteristic distortionfromthetransformationfT(S,S′);alessdistortedskeleton fonts. (a) Fig.14. TheIbara,Gcomic,Onryo,Tsunoda,andZinpenfonts(fromtopto bottom). (b) TABLEII SAMPLESELECTIONRESULTS. α fselection fr samples 0.8 0.017 0.673 0.6 0.027 0.447 (c) 0.4 0.030 0.387 We created sample images by drawing characters in a 360-pt font size using the Qt library9. The size of the images is 500 by 500 pixels. We implemented the proposed method with C++ and (d) experiments were carried out on a Windows machine with a dual- core CPU. Fifteen characters were chosen as the samples in each font. The coefficients were set as follows: β1 = 2.0 and β2 = 1.5. Wegenerated2,965characters,whichcomprisethesetofJISlevel-1. Forcomparison, weusedtheexistingmethod [28]based onpatch (e) transformationdevelopedbySaitoetal.Tothebestofourknowledge, [28] is the only method that is applicable to characters in various Fig. 15. The characters generated in five fonts: (a) Ibara, (b) Gcomic, (c) fonts.[28]dividessamplesintogridsquaresandgeneratescharacters Onryo, (d) Tsunoda, and (e) Zinpen. From top to bottom in each subfigure, results are shown for the original characters, the proposed method, and the by deploying the squares. The frameworks of the proposed and [28] methodofSaitoetal.[28]. are similar. Both exploit samples, extract components of characters, and generate characters. However, the extracted components are significantly different. The components in [28] are small pieces of Inaddition,wedescribethecomplexityofthestrokesofthegenerated charactersthatlackmeaning,whereasthecomponentsintheproposed characters: 1 (min), 29 (max), 12.5 (avg), and 4.5 (std). method are complete strokes. We have employed our method for sample selection, the results B. Similarityevaluation of the images of which are summarized in Table II. We extracted strokes of the validation characters from the 1,006 characters from kyoiku- We evaluated the generated characters as images. We compared kanji that are for elementary school students, as determined by the the generated characters with the original images with the Cham- Japanese Ministryof Education. Theinitial candidates are randomly fer distance [35]. Let dcham(A,B) be the Chamfer distance from generated. We fixed the number of elements in a candidate to 15 image A to image B. The distance is defined as dcham(A,B) = characters.Ineachiterationofthealgorithm,20candidatessurviveas p∈Aˆminq∈Bˆ|p−q|, where Aˆ and Bˆ are the sets of edge points good candidates and 150 new candidates are created. Themaximum of A and B, respectively. We obtained Aˆ and Bˆ by applying a P number of iterations is set to 1,000. We varied α and carried out Canny edge detector [36]. In general, dcham(A,B) is not equal to sampleselection.Theminimumfselection andfr arelistedinTableII. dcham(B,A); therefore, the symmetric formulation is often used Theobtainedsamplesarecomplexcharactersathighvaluesofαand d′ (A,B)=d (A,B)+d (B,A). (21) simpler at low α values. We use samples at α=0.6 as the sample cham cham cham characters in the following experiments. We employed d′ in this study. cham A. Character generation results A subset of the generated characters is used for this evaluation. We generated characters in the five fonts using the proposed and Specifically, we used the 1,006 characters mentioned above. We existing methods, which are shown in Fig. 15. We generated 2,965 resized the images of the original and generated characters to 100 characters in each font, though only 75 characters are shown due to by 100 pixels and calculated the Chamfer distance between the space limitations. Almost all characters generated by the proposed generated characters and the originals. Table III summarizes the method appear clean and have good readability. Moreover, the font averageChamferdistances.Theproposedmethodiscloserforallfive characteristics, such as the slant of the characters in Ibara, are fonts than the existing method. In particular, the proposed method successfully reconstructed. It is not easy to distinguish the original is greatly superior to the existing method for the Tsunoda font. charactersfromthosegeneratedbytheproposedmethodataglance. Tsunoda has a distinctive skeleton, as can be seen in the spaces between strokes. Since the proposed method successfully modified 9http://www.qt.io/ the skeleton for Tsunoda, the generated characters are close to the TABLEIII TABLEV AVERAGECHAMFERDISTANCE RESULTSOFASUBJECTIVEEVALUATIONOFFONTSIMILARITYTO ORIGINALCHARACTERS Ibara Gcomic Onryo Tsunoda Zinpen Saito[28] 4.7 4.2 4.6 6.8 3.8 Ibara Gcomic Onryo Tsunoda Zinpen Proposed 4.1 3.7 4.2 4.1 3.2 Originals 4.5 4.8 4.5 4.9 4.9 Saito[28] 1.1 1.3 1.5 1.1 1.3 Proposed 4.3 4.4 4.6 4.3 4.6 TABLEIV RECOGNITIONRESULTS(%) Ibara Gcomic Onryo Tsunoda Zinpen Saito[28] 29.0 61.5 56.5 7.4 68.8 Proposed 48.3 77.4 67.7 28.6 77.4 originals. Theseresults numerically demonstrate theeffectiveness of the proposed method. C. Evaluation using character recognition We evaluated the generated characters by using them as training (a) data for a character recognition system. The test data are character imagesofkyoiku-kanjiineachfont.Therefore,thenumberofclasses is1,006.AsimplerecognitionmethodwithChamferdistanceisused inthisstudy.WecalculatedtheChamferdistancebetweenthetraining andtestdataandclassifiedthetestdataasthenearestcharacter.Test dataarecreatedasimages,includingoneJPEG-compressedcharacter. Wecreatedtestdatabydrawingcharactersusinga60-ptfontsizeon 100by100pixelimages;thebackgroundiswhite,andtheforeground is black. Table IV summarizes the results. The proposed method achieved higher performance than the existing method for all fonts. (b) Fig. 16. Examples of test data and answers. (a) Test data shown to the participants. (b)Answerstothetestdata.Generated characters aregray. D. Subjective evaluation We present the results of two subjective evaluations based on the similarity of each font to the original characters and the appearance of 0.77–0.99. The appearance of the characters generated by the ofthegeneratedcharacters. Bothsubjectiveevaluations werecarried method in[28] isfar fromhandmade. Incontrast, theresults for the out by 14 participants. proposed method are much lower than those for [28]. In particular, Thefirstsubjectiveevaluationisofthesimilaritybetweenthegen- thecharactersgeneratedbytheproposedmethodforIbaraandOnryo eratedandoriginalcharacters.Atthebeginningofthefirstsubjective are close to the original characters. According to the results, it is evaluation, we showed the original characters to the participants. difficult to distinguish the original characters and those generated Then, 10 idioms consisting of four characters were displayed. The by the proposed method, even by a human. Therefore, the proposed idioms were made from original characters, characters generated method successfully generated characters with a good appearance. by the proposed method, or characters generated by the existing E. Varied font generation method.Weusethemeanopinionscore(MOS)overtheresults.The participants assigned scores ranging from 1 (bad) to 5 (excellent) to In order to demonstrate the font generation capability of the the idioms according to their impression of their similarity. Table proposed method, we performed a generation experiment with 42 V summarizes the results. The proposed method receives higher fonts:fourstandardfontsusedinWindows(Gothic,Meiryo,Mincho, MOSsthantheexistingmethod.Moreover,theMOSsoftheproposed and YuMincho), six calligraphy handwriting fonts 5) – 10), 17 pen method are relatively close to the originals. handwritingfonts11)–24),and15artificialfonts28)–42).Thefonts The second subjective evaluation is of appearance. We asked the are varied, and include Mincho, Gothic, clerical, antique, personal participants to select characters that may have been generated by handwriting, professional and handwriting styles. It is worth noting computers. Thetestdataconsistedof 150charactersfromeachfont, that the number of fonts used in most existing methods is small; in i.e.,50 characters each from the original characters, those generated particular,weusedasignificantlylargenumberofhandwrittenfonts, bytheexistingmethod,andthosegeneratedbytheproposedmethod. 23. In this experiment, we generated 2,965 characters. We illustrate Therefore, the test data consist of a total of 750 characters. Fig. examples of the results in Fig. 17. The results are promising. The 16 illustrates the test data shown to the participants. We counted proposed method generated clean characters and the style of each thenumber of participants who believed thecharacters to havebeen font is reproduced. generated artificially. Then, we calculated the averages and normal- ized the numbers. Table VI summarizes the results. A character’s VIII. CONCLUSION appearance is natural if the value is low and unnatural if it is high. This paper focused on the problem of generating characters for Naturalmeansthatthecharactersarelikelyhandmade,andunnatural a typographic font by exploiting a subset of a font and a skeleton meansartificial.Mostofcharactersgeneratedbytheexistingmethod dataset. The proposed method extracts character strokes and con- areclassifiedasgeneratedcharacterssincetheresultsareintherange structscharactersbyselectinganddeployingthestrokestotheskele- (a) Proposed (b) Original Fig.17. Generation resultsoftheproposedmethodandtheoriginal characters invarious fonts.Eachrowshowsonefont. TABLEVI APPENDIX RESULTSOFTHESUBJECTIVEEVALUATIONOFAPPEARANCE LISTOFFONTS Ibara Gcomic Onryo Tsunoda Zinpen 1) MS Gothic Originals 0.06 0.02 0.04 0.01 0.01 2) MS Meiryo Saito[28] 0.96 0.9 0.86 0.98 0.94 3) MS Mincho Proposed 0.11 0.09 0.07 0.13 0.08 4) MS YuMincho 5) Aoyagi kouzan 6) Aoyagi reisho 7) Eishi kaisho 8) aiharahude ton. The proposed method successfully extracts the natural strokes 9) jetblack from sample character images. It is worthwhile to emphasize the 10) riitf importance of stroke extraction from the small number of character 11) azukifont images.Theproposedmethodonlyrequiresafontsubset—assmallas 12) chigfont 15 samples—which isafeasiblenumber of samples tocollect. With 13) gyate such asmallsubset, the proposed method can generatethousands of 14) hosofuwafont characters. Sample collection is further eased because this method 15) KajudenB can extract characters from image formats. 16) KajudenR 17) kiloji Anexperimental evaluationwasconducted withfivecharacteristic 18) KTEGAKI fontsthataredifficulttogeneratewithexistingmethods.Weevaluated 19) mitimasu the generated characters by objective and subjective evaluations; 20) seifuu all results indicated that the characters generated by our proposed 21) setofont methodhaveacomparableappearance,usefulness,andreadabilityto 22) SNsanafon the original characters. Furthermore, we carried out the experiments 23) TAKUMISFONT-B with subsets of 42 fonts to demonstrate the generative capability 24) uzurafont of the proposed method. In our future work, we will attempt to 25) apjapanesefonth automatically adjust skeleton samples and conduct experiments with 26) KFhimaji a larger number and greater variation of fonts. 27) ruriiro