From Pa¯ ninian Sandhi . to Finite State Calculus Malcolm D. Hyman Max Planck Institute for the History of Science, Berlin FirstInternationalSanskritComputationalLinguisticsSymposium,Paris,2007–p.1 Overview 1. Research context 2. An XML vocabulary for Pa¯ ninian rules . 3. From Pa¯ ninian rules to an FST . 4. Implications: remarks on linguistic description FirstInternationalSanskritComputationalLinguisticsSymposium,Paris,2007–p.2 Research context Ongoing work on modeling components of Sanskrit grammar according to Pa¯ ninian . principles nominal inflection verbal inflection (using Dha¯ tupa¯ tha) . stem formation (perfect stem, participial stems. . . ) morphophonology (sandhi) FirstInternationalSanskritComputationalLinguisticsSymposium,Paris,2007–p.3 Methodology How closely to follow Pa¯ nini? . Practical concerns dictate an incremental approach. We are obliged to interpret Pa¯ nini. . Research results concerning both Indian grammatical methods and facts of the Sanskrit language will emerge from computational studies. FirstInternationalSanskritComputationalLinguisticsSymposium,Paris,2007–p.4 Building blocks of an XML model The rules model not only a Pa¯ ninian su¯ tra, but . also its context and its interpretation. An XML schema A sound-based encoding (SLP1) A regular expression dialect (PCREs) FirstInternationalSanskritComputationalLinguisticsSymposium,Paris,2007–p.5 The SLP1 encoding (cid:0)a (cid:0)(cid:1)(cid:2) a¯ (cid:3)i (cid:4)(cid:3)¯ı (cid:5)u (cid:6)u¯ a A i I u U (cid:7)r (cid:9)¯r (cid:10)l (cid:11)¯l (cid:8) (cid:8) (cid:8) (cid:8) f F x X * (cid:12)(cid:1)e (cid:13)(cid:12)(cid:1)ai (cid:0)(cid:1)(cid:2)(cid:13) o (cid:0)(cid:1)(cid:2)(cid:14) au e E o O (cid:15)(cid:17)k (cid:2)(cid:18)kh (cid:2)(cid:19)g (cid:1)(cid:20) (cid:2) gh (cid:21)(cid:22) n˙ (cid:16) (cid:16) (cid:16) (cid:16) (cid:16) k K g G N (cid:21)(cid:21)(cid:23)(cid:2) c (cid:24)ch (cid:21)(cid:25) (cid:2) j (cid:2)(cid:26)jh (cid:2)(cid:27)ñ (cid:16) (cid:16) (cid:16) (cid:16) (cid:16) c C j J Y (cid:28)t. (cid:29)t.h (cid:21)(cid:30) d. (cid:31)d.h ! (cid:2) n. (cid:16) (cid:16) (cid:16) (cid:16) (cid:16) w W q Q R (cid:2)"t (cid:2)#th $d (cid:1)% (cid:2) dh (cid:2)&n (cid:16) (cid:16) (cid:16) (cid:16) (cid:16) t T d D n ’(cid:2) p ( ph (cid:2))b (cid:2)*bh (cid:2)+m (cid:16) (cid:16) (cid:16) (cid:16) (cid:16) p P b B m (cid:2),y (cid:21)- r .l (cid:2)/v (cid:16) (cid:16) (cid:16) (cid:16) y r l v (cid:2)0s´ 1(cid:2) s. (cid:21)2 (cid:2) s 3h (cid:16) (cid:16) (cid:16) (cid:16) S z s h *anusva¯ra =M;visarga=H FirstInternationalSanskritComputationalLinguisticsSymposium,Paris,2007–p.6 The rule element 8.3.23 mo ’nusva¯ rah . <rule source="m" target="M" rcontext="[@(wb)][@(hal)]" ref="A.8.3.23"/> (We may need more than one rule to express a su¯ tra.) FirstInternationalSanskritComputationalLinguisticsSymposium,Paris,2007–p.7 The macro element We need some means for translating Pa¯ nini’s . metalanguage, e. g. sound classes (pratya¯ ha¯ ras): <macro name="JaS" value="JBGQDjbgqd" c="voiced stop"/> FirstInternationalSanskritComputationalLinguisticsSymposium,Paris,2007–p.8 The mapping element 1.1.2 aden˙ gunah . . <mapping name="guna" ref="A.1.1.2"> <map from="@(a)" to="a"/> <map from="@(i)" to="e"/> <map from="@(u)" to="o"/> <map from="@(f)" to="a"/> <map from="@(x)" to="a"/> </mapping> FirstInternationalSanskritComputationalLinguisticsSymposium,Paris,2007–p.9 The function element <function name="gunate"> <rule source="[@(a)@(i)@(u)]" target="%(guna($1))"/> <rule source="[@(f)@(x)]" target="%(guna($1)) %(semivowel($1))"/> </function> FirstInternationalSanskritComputationalLinguisticsSymposium,Paris,2007–p.10
Description: