Revised November 21, 2002. Optimal Paradigms1 John J. McCarthy University of Massachusetts, Amherst Abstract Transderivational Correspondence and Uniform Exponence are two recent theories of surface resemblances among morphologically related words. This chapter presents Optimal Paradigms theory (OP), which incorporates elements of both. In OP, candidates consist of entire inflectional paradigms. Within each candidate paradigm, there is a correspondence relation from every paradigm member to every other paradigm member. Faithfulness constraints on this intraparadigmatic correspondence relation resist alternation within the paradigm. This model is illustrated and supported with a type of evidence that has not figured in previous discussions, the templatic structure of the Classical Arabic verb. Arabic verb templates pose a problem for Generalized Template Theory, which demands that templatic restrictions emerge from independently motivated constraints. The OP model supplies the necessary link between this restrictive theory of templates and the Arabic data. 1. Introduction Traditional ideas of analogy, SPE(cid:146)s phonological cycle, and recent proposals about output- output faithfulness all have the same goal: to account for surface resemblances among morphologically related words. For instance, the trisyllabic participial form lightening has a syllabic n because of its relationship to the verb lighten (cid:151) a relationship not shared with the disyllabic noun lightning. Phenomena like this have figured prominently in discussions of analogy and its successors. 1For feedback on earlier versions of this chapter, I am grateful to the editors of this volume, to the participants in phonology seminars at the University of Massachusetts, to an audience at the University of Tromsł (particularly Curt Rice), and to John Alderete, Jonathan Bobaljik, Andries Coetzee, Paul de Lacy, Diamandis Gafos, Maria Gouskova, Yoonjung Kang, Maria- Rosa Lloret, Linda Lombardi, Paula MenØndez-Benito, Joe Pater, Alan Prince, Lisa Selkirk, and Jeroen van de Weijer. 1 In one thread of recent research (see §2 below), words have a surface resemblance because of shared membership in a paradigm. A paradigm is a set of inflected forms based on a common lexeme or stem: +lighten, lightens, lightened, lightening, is an example. The idea is that members of a paradigm should have similar surface phonology, such as the syllabic n that appears consistently throughout the lighten paradigm. In this article, I will introduce a novel formalization of surface resemblance through shared paradigm membership, couched within Optimality Theory (Prince and Smolensky 1993) and correspondence theory (Benua 1997, McCarthy and Prince 1995, 1999). In this Optimal Paradigms model (OP), an OT constraint hierarchy evaluates candidates consisting of entire paradigms. Within a paradigm qua candidate, every output realization of a lexeme stands in correspondence with every other output realization of that lexeme. Faithfulness constraints on this intraparadigmatic output- output correspondence relation resist alternation within the paradigm. Through interaction with markedness and other faithfulness constraints, they account for surface resemblance and failure of resemblance among members of a paradigm. The OP model is illustrated and supported by a type of data that has not figured in previous discussions of the surface-resemblance problem. Certain properties of Classical Arabic root-and- pattern morphology, I will argue, cannot be satisfactorily explained without the OP model. This argument also shows how OP supports the minimalist goals of Generalized Template Theory (GTT), which seeks to eliminate templates and similar stipulations from linguistic theory, replacing them with independently motivated constraints. The analysis of Arabic templates sets in clear contrast the differences between OP and serial-derivational models, including proposals for melding OT with Lexical Phonology (see §4). 2 2 Background to and Overview of the OP Model This section provides an informal introduction to the Optimal Paradigms model and to the analysis of Arabic templates. Later sections fill in the details of the theory and the analysis. Benua(cid:146)s (1997) Transderivational Correspondence Theory (TCT) says that a morphologically derived surface form stands in a relation of output-output (O-O) correspondence with its morphologically simplex counterpart, called the base. For example, lightening is in O-O correspondence with lighten. Harmonic evaluation proceeds recursively, from the base outward, by a principle of Base Priority. The surface phonology of the base lighten is carried over, by obedience to O-O faithfulness constraints, to the derived form lightening. Base Priority disallows influences running in the other direction, so derived lightening can never influence the base lighten via O-O correspondence. Because of Base Priority, TCT is inherently asymmetric: if form A influences the phonology of a morphologically related form B, then form B can never influence the phonology of form A. Another approach to the surface-resemblance problem is inherently symmetric, requiring that morphologically related words (or even individual morphemes) resemble one another without assigning priority: (1) a. Uniform Exponence (Kenstowicz 1996) Minimize the differences in the realization of a lexical item (morpheme, stem, affix, word). b. Metrical Consistency (Burzio 1994: 228) Every morpheme must be as metrically consistent as possible. In later work, Burzio (1996, 1999) generalizes Metrical Consistency to a principle of Anti- Allomorphy, which requires consistent realization of morphemes in all their phonological properties, not just stress. With this generalization, these principles become equivalent: morphemes, stems, or 3 words should not vary in their phonological realizations. Unlike TCT, this theory is inherently symmetric: it is possible for form A to influence the phonology of form B, while at the same time form B influences the phonology of form A. These two theories of surface resemblance among related words are more complementary than competing. TCT deals with the classic cases of cyclic or stratal behavior, such as English condØnse/c(cid:242)ndŁnsÆtion (Chomsky and Halle 1968, Pater 2000): ranked faithfulness constraints on surface forms, formalized under correspondence theory, enforce similarity, but by virtue of Base Priority they effectively enforce it only on the derived form, not the basic form. Uniform Exponence (UE) has been applied to paradigm-uniformity phenomena inter alia, preventing alternations among forms that are related inflectionally, typically where no morphologically simplex base is identifiable. TCT is not applicable to inflectional paradigms because it is an asymmetric, base-prioritizing theory (though see Benua 1997: Chapter 6 for a different view). In TCT, the base is the first step in the recursive evaluation. The derived form, which is the next step in the recursive evaluation, is obtained from the base by applying a morphological operation, such as affixation. Inflectional paradigms have no base in this sense: Latin amat (cid:145)he loves(cid:146) is not derived from amÇ (cid:145)I love(cid:146) or vice- versa; rather, both are derived from the lexeme /am-/. This deficiency in TCT might be remedied by invoking morphosyntactic unmarkedness, rather than morphological underivedness, as a way of identifying the base form in an inflectional paradigm (cf. Kenstowicz 1996, Kiparsky 1998: 31-2, Kraska-Szlenk 1995, Kurylowicz 1945-1949, Selkirk 2001). If this criterion is applied, then Latin amat is presumably the base of the /am-/ verbal paradigm. But this method of base identification will not work for the Arabic example discussed here (see §4.3). Just as TCT has difficulties with inflectional morphology, so too UE runs into problems with derivational morphology. If applied to derivation, UE overpredicts surface-resemblance effects. With UE, it is possible for the derived form to influence the phonology of the base or for a bound 4 morpheme in one word to influence the phonology of that morpheme in another word. Solid synchronic examples of these predictions do not seem to exist.2 Furthermore, UE represents more of an intuition than a usable phonological principle. In OT, a constraint is a function from a linguistic expression to zero or more violation-marks. Injunctions like (cid:147)minimize the differences(cid:148) or (cid:147)be as metrically consistent as possible(cid:148) are not well-defined constraints. Moreover, expressions like (cid:147)minimize X(cid:148) and (cid:147)as X as possible(cid:148) incorporate into themselves part of the definition of EVAL, and so they should not appear in constraint definitions (McCarthy 2002: 40). (See van de Weijer (1999) for an improved formalization of UE and Buckley (1999) for related discussion.) The Optimal Paradigms (OP) model proposed here synthesizes the best elements of TCT and UE. From TCT it draws the idea of using correspondence theory as a foundation. Correspondence theory supplies a range of well-defined, rankable constraints enforcing resemblance between forms. From UE, OP incorporates the idea of evaluating surface resemblance symmetrically across inflectionally related forms. The central premises of the OP model are therefore these: 2For example, Kenstowicz (1996) proposes that the leveling of the prefix des- as deh- in aspirating dialects of Spanish is a consequence of UE acting on all instances of this bound morpheme. But there is no evidence that this is part of the synchronic grammar: since deh- shows up in all contexts, there is no justification for underlying /des-/. Another example: Burzio (1994: 201) proposes that English words with the suffix -ic have penult stress to maintain uniformity with the same words ending in -ical: e.g., acadØmical affects acadØmic (cf. Chomsky and Halle 1968: 88). One problem with this analysis is that many words in -ic have no related form in -ical (sulfuric, Ethiopic, Olympic, Byronic) or they have a related form that is found in dictionaries but not used (taxonomic(al), semantic(al), prosodic(al), genetic(al)). Another problem is that other suffixes, such as -id and -ish, have the same stress behavior without a longer derivative to explain it. 5 (2) OP in Outline a. Candidates consist of entire inflectional paradigms, where an inflectional paradigm contains all and only the words based on a single lexeme (for similar ideas, see Bonet and Lloret 2001, Kenstowicz 1996: 385, McCarthy 1998, Raffelsiefen 1995, 1999, Tesar and Smolensky 2000).3 b. Markedness and input-output faithfulness constraints evaluate all members of the candidate paradigm. The violation-marks incurred by each paradigm member are added to those incurred by all the others.4 c. The stem (output form of the shared lexeme) in each paradigm member is in a correspondence relation U with the stem in every other paradigm member. (That is, for OP every candidate paradigm P there is a relation U on P(cid:215)P.) There is no distinctive base(cid:151) OP rather, every member of a paradigm is a base of sorts with respect to every other member.5 d. There is a set of output-output faithfulness constraints on the U correspondence relation. OP For example, suppose we have a language with no suffix in the singular and the suffix -i in the plural. Suppose this language also has coronal palatalization before i. From the lexeme /mat/, GEN will produce such candidate paradigms as +mat, mati,, +mat, ma±i,, and +ma±, ma±i,. Each candidate paradigm brings with it a correspondence relation U that relates the stems in each OP 3Here, I assume that paradigms are (cid:147)flat(cid:148), consisting of a list of all paradigm members. It is conceivable, however, as John Alderete and Diamandis Gafos point out, that paradigms have internal hierarchical structure. For example, Latin noun paradigms might decompose into separate subparadigms for singular and plural: ++pater, patris, ..., , +patre+s, patrum, ..., , sg. pl. (glosses: ++father (nom.), father (gen.), ...,, +fathers (nom.), fathers (gen.), ...,,). The correspondence relations and OP faithfulness constraints can be adapted to this proposal in various ways (cid:151) e.g., perhaps OP faith is violated once for every (sub)paradigm that hosts an alternation, or perhaps there are distinct correspondence relations (and distinct OP faithfulness constraints) within and between subparadigms. This possibility, though certainly intriguing, will not be pursued here because the evidence under discussion does not require it. 4The violation profile of a form is a vector representing all of its constraint violations in rank order, such as +*, **, (cid:216), ***, (Samek-Lodovici and Prince 1999). The violation profile of an entire paradigm is the vector sum of the violation profiles of all members of that paradigm. E.g., +*, **, (cid:216), ***, + +(cid:216), *, **, (cid:216), = +*, ***, **, ***,. 5Limitation of the correspondence relation to the shared lexeme recalls Alderete(cid:146)s (1998) notion of stem-to-stem correspondence, which is required in his accounts of pre-accentuation in Cupeæo and accent shift in Japanese. Since U is a relation on P(cid:215)P, every member of a paradigm is also in correspondence with itself. This is harmless, OP since self-correspondence can never lead to faithfulness violations. 6 paradigm member: mat U ma±i and, symmetrically, ma±i U mat. (The portions standing in OP OP OP correspondence are underlined.) The candidate +mat, ma±i, violates the constraint OP-IDENT(high) (or whatever feature distinguishes t from ±).6 The OP model presupposes a distinction between inflectional morphology, which is organized into paradigms, and derivational morphology, which is organized hierarchically by the relation (cid:147)is derived from(cid:148). (See Spencer (1991: Chapter 6) for a review of the issues surrounding this assumption.) Derivational morphology, I assume, is analyzed within TCT, as before. But inflectional paradigms are different from derivational hierarchies; in paradigms, all members are co-equal in their potential to influence the surface phonology of other members of the paradigm. This is formalized by U correspondence, which gives every paradigm member a chance to affect any other member. OP Whether it does or does not depends on the ranking.7 As we will see in greater detail in §4 and §5, this model predicts certain interactional patterns that set it apart from other approaches, particularly TCT. One pattern is overapplication-only. Given a language with a general process of coronal palatalization, there are two ways to level alternations within the paradigm of /mat/: +ma±, ma±i, or +mat, mati,. The first of these paradigms shows overapplication of the palatalization process (cid:151) there is palatalization of /t/ even in the unsuffixed form, where the conditioning i is absent. The second paradigm shows underapplication of palatalization: the process is blocked in the suffixed form mati because there is no palatalization in the unsuffixed form mat. OP-IDENT(high) is satisfied either way, but the paradigm with 6Technically, the paradigm +mat, ma±i, receives two marks from OP-IDENT(high), one for the mat U ma±i correspondence OP relation and the other for its symmetric counterpart. The paradigm +ma, ma±i, incurs one violation of OP-MAX for the ma U ma±i relation and one violation of OP-DEP for the ma±i U ma relation. OP OP 7Jeroen van de Weijer and the members of the Leiden Phonology Group raise an objection: the phonological effects of derivational and inflectional morphology are sometimes the same. For example, English stress-neutral suffixes can be both derivational (-ness) and inflectional (-ing). This is exactly as the OP model predicts: under ranking permutation, we expect to find cases where OP faithfulness constraints, which pertain to inflection, and OO faithfulness constraints, which pertain to derivation, are ranked similarly. The model also predicts, however, that inflection and derivation can act differently (cid:151) for example, only derivation can show true underapplication effects, and only inflection can show phonological influences from different paradigm members simultaneously (see §4.3). 7 underapplication cannot be obtained in the OP model. (Some care is required in defining what over- and underapplication mean in the context of a constraint-based theory like OT, so this statement should not be applied indiscriminately. See §5.2.) The problem with underapplication is that it competes with overapplication. Overapplication satisfies the high-ranking markedness constraint that is responsible for the basic palatalization process, but underapplication does not. Underapplication does better on IO faithfulness, but that is irrelevant, because the assumed existence of the process in the language as a whole shows that IO- IDENT(high) is ranked below the responsible markedness constraint. This means that there is only one way for underapplication to win: some other constraint must block overapplication. For an example of overapplication-only, see §4.1. For examples where overapplication is blocked and underapplication happens instead, see §4.2 and §5.2. For a general evaluation of the overapplication- only hypothesis, see §5.2. And for the reduplicative parallel, upon which this argument is based, see McCarthy and Prince (1995, 1999). A related prediction of OP is attraction to the unmarked. I will call a paradigm member an attractor if other members of its paradigm are forced to resemble it by visibly active OP faithfulness constraints. For example, in the leveled paradigm +ma±, ma±i, from /mat/, the form ma±i is the attractor, with ma± forced to resemble it by OP-IDENT(high). Now suppose we have a situation where there are two different ways to satisfy an OP constraint (cid:151) two different ways to level a paradigm, depending on which member is acting as the attractor. That is, there are candidates +A1, B1, and +A2, B2, that equally satisfy the high-ranking OP constraint, but differ in which paradigm member is doing the attracting: in the first paradigm, A1 is the attractor, but in the second paradigm, B2 is the attractor. Unless IO faithfulness is decisive, the winning paradigm will be determined by markedness, according to the following logic: (i) Identify the highest-ranking markedness constraint that favors A1 over A2. Call it 8 M(A1™A2). (ii) Identify the highest-ranking markedness constraint that favors B2 over B1. Call it M(B2™B1). (iii) If M(A1™A2) dominates M(B2™B1), then A1 is the superior attractor and so the paradigm +A1, B1, wins. (iv) But if M(B2™B1) dominates M(A1™A2), then B2 is the superior attractor and so the paradigm +A2, B2, wins. In other words, the markedness of the attractor is what matters. Attraction to the unmarked follows directly from the theory: in OP, the markedness violations of a candidate paradigm are the summed markedness violations of its individual members. The markedness violations incurred by +A1, B1, are those incurred by A1 or B1, so if the A1-favoring markedness constraint dominates the B2-favoring one, the paradigm that contains A1 is optimal. Here are some hypothetical examples to illustrate this prediction; for real-life cases, see §4.2 and Raffelsiefen (1995, 1999). Overapplication vs. underapplication in +ma±, ma±i, vs. +mat, mati, is perhaps the simplest example that can be constructed; indeed, overapplication-only is a special case of attraction to the unmarked. In +ma±, ma±i,, the suffixed form ma±i is the attractor, while in +mat, mati, unsuffixed mat is the attractor. Which paradigm wins depends on which markedness constraint is higher ranked: M(ma±i™mati) or M(mat™ma±). Under the assumption that this language has a general process of coronal palatalization, M(ma±i™mati) is top-ranked, so overapplication wins. A more complex example can be constructed from a language with a -u suffix in the singular and -i in the plural, with Japanese-style phonology of t before these vowels: affrication to ° before u and palatalization to ± before i. Then there is competition between two different ways to overapply, +ma°u, ma°i, vs. +ma±u, ma±i,. By attraction to the unmarked, the choice between them comes down to this question: is M(ma°u™ma±u) ranked higher or lower than M(ma±i™ma°i)? The answer 9 could go either way; in fact, this might be the only situation where these two constraints can be brought into conflict. The OP model also predicts the possibility of majority-rules effects, where the pattern that is most common in a paradigm acts as an attractor for others. Majority-rules effects are not a routine matter in the OP approach; the empirical circumstances and constraint rankings that will produce them are highly specific, as we will see in §5.1. But when conditions are propitious, we expect to see results like the following. Stems followed by a consonant-initial suffix alternate one way, in accordance with undominated markedness constraints. Stems followed by a vowel-initial suffix alternate another way, also in accordance with those undominated constraints. If markedness does not decide how stems with no suffix will alternate, then they will be attracted to the pattern that is more common in the rest of the paradigm, which of course depends on whether consonant-initial or vowel-initial suffixes happen to be more frequent. This result follows from minimization of OP faithfulness violations (cid:151) though some OP faithfulness violation is unavoidable because markedness forces differences between the two suffixed conditions, fewer violations of OP faithfulness are incurred if the unsuffixed form conforms to the more common of the two suffixed patterns. It is important to realize that attraction to the unmarked, overapplication-only, and majority- rules effects are not special stipulations or auxiliary principles. Rather, they are consequences of the OP model that devolve from its basic architecture. It is also important to realize that OP, as a theory of paradigms, only asserts these claims about inflectional morphology, not derivational. If inflectional morphology turns out to conform to these predictions, then the OP theory receives strong support. If the predictions turn out to be wrong, then the problem is profound and there is no easy way to patch around it because the predictions are so deeply connected to the tenets of the theory. Needless to say, whether they are right or wrong, our theories should always make such strong, falsifiable claims. 10
Description: