Grammaticalization in Somali and the development of morphological tone

We demonstrate how grammaticalization may cause the restructuring of prosodic systems, leading to the development of new prosodic types. Illustrated with data from Northern Somali and related varieties, we show how a restricted tone system has developed into one of morphological tone through grammaticalization of independent words to bound forms. Diachronic change has weakened the “accentual” properties of the High tone in Somali, and the tone patterns are synchronically best analyzed as properties of morphological constructions rather than of prosodic domains.


Introduction.
One of the puzzles in Somali that has been the topic of debate involves data of the type illustrated in In his seminal paper on the prosodic system of Somali, Hyman (1981) proposed that there is maximum one High tone per word in Somali, assigned to the final or penultimate mora of words. The challenge to this approach is progressive forms such as keénayaa (in which the High tone is "too far" from the right edge) and keénayó (which has two High tones). Hyman (1981; 175) proposes that progressive forms have an internal word boundary between the stem and the suffixes -(keén)(ayaa) and (keén)(ayó). This way, one can keep the proposed generalizations regarding the number and location of High tones.
The idea that the High tone is associated with the word domain, and that one can use the High tone as a "diagnostic" to wordhood, has lived on in later works on the prosodic system of Somali, e.g. Le Gac (2002) ; Downing & Nilsson (2019) ; Green & Morrison (2016, 2018 (though see Kaldhol & Stausland Johnsen Forthcoming for arguments that the one High tone per word rule is not synchronically productive). However, an explicit analysis of the tone patterns in the progressive forms has not been proposed in these later works.
The present paper aims to revisit the challenge posed by the progressive forms of verbs, and to account for the tone patterns illustrated in table 1 by demonstrating how they have developed. We do so by combining the tools of grammaticalization studies (e.g. Heine & Narrog 2015) and prosodic typology (e.g. Hyman 2006Hyman , 2009. The goal of the present paper is to demonstrate that while the distribution of the High tone in Somali used to be restricted within a word domain at a previous stage of the language, grammaticalization has weakened these restrictions, causing a new system to develop: the synchronic system is one of morphological tone in which the tone patterns are associated with different morphological constructions.
This paper is structured as follows. Section 2 provides background on grammaticalization studies. Section 3 outlines the basic properties of the tone system in Somali, and relates them to the broader study of prosodic typology. Section 4 outlines the changes that have taken place in the verbal domain, while section 5 discusses the consequences these changes have had for the overall tone system. Section 6 concludes.

Grammaticalization.
Givón famously stated that "Today's morphology is yesterday's syntax" (Givón 1971; 413). The development of inflectional affixes from function words that have become bound over time is amply documented (see e.g. Kouteva et al. 2019). For example, the Spanish future suffix -ré has developed from a Latin construction consisting of an infinitive verb + habēre 'to have' (ibid. p. 340). These types of developments are instances of grammaticalization, and can be captured by a cline of wordhood, as illustrated in (1). A grammaticalized form is one which has moved from left to right on the cline. 1 (1) The cline of wordhood (Hopper & Traugott 2003; 7) content item > grammatical word > clitic > affix In the present paper, we will be less concerned about changes in meaning (e.g. how content items develop more grammatical meaning through changes such as semantic bleaching) and more about changes in form; that is, we will focus on the aspect of grammaticalization typically referred to as coalescence. Coalescence refers to a gradual increase in boundness, after which function words become 'glued' to content words (see e.g. Haspelmath 2012). This is also known as morphologization (Joseph 2003). When a function word is grammaticalized and develops into an affix, this change also affects the relationship between this grammaticalized form and other elements within its construction. This dimension of grammaticalization is captured in the following definition: A grammaticalization is a diachronic change by which the parts of a constructional schema come to have stronger internal dependencies. (Haspelmath 2004; 26) One example that illustrates this, is the Greek future marker tha, which has developed from the reduction of thélō hína 'I wish that' (Kouteva et al. 2019; 453). This is called univerbation, which Lehmann defines as "the union of two syntagmatically adjacent word forms into one" (Lehmann 2020; 206). Univerbation is furthermore characterized by gradience and gradualness: In principle, univerbation takes place at the moment that a construction is converted into a word. Since, however, the concept of word itself does not have neat boundaries, this process is not, in fact, an instantaneous conversion, but rather a transition. (Lehmann 2020; 209) Grammaticalization studies are in part the result of recognizing this gradience, and the fluidity of the concepts on the cline of wordhood (Hopper & Traugott 2003; 7).

Tone patterns in Northern Somali.
Somali is a Cushitic language spoken on the Horn of Africa. In the present paper, we focus on the dialect group referred to as Northern Somali.
Some clarification of what we mean by this is in order here, since Somali dialectology has a long history of confusing terminology (see Tosco 2012 for an overview). Northern Somali has three main subgroups: Northeastern Somali, Northwestern Somali, and Benaadir Somali (note that there are disagreements about whether Benaadir Somali should be considered a part of Northern Somali -see e.g. Tosco 2012; 272 -or a separate dialect group -see e.g. Banti 2011; 693). Much work on Somali is based on Northwestern Somali (e.g. Andrzejewski 1956; Saeed 1999. While there is no "standard" Somali as such, the written Somali language is based mainly on Northern Somali. In the present paper, we are concerned mainly with Northwestern and Benaadir Somali, for which Tosco (2012; 272) uses the term Central-Northern Somali. Northeastern Somali, as we will see, has retained more "archaic" features. It should also be noted that a process of koineization or dialect-levelling took place after the written language was officially launched in 1972 (Banti 2009). What we are saying here for Central-Northern Somali is true of this spoken koiné as well.
Another dialect group that will be relevant below is Maay. There has been disagreements about whether this should be considered another dialect group within Somali, or a separate language, and there appears to be internal diversity within this group such that it is more of a continuum (Lamberti 1986b; 23-24; Tosco 2012; 271). For more on the classification of Somali dialects and the history of Somali dialectology studies, see Abdirachid (2011) The main properties of the tone system of the Northern Somali group can be summarized as follows: the tonal contrast is between a High tone (kú 'in') and a Low tone (kù 'you'). One can describe the system by referring exclusively to the High tone, so one can analyze the contrast as High vs. Ø (toneless), with default insertion of Low tones. Consider the examples in table 2 2 (acute accent indicates a High tone, no accent indicate a default Low tone).

Masculine
Feminine Table 2. Tonal gender marking Armstrong (1934) was the first to document the relationship between tone and gender revealed in examples like the ones in table 2; Hyman (1981) was the first to analyze them with the mora as the Tone-Bearing Unit (TBU). This move allows one to state the relationship between tone and gender in this particular declension in simple terms: feminine nouns have a High tone on the final mora, masculine nouns have a High tone on the penultimate mora (note that only vowels are moraic in Somali and that coda consonants cannot be tone-bearing). The High tone is further restricted in its distribution in ways that have been associated with "accent" systems. 3 The so-called accentual properties that tones may have, include the ones listed in (2) (from Downing 2010; Hyman 2009; 220).
(2) Accentual properties potentially found in tone systems As stated above, the tonal contrast in Somali may be analyzed as privative, H vs. Ø. However, the High tone is not obligatory, because there are toneless words such as wada 'together' and kala 'apart'. Many verb forms are toneless (see below as well as Saeed 1999).
The two accentual properties of interest in the present paper are culminativity and demarcativity. The High tone in Somali is typically analyzed as exhibiting both of these properties. As exemplified in table 2 above, all Somali roots, which we define here as the minimal unit with a lexical meaning, have maximally one High tone (in line with culminativity). This High tone is assigned to the final or penultimate mora (in line with demarcativity). Furthermore, in many morphological constructions, only the rightmost morpheme has a High tone, as illustrated by the alternations in (3)-(5) (see e.g. Green & Morrison 2016; Kaldhol 2021, Forthcoming; Saeed 1999. An important point for the present purposes is that both culminativity (maximum one High tone per word) and demarcativity (the High tone marks the edge of a word) reference the unit word. We saw in section 2 above that in grammaticalization studies, the word is not a primitive, but a label used to mark a point on a continuum from more bound to less bound. When conceptualized this way, the notion word is of a gradual and gradient nature.
In the present paper, we argue that the distribution of the High tone in Somali is readily accounted for by grammaticalization: what looks like "exceptions" to culminativity and demarcativity are the result of a gradual increase in boundness, which has caused function words to become 'glued' to content words, weakening the accentual properties of the High tone. We turn to these developments in the next section.

Coalescence in the verbal domain. The main puzzle in the verbal domain in Northern
Somali is the tone patterns of progressive forms such as the ones illustrated in (6)-(7). First, keénayaa is a problem for a demarcativity analysis because the High tone is "too far" from the right edge. Second, keénayó is also a problem for a culminativity analysis due to the presence of two High tones.  keénayaan keénayáan cúnayaan cúnayáan Table 5. The progressive aspect The positive forms violate demarcativity, since the High tone is "too far" from the right edge (e.g. keénayaa). The negative forms also violate culminativity, since there are two High tones: one on the stem, and one on the suffix (e.g. keénayó). The differences between the present forms in table 3 (e.g. keenaa, keenó) and the progressive forms in table 5 are that there is an additional suffix -ay-between the stem and the person-number suffixes, and a High tone on the mora preceding -ay-(keén-ay-aa, keén-ay-ó). Moreno (1955; 79) analyzes the progressive -ay-as being an instantiation of the lexical verb hay 'have, hold' (whose present tense paradigm is provided in table 3, bottom right corner). He points to periphrastic constructions in other dialects that show the relationship to hay more transparently. Examples are illustrated in (8)  sheenә háaye 'I am bringing' (Saeed 1982; 25) We should note that although accent marks are provided in these sources, it is not clear exactly what they mean or if they can be compared directly to the High tone in Central-Northern Somali. What is important for the present purposes is that the existence of periphrastic progressives in related dialects motivates the analysis of the Northern Somali progressive (cúnayaa) as resulting from a former periphrastic construction.
As explained in section 2, univerbation is a gradual process, and we may thus expect to see forms at different stages of the grammaticalization process co-existing within the same dialect group. Such a situation is reported for varieties of Maay, as exemplified in (9).
(9) Variation sheenә háayaan ∼ sheenáayaan 'they are bringing it' (Maay; Saeed 1999; 24-25) 4.2. WEAKENING OF THE ACCENTUAL PROPERTIES OF THE HIGH TONE. Nearly 70 years have passed since Moreno (1955), and we are now in a position to build on his work by connecting it to grammaticalization studies as well as prosodic typology, both of which are fields that have progressed substantially over the decades passed. First, we can rephrase the relationship between the progressive suffix and the verb hay: it is not that the -ay-is an instantiation of the verb hay synchronically; rather, the progressive forms are the result of a gradual coalescence of a former periphrastic construction (*keéni hayaa) to a synthetic one (keénayaa). To put this in the terms outlined in section 2, the auxiliary verb hay has moved along the cline of wordhood, and become the suffix -ay-'PROG'. As a result, the relationship to hay 'have' is no longer transparent. In other words, the parts of the progressive constructional schema have come to have stronger internal dependencies, as two syntagmatically adjacent forms have coalesced into one (*keéni hayaa ⇒ keénayaa).
Second, we are tying the data to the study of prosodic typology, by demonstrating how grammaticalization has restructured the tone system: while tone was subject to distributional restrictions (culminativity and demarcativity) at a previous stage of the language, grammaticalization has weakened these restrictions: (10) Weakening of demarcativity: *keéni hayaa ⇒ keénayaa *cúni hayaa ⇒ cúnayaa (11) Weakening of culminativity: *keéni hayó ⇒ keénayó *cúni hayó ⇒ cúnayó The fact that the progressive forms have a High tone on the stem (e.g. keénayaa, keénayó) is explained by the diachronic origin of these constructions: the progressive used to be periphrastic, and consisted of an infinitive form with a High tone (e.g. *keéni) followed by an auxiliary verb (e.g. *keéni hayaa, *keéni hayó). We argue that the univerbation of the periphrastic progressive forms to synthetic ones has blurred the relationship between tones and words, and this way, grammaticalization has restructured the prosodic system.

Proposal.
We propose that grammaticalization has caused the High tone in Somali to lose its accentual properties (demarcativity, culminativity). At a previous stage, tone was tied to metrical structure and prosodic domains in the sense that the High tone was assigned to the final or penultimate mora of words (the High tone was demarcative), and there was maximum one High tone per word (the High tone was culminative). As function words have developed into affixes and become bound to content words, new tone patterns have been introduced, causing a new system to develop. The synchronic system is one of morphological tone in which the tone patterns are associated with morphological constructions rather than prosodic domains.
We do not aim to provide a synchronic analysis of the tone patterns here, as we take the position that a diachronic account of the facts gives a satisfactory answer to questions such as why does this system look this way, and how did it develop? For more on diachronic explanations, see e.g. Blevins (2004); Bybee (2003. However, a few notes on synchronic implications are in order: the tone patterns are predictable from the grammatical features of constructions (e.g. progressive, negative), and thus they can also be analyzed as properties of the constructions.
For example, one can analyze the suffixes as having different tones themselves, as well as different tonal effects on the verb stem. We illustrate this by returning to the examples presented at the beginning of the paper, repeated here in We see that there is a whole range of tone-morphology interactions in Somali: first, there is tonal imperative marking (kéen). In this case, the penultimate High tone is an exponent of a grammatical feature. Second, we can categorize the verbal suffixes in Somali into Low-toned (as in keen-aa; these could also be analyzed as toneless) and High-toned (keen-ó), alongside the type constituted by the infinitive and the progressive suffix, which themselves are Lowtoned but co-occur with a High tone on the final mora of the stem (keén-i, keén-ay-aa, keénay-ó). In autosegmental terms, this latter pattern can be captured with a floating tone H which associates to the mora preceding the suffix -ay-. Note also that the progressive suffix is segmentally homophonous with the past tense suffix -ay, which however does not co-occur with a High tone on the stem (12). In a morpheme-based approach, one can put the floating High tone into the underlying representation of the suffix and stipulate an association rule. In a constructionist approach, one can say that both the suffix and the High tone are properties of the construction as a whole; the constructions in table 6 thus consists of a mapping between different configurations of elements and different meanings.
We believe the "word" is a useful language-particular descriptive category (in the sense of Haspelmath 2018) for capturing the distribution of the High tone at a previous stage of the language (and the dialects whose High tone still has accentual properties). However, the tone patterns in Central-Northern Somali have been morphologized. Diachrony accounts for the dis-tributional facts, and referencing the "word" to account for them synchronically is unnecessary. Furthermore, the weakening of the accentual properties of the High tone has in turn weakened the evidence for the "word" itself. We do not attempt here to define the "word" domain in Central-Northern Somali, or for that matter, to decide whether there is a "word" domain at all in this dialect group. The a priori assumption that (grammatical or phonological) "words" are relevant units for all languages has been questioned by e.g. Bickel et al. (2009); Haspelmath (2011); Schiering et al. (2010; Tallman (2020) (contra the hypotheses in frameworks such as the Prosodic Hierarchy; e.g. Nespor & Vogel 1986; Selkirk 1984. In our view, the question is whether there is any synchronically productive rule at all that requires reference to such a domain (we are not convinced that this is the case; see Kaldhol & Stausland Johnsen Forthcoming for discussion).

Concluding remarks.
Mithun has demonstrated how the "recognition of the processes involved in grammaticalization can provide valuable tools as we seek to explain the patterns that occur in languages" (Mithun 2011; 192). The present paper has aimed to extend this line of exploration to the study of prosodic typology and tone, by demonstrating how grammaticalization may cause the restructuring of prosodic systems, leading to the development of new prosodic types.
As explained in section 2, we have been concerned with form rather than meaning. One of the processes involved in grammaticalization is semantic bleaching, in which content items develop more grammatical meaning over time (for example, lexical verbs may develop into auxiliary verbs). We have instead focused on how function words (in this case the auxiliary verb hay) may develop into affixes as the result of a gradual increase in boundness.
Within prosodic typology, the distributional restrictions that tones may have are described in terms of so-called "accentual properties" (see e.g. Downing 2010 andHyman 2009). These properties make reference to a word domain (e.g. one tone per word, tone marking edges of words). This categorical and a priori given "word" notion found in the study of prosodic typology thus contrasts with the gradual and gradient "word" notion conceptualized in grammaticalization studies.
By combining insights from the two approaches, we have argued that the gradual changes involved in grammaticalization have weakened the accentual properties of the High tone in Somali, in such a way that the prosodic system has been reorganized, and the tone patterns have been morphologized. At a previous stage, tone was subject to distributional restrictions within a word domain, but the synchronic system is one of morphological tone in which the tone patterns are associated with morphological constructions. Hence a gradual and gradient conception of wordhood and boundness may prove elucidating for the study of prosodic typology, as the developments that have taken place in Somali illustrate how one type of prosodic system can evolve into another.