Sich ausgehen: Actuality entailments and further notes from the perspective of an Austrian German motion verb construction

The paper contains notes for the processes of interpretation and language change starting out from the phenomenon that emerged from the motion verb gehen, ‘go’, joined by the reflexive and the particle aus, ‘out’ in Austrian German. An empirical case is primarily made that the constructions if fully implicative. Furthermore, it is suggested that concepts such as Maximize Presupposition and codevelopment in contact can be useful tools in the equipment of researchers working on semantic change. Finally, a methodological point is suggested towards bridging synchronic and historical data collection processes.


Introduction.
The goal is to make some observational and theoretical points concerning an original motion verb joined by the particle aus and the reflexive in Austrian German (1) (SAG): (1) Eine Tasse Kaffee geht sich vor dem Termin aus. a cup coffee goes itself before the appointment out 'It's possible to have a cup of coffee before the appointment. / There is enough time to have a cup of coffee before the appointment.' I largely follow the description and analysis of the phenomenon elaborated in Gergel & Kopf-Giammanco (forthcoming). The original motion verb construction has come to express a type of modality related to scales and sufficiency in Present-day Austrian German. Epistemic, deontic and purely circumstantial readings not directly related to scales are typically out. SAGs are indeed distinct in several ways from run-of-the-mill modals and so-called enough constructions (ECs), with which they share some but not all characteristics (cf. Nadathur 2019 on the latter). The current objective is to fill certain gaps in the already extant literature, more specifically: i. Buttress the view that SAGs are more implicative than classical modalizing expressions.
The fact that SAGs and different types of ECs differ is discussed in Gergel & Kopf-Giammanco (forthcoming). What I focus on below is the distinction from other modals in German, the implicativity of which has occasionally been claimed (without sufficient support, as I will argue). I will present results that relativize the implicative view for the modals and show distinctions in the judgments with respect to the inferences generated by SAGs. ii. Underline the necessity of considering the role of presuppositions in language change, by suggesting a diachronic version of Maximize Presupposition (Heim 1991). iii. Sketch co-development as a mode of semantic change with the hope of accounting for some puzzles in the current diachronic literature. I will make an apparently counter-intuitive point arguing that going on two paths may be better than on one. Specifically, that language-internal semantics and language external factors can both be heavily at work in semantic change.
The follow-up question will be how to ascertain the impact of one development (say the native one in the course of change), when the other one (e.g. a different language) has already interfered and produced a composed result. In this connection, the very final part of this paper sketches a more general solution for a methodological impasse encountered in historical semantics compared with field-work methods currently utilized in semantic research, which is hoped to be of use in further work in the field.
The structure of this paper follows the points above (Section 2 discusses point (i), etc.).

SAG as an implicative construction.
This section compares the implicative behavior of SAGs with that of German modals and offers new empirical material supporting the claims.
2.1. MINI-BACKGROUND ACTUALITY ENTAILMENTS & MODALS. Implicative predicates have been known as those predicates which give rise to inferences (recently known as actuality entailments, AEs) that their complements hold true in the actual (and not just in some possible) world or situation. Even the literature that calls such inferences 'entailments' does not commit to their exact status, so I will use AE similarly, as a descriptive label to refer to the inferential effect, without theoretically inquiring its nature here, but establishing to what degree it exists in the first place.
An insight of Bhatt (1999) and Hacquard (2006) has drawn attention to the fact that modal expressions can also give rise to AEs, but in a selective manner. The research focus has been on languages that possess an overt morphological marking of the perfective/imperfective distinction, such as Hindi or the Romance languages. For such languages, it has been shown that AEs only obtain in the perfective. For example, see the French (2), from Hacquard (2006:13), where it is only the perfective that induces the inference that Jane not only could but actually did take the train.
(2) Pour aller au zoo, Jane {pouvait/ a pu} prendre le train to go to zoo Jane can-past-impf/ can-past-pfv take the train The question of what happens with modals and modal expressions in languages that do not encode the (im)perfective distinction morphologically has, to my knowledge, been addressed less frequently. German is a good representative in this respect. Even though it has an analytic form known as Perfekt, 'perfect', for most purposes this is functionally and semantically equivalent to a simple preterite. There is a rough North-South divide in usage, and register correlations exist (the perfect being often considered more colloquial than the preterite). Crucially, however, there is no systematic opposition, such that, for example, the synthetic form (the preterite) would function as a marker of imperfective aspect and the Perfekt as a genuinely opposing perfectivizing form. Stechow et al. (2004) note that German modals give rise to actuality entailments. 1 Gergel (2017a), while focusing on Old English, also observes actualistic behavior in attested examples in German. In Old English, counterexamples can be found; that is, cases of past modals that are interpreted non-actualistically or even as counter-to-fact. For German, while corpus examples are interpretable actualistically, a broad corpus study that focuses on the language should be helpful in determining their behavior.

HOW IMPLICATIVE ARE GERMAN MODALS? Von
Moreover, to my knowledge, speakers' intuitions have not been consulted much thus far. I will argue that such intuitions, in fact, seem to be considerably more variable than it might be suspected from earlier observations, when confronted with the direct question of whether a particular modalized eventuality also held in the actual world. In what follows, I describe the results of an experiment run in Graz, Styria (Austria) and Saarbrücken, Saarland (Germany). I first report on the results obtained for the modals to then compare them with SAGs (for the relevant Austrian speakers) in the next subsection. This offers opportunity to have an initial empirical estimate of the AE profile of German modals in the first place, to then use it as a counterfoil for the behavior of the SAG construction extant in Austrian German.
A total of 74 speakers consisting mostly (but not exclusively) of undergraduate students answered questions containing core modals. The speakers were tested with infinitival complements of the modals in the four major types of eventualities: accomplishments, achievements, activities and states. The background to this is that the type of eventuality could in principle play a role as, for instance, modalized states may be more likely to be interpreted epistemically, and epistemic modals typically resist AEs (according to Hacquard 2006 and others for structural reasons in languages like French). The questionnaire included 40 questions for each speaker out of which 20 represented fillers on matters unrelated to modality (e.g., testing classical entailment patterns and other unrelated factors). 10 additional items for each participant were semi-fillers in the sense that they included items that can be interpreted modally in some sense or another (such as SAGs, but also other modal periphrases) but which were not core modals. The core four modals tested were können, 'can', dürfen, 'may', müssen, 'must', and sollen, 'shall'. Modals and eventuality types have been administered in all permutations. As no differences could be detected for the core modals between the two major varieties (Austrian and Federal German with 43 and 31 speakers respectively), I report the results for the entire population tested for the core modals. The experimental set-up included only the modalized sentences (and no additional context), followed by the question of interest. This is equivalent to the results typically presented in the theoretical literature on other languages (cf. example (2) above), which, however, is based on introspection (cf. Gergel & Kopf-Giammanco forthcoming for some discussion of pros and cons on minimal contexts in another type of informal questionnaire). To illustrate the current set-up, for a minimal-context sentence pair such as (3a), informants were asked the question in (3b): (3) a. Bernd konnte ein Bild von der Katze malen. Bernd could a picture of the cat draw 'Bernd was able to draw a picture of the cat'. b. Hat Bernd ein Bild von der Katze gemalt? has Bernd a picture of the cat drawn 'Did Bernd draw a picture of the cat?' The possible answers were a graded forced-choice towards the degree of actualization of the event, ranging from "no" (counting 0) towards "yes" (counting 4), via the intermediate stages ("rather no", 1, "maybe", 2, and "rather yes", 3). The scale is ordinal in shape, but, as usual, proportions on it have no homomorphic significance (for instance, grade 4 does not mean that it is necessarily to be understood as twice as actualistic as 2). Notice further that a rating of 2 is still just a "maybe" and does not indicate any bias towards an actuality "yes". Based on their labelling, it is only ratings of 3 and 4 that can be taken as indication for an actualistic interpretation. Table 1 presents the results across the four eventuality types for the four modals können, 'can,' dürfen, 'may,' müssen, 'must,' and sollen, 'shall,' in the preterite and the perfect, respectively. The finding clearly indicates that sollen is not actualistic at all; both in the preterite and the perfect, it is on average below the 'maybe' level 2. This may be expected, if we put into the picture some of its additional properties, such as the identity in form between preterite and conjunctive present. It remains noteworthy that the perfect of sollen is almost just as non-actualistic.
At the same time, a descriptive tendency to interpret events as if they held in the actual world can be observed for the complements of the other three modals. But it is not one that is as clear as suggested in some of the corpus-based or introspective literature, and not a categorial one. As a point of further comparison, I will put these findings into perspective with regular entailments and SAG patterns shortly below. For now, notice that unlike in Romance and other aspect-marking languages, what can certainly be refuted, as the figures for the modals confirm, is that a correlation of the perfect (to the detriment of the non-Perfekt) and actuality entailments should hold. For German, not only there is no significant preference for the perfect for actualistic interpretation, but for example, in some of the cells in Table 1, it is the preterite which induces higher confidence of actuality. This holds notably for the modal that recorded the highest AE ratings, namely können, 'can': while it averaged 3.34 in the preterite, the average perfect rating was only at a lower 2.89. This leads me to conclude that no AE-biasing effect of the perfect exists in German and können, 'can', is the most actualistic modal.
Clearly, more could be said about the core German modals. For instance, the average ratings have been based on an equal weighting of the aspectual classes, although the data points were not always equal in number due to practical reasons. As can be expected, the averages for each eventuality type naturally differ. They are detailed for können in Table 2 Data such as the ones in Table 2 (and similar data for the other modals, not discussed for space reasons) should be taken with a grain of salt, as the distinction with respect to eventuality types was not a primary concern. Furthermore, such data do not (need to) match corpora, as the relative proportions for each of the distinct eventualities do not have to be equally frequent in use.

THE ACTUALISTIC PROFILE OF SAGS.
Having established a baseline for the core modals, I now address the question how SAGs perform with respect to AEs. While the general experimental set-up was shared with the modals above (and will hence not be repeated), some specific notes regarding SAGs are in order. First, only the judgments of Austrian speakers are considered here, as the construction is not known to the majority of the general public in Federal German (cf. Gergel & Kopf-Giammanco forthcoming for qualifications; the statement can largely be assumed for the Saarland region, where the Federal German questionnaire was run). None of the speakers consulted in the Saarland version of the experiment indicated either an Austrian or a Bavarian background; most of the native speakers who were considered in Germany indicated a dialectal background either directly from the Saar territory or the neighboring regions. Thus, while Federal German speakers were also confronted with SAG items as fillers, they did not always provide judgments and quite often noted, in a free comment line provided in the experiment, that they did not know the construction or questioned its grammaticality in German (without specifically having been asked about this -recall that this is an actuality, not a grammaticality experiment). I will hence only consider the AE profile as extracted from those speakers who could be expected to have reliable native judgments of SAGs; namely, the ones tested on the Austrian version of the experiment (I will return to a possible role of non-native judgments in controlled contexts in section four).
Second, a putative preterite/perfect divide was not tested for SAGs for two main reasons. As it was already shown above, even with the modals, with which the preterite is still fully productive as a form in all varieties of German, the divide in form does not translate into consistently higher ratings for the perfect (unlike in e.g., French). Furthermore, SAGs are generally rather colloquial constructions and Austrian German, as a Southern variety of German (mostly Bavarian, and partly Alemannic in its dialectal origin), has a clear preference for the perfect (to the detriment of the preterite) in most colloquial situations. Hence, only a testing in the perfect for SAGs was utilized to ascertain their native actualistic profile. The results of the elicitation for SAGs are detailed over the distinct types of eventualities as shown in Table 3. Accomplishment Achievement Activity State 3.83 3.88 4.0 3.58 Table 3. SAG ratings of actuality across eventuality types The overall average rating is at 3.82. Once again, no particular focus has been set on a detailed inquiry of the different eventualities, as for instance only one predicate for each has been tested. But the overall rating is rather strongly actualistic and more so than for the modals discussed above. Recall that the maximally possible rating was 4, which indicates the impression of full certainty that the event took place, and the only core modal that scored above 3 was können. But even können did so (ironically) only in the preterite, namely at a 3.34, while averaging at 2.89 in the perfect. This does not mean, overall, that German modals cannot be actualistic with the help of enriched contexts, e.g., in narratives or actual conversations. But conversely, the result does mean that SAGs are virtually maximally actualistic.
To strengthen the point that SAGs are as actualistic as it gets, let's see how semantic entailments turn out to be graded on the same task (also performed in German and in the same set-up as explained above -it is only for quick readability that I will render the key building blocks of the contexts in English). Four sentences that create logical entailments were tested as fillers and to give away their overall rating (here, too, they were weighted equally): they ranged at an average of 3.74. Two sentences had conjunction (where A is entailed by A and B), one had an inference from a subset to a superset situation (swimming > doing sports), and finally one had an inference involving a universal in the subject of the type (all Austrians know x > all Styrians know x). The oscillations observed in the class of classical entailments did not directly reflect a particular lexical choice of a trigger. Thus, both the respective lowest and highest average rating were culled by a conjunctive context. An inference along the lines of 'Conny eats fruits and vegetables' to 'Conny eats fruits' was rated at an average of 3.58, while the inference from 'Jonas likes to dance tango and drink gin' to 'Jonas likes to dance tango' obtained an average of 3.91. This is not the place to speculate on such differences (neither syntactically nor in terms of co-occurrence patterns etc.; e.g., whether the NP-conjoining strategy elicits different reactions per se, or whether fruits and vegetables is more predictable than tango dancing and gin drinking. Even if it were, why should such items produce different results?) The bottom line remains that the overall average of 3.74 for semantic entailments is surely very close to the maximum 4, and recall that the average for SAGs wasn't any worse at 3.82. This leads me to conclude that unlike the German modals, which show a more varied distribution empirically, SAGs are fully actualistic.

Presuppositions over time.
SAGs can be given an entry as in (4) for a start, slightly adapting and simplifying the entry discussed in Gergel & Kopf-Giammanco (forthcoming): (4) Let S be a sentence containing a SAG based on a gradable expression GRD (e.g. on a time scale), a contextually available entity x as an argument of GRD, and Q a proposition which is subordinated to the SAG predicate. Then, evaluated with respect to a world w: a. SAG (and thereby S) presupposes a degree dnec that is necessary for Q; b. SAG (and thereby S) presupposes that Q is desirable; c. S asserts (via the interpretation function) that x has at least degree dnec of GRD in w; d. In case GRD induces an action-characterizing eventuality, SAG (and thereby S) presupposes the contextual causal sufficiency of a manifestation of dnec-GRD for Q.
The proposition Q is either introduced directly via a (typically finite) clause or indirectly by a key -e.g, the cup of coffee in (1), from which a proposition concerning the drinking of a cup of coffee is inferred. The only generalization I make here is that the key is never a causer. 2 Three out of the four features of (4) are presuppositional. This properly includes a slight correction of Nadathur's (2019) recent approach to sufficiency; some such component is common to many approaches to ECs, as a necessary and a sufficient condition on instantiation that has to be implemented in a selective way at the semantics-pragmatics interface (cf. Hacquard 2005 for a different implementation). Diachronically, this translates into assuming that once sufficiency is conveyed by the motion verb construction, its presuppositional entry also enters the stage.
Presuppositions can come about in several other guises. For SAGs, an important component emerging from (4) that is unrelated to the sufficiency part is the desirability of the prejacent, where the desire holder is usually contextually bound. Gergel & Kopf-Giammanco (forthcoming) observe a number of predicates in relevant corpus examples consisting of the motion verb and particles which appear to denote potentially desirable events (just as there are undesirable ones). One possibility that might have been instantiated, then, is that an earlier implicature deriving from such contexts became conventionalized and encoded as a presupposition. A different type of path for the development of a presupposition is outlined in the case studies reported in Beck & Gergel (2015), where the preposition meaning 'against' contributed its core lexical meaning (and thus not an implicature in this case) to yield an originally counterdirectional presupposition. I take the further investigation of diverse trajectories of presuppositions to be a necessity in the field. I assume here that the origin of presuppositional elements can indeed be quite varied.
But the natural follow-up question is: what kind of more general tendencies for presuppositions can be expected? Eckardt (2006Eckardt ( , 2009 proposes the principle of Avoid Pragmatic Overload (APO). Eckardt does not offer quantitative corpus data on APO, but suggests (simplifying somewhat) that presuppositions, just alongside implicatures, tend to be eliminated when they are too numerous. Given that implicatures have already been widely discussed in language change and my current focus is on presuppositions, I will ignore the former but suggest that for the latter, it may be worth exploring an additional possibility for many case studies, according to which their overt marking tends to be particularly advantageous over time (rather than necessarily a liability to be avoided). For synchronic stages, Heim (1991) already suggested a tendency of maximizing presuppositions in conversations, which is currently widely explored in much experimental literature (cf. e.g., Bade 2016 for views on different synchronic triggers). For our immediate purposes regarding language evolution over time: if some pressure of such a principle exists in human communication (whatever its ultimate cause may be), then it can reasonably be expected to also yield certain cumulative effects over time. Notice that this proposal is, while substantially distinct, not the opposite of APO, as the claim is not that presuppositions should constantly increase (as opposed to being eliminated/avoided) until, say, some processing boundary stops it. In fact, if a standard assumption is made that the pragmatics of communication is on a deeper level relatively stable across languages and (recorded) language stages, then it would be reasonably quite counterintuitive to expect an increase in presuppositions per se. The only increase that can be expected (all things being equal) is with respect to the (overt) marking of presupposition triggers over time. Specifically, I suggest (5): (5) MaxPMoT (Maximize Presupposition Marking over Time): Increase the signaling of presuppositions over time by using presupposition triggers when possible and appropriate.
The first piece of motivation for testing a hypothesis such as (5) for different classes of presupposition triggers lies in the possibility that Heim's Maximize Presupposition yields some cumulative effects over time in the following sense. If there are marking gaps in certain paradigms in which the conversational contexts include shared propositions for the common ground and certain items are occasionally recruited to fill in those gaps, then over time it will be advantageous for them to be recruited systematically. For example, if definiteness was not yet marked in many cases (in which the relevant presuppositions were contextually satisfied) at early stages of, say, Romance or Germanic varieties, then the increase in use over time may well be due to something like MaxPMoT. Key presuppositional markers ('triggers') can grammaticalize in many different ways, both in terms of their pragmatic and syntactic paths (e.g., whether the definite is conventionalized as pre-or post-nominal, both positional possibilities having been chosen in Germanic as well as in Romance, as is presumably well-known). But once the relevant overt functional markers have developed, it appears to be difficult to undo them in the process of language change. That is, all things being equal, the marking of presuppositions will be resilient in the sense that it will neither disappear nor become de-presuppositionalized. I am not yet aware of a situation/language in which a definite article has changed its meaning from presupposing that there is e.g., a unique king of France towards asserting that there is a unique king of France. Resilience, at the same time, cannot mean hard-wired immortality for triggers. However, the presuppositional part in the entries of triggers should not be easy to be eliminated in a volatile fashion when, e.g., conversational situations become too convoluted. But triggers can surely disappear just like other lexical items when, for instance, an even better suited competitor emerges and becomes recruited. A case in point for this substitution is the fate of the adverb eft, a very robust and widely attested version of 'again', in Old/Middle English, ultimately replaced by the newcomer again (cf. Gergel 2017b for an outline and details of this competition). When it comes to the marking of the common ground, a considerable amount of work is thus imaginable diachronically. With the advent of updated notions of common ground, the task could become even more challenging, but also interesting (cf. e.g. Bar-Asher Siegal & Boneh 2016 for a recent assessment of the synchronic picture, and Puhl & Gergel 2020 for a different case study).

Co-development.
The section suggests that co-development in meaning change may be an opportunity, rather than an impediment, towards explaining puzzles in the emergence of SAGs and other constructions that have, with different degrees of plausibility, been co-influenced by contact. The current point is thus intrinsically tied not only to language comparison but to language contact and language external factors. In this area, interesting and well-known attempts at generalizations have been made (cf. Thomason & Kaufman 1988, Winford 2003, van Gelderen 2016, but usually the formal semantic and the contact linguistic side have stayed disjoint. Eckardt's (2006) seminal contribution, for example, argues against contact and emphasizes the truth-conditional re-organization of the original motion verb go towards the development of the English futurate of imminence. While the latter point is in my understanding strong and almost undeniable, the former appears as somewhat less necessary. Why couldn't a construction with a boost from direct or indirect borrowing (with different degrees of importance in different borrowing situations) still develop in a more-or-less predictable way according to (thus accelerated) principles of compositionality? Emulating form and then applying the regularities of one own's grammar represents, after all, the usual course of events in multiple domains of first and second language acquisition (Yang & Montrul 2017 and the literature cited therein). I therefore assume that such considerations merit a place in the semantic domain as well. An argument presented (cf. e.g., Eckardt 2006: Chapter 4 for richer discussion) against contact with French is that other German(ic) varieties such as those spoken in Alsace and Luxembourg have apparently not imported the construction even though the speakers are competent in French as well. This certainly makes Eckardt's point that contact with French cannot be the (sole) driving force in all Germanic-Romance situations. At the same, time the argument is very close to the thorny actuation issue, which cannot be expected to be solved by the contact camp (just as it is not solved by other camps working on language change). Simply because a form-meaning pairing is not borrowed in region A does not mean that it could not have been borrowed in region B, as both the contact and the native (usage and grammar) data points are conspicuously distinct. As far as contact between German and French is concerned, sheer individual calques involving the motion verb can be observed in border regions. For example, in the federal state of Saarland, a region closely related in terms of its dialectal background to the languages and varieties of Luxembourg, the periphrastic heiraten gehen, 'go get married' (Maike Puhl, p.c.) for aller se marier is a common calque, despite the fact that the general population is less bilingual. A more prevailing argument against contact is that the data at the critical periods are scarce. This seems critical, but it could, however, be due to a more colloquial use of the construction in different set-ups. At the same time, both the influence of French in terms of verbal periphrases (cf. Trips & Stein 2012 for a recent review and assessment) and the strong general impact of French lexis long after the ruling classes in England stopped speaking French (cf. Ingham 2018 for a convincing case for bilingual competence in socially key domains) would support the role of contact.
For SAGs, a partially similar case can be made, perhaps even stronger on the side of contact. Nineteenth century Vienna had a vibrant bilingual community with over 25.000 registered inhabitants whose first language was Czech/Slowak/Bohemian (Newerkla 2013: 2) and Czech/Bohemian is claimed to have (had) a partially similar construction (cf. Glettler 1985, Hoffmannová 2007. Although the discussion is usually limited in the sources regarding the exact properties of the Slavic situation, two facts are notable and strengthen the strong possibility of contact. First, other historical varieties of German appear to lack SAGs. Synchronic borrowing from Austrian is often observable nowadays at the level of individual speakers who had exposure to Austrian; with hundreds of thousands of Federal German speakers in Austria in modern times this is not a surprise. But overall the construction is still considered an Austriacism and many speakers of Federal German do not know it, as mentioned. Second, all the morphosyntactic building blocks that seem to be required are available just as well in virtually all varieties of German (e.g., there are both at present and earlier stages in non-Austrian varieties of German a number of verb-particle combinations that could come somewhat close in meaning too, cf. Gergel & Kopf-Giammanco forthcoming).
While the field of contact seems to be a potentially rich one to uncover certain leaps in otherwise fully compositional processes, I would like to end this contribution by advertising a way in which development could potentially be factored out from co-development in practical terms. This proposal also relates to other difficulties of data collection in diachronic situations (beyond contact), where the judgments of native speakers can typically not be consulted any longer. If we extrapolate from other cases of difficult data collection such as first-language acquisition, following in spirit approaches of the type advocated in Gleitman et al. (2005), then a good possibility arises by doing the following. When speakers of a particular variety are not available, ask speakers of a closely related variety/language and test their intuitions.
(6) The human diachronic simulation paradigm (HUDSPA): Humans confronted with new meaning-form pairings modeled after an attested semantic change will react similarly when they are placed in conditions that resemble those of the actual change