Using the Cognitive Approach to Coherence Relations for Discourse Annotation

The Cognitive approach to Coherence Relations (Sanders, Spooren, & Noordman, 1992) was originally proposed as a set of cognitively plausible primitives to order coherence relations, but is also increasingly used as a discourse annotation scheme. This paper provides an overview of new CCR distinctions that have been proposed over the years, summarizes the most important discussions about the operationalization of the primitives, and introduces a new distinction ( DISJUNCTION ) to the taxonomy to improve the descriptive adequacy of CCR. In addition, it reflects on the use of the CCR as an annotation scheme in practice. The overall aim of the paper is to provide an overview of state-of-the-art CCR for discourse annotation that can form, together with the original 1992 proposal, a comprehensive starting point for anyone interested in annotating discourse using CCR.


Introduction
Annotating coherence relations refers to the process of attributing labels that best capture the relation inferred between two segments in a text to that relation. To annotate coherence relations, researchers make use of discourse annotation schemes. Discourse annotation schemes differ greatly in the number of relations they distinguish, ranging from two (Grosz & Sidner, 1986) to 81 relations (Carlson & Marcu, 2001). This is in part due to the fact that there is disagreement about how many distinct coherence relations language users actually infer and how specific these relations are. On the other hand, these differences seem to be caused by the varying purposes of the annotation schemes and the research traditions they originate from.
One approach to describing coherence relations that has been around for a while is the Cognitive approach to Coherence Relations (CCR; Sanders, Spooren, & Noordman, 1992, 1993. Not originally designed as a discourse annotation approach, CCR defines four basic cognitive primitives that can be used to order the set of coherence relations language users infer between segments in a text. Since its introduction, CCR has primarily been used as a basis for experimental and acquisition research on discourse coherence; this research includes both studies aimed to verify the cognitive relevance of CCR's primitives and studies in which CCR's primitives are used as a point of departure for researching discourse coherence (see Sanders & Evers-Vermeul, 2019 for an overview).
CCR is also increasingly used as a basis for discourse annotation, as is evidenced by the list of projects that have used CCR to annotate coherence relations included as Appendix A. Using CCR as a discourse annotation can be appealing for several reasons. Since it consists of cognitively relevant primitives, CCR is applicable cross-linguistically.1 Indeed, it has successfully been used in discourse annotation projects covering several different languages: Dutch (e.g., Evers-Vermeul, 2005;Spooren & Sanders, 2008;Stukker, 2005;Vis, 2011), English (Hoek, Zufferey, Evers-Vermeul, & Sanders, 2017;Rehbein, Scholman, & Demberg, 2016), German (Pit, 2003), French (Degand & Pander Maat 2001;Pander Maat & Degand 2001;Pit, 2003), Spanish (Santana, Spooren, Nieuwenhuijsen, & Sanders, 2018), and Mandarin Chinese (Li, Evers-Vermeul, & Sanders, 2013;Li, Sanders, & Evers-Vermeul, 2016;Xiao, Li, Sanders, & Spooren, to appear). In addition, CCR's primitives present a systematic approach to the categorization of coherence relations and have been shown to correspond to the distribution of connectives in various languages (e.g., Knott & Sanders, 1998;Li, 2014;Pit, 2003;Sanders & Spooren, 2015;Wei, 2018). CCR's individual primitives also make it attainable to employ naive annotators in annotation projects; Scholman,  show that undergraduate students can use a stepwise version of CCR to produce decent quality annotations without extensive training. Not being entirely dependent on expert annotators helps cut down on time and expenses of traditional annotation projects and opens up the possibility of crowd-sourcing annotations. Furthermore, CCR's value combinations are often much more informative than end labels and can provide a better insight into annotator disagreements (Demberg, Scholman, & Asr, 2019; see also Section 4.1). Finally, the CCR taxonomy is easily applied to only a subset of relations. When, for instance, only considering coherence relations involving some form of contrast, or relations signaled by because, it is clear which primitives and distinctions should be included in the annotation 'tag set;' for approaches that use end labels this is not necessarily as obvious.
There also appear to be some downsides to using CCR as a discourse annotation approach. Since CCR was designed to "identify the primitives in terms of which the set of coherence relations can be ordered," it does not constitute a "complete descriptively adequate taxonomy of coherence relations" (Sanders et al., 1992:4). 2 Since the original 1992 proposal, several additional distinctions have been proposed that aim to improve the descriptive adequacy of the taxonomy. However, these proposals are distributed over several individual papers. In addition, there appears to be some skewedness in how well the approach is developed for different types of relations; there has been a lot of debate on how to operationalize CCR's primitives in the causal domain, but less so for other types of relations. In addition, several new distinctions have been proposed and frequently used within the domain of causal relations (e.g., VOLITIONALITY, PURPOSE), while fewer additional distinctions have been suggested for other types of relations.
After outlining the basic considerations of CCR and the original CCR taxonomy in Section 2, this paper provides an overview of new CCR distinctions that have been proposed over the years, summarizes the most important discussions about the operationalization of the primitives, and introduces a new distinction (DISJUNCTION) to the taxonomy to further improve the descriptive adequacy of CCR (Section 3). Finally, Section 4 discusses several important issues that the use of the CCR as an annotation scheme in practice presents; taking note of these points can help in successfully using CCR in corpus annotation. Overall, this paper thus provides an overview of state-of-the-art CCR for discourse annotation, and forms, together with the original 1992 proposal, a comprehensive starting point for anyone interested in annotating discourse using CCR.

The Cognitive approach to Coherence Relations
The original CCR taxonomy was proposed in Noordman (1992, 1993), and is very much in line with work by Hobbs (1978Hobbs ( , 1979Hobbs ( , 1990 and Kehler (1995Kehler ( , 2002, who also consider coherence relations to be cognitive entities, and approach coherence relations by formulating a limited set of organizing principles. Sanders et al. (1992:2) define the concept of coherence relation as "an aspect of meaning of two or more discourse segments that cannot be described in terms of the meaning of the segments in isolation." Coherence relations are the reason that "the meaning of two discourse segments is more than the sum of the parts" (Sanders et al., 1992:2). This basic property of coherence relations is referred to as the relational surplus; the criterion that CCR's primitives have to be features of the relational surplus is the relational criterion.
In CCR, discourse relations are considered to hold between segments that are minimally clauses (e.g., Evers-Vermeul, 2005;Sanders & van Wijk, 1996); we will refer to this as the clausal criterion. The clausal criterion is closely related to the basic definition of coherence relations in CCR, since clauses are the smallest grammatical units that can function meaningfully in isolation (see Hoek, Evers-Vermeul, & Sanders [2017] for a more elaborate discussion of the clausal criterion).
CCR considers coherence relations to be cognitive constructs. Its taxonomy is therefore intended to be cognitively plausible. For a distinction to meet the cognitive plausibility criterion, it should be observable in or make relevant predictions about language acquisition and language processing (Sanders et al., 1992). In addition, evidence for cognitive plausibility can be drawn from the system of linguistic markers. Knott and Dale (1994) argue that the distinctions made by connectives and cue phrases are indicative of the distinctions made in the minds of language users (see also Knott & Sanders, 1998).
Because CCR defines coherence relations as cognitive constructs, the labels attributed to coherence relations in annotation should correspond to the relation that holds in the mental representation of the discourse, i.e., the inferred relation. If a relation is marked by a connective or cue phrase in the text, it may well be the case that the annotated relation does not correspond to what is explicitly signaled by the linguistic marker. (1), for example, is marked by the connective and, but the relation that is inferred is a causal relation: the not marrying is interpreted as a consequence, albeit jokingly, of the chips-eating. (1)  In focusing primarily on the relations that hold in the mental representation of a discourse, CCR's approach to the depiction of coherence relations is distinctly different from 'bottom-up' annotation approaches that seem to place more focus on the linguistic markers of coherence relations, such as for example the Penn Discourse Treebank (PDTB; Prasad et al., 2008). The original CCR primitives meet all three of CCR's criteria. They are properties of the relational surplus, thereby satisfying the relational criterion. They can also be used to describe relations that hold between clauses or larger discourse segments, thereby satisfying the clausal criterion. Finally, all primitives are cognitively plausible. The difference between positive and negative relations, which are distinguished from each other by the polarity primitive (see Section 2.1), can for instance be observed in processing (positive relations are processed faster than negative relations: Clark, 1974;Murray, 1997;Wason & Johnson-Laird, 1971), language acquisition (positive relations are acquired earlier than negative relations: Bates, 1976;Bloom, et al., 1980;Eisenberg, 1980;Evers-Vermeul & Sanders, 2009), and the linguistic system (positive and negative relations are prototypically signaled by different connectives). The remainder of this section will give an overview of the four original CCR primitives: polarity, basic operation, source of coherence, and order of the segments.

POLARITY
Discourse relations hold between two propositions, expressed by S1, which refers to the first segment in the linear order of segments, and S2, which refers to the second segment. A relation with a positive value for POLARITY features P (antecedent) and Q (consequent), as in (2). A relation has a negative value for POLARITY if it features a negative counterpart of P, not-P, or Q, not-Q, as in (3). (2) [We liked Bob]S1 because [he was both different and apologetic.]S2 (3) [They … never failed to invite us to their houses]S1 although [they knew we would never come.]S2 In (2), S1 presents a consequence (Q) of the cause (P) in S2. In (3), however, S1 is a contrastive consequence (not-Q) of the cause (P) in S2; a logical consequence of knowing someone never takes your offer could be to stop inviting them. Positive relations are often expressed with connectives such as and or because. Negative relations are often signaled by connectives such as but or although. Although positive relations can often be turned into negative relations by negating one of the arguments, it should be noted that relations with a negative value for POLARITY do not necessarily contain lexical negation, as is illustrated by (4). Similarly, relations containing lexical negation can have a positive value for POLARITY, as can be seen in (5). (4) Although [it's inspired by the vinyl bars of Japan,]S1 [this spot chooses accessibility over authenticity.]S2 (5) [I don't make them a lot]S1 because [I don't think it's fair to the other cookies.]S2

BASIC OPERATION
The category of BASIC OPERATION takes two values: causal and additive. A relation is causal if there is an implication relation between the two arguments (P → Q), as in (6). Conditional relations, as in (7), also involve an implication relation and are categorized as having a causal BASIC OPERATION under the original CCR proposal. (6) [Phone service in the greater Chicago area was tied up for two hours Christmas Eve]S1 because [some kid called a phone-in show to get a wife for his father.]S2 (7) If [there was a fan club]S1 [I'd be the president.]S2 A relation is additive if there is no causal relation between the segments and the only relation that can be inferred between the segments is P & Q, as in (8). (8) [I'm worried]S1 and [I'm confused.]S2

SOURCE OF COHERENCE
The main distinction made in the SOURCE OF COHERENCE of a discourse relation is between objective and subjective.4 A discourse relation is objective when its two segments are related by their locutionary meaning; the relation is observable in the real world, as in (9).5 Subjective relations are related because of the illocutionary meaning of one or both of its segments; they involve the speaker's reasoning, as in (10); subjective discourse relations are often a reason or motivation for a claim or conclusion.
(9) [A Harry Potter festival that was supposed to take place near Glasgow this summer has been cancelled,]S1 because [too many people wanted to go.]S2 (10) [Knitted gifts are great]S1 because [they are timeless and will last forever if taken proper care of.]S2 A specific type of subjective relations are speech act relations. In a speech act relation one of the segments relates to the performance of the speech act in the other segment, for instance by offering a motivation or justification, as in (11), or by indicating the relevance of an utterance, as in (12). Speech act relations can also hold between two speech acts, as in (13).
(11) [How long are they going to take to cook?]S1 Because [you've got twelve minutes to go.]S2 (12) [There is a wonderful theatre program,]S1 if [she's interested in that.]S2 (13) [Why would it take an unusual woman to keep him company?]S1 And [why was he wearing a Russian astronaut on his lapel?]S2 The SOURCE OF COHERENCE values are highly comparable to Sweetser's (1990)

ORDER OF THE SEGMENTS
Discourse relations generally consist of two segments. The linearly first segment is always referred to as S1; the linearly second segment is always S2. The ORDER OF THE SEGMENTS feature refers to 6 how P and Q of the BASIC OPERATION map onto S1 and S2. It takes two values: basic if S1 expresses P and S2 expresses Q, as in (14), and non-basic if S1 expresses Q and S2 expresses P, as in (15). (14) Because [they live in sub-tropical climates,]S1 [African penguins have to cope with both cooling down on land and keeping warm in the water.]S2 (15) [I had to talk loud]S1 because [the movie was loud!]S2 The ORDER OF THE SEGMENTS is only relevant to causal relations, since additive relations are symmetrical in this respect.

Extensions of the original CCR taxonomy
POLARITY, BASIC OPERATION, SOURCE OF COHERENCE, and ORDER OF THE SEGMENTS are the original four primitives of the CCR taxonomy. Since the 1992 proposal, there has been a lot of discussion on how to operationalize the primitives, as well as proposals for new primitives or additional distinctions to CCR for discourse annotation. In this section, we provide an overview of the most important developments since the original CCR proposal.

Additional distinctions within original primitives
There have been proposals for additional distinctions within certain parts of the original CCR taxonomy. Unlike the original primitives, these additional distinctions apply only to a (small) subset of coherence relations. The proposed distinctions allow annotators to make more fine-grained contrasts, thus improving the descriptive adequacy of the CCR taxonomy. Additional distinctions have been proposed within the class of positive relations (TEMPORALITY), the class of positive objective causal relations (VOLITIONALITY and PURPOSE) and within the class of negative relations (DIRECTNESS).
It should be noted that many of the new distinctions have been proposed for only a subset of relations. This does not necessarily mean that the same distinction could not also be annotated for other types of relations. VOLITIONALITY (Section 3.1.2), for example, divides the subset of positive causal relations into volitional and non-volitional causal relations, a distinction that has been argued to be cognitive relevant on the basis of evidence from language processing, language acquisition, and linguistic systems. While VOLITIONALITY could also be annotated for, for instance, conditional relations, there is no clear evidence that the distinction between volitional and non-volitional is as cognitively relevant within the class of conditional relations as it is within the class of causal relations. If such evidence were to be found, the VOLITIONALITY distinction could easily be extended to apply to all implication relations with a positive value for POLARITY; the same holds for other distinctions as well. Limiting additional distinctions to apply to only those subsets for which there is empirical evidence for the distinction's cognitive relevance, helps to create a balance between descriptive adequacy on the one hand, and cognitive plausibility on the other.

TEMPORALITY
The CCR taxonomy has recently been proposed to be extended with a new distinction: TEMPORALITY. The original CCR proposal considers temporal relations to be a subtype of positive additive relations; Sanders, Spooren, and Noordman (1992:28) state that "the properties distinguishing temporal relations from other additive relations concern the referential meaning of the individual segments." TEMPORALITY is thus taken to be a propositional, rather than a relational feature of coherence relations, and, as such, does not meet all the criteria necessary to be adopted into the CCR taxonomy. Evers-Vermeul, Hoek, and Scholman (2017), however, argue that TEMPORALITY does meet the relational criterion. They show that the temporal information in the propositional content of the segments is not always sufficient to establish a temporal coherence relation. In addition, they argue that the ordering of discourse segments in time can only be determined for a combination of the discourse segments; not for segments in isolation. As such, TEMPORALITY is a feature of the relational surplus and meets the relational criterion. TEMPORALITY is then argued to also meet all other CCR criteria. Temporal relations can hold between clauses, and the relevance of TEMPORALITY is observable in language processing, language acquisition, and in the connective inventory of several different languages.
After discussing other options, Evers-Vermeul, Hoek, and Scholman (2017) argue that the best way of incorporating temporal relations in CCR is adding another primitive to the taxonomy. The proposed primitive distinguishes between relations that are ordered in time and relations that are not ordered in time. Positive additive relations that are ordered in time are relations that are most prototypically referred to as 'temporal relations.' As is shown in Figure 1, two additional steps make more fine-grained distinctions within the set of relations that are ordered in time: between sequential and synchronous relations and between sequential relations that are chronologically ordered and sequential relations that have an anti-chronological order. While not explicitly included in the original CCR taxonomy, the use of a primitive that includes additional distinctions relevant to only a subset of relations is in line with later proposals for additional distinctions, such as VOLITIONALITY (see Section 3.1.2).

Temporal
Non-temporal 2 Sequential Synchronous 3 Chronological Anti-chronological Although it was not explicitly addressed in Evers-Vermeul, Hoek, and Scholman (2017) whether the TEMPORALITY distinction is applicable to all coherence relations, we consider temporal order to be especially productive to relations with a positive value for POLARITY. With TEMPORALITY as a separate primitive, two different types of order can be distinguished for causal and conditional relations: implication order, as depicted by the original ORDER OF THE SEGMENTS primitive, i.e., basic versus non-basic order, and temporal order, i.e. chronological versus anti-chronological order. These two orders will coincide for many relations, as in the conditional relation in (16). The relation has a basic ORDER OF THE SEGMENTS, i.e., S1 expresses P and S2 expresses Q. It also has a chronological temporal order, i.e., the event expressed in S1, buying a Railcard online, occurs before the event expressed in S2, replacing that Railcard. Sometimes, however, the two orders diverge, as in the conditional relation in (17), which has basic order, but anti-chronological order, since the event expressed by S1, gaining muscle, occurs after the event expressed by S2, lifting as heavy as possible. The idea that there is an underlying temporal order that is opposite to the ORDER OF THE SEGMENTS is underlined by the fact that the relation in (17) can be paraphrased as you should always lift as heavy as possible, because then you will gain muscle, while a similar construction cannot be used to paraphrase (16).

(16)
If [you bought your Railcard online,]S1 [you will need to get a replacement online.]S2 (17) If [you want to gain muscle,]S1 [you should always lift as heavy as possible.]S2 The addition of the TEMPORALITY feature has two important advantages. First, it opens up the possibility to investigate differences and similarities in the use of causal and conditional relations with a temporal order on the one hand and purely temporal relations on the other. Second, it helps make finer-grained distinctions within the class of implication relations, with relations in which implication order and temporal order do not coincide corresponding to for instance relations annotated as ENABLEMENT or PROBLEM-SOLUTION in the RST-DT (Carlson & Marcu, 2001) or as IMPLICIT ASSERTION in the PDTB 2.0 (PDTB Research Group 2008). It should be noted that while the TEMPORALITY feature (specifically the third temporal ordering step) can be annotated for causal and conditional relations, this does not imply that these relations in CCR would belong to the class of TEMPORALS distinguished in RST-DT or PDTB, since the implication relation is considered to be a more salient (stronger) feature of these relations.

VOLITIONALITY
It has been proposed that within the class of positive objective causal relations, a distinction can be made between volitional and non-volitional relations (e.g., Pander Maat & Sanders, 2000;Sanders et al., 1992;Stukker, Sanders, & Verhagen, 2008; see also Mann & Thompson, 1988). Volitional causal relations involve a thinking actor who is responsible for an event in the consequent of the relation, as in (18), where the making event in S1 is a volitional action. Non-volitional causal relations do not involve a volitional action. In the relation in (19), for example, the consequent does not involve an agent; one fact leads to the other. It should be noted that some languages have dedicated connectives for non-volitional causal relations, such as daardoor 'that is why' and doordat 'because of the fact that' in Dutch (e.g., Stukker et al., 2008). (18) [I make them a lot]S1 because [I have this indescribable need to constantly have new pillows.]S2 (19) [The game has changed]S1 because [the way we communicate has changed.]S2 Pander Maat and Sanders (2000) propose that volitional causal relations have something in common with subjective causal relations (see Section 3.3). Both types of relations involve a Subject of Consciousness (SoC); a thinking entity involved in the relation. The main difference between volitional causal relations and subjective causal relations is that in subjective relations the SoC is involved in the construal of the relation (see Section 3.3.1), whereas in volitional causal relations, the SoC is not. Instead, the SoC in a volitional causal relation is usually an agent. In addition, the SoC in volitional relations is typically explicitly mentioned (onstage: see Section 3.3.2), though it may also be inferable in for instance a passive construction, see also Section 3.1.3. While the speaker is responsible for the action in S1 and the fact in S2, the causal relation does not stem from the speaker's mind and is observable in the real world. Non-volitional causal relations do not involve an SoC at all.

PURPOSE
Another distinction within the class of positive objective causal relations is PURPOSE . Purpose relations feature a volitional action for which the motivation is an intended result. In (20), for instance, the adding of the smell is done to achieve the intended result of people knowing when there is a gas leak. Unlike the relation in (20), the relation in (21) does not feature an explicitly mentioned agent and instead uses a passive construction in S1. While the relation in (20) has an explicit agent (they), the agent in (21) is implicit in the passive construction in S1. Since the agent in (21) is not absent but merely unmentioned, it can still be classified as a positive, objective causal relation specified for PURPOSE. Several different languages have connectives that typically express PURPOSE relations, such as so that or in order to in English, or zodat 'so that' in Dutch. (20) The gas is odorless, but [they add the smell]S1 so [you know when there's a leak.]S2 (21) [Services are being enhanced to remain open 24 hours]S1 so that [no one will have to stay on the streets during the cold snap.]S2 For causal relations specified for PURPOSE, determining the ORDER OF THE SEGMENTS is not entirely straightforward (e.g., Sanders et al., 1992). On the one hand, the relations in (20) and (21) are very similar to result relations (i.e., positive causal relations with basic order). On the other hand, they also bear similarities to volitional causal relations with a non-basic order like the one in (18), because the intended result is the motivation for executing the volitional action in the first place (see also Reese et al., 2007:12-13). In CCR, the intended result in positive causal relations specified for PURPOSE should be considered the consequent, Q, while the intentional action should be considered the antecedent, P. The ORDER OF THE SEGMENTS in (20) and (21) is therefore basic. Pander Maat (1998) evaluates the original CCR taxonomy with respect to negative relations. He argues that the original primitive inventory is insufficient to capture all major distinctions between relations with a negative value for POLARITY. On the basis of a corpus annotation study and using linguistic evidence, primarily from the Dutch connective inventory, he proposes a new distinction to be applied to negative additive coherence relations: DIRECTNESS. 6 Pander Maat (1998) poses that in negative additive relations, the two segments are compared to each other. This comparison is direct if "the propositions are themselves incompatible" (Pander Maat 1998:192); the propositional content of S1 is in direct contrast to the propositional content of S2. The comparison can also be indirect, in which case the results or conclusions on the basis of propositions are incompatible. Direct, negative, objective, additive relations contain, for instance, a semantic contrast. In (22) the statements about Neilia and Jill are directly compared. In (23), on the other hand, an indirect, negative, objective, additive relation, it is not the segments themselves that are in contrast to each other, but rather the results of both segments ('conflicting causal forces'); daily gains imply an improvement, but the second segment indicates a trend in the opposite direction. (22) [Neilia would always be Mommy,]S1 but [Jill was Mom.]S2 (23) [Stock market notches daily gain,]S1 but [posts largest weekly drop since early 2016]S2

DIRECTNESS
Within negative, subjective, additive relations, DIRECTNESS mainly distinguishes between qualifications and concessions. Sanders, Spooren, and Noordman (1992) categorize concessions as negative, subjective, additive relations. In their view (see also Spooren, 1989), concessions are relations that feature two arguments in favor of opposing views, see Figure 2.7 Concessions are similar to relations with conflicting causal forces, as in (23) Figure 1), since it appears to be irrelevant to some types of negative relations, fixed for others, and only a true distinction for a few combinations of primitives/distinctions. In addition, the PERSPECTIVE distinction mainly appears to be a property of the propositional content of the segments, rather than a relational feature. We will therefore not discuss the PERSPECTIVE distinction at length here. 7 Outside of CCR, concession is also often used to refer to negative causal relations, e.g., "although she studied hard, she failed the exam." In concessions, the conclusion that can be drawn on the basis of the first segment is incompatible with the conclusion that can be drawn on the basis of the second segment. Since the causality is not found between the segments, but rather between the segments and their associated inferences, the relation between S1 and S2 is not an implication relation and, as such, the relation in Figure 2 is considered an additive relation. (24) and (25) are actual examples of concessions. In (24), the inference made on the basis of the first segment, "I won't agree with you," is in contrast with the inference made on the basis of the second segment "I will agree with you." In (25), the contrast holds between "you can write it yourself" and "we will have someone else write it." (24) "This is a beautiful house." "Thank you. I never know what to say when somebody says that. [You don't want to agree]S1 but on the other hand, [it feels weird to disagree and say 'no it's a dump'."]S2 (25) [I'm sure you would like to write the book yourself,]S1 but [your record is not what I might call promising, book-finishing-wise.]S2 In qualifications, the second segment "cancels the strongest interpretation of the first statement" (Pander Maat 1998:186). As is illustrated in Figure 3, qualifications are similar to concessions, but the conclusion made on the basis of S2 directly contrasts with the propositional content of S1.   26) contains an actual example of a qualification relation. The proposition expressed in S1, "I don't know any blind people," is qualified by the statement that the speaker does know someone with a pretty severe eye condition, which implies that he does know someone who is practically blind. (26) [I, personally, don't know any blind people,]S1 though [the guy I used to buy my newspaper from had pretty bad cataracts.]S2 Pander Maat (1998) further distinguishes four specific types of qualifications: simple qualification, exceptions, qualified denial, and denied intensification. The differences between the four types mainly refer to the direction of the qualification (weakening or intensifying) and to whether a stronger or weaker interpretation of the first segment should only be made to a certain extent or not at all. Each subtype of qualification, however, follows the general relation configuration illustrated in Figure 3. As such, the different qualification subtypes seem too fine-grained and segmentspecific to be incorporated into the CCR taxonomy by means of additional distinctions (this is also not something Pander Maat [1998] proposes). However, taking note of these specific variations may be helpful in recognizing qualifications during annotation. Including the DIRECTNESS distinction within negative additive relations helps make the CCR taxonomy more descriptively accurate. In addition, Pander Maat (1998:199, Table 1) demonstrates that the differences between direct and indirect negative relations can be observed in the Dutch connective system, which suggests that the distinction is also cognitively plausible. Finally, as Pander Maat (1998) points out, DIRECTNESS makes the CCR taxonomy more consistent, since the original 1992 proposal conflated SOURCE OF COHERENCE and DIRECTNESS for negative additive relations; the class of negative objective additive relations only included direct comparisons, while the class of negative subjective additive relations only included indirect comparisons.

Proposing a new distinction: DISJUNCTION
CCR was recently used as a tool to map other discourse annotation schemes onto each other (see . The relation labels from RST, PDTB, and SDRT were 'translated' into CCR's primitives, enabling a more accurate and straightforward comparison between the different frameworks than just comparing the end labels would have allowed. While CCR was able to capture the majority of distinctions, several extra features had to be formulated to ascribe a unique set of primitives and features to each relation label from a framework.8 Most extra features were similar to the distinctions discussed in Section 3.1 in that they were relevant to only a small subset of relations, and defined more specific instances of a certain relation type (e.g., LIST relations as a specific instance of positive additive relations). A notable exception was DISJUNCTION, a feature that distinguishes disjunction relations, in which the two segments are presented as alternatives, from other additive relations. Whereas RST, PDTB, and SDRT all include disjunctions as a specific relation type, the original CCR taxonomy is unable to adequately capture the distinction between disjunctions and other types of additive relations.

Disjunctions in CCR
As the main reason for not including an "alternation relation" in their taxonomy, Sanders, Spooren, and Noordman (1992:29) refer to the "unclear status of alternation." While some of the existing approaches to discourse coherence treated DISJUNCTION relations as a distinct class of relations, for instance "on a par with conjoining, temporal, and implication," as Longacre (1983), others considered them a subcategory of additive relations (e.g., Halliday & Hasan, 1976). In addition, as Sanders et al. (1992:29) point out, there was "also confusion about the nature of the alternation relation;" while some considered disjunctions to be primarily inclusive (e.g., Longacre, 1983), others considered disjunctions to be primarily exclusive (e.g., Gamut, 1982;Levinson, 1983).
Here, we would like to argue in favor of including an additional distinction in the CCR taxonomy that can account for disjunction relations. Not only would such a distinction improve the descriptive adequacy of the taxonomy, it also seems to meet all criteria set by the CCR approach. First of all, disjunction relations hold between clauses, see (27), thereby satisfying the basic clausal criterion. DISJUNCTION is also a feature of the relational surplus, since the meaning of the relation as a whole is more specific than just the segments in isolation; without DISJUNCTION, as in (27'), the two segments would not be considered alternatives and both segments would be considered to be true. (27) [You either know it]S1 or [you don't.]S2 (27') You know it // you don't know it The final criterion that relational features have to meet before they can be included into the CCR taxonomy is cognitive plausibility. There seems to be ample linguistic evidence from connective inventories to suggest that DISJUNCTION is a cognitively plausible distinction, since many languages have connectives that prototypically mark disjunctions, for instance or or either or in English, of in Dutch, oder in German, ou in French, and o in Spanish. As discussed in Section 2, other evidence related to the cognitive plausibility of features of coherence relations can be derived from language acquisition and language processing. Disjunction at the discourse level, however, does not seem to have received a lot of attention in these fields. A notable exception is a self-paced reading study by Staub and Clifton (2006). This experiment compares reading times of disjunctions in past tense and future tense signaled by or or either or. Staub and Clifton (2006) find that readers benefit more from the presence of either in the past tense condition than in the future tense condition. This suggests that when encountering a connective indicating DISJUNCTION after the first segment, readers have to update the truth-conditional status of S1. This effect is much smaller, or even absent, in the future tense condition because the truth-conditional status of those segments is already uncertain. Staub and Clifton's (2006) experiment thus shows that DISJUNCTION can affect language processing and, as such, provides additional evidence in favor of the cognitive plausibility of DISJUNCTION. While evidence pertaining to the cognitive plausibility of DISJUNCTION is fairly limited, there currently does not seem to be any evidence against DISJUNCTION being a cognitively plausible distinction at the discourse level. Since DISJUNCTION currently appears to meet CCR criteria and would improve CCR's descriptive adequacy, we suggest adopting DISJUNCTION as an additional primitive in CCR. Further investigating the cognitive plausibility of DISJUNCTION at the discourse level seems a fruitful endeavor for future research; any counter-evidence this research may uncover should be taken into account in future evaluations of the DISJUNCTION primitive.

DISJUNCTION as a new distinction in CCR
In line with the original Sanders, Spooren, and Noordman (1992) paper, we consider disjunctions to be a specific type of additive relations. Here, however, we propose to include DISJUNCTION as an additional distinction to the CCR taxonomy, applicable only to the class of additive relations. Similar to the additional distinctions discussed in Section 3.2, DISJUNCTION will carry the values alternative, in which case the segments are presented as alternatives, and not alternative, in which case the segments are not presented as alternatives. Additive relations that are alternative are the relations prototypically referred to as the class of DISJUNCTIONS; additive relations that are not alternative are all other types of additive relations. As mentioned in Section 3.2.1, DISJUNCTIONS can be exclusive, in which case the alternatives cannot hold at the same time, as in (27), or inclusive, in which case they can, as in (28). (28) [A little sweetener can take them [= waffles] from supper table to breakfast table]S1 or [even turn them into dessert.]S2 It is possible to distinguish between the inclusive and exclusive disjunctions using the POLARITY primitive (see also . Since the two segments can hold at the same time, inclusive disjunctions have a positive value for POLARITY: P & Q. Exclusive disjunctions, on the other hand, always involve the negative counterpart of either P or Q: P & not-Q or not-P & Q. In (27), for instance, you know it, in which case you do not not know it, you do not know it, in which case you do not know it. The SOURCE OF COHERENCE primitive applies to disjunctions as it does to other types of coherence relations; disjunctions can be either objective or subjective. In (27), the segments present alternatives that hold in the real world. As such, (27) has an objective value for SOURCE OF COHERENCE. (29) presents two alternative opinions or claims, making the relation subjective; note that the two segments do not necessarily present real-world alternatives, since it could both be true that 'this person' is just stupid and has lost her mind, or that neither proposition holds. (30) is also subjective, since the DISJUNCTION holds between two speech acts, specifically between two questions. (29) Either [this person has lost her presence of mind]S1 or [she is just stupid.]S2 (30) [Are you just feeling lazy]S1 or [do you need a break?]S2 Since DISJUNCTIONS are considered to be a subtype of additive relations, the ORDER OF THE SEGMENTS primitive does not apply. It should be noted that 'disjunctions' are sometimes considered to include unless-relations (e.g., PDTB Research Group, 2007;Reese et al., 2007: unless you know it, you don't know it has a meaning highly similar to the relation in [27]). In CCR, relations marked by unless are categorized as negative conditional relations; this also holds for relations not specifically marked by unless but with a similar interpretation.

Operationalizing SOURCE OF COHERENCE: segment-internal distinctions
The distinction between objective and subjective relations (or a similar distinction) is, as mentioned in Section 2.3, very common in theories about discourse and discourse annotation approaches.
Although researchers seem to agree on prototypical examples, the SOURCE OF COHERENCE of a relation can be difficult to determine in the practice of actual corpus annotation (e.g., . A proposal to improve the application of this primitive in the annotation of real-world examples is to make use of paraphrase tests, in which the segments of the relation are inserted in a paraphrase that makes explicit either a subjective or objective reading, for instance the fact that P causes S's claim/advice/conclusion that Q can be used to test whether positive causal relations with a basic order are subjective . Another practice that seems to facilitate determining the SOURCE OF COHERENCE of a relation is to consider the relation in its larger context, for example the whole text Sanders & Spooren, 2013). It has been proposed that determining the SOURCE OF COHERENCE of a relation is difficult because while there are highly prototypical instances of objective and subjective relations, there are also many less prototypical examples (e.g., Degand & Sanders, 1999;Stukker & Sanders, 2012);9 non-prototypical examples are harder to classify than more prototypical examples. Several papers explore what makes a relation prototypically subjective or objective. Relevant features include the identity of the Subject of Consciousness, the explicit presence of the Subject of Consciousness, and the propositional attitude of the segments, each of which will be elaborated on in the rest of the section. Using these individual features can facilitate the process of determining a relation's SOURCE OF COHERENCE, as will be explained in Section 4.3. At the same time, the individual features are also used as additional distinctions within the SOURCE OF COHERENCE primitive to examine connective profiles in a more fine-grained way (e.g., Li, 2014;Santana et al., 2018;Xiao et al., to appear). While this section focuses on the additional distinctions pertaining to the SOURCE OF COHERENCE primitive that have been proposed in previous literature (which may all help in operationalizing this primitive), Section 4.3 discusses an additional issue that appears to complicate the annotation of SOURCE OF COHERENCE: the distinction between SOURCE OF COHERENCE and truth value; since this issue does not come with additional distinctions that can be annotated, we discuss it in Section 4, along with other issues that researchers may encounter when using CCR in corpus annotation.

Identity of the Subject of Consciousness
Pander Maat and Sanders (2000) propose that subjective relations involve a Subject of Consciousness (SoC) that is responsible for the construal of the relation; the relation stems from the SoC's mind (see also Pander Maat & Degand, 2001;Pit, 2003;J. Sanders, Sanders, & Sweetser, 2012;Sanders, J. Sanders, & Sweetser, 2009, among others). Subjective causal relations, for instance, involve the SoC's reasoning, as in (31). As was mentioned in Section 3.1.2, objective relations have either no SoC (non-volitional relations) or an SoC that is not responsible for the construal of the relation but is present as the agent of a volitional action (volitional relations). (31) [It must have been turkey mating season in Northern California]S1 because [we've never seen so many turkeys strutting around.]S2 In subjective coherence relations, the SoC is usually the speaker (Pander Maat & Sanders, 2000): either the speaker or author of the discourse, as in (31), or the speaker responsible for the content of a direct quote, as in (32). Alternatively, the SoC can be another actor in the discourse whose perspective is taken, as in (33). In (33), S2 is a conclusion made on the basis of information in S1. It does not explicitly say 'so Tarzan concludes that the natives must be very near,' but it is clear that the conclusion is drawn by Tarzan. Tarzan is the thinking entity responsible for the construal of the relation and therefore the SoC. In examples like (33), a third person actor temporarily becomes the speaker, although it would be even more accurate to say that in examples like these there is a 'blend' between the perspectives of the author or speaker and the discourse participant. (e.g., J. , J. Sanders & Spooren, 1997

Explicit presence of the Subject of Consciousness
Not only the identity of the SoC, but also the extent to which the SoC is explicitly present in the relation has been argued to bear on the subjectivity of a relation. Langacker (1990Langacker ( , 1991Langacker ( , 2006 proposes that utterances with an explicitly mentioned, 'onstage,' speaker are more objective than utterances where the speaker is left implicit, or 'offstage,' since an explicitly mentioned speaker becomes itself the focus of attention. This view is applied to coherence relations by, for instance, Pit (2003), Spooren (2013, 2015), and Stukker and , who show that relations with onstage SoCs, as in (34), are less prototypically subjective than relations in which the SoC remains offstage, as in (34'). However, relations with an onstage speaker SoC do tend to be considered to be subjective relations if the relation is centered around a subjective judgment, opinion, or conclusion (e.g., Pander Maat & Degand, 2001;Pander Maat & Sanders, 2000;Pit, 2003;Sanders & Evers-Vermeul, 2019;Wei, 2018).10 (34) [I think all glitter should be banned,]S1 because [it's microplastic.]S2 (34') [All glitter should be banned,]S1 because [it's microplastic.]S2 It should be noted that a subjective relation can explicitly mention someone whose identity corresponds to the identity of the SoC, and still have an implicit SoC. In (35), for instance, the SoC is the speaker, but he is not explicitly mentioned in his role as SoC (as would be the case in which I think was a bummer). Instead, he is merely explicitly mentioned as an actor in the event in S2 that is used to motivate the judgment in S1. 10 As was mentioned in Footnote 8, some researchers have proposed that subjectivity is a scalar, rather than a categorial notion (Pander Maat & Degand, 2001;Degand & Pander Maat, 2003). Under this view, (34') would be considered to be more subjective than (34). In this paper, we adopt a categorial view of the SOURCE OF COHERENCE primitive (while recognizing that relations can differ in how prototypically subjective they are), since this approach is most in line with the common annotation practice of assigning labels to relations. (35) I made it through the night without getting fired. [Which was a bummer]S1 because [I had spent the days previous applying for new serving jobs through Craigslist, just in case.]S2

Propositional attitude of the segments
A final feature of coherence relations that is relevant to its SOURCE OF COHERENCE is the propositional attitude of the segments (e.g., Li, 2014;Li et al., 2016;Sanders & Spooren, 2009, 2015; are they, for instance, judgments, speech acts, or facts? Subjective relations prototypically involve judgments or speech acts, while objective relations prototypically feature facts. For implication relations, the propositional attitude of the consequent, Q, is most crucial (Li, 2014).

State-of-the-art CCR for discourse annotation
This section gave an overview of the most important developments in CCR since the original 1992 proposal when it comes to discourse annotation. Figure 4 provides a schematic overview of stateof-the-art CCR for discourse annotation. The overview is a flowchart resulting in unique value combinations at the bottom of the scheme. As is indicated by the grey shading and the prominence of POLARITY, BASIC OPERATION, and SOURCE OF COHERENCE, these are the only primitives relevant to all coherence relations. The distinctions in red squares are only relevant to the subset of relations below the primitive value to which they are attached. As such, they duplicate the set of relations below that primitive value. The numbers at the bottom of the scheme refer to the numbers in Table  1, where a simple, prototypical example is provided for each value combination in CCR. The segment-internal distinctions for SOURCE OF COHERENCE discussed in Section 3.3 are not explicitly incorporated in the scheme, but are considered to be part of the objective-subjective distinction within SOURCE OF COHERENCE. For TEMPORALITY, we only included the temporal order step for positive causal relations; by definition, these relations contain an underlying sequential temporal order. Figure  4.

positive causal objective basic chronological -volitional -purpose
Because it was raining, the streets were getting wet.

positive causal objective basic chronological +volitional -purpose
Because it was raining, Jill brought her umbrella.

positive causal objective basic chronological +volitional +purpose
Joe put up a tarp over the party area to prevent everyone from getting wet.

positive causal objective non-basic anti-chronological -volitional -purpose
The streets were getting wet because it was raining. 5 positive causal objective non-basic anti-chronological +volitional -purpose Jill brought her umbrella because it was raining. 6 positive causal objective non-basic anti-chronological +volitional +purpose To prevent everyone from getting wet, Joe put up a tarp over the party area. 7 positive causal subjective basic chronological The streets are wet, so it must be raining.

positive causal subjective basic anti-chronological
To prevent everyone from getting wet, you should cover the party area with a tarp. 9 positive causal subjective non-basic chronological You should cover the party area with a tarp to prevent everyone from getting wet. 10 positive causal subjective non-basic anti-chronological It must be raining, since the streets are wet.

positive conditional objective basic chronological
If it rains, Jill will bring an umbrella.

positive conditional objective non-basic anti-chronological
Jill will bring an umbrella if it rains. Legos are great. Reading is also wonderful. 23 positive additive subjective -temporal +alternative Jill is usually described as being great company or even as being someone who makes any party a success.

negative causal objective basic
Even though it was raining, the streets stayed dry.

negative causal objective non-basic
The streets stayed dry, even though it was raining.

negative causal subjective basic
Even though it is raining, you should not bring an umbrella.

negative causal subjective non-basic
You should not bring an umbrella, even though it is raining.

negative conditional objective basic
Unless the skies have cleared, we are bringing an umbrella. 29 negative conditional objective non-basic We are bringing an umbrella, unless the skies have cleared.

negative conditional subjective basic
Unless it is absolutely pouring down, you should not bring an umbrella.

negative conditional subjective non-basic
You should not bring an umbrella, unless it is absolutely pouring down. 32 negative additive objective direct -alternative Jill brought an umbrella, but her friend did not.

negative additive objective indirect -alternative
The rain is making the streets wet, but the sun is drying them really quickly. 34 negative additive objective direct +alternative The whole party it was either drizzling or pouring down. 35 negative additive subjective direct +alternative Every party last year was either really great or it was a total disaster.

negative additive subjective direct -alternative
Rain is the absolute worst, though the smell of a light drizzle after a sunny day is pretty wonderful.

negative additive subjective indirect -alternative
Going to that party sounds like fun, but it is pouring down outside. Table 1. Prototypical examples for each value combination in state-of-the-art CCR for discourse annotation.

CCR as an annotation scheme in practice
In the introduction, we mentioned several advantages of using CCR for discourse annotation; it consists of cognitively plausible distinctions, is applicable cross-linguistically, and can be used by non-expert annotators. We also mentioned some potential problems researchers could run into when starting to use CCR, most of which we aim to solve in the current paper. We provided an overview of all proposed additional primitives and distinctions and gave a summary of several discussions that have been carried out over separate research papers. This eliminates the need to sift through many different research papers to create an overview of state-of-the-art CCR. In addition, we took inventory of the full CCR taxonomy to see if there were any potential extra distinctions that would be eligible to be adopted into CCR and would increase the approach's descriptive adequacy. This led to our proposal for DISJUNCTION as a new distinction in CCR.
In this section, we reflect on the use of CCR as an annotation scheme in practice, because the use of primitives presents certain challenges that an end-label approach does not. We discuss several issues that are relevant to take into account when implementing the CCR taxonomy in a discourse annotation project:11 different options for calculating inter-annotator agreement (Section 4.1), assumptions about the independence of primitives and distinctions, and the possibility of also using end labels when using CCR (Section 4.2). In addition, we discuss an additional point concerning the operationalization of SOURCE OF COHERENCE. While the distinction between subjective and objective relations also exists in other discourse annotation frameworks, no framework applies this distinction as systematically to all types of relations as CCR (see . When a decision about whether a relation is objective or subjective has to be made in another framework, this will be done by comparing the relation that has to be annotated to the objective and subjective counterpart of the candidate relation (e.g., considering the relation definitions of RST-DT's CAUSE versus PRAGMATIC CAUSE). In CCR, a relation's SOURCE OF COHERENCE always has to be determined based on a general definition of the SOURCE OF COHERENCE primitive, which may be part of the reason why annotating this distinction is so difficult. In practice, a relation's SOURCE OF COHERENCE appears to be commonly confused with its truth value. We clarify this distinction in Section 4.3. Taking note of the issues discussed in Sections 4.1,4.2,and 4.3 can help to successfully apply CCR -both the original primitives discussed in Section 2 and the later proposed primitives and distinctions discussed in Section 3-in corpus annotation.

Calculating inter-annotator agreement
When annotating coherence relations, researchers have to rely heavily on their own interpretation of the discourse, which is why discourse annotation is, at least to some extent, a subjective endeavor (e.g., . To demonstrate that annotation has been done reliably and reproducibly, researchers can report an inter-annotator agreement measure: either simple percentage agreement or a chance-corrected numerical index that indicates the amount of agreement between two independent coders (e.g., Cohen's Kappa, Krippendorf's Alpha, Gwet's AC1). For annotation efforts that make use of end labels to categorize coherence relations, the basis for calculating inter-annotator agreement is a confusion matrix like the one in Table 2. Since CCR uses primitives rather than end labels, calculating inter-annotator agreement requires additional consideration: while one option is to mimic the approach in Table 2 and treat all primitive and distinction value combinations as end labels (e.g., 'positive causal subjective non-basic,' 'negative additive subjective indirect'), it is also possible to calculate agreement for each primitive or distinction individually (e.g., BASIC OPERATION: causal vs. additive). Treating all primitive and distinction value combinations as end labels has the main advantage that it allows for a comparison of the inter-annotator agreement score to those from annotation efforts using another framework (this approach was for instance taken in Rehbein et al., 2016).12 However, the CCR 'end labels' that are being compared are not entirely equivalent; relations have minimally three values (e.g., 'positive additive objective'), but can have up to six values (e.g., 'positive causal objective basic volitional purpose').
Calculating agreement for each primitive or distinction, on the other hand, is much easier than taking an 'end label' approach. In addition, it generates a clear overview of where exactly confusions or disagreements arise, which can be extremely valuable for further annotator training. In both Rehbein et al. (2016) and , for instance, lowest agreement scores were reported for the SOURCE OF COHERENCE primitive (81.3%/κ=.63 and 75%-81%/κ=.52-.62, respectively; see also  for a discussion on reaching sufficient agreement on SOURCE OF COHERENCE); highest agreement in both studies was reached on ORDER (86.2%/κ=.87 and 94%-100%/κ=.88-1.00, respectively).13 Calculating agreement separately for each primitive or distinction makes it impossible, however, to check whether there is a systematic confusion between specific value combinations (e.g., 'negative objective causal non-basic' and 'negative additive subjective indirect'), either because of annotator bias or because of a closer resemblance between two types of relations than the value combinations may suggest (see also Section 4.2). In addition, annotations can be dependent on the annotation of the other primitives or distinctions, especially when it comes to distinctions relevant to only a subset of relations. If one coder categorizes a relation as causal, while the other one marks it as additive, the two coders do not have the same number or type of other primitives and distinctions to annotate; coder 1 will for instance have to determine whether the relation is CONDITIONAL, while coder 2 has to make a decision on the DISJUNCTION distinction (see also Scholman et al., 2016 on the interdependence of annotations in CCR).
When using the full CCR taxonomy in an annotation project, it thus seems worth exploring the inter-annotator agreement both from the perspective of value combinations and for each individual primitive and distinction separately. The combination of both approaches will provide the most informative overview of annotations, as is also illustrated in Section 4.2. When calculating interannotator agreement scores, it should be considered whether the annotation process, as well as the configuration in which they are being analyzed, match the assumptions of the inter-annotator agreement statistic used; an inter-annotator agreement statistic may be unequipped to be used for annotations that are not independent or annotations that involve an uneven number of steps.14

Independence of primitives and the use of end labels in addition to primitives
While CCR's primitives are formulated as separate features, in practice the primitives seem to be slightly less independent than they may seem on the basis of the original taxonomy. First of all, the exact operationalization of a specific primitive or distinction can vary depending on other primitive values. As will be elaborated on in Section 4.3, determining the SOURCE OF COHERENCE for conditional relations involves a frequent problem that is much less often encountered in other types of relations: distinguishing between subjectivity and truth value. In addition, agreeing on the BASIC OPERATION of relations with a positive value for POLARITY tends to be much easier than instance RST and SDRT. For example, depicting the overall discourse structure of a text involves annotating many higher-order relations (which often appear to be implicit: e.g., , Patterson & Kehler, 2013 and when annotating the same text, frameworks that connect all segments and chunks of segments into one top-level relation will annotate more relations than frameworks that do not (e.g., Demberg et al., 2019). Since differences between annotation frameworks or projects may thus influence inter-annotator agreement scores, differences in inter-annotator agreement scores should be interpreted with care; a lower score does not necessarily mean a worse performance by the coders. 13 It should be noted that  did not annotate POLARITY, since relations were selected on the basis of their connectives, which were all judged to be unambiguous in their POLARITY. 14 Discussing the basic assumptions of commonly used inter-annotator agreement statistics and relating them to the possible ways in which CCR annotations could be analyzed is beyond the scope of this paper, but see for instance Zhao, Liu, and Deng (2013) for a comprehensive overview of the basic assumptions of many inter-annotator agreement statistics. For a more general discussion on inter-annotator agreement in discourse annotation, see for instance  or  determining the BASIC OPERATION of negative relations; distinguishing between positive additive and positive causal relations is simple compared to distinguishing between negative additive and negative causal relations.
Another indication that primitives may not always be entirely independent from each other is that annotations may reveal a relatively frequent confusion between two types of relations that differ in multiple values. Based on the taxonomy, disagreement between relations that differ in only one value seems much more likely, and this type of confusion was indeed the most frequent type of disagreement in the annotation of the English relations in the parallel corpus (88% of all disagreements). The most common exception was a disagreement between annotators in which one annotator coded the relation as negative causal objective, (non-)basic, while the other coded it as negative additive subjective indirect, or vice versa. 15 An example of such a relation can be found in (36). On the one hand, this relation could be analyzed as a negative objective causal relation, since setting targets and deadlines could plausibly lead to those targets and deadlines being met; the relation in (36) could then be analyzed as P leading to not-Q. On the other hand, the relation could also be analyzed as a negative subjective indirect additive relation (concession); the conclusion that can be drawn on the basis of S1 is "we are doing great," while the conclusion that can be drawn on the basis of S2 is "we are not doing so great." (36) We learned from that programme that implementation was not good enough. We have a solid base of more than 200 legal acts in the environment. [We already have ambitious targets and deadlines in programmes,]S1 but [they have not all been met.]S2 In practice, it can sometimes be harder to distinguish between two types of coherence relations than would be expected on the basis of the primitive and distinction value combinations in the CCR taxonomy. The observation that in practice, primitives are slightly less independent than they may seem to be in the taxonomy makes comparing annotations between coders using value combinations worthwhile. In addition, it makes it useful to explore the operationalization of a specific primitive or distinction within a specific subset of relations, e.g., TEMPORALITY within causal relations versus additive relations, or SOURCE OF COHERENCE within conditional versus causal versus additive relations. Another possible solution is to use end labels in addition to the primitive value combinations. Some types of relations, especially highly specific types of relations, seem to become easier to recognize after the researcher has become more familiar with relations that carry that specific combination of primitive and distinction values. It is for example very likely that inter-annotator agreement on relations with a negative value for POLARITY can be improved more by focusing on the exact difference between qualifications (negative subjective additive direct; see Section 3.1.4) and concessions (negative additive subjective indirect; see Section 3.1.4) than by further discussing the individual primitives.
Occasionally, it may thus seem easier to use end labels during annotation than individual primitives and distinctions. When encountering the relation in (37) in an annotation project using PDTB 2.0 (PDTB Research Group, 2007), the relation label that should be chosen is fairly straightforward: exception. Using CCR, however, determining that (37) is a negative objective additive relation is, by comparison, much less obvious. Similarly, attributing a label to a relation like the one in (38) when using Carlson and Marcu's (2001) version of RST is simple: otherwise. Arriving at an annotation in CCR is much more involved: a negative objective conditional relation with basic order. (37) Don't let the internet fool youmaking hard boiled eggs in the microwave oven is trouble. If you try to hard boil eggs in your microwave you're likely to end up with a big mess to clean up. The rapid heat from the microwaves creates a lot of steam in the egg.
[The steam has nowhere to go]S1 except [to explode out.]S2 (38) [When adding wine to a sauce, make sure you allow most of the alcohol to cook off;]S1 otherwise, [the sauce may have a harsh, slightly boozy taste.]S2 However, differences in how easy it is to annotate certain types of relations exist not just between CCR and annotation approaches with end labels, but between annotation approaches in general. (37) is simple to categorize using PDTB 2.0, but is much harder to label using Carlson and Marcu's (2001) version of RST; (38) is straightforwardly labeled using Carlson and Marcu's (2001) annotation scheme, but much more difficult to categorize using PDTB 2.0. In sum, it can be worthwhile exploring which distinctions can be more reliably made when using end labels in addition to the individual primitives when using CCR for discourse annotation. Another benefit of using end labels to refer to specific combinations of primitive values is that end labels can make talking about specific relation types much more convenient. It is for instance much easier to talk about RESULT relations than to repeatedly mention 'positive objective basic order causal relations.'16 In such situations, the most obvious solution would be to define a relation type in terms of CCR primitives and distinctions and give it a single name to refer to the specific relation type. We took this approach ourselves in Section 3.1.4 of this paper, where we used qualification to refer to negative additive subjective direct relations and concession to refer to negative additive subjective indirect relations. CCR's primitive approach is thus not incompatible with the use of end labels. The original CCR proposal by Sanders, Spooren, and Noordman (1992) already gives an overview of possible end labels that can be used to refer to specific combinations of primitive values. Being aware of which specific value combinations correspond to which type of end labels also makes it easier to compare CCR to other discourse annotation approaches and existing literature on coherence relations.

SOURCE OF COHERENCE versus truth value
A common source of confusion in annotating coherence relations in a corpus pertains to the relationship between SOURCE OF COHERENCE and truth value. Objective relations are defined to hold between two events in the real world, but this does not mean that the relation that is established between the two segments is necessarily true. In (39), for instance, the relation signaled by because is a positive volitional objective causal relation in which an SoC performs a volitional action for a specific reason. The relation as a whole, however, is a conclusion by the speaker, as is also indicated by so; the speaker makes a conclusion or claim about the unfolding of events in the real world. While the relation between S1 and S2 in (39) is an objective causal relation, the relation between the two combined segments and the rest of the discourse is subjective. We have found that in practice it can sometimes be difficult to distinguish between the SOURCE OF COHERENCE of the relation that is being annotated and the SOURCE OF COHERENCE at a higher discourse level. (39) So, [you're really just apologizing]S1 because [you need my advice.]S2 (40) is a fragment extracted from the Europarl corpus. The relation at the end of the fragment is highly similar to the one in (39), although it is slightly more complicated. The final sentence of (40) is a positive volitional objective relation, since it holds between an intentional act and a reason for that act. However, from the fragment it is clear that the discourse relation is the speaker's answer to the question why is this being done? The relation is claimed to be true: the reason for the intentional act is invented or hypothesized by the speaker. This does not, however, mean that the relation itself becomes subjective. Internally, the way in which S1 relates to S2 is objective, and without context, there would probably be no confusion. The subjective nature of the final sentence in (40) arrives from, and can be captured by, it as a whole being a claim and part of a subjective relation; the speaker claims that it is being done not for the benefit for the consumer, not because consumers demand it, but rather because consumers are preferred to not know about it. The distinction between SOURCE OF COHERENCE and truth value seems especially relevant to conditional relations, since they often seem to entail speaker involvement. Conditionals, of which the content usually has not been realized, are often predictions. In the relation in (41), for example, the speaker announces what his party will do in a certain scenario. Similar to the relations in (39) and (40), the relation between the two segments in (41) is objective, while the relation as a whole is a prediction. Here too, the SOURCE OF COHERENCE within the relation is not the same as the SOURCE OF COHERENCE of the relations that holds at a higher discourse level, i.e., between the combined segments as a whole and the preceding discourse. (41) If [we find that any Member of this House or their employees collaborated with the BBC in this farrago]S1 [we will expose them to the opprobrium of this House.]S2 Removing the conditionality from the discourse relation helps when annotating the SOURCE OF COHERENCE of a conditional relation; if the resulting causal relation is objective, the conditional relation is also objective. Without the conditionality, (41) would become 'there has been a collaboration with the BBC, which is why we will expose them;' a positive volitional objective causal relation. It is also not uncommon for conditional relations to express the speaker's negative stance toward the antecedent actually taking place, and therefore toward the entire prediction (sometimes also called counterfactual or irrealis). In English, indicating that something is unlikely to come true can for instance be done by means of a distanced verb form, e.g., if we found that. Speakers can also encode that the event did definitely not take place, e.g., if we had found that. Even though negative stance seems to emphasize the presence of a speaker, it does not usually influence the SOURCE OF COHERENCE between the segments of the conditional relation. With negative stance added, the relation in (41) for example still expresses that if one real world event occurs, it leads to another real-world event.
In general, an increased awareness of the difference between SOURCE OF COHERENCE and truth value and about the way in which the SOURCE OF COHERENCE at a higher discourse level can influence the way in which the SOURCE OF COHERENCE between two segments is perceived can help improve the quality and reliability of discourse annotation.

Conclusion
The Cognitive approach to Coherence Relations was originally proposed as a set of cognitively plausible primitives to order coherence relations, but is also increasingly used as an annotation scheme for classifying coherence relations, see Appendix A. In this paper, we gave an overview of the most important developments within CCR from the point of view of discourse annotation. We discussed proposals for new primitives and additional distinctions, and summarized the discussion on how to operationalize an original primitive, SOURCE OF COHERENCE. In addition, we argued in favor of adding a new distinction to CCR: DISJUNCTION. Finally, we discussed some practical issues we encountered during a recent annotation project using CCR. As a whole, this paper gives an overview of state-of-the-art CCR for discourse annotation. As such, it can be used, together with the original 1992 proposal, as a point of departure for anyone interested in annotating coherence relations using the Cognitive approach to Coherence Relations.