dingeszakkou2019muchstakeknowledge
/data/papers/dingeszakkou2019muchstakeknowledge/out/text.txt
| INTRODUCTION

Orthodoxy in the contemporary debate on knowledge ascriptions holds that the truth-value of knowledge ascriptions is purely a matter of truth-related factors such as what evidence the putative knower has and whether her beliefs are true. Adherents of orthodoxy are thus united in rejecting views like contextualism (e.g., Blome-Tillmann, 2014; DeRose, 2009 ), relativism (e.g., MacFarlane, 2014) , and pragmatic encroachment (e.g., Fantl & McGrath, 2009; Hawthorne, 2004; Stanley, 2005) . These latter views hold that the truth-value of knowledge ascriptions also depends on conversational or psychological factors such as the salience of error-possibilities or practical factors such as what is at stake in the context of utterance (contextualism), the context of assessment (relativism), or the context of the subject to whom knowledge is ascribed (pragmatic encroachment).

One widely acknowledged challenge to orthodoxy comes from so-called salient alternative effects on knowledge ascriptions. It turns out that our willingness to ascribe knowledge depends on which error-possibilities are salient to us (e.g., Alexander, Gonnerman, & Waterman, 2014; Buckwalter, 2014; Gerken, Alexander, Gonnerman, & Waterman, 2020; Nagel, Juan, & Mar, 2013; Schaffer & Knobe, 2012) . This is at least initially puzzling from the perspective of orthodoxy because the truth-value of knowledge ascriptions supposedly does not depend on such parameters. Numerous orthodox defense strategies have been offered, which usually appeal to conversational pragmatics (e.g., Rysiew, 2001) or psychological biases (e.g., Gerken, 2017; Nagel, 2010b; Williamson, 2005) . Such defenses of orthodoxy remain controversial, however, and pressure remains to join the unorthodox camp.

A much less widely acknowledged challenge to orthodoxy comes from so-called practical factor effects on knowledge ascriptions, the putative phenomenon that our willingness to ascribe knowledge depends on practical factors such as what is at stake or how much time we have to ponder the issue. 1 If practical factor effects exist, they yield a similarly pressing challenge to orthodoxy. This is because, according to orthodoxy, the truth-value of knowledge ascriptions is independent of practical factors in just the same way in which it is independent of salient alternative facts. But the status of practical factor effects is highly contested.

In some sense, practical factor effects trivially exist. A number of philosophers have reported varying intuitions about knowledge ascriptions depending on what is at stake (e.g., Fantl & McGrath, 2002, pp. 67-68; Stanley, 2005, pp. 5-6) . Their intuitions seem to be affected by the parameters in question. The interesting issue (at least from our perspective) is how widely these intuitions are shared. Philosophers are bound to be biased toward their favored theory, or they might be "moved by the authoritative judgments of others" (DeRose, 2009, p. 49, n. 2) . Such distortions seem particularly relevant for practical factor effects, where intuitions are less than fully robust. So, are these intuitions shared by the uninitiated folk? Experimental results have so far not given a clear answer.

Practical factor effects on knowledge ascriptions have been tested in two experimental paradigms. In the first paradigm-the canonic paradigm-participants read a short story describing a protagonist in either a high or a low stakes situation. Then they rate their agreement with a given knowledge claim. The evidence available to the protagonist is described in the story, and the description of the evidence is held fixed across the high and the low stakes condition. In the second experimental paradigm-the evidence seeking paradigm-the evidence available to the protagonist is no longer described in the story. Instead, participants rate how much evidence our protagonist needs to acquire knowledge in either a low or a high stakes condition.

Practical factor effects on knowledge ascriptions have proven hard to confirm in the canonic experimental paradigm. Many studies find no effects at all (e.g., Feltz & Zarpentine, 2010; Buckwalter, 2010 Buckwalter, , 2014;; Buckwalter & Schaffer, 2015, pp. 216-218; Rose et al., 2019; Francis, Beaman, & Hansen, 2019) , and even in the few studies where effects are found, they remain mostly very small and/or unstable (May, Sinnott-Armstrong, Hull, & Zimmerman, 2010; 1 We will be working with an intuitive notion of stakes here. See, for example, Anderson and Hawthorne (2019) for initial attempts to sharpen this notion. Notice that when we speak of stakes affecting knowledge ascriptions, we do not want to suggest that they always affect knowledge ascriptions or do so in a very systematic way. Anderson and Hawthorne (2019) , for instance, argue that plausible generalizations of this kind are hard to come by. We hold instead that stakes can affect knowledge ascriptions and do so in the cases we will consider. See Weatherson (2012, p. 84) and Fantl and McGrath (2019, p. 261 ) for related ideas. Sripada & Stanley, 2012; Pinillos & Simpson, 2014, pp. 23-25) . 2 Here are two hypotheses that could explain this pattern of data in existing canonic studies: (a) We are seeing stakes effects in the latter studies that are somehow blocked in the former, or (b) the latter studies merely report noise resulting from variations in extraneous variables; none of the mentioned studies confirms stakes effects. Hypothesis (b) can easily seem to be the more attractive option. It is difficult to see what should block the effects in all the former studies, while possible sources of noise in the latter studies are often easy to find. 3 Studies using the evidence seeking paradigm yield more robust results. Participants consistently offer higher evidence ratings in the high stakes condition than in the low stakes condition (Pinillos, 2012; Pinillos & Simpson, 2014; Buckwalter & Schaffer, 2015, pp. 208-209; Francis et al., 2019) . 4 But, a number of authors suggest that this does not confirm stakes effects on knowledge ascriptions. We are supposedly seeing stakes effects on a deontic modal that allegedly features in the task description participants receive (Buckwalter & Schaffer, 2015, pp. 207-218; Rose et al., 2019, p. 240) .

In sum, there is little evidence to date for practical factor effects on knowledge ascriptions, and a natural assumption would be that such effects simply do not exist. The goal of our paper is to show that, contrary to this natural conclusion, there are practical factor effects on knowledge ascriptions, and that these effects yield a substantive challenge to orthodoxy. One central contribution will be a novel experimental paradigm to test practical factor effects. It trades on the idea that people retract knowledge ascriptions when practical factors relevantly shift.

The structure of the paper is as follows. We begin by explaining how retraction data may reveal practical factor effects on knowledge ascriptions (Section 2). We present corresponding studies to show that such effects exist (Sections 3 and 4). We go on to address two major concerns about the idea that our data challenge orthodoxy. One concern is that the effects we observe can be reduced to the already established salient alternative effects and, as such, do not pose a distinctive explanatory challenge. We explain why practical factor effects may seem reducible to salient alternative effects, and object to the resulting proposal with a further study (Section 5). Another concern is that the effects we find are not as strong as one might have hoped. We explain why they still speak in favor of unorthodox positions (Section 6). We conclude that, contrary to common opinion, practical factor effects exist, and yield a substantive challenge to orthodoxy (Section 7).

2 Turri (2017, pp. 143-146) reports what may appear to be robust stakes effects, but he plausibly suggests that we are seeing a "deferral" effect instead (pp. 147-151) . Deferral cannot play a role in our studies below because they feature a knowledge ascription (rather than denial) across the board. Turri, Buckwalter, and Rose (2016) report further potential robust stakes effects in canonic studies. Their "stakes" manipulation, however, varies various factors at once, including the putatively known proposition. It is thus difficult to trace the effect to stakes as we understand them. See also Gerken (2017, p. 106 ) on how the "complement clause" affects knowledge ascriptions. 3 Take Pinillos and Simpson's truck cases, for instance (see their web appendix). The low stakes case features a truck carrying "a dozen eggs". The high stakes case features a truck carrying a bomb that can "kill many innocent people from the immediate area." Participants presumably associate different types of trucks with these cargos, and this might suffice to explain why they take it to be less clear that the high stakes truck will make it across a "rickety wooden bridge." Similar things can be said about their coin vignettes (pp. 23-24) . You normally do not win $10,000 dollars (as in the high stakes vignette) just by counting 134 coins unless there is some trick involved that makes the task harder. See Buckwalter and Schaffer (2015, pp. 218-228) for further concerns, which will be addressed in more detail below. 4 Turri (2017, pp. 151-152) reports the only evidence seeking studies that fail to confirm practical factor effects. A number of factors could explain the absence of an effect in these studies. For instance, Turri uses a highly abstract rating scale, and he introduces the prompt as a "more general question", which might lead participants to abstract away from the specifics of the case description. See Gerken (2017, p. 283) for related observations. Our focus will be on what is at stake as one specific type of practical factor. The retraction paradigm could be used to test other types of practical factors such as time constraints (e.g., Anderson & Hawthorne, 2019; Shin, 2014) . It could also be used to further corroborate salient alternative effects. We hope the potential for further applications of the retraction paradigm underscores the fruitfulness of our approach.

2 | STAKES EFFECTS AND RETRACTION Gerken (2017, p. 34 ) characterizes practical factor effects on knowledge ascriptions as "patterns of shifting knowledge ascriptions, or evaluations of them, that are due to varying practical factors". This definition is helpfully broad in that it does not specify a type of "evaluation", or assessment, as uniquely relevant for practical factor, or more specifically, stakes effects. We might find stakes effects in truth-value assessments just as in appropriateness assessments or in evaluations of how much evidence we need for knowledge in a given case. We might also find them in whether people see a need to retract a previous knowledge claim.

Indeed, many philosophers present retraction data as one important incarnation of stakes effects on knowledge ascriptions. MacFarlane (2005, p. 202) , for instance, considers a case where he ascribes knowledge before there is an upward shift in parameters such as what is at stake. His verdict is this: "If challenged, I will retract my earlier claim". Dimmock and Huvenes (2014, p. 3244) similarly hold that an upward shift in stakes makes it "natural and appropriate" for someone to "admit that what he said earlier is false and retract his previous assertion".

Retraction intuitions are evidently relevant for the question of whether there are stakes effects on knowledge ascriptions. We should test these intuitions before we draw conclusions about the putative non-existence of stakes effects on knowledge ascriptions. The subsequent studies fill this gap.

| STUDY 1

Our first study tested whether people can be led to retract previous knowledge claims when the stakes rise. We presented participants with one of three scenarios. In one scenario, STAKES, the stakes change before participants assess whether they would retract a previous knowledge claim. In another scenario, NEUTRAL, nothing of relevance changes before participants are asked to consider retraction. In the last scenario, EVIDENCE, the evidential situation changes before retraction judgments are issued. If there are stakes effects on knowledge ascriptions, we should see higher retraction rates in STAKES than in NEUTRAL. The EVIDENCE scenario was included as a baseline. We expect higher retraction rates in EVIDENCE than in NEUTRAL, but we have no prediction about how EVIDENCE relates to STAKES.

Notice one respect in which our study differs from related retraction studies in the literature on epistemic modals (Knobe & Yalcin, 2014; Marques, 2015) . In extant retraction studies on epistemic modals, participants normatively assess retractions. For instance, they rate whether one is "required" to retract in a given case or whether retraction would be "appropriate". Our studies do not use such normative assessments but descriptive assessments instead. Our participants answer questions about what they "would say" in a given case. This is in line with how retraction data have been presented in the literature. MacFarlane (2005, p. 202), for instance, starts out with the observation that he "will" retract in the relevant situations rather than that retraction is "required" or "appropriate".

There is a methodological question of whether normative or descriptive data yield more interesting input for theorizing. We will not take a stand on this here and just assume that our descriptive data yield input that is interesting to a notable degree. We take it that theories of knowledge ascriptions are typically designed, at least in part, to explain ordinary speaker behavior, so we hope this assumption is uncontroversial enough. foot_2

| Method

One hundred and fifty-three participants were recruited through Prolific Academic (58% female, mean age 35). Participants were instructed that they would be asked to picture themselves in a scenario and answer a question about what they would say in the scenario. Then they were randomly assigned to one of three conditions. In each condition, they received a scenario description that started out as follows (the scenarios are variants of the typo scenarios from Pinillos (2012, p. 199 )):

Picture yourself in the following scenario: You and Hannah have been writing a joint paper for an English class. You have agreed to proofread the paper. You've carefully proofread the paper 3 times and used a dictionary if necessary. You spotted and corrected a few typos, but you didn't find any typos in the last round anymore. You meet up with Hannah to finally submit the paper. Hannah asks whether you think there are no typos in the paper anymore. You respond: "I know there are no typos anymore." At this point, … Depending on the condition, the scenario description continued in one of the following ways. NEUTRAL … Hannah reveals to you for the first time that she's always been a big fan of the Backstreet Boys. You've never liked the Backstreet Boys, but since you like Hannah, you promise to listen to a few songs she particularly recommends. You doubt that it will change your mind but agree that it doesn't hurt to give it a try. As you're about to submit the paper, Hannah asks whether you stand by your previous claim that you know there are no typos in the paper. You respond: STAKES …Hannah reveals to you for the first time that it is extremely important for her to get an A in the English class. Her scholarship depends on it, and she'll have to leave college if she loses the scholarship. If there is a typo left in the paper, she's very unlikely to get an A, so it is extremely important to her that there are no typos in the paper. As you're about to submit the paper, Hannah asks whether you stand by your previous claim that you know there are no typos in the paper. You respond: EVIDENCE …Hannah reveals to you for the first time that she's secretly read your previous term papers and always spotted lots of typos in them even when you said you had carefully proofread them. She apologizes for not telling you earlier. You are slightly disappointed but forgive her. Hannah is a good friend, and you appreciate that she was honest with you in the end. As you're about to submit the paper, Hannah asks whether you stand by your previous claim that you know there are no typos in the paper. You respond:

After reading the story, participants in all conditions had to choose between the response options "I do" or "I don't" (in randomized order). The specific instruction read as follows: "Please pick the response you would be more likely to give". Immediately below, participants could rate how confident they were in their response on a 7-point Likert scale ranging from "very unconfident" to "very confident". Points in between were labeled "somewhat confident/ unconfident," "confident/unconfident," and "neither confident nor unconfident". Participants responded to an attention check on a new screen (asking them to recall how many times they had proofread the paper according to the story) before concluding the study.

One might worry that our case descriptions do not specify whether the factivity condition for knowledge is satisfied, that is, that the case descriptions leave open whether the paper contains remaining typos. But this is as it should be, for this factor should be entirely irrelevant to how participants respond to our prompt. A corresponding stipulation would make the case descriptions unnecessarily wordy and maybe even misleading by presenting irrelevant information as if it was relevant. We ask participants whether they would retract their previous knowledge ascription in the case described. We thus ask them what they would say, not what would be the case. Their response is going to depend, for example, on whether they would deem the evidence they are described to have as sufficient for knowledge. It will also depend on whether they would hold a belief in the target proposition. Whether the target proposition would actually be true, however, should have no additional influence. To see this clearly, just ask yourself what information you would need to respond to our prompt. You will see that a stipulation about the factivity condition is not among these factors.

| Results

The responses of two participants were discarded because they failed the attention check. We analyzed the remaining data by first looking at the binary responses alone ("I do" vs. "I don't"). As a second step, we integrated confidence levels. We did this by calculating a composite score for each participant. The response "I do" was coded as 1, the response "I don't" was coded as -1, and the composite score resulted from a multiplication with the reported confidence level ranging from 1 (very unconfident) to 7 (very confident). For instance, a very confident "I don't" yielded a score of -7, while a very unconfident "I do" yielded a score of 1. foot_3

| Analysis of binary results

The percentages of participants who chose to stand by their previous knowledge claim (I do) and the percentages of retractors (I don't) in each condition are shown in Table 1 .

5.9% of the participants in NEUTRAL chose to retract compared to 24% in STAKES and 58% in EVIDENCE. A chi-square test for independence was performed to determine the significance of these differences. A significant effect of story type was observed, χ 2 (1, N = 151) = 34.169, p < .001, Cramer's V = .476 (medium effect), prompting further pair-wise comparisons between conditions. The differences were significant in all cases. Participants were more inclined to retract in STAKES than in NEUTRAL, χ 2 (1, N = 101) = 6.554, p = .010, Cramer's V = .255 (medium effect). They were more inclined to retract in EVIDENCE than in STAKES, χ 2 (1, N = 100) = 11.947, p = .001, Cramer's V = .346 (medium effect). And they were more inclined to retract in EVIDENCE than in NEUTRAL, χ 2 (1, N = 101) = 31.683, p < .001, Cramer's V = .560 (medium effect).

| Analysis of composite scores

Mean composite scores by condition are shown in Figure 1 . As before, there was a statistically significant difference between conditions as determined by one-way ANOVA (F(2,150) = 24.376, p < .001, η 2 = .248 [large effect]). A Tukey HSD post hoc test revealed that participants were more inclined to retract in STAKES (M = 3.32, SD = 4.40, p = .036, Cohen's d = .50 [medium effect]) and EVIDENCE (M = -.40, SD = 5.19, p < .001, Cohen's d = 1.37 (large effect)) than in NEUTRAL (M = 5.43, SD = 2.82). They were also more inclined to retract in EVIDENCE than in STAKES (p < .001, Cohen's d = .88 [large effect]).

| Discussion

Our study confirms stakes effects on knowledge ascriptions. Retraction becomes approximately four times more likely when the stakes go up (5.9% vs. 24%). The effect is not as strong as the effect of changes in one's evidential situation (5.9% vs. 58%). But this should not be worrisome. Knowledge ascriptions are affected by different parameters, and the evidence parameter may have a stronger influence than the stakes parameter.

Notice that unlike in the case of evidence seeking studies, one cannot argue that we are actually seeing stakes effects on a deontic modal (as suggested in Buckwalter & Schaffer, 2015, pp. 207-218 and Rose et al., 2019, p. 240) . The simple reason is that our studies do not feature a relevant modal in the task participants are asked to perform. This puts pressure on the idea that the deontic modal theory is true even for evidence seeking studies (see Pinillos & Simpson, 2014, pp. 29-38; Francis et al., 2019; and Dinges, 2020 for further data that the deontic modal theory cannot explain).

| STUDY 2

The following study aimed to replicate the results from the previous study with a different background story.

| Method

One hundred and fifty-two participants were recruited through Prolific Academic (57% female, mean age 33). The study design was as before except that we used the following bank case vignettes rather than the typo vignettes from above (see DeRose (1992) for the original bank cases):

Picture yourself in the following scenario: You are driving home from work on a Friday afternoon with a colleague, Peter. You plan to stop at the bank to deposit your paychecks. As you drive past the bank, you notice that the lines inside are very long, as they often are on Friday. Peter asks whether you know whether the bank will be open tomorrow, on Saturday. If it is open tomorrow, you can come back tomorrow, when the lines are shorter. You remember having been at the bank three weeks before on a Saturday. Based on this, you respond: "I know the bank will be open tomorrow". At this point, … NEUTRAL …you receive a phone call from your partner. S/he tells you that one of your children has gotten sick and that they are still waiting at the doctor's office to get an appointment. S/he asks whether you can water the plants if you come home and prepare dinner. There's enough food at home so you don't have to buy anything extra. You agree. As you hang up, Peter asks whether you stand by your previous claim that you know the bank will be open tomorrow. You respond: STAKES … you receive a phone call from your partner. S/he tells you that it is extremely important that your paycheck is deposited by Saturday at the latest. A very important bill is coming due, and there is too little in the account. You realize that it would be a disaster if you drove home today and found the bank closed tomorrow. As you hang up, Peter asks whether you stand by your previous claim that you know the bank will be open tomorrow. You respond: EVIDENCE … you receive a phone call from your partner. S/he tells you that s/he was at a different branch of your bank earlier today. A sign said that the branch no longer opens on Saturdays. You see a similar sign in the branch you were about to visit. You can't properly read the sign from the distance, but it seems to concern the opening hours. As you hang up, Peter asks whether you stand by your previous claim that you know the bank will be open tomorrow. You respond:

As before, participants made a binary choice between "I do" and "I don't" and rated their confidence levels afterwards. Again, they received an attention check on a new screen (recall how many weeks have passed since their last visit at the bank) before concluding the study.

| Results

| Analysis of binary results

All participants passed the attention check. Table 2 shows the proportions of participants who chose to retract and the proportions of participants who chose to stand by their initial knowledge claim for each condition. 9.8% of our participants chose to retract their previous knowledge claim in NEUTRAL compared to 48% in STAKES and 96.1% in EVIDENCE. A chi square test for independence reveals that these differences are significant, χ 2 (1, N = 152) = 76.302, p < .001, Cramer's V = .709 (large effect). Pair-wise comparisons show a significant difference between NEUTRAL and STAKES, χ 2 (1, N = 101) = 17.996, p < .001, Cramer's V = .422 (medium effect), EVIDENCE and STAKES, χ 2 (1, N = 101) = 29.126, p < .001, Cramer's V = .537 (medium effect), and NEUTRAL and EVIDENCE, χ 2 (1, N = 102) = 76.185, p < .001, Cramer's V = .864 (large effect).

| Analysis of composite scores

Mean composite scores by condition are shown in Figure 2 . Again, there was a statistically significant difference between conditions as determined by one-way ANOVA (F(2,151) = 83.997,

T A B L E 2 Percentages by condition of participants who retract/stand by Neutral (%) Stakes (%) Evidence (%) Stand by (I do) 90.2 52 3.9 Retract (I don't) 9.8 48 96.1 p < .001, η 2 = .530 [large effect]). A Tukey HSD post hoc test revealed that participants were more inclined to retract in STAKES (M = 1.10, SD = 5.08, p < .001, Cohen's d = 1.06 [large effect]) and EVIDENCE (M = -4.53, SD = 2.46, p < .001, Cohen's d = 2.55 [large effect]) than in NEUTRAL (M = 5.06, SD = 3.27). They were also more inclined to retract in EVIDENCE than in STAKES (p < .001, Cohen's d = 1.50 [large effect]).

| Discussion

The study replicated our previous findings. It shows even more clearly than before that high stakes lead to retractions of previous knowledge claims. The effect of new evidence is still stronger than the effect of stakes. As discussed above, this is not a worry for the view that knowledge ascriptions are sensitive to stakes.

| STUDY 3

One may grant that our studies reveal stakes effects on knowledge ascriptions, but deny that these effects pose a distinctive explanatory challenge to proponents of orthodoxy. In particular, one might present what we will call the reduction hypothesis. This hypothesis says that our stakes effects-and maybe stakes effects in general-are reducible to the already established salient alternative effects in the sense that any given explanation of the latter more or less trivially yields an explanation of the former. Thus, stakes effects do not add a substantial explanatory burden. We should look for an account of salient alternative effects. Whatever theory we end up with, an account of stakes effects will simply fall out (see Buckwalter & Schaffer, 2015, pp. 222-223 for a related suggestion).

To motivate the reduction hypothesis, one may present what we will call the mediation hypothesis. This hypothesis says that (a) participants who picture themselves in a high-stakes situation (as in STAKES) think of additional error-possibilities, and that (b) this alone causes them to retract earlier knowledge ascriptions. If the mediation hypothesis holds, the reduction hypothesis naturally follows. To fully explain stakes effects, we have to explain why people think of more error-possibilities when the stakes rise (compare (a)), and why this in turn makes them retract earlier knowledge attributions (compare (b)). The first part of the explanation will

evidence stakes neutral 7 5 3 1 -1 -3 -5 -7

F I G U R E 2 Mean composite scores by condition. Error bars show 95% CI plausibly come from psychology, where it has been independently observed that high stakes lead people to consider a wider range of hypotheses (e.g., Mayseless & Kruglanski, 1987, pp. 175-178; Buckwalter & Schaffer, 2015, p. 222) . The second part of the explanation arguably follows from any given explanation of salient alternative effects. After all, an explanation of salient alternative effects plausibly just is an explanation of why thinking of additional error-possibilities affects knowledge ascriptions. So given the mediation hypothesis, the reduction hypothesis naturally follows, that is, stakes effects reduce to salient alternative effects in the sense that we can explain the former based on any given account of the latter.

The subsequent study tests this proposal. In particular, it tests the mediation hypothesis and thus the claim that stakes effects are fully mediated through the number of errorpossibilities people think of. If the mediation hypothesis can be confirmed, the reduction hypothesis naturally follows. Meanwhile, if the mediation hypothesis fails, the reduction hypothesis becomes unmotivated. It remains possible that stakes effects reduce to salient alternative effects, but proponents of orthodoxy would no longer be justified to rely on this assumption.

| Method

| Pretest

The major challenge in designing the present study was to find a way to control for the number of error-possibilities participants think of-or the number of "generated" error-possibilities, as we will also sometimes say-when they respond to our retraction tasks. Unless we can measure this factor, we cannot assess whether it mediates stakes effects on knowledge ascriptions. 7 The first step we took in finding such a measure was to generate a list of error-possibilities that would naturally come to our participants' minds. To generate this list, we ran a small pretest (N = 20). We asked participants to imagine themselves in a bank-case like situation where they had formed the belief that the bank will be open on the next Saturday based on their memory of having been at the bank 3 weeks before. Then we stipulated that this belief was wrong and that the bank was actually closed. Participants now received the following instruction: "What has happened? Why is the bank closed tomorrow? Write down the first thing that comes to mind". We systematized the responses and ended up with the following four error-possibilities:

(1) The bank has changed its hours since my last visit; (2) there is a bank holiday tomorrow I forgot about; (3) I have wrongly remembered my visit at the bank 3 weeks before (e.g., I was there on a Friday or it was a different branch); (4) the bank is closed tomorrow due to staff training. This list of error-possibilities was employed in our main study as described below. (See the Appendix for the precise vignette we used in the pretest and a systematized list of the responses participants gave.) 7 The experimental approach we will take differs from the approach Buckwalter and Schaffer (2015, pp. 224-225) took to control for salient alternatives. They attempt to modify the case descriptions such that both the low and the high stakes condition feature what they call "high salience." We worry though that their modification of the low stakes case actually turns it into a high stakes case. The case description mentions that "a small percentage" of people die from eating "a single Mongolian pine nut." It is stipulated that the protagonist has only a slight allergy. Still, the stakes might be high for her to the extent that she cares for the well-being of others (who might die).

| Main study

One hundred and seventy-three participants were recruited through Prolific Academic (61% female, mean age 32). They were assigned to the NEUTRAL or the STAKES story from the bank case study above (we dropped the EVIDENCE condition this time). The experimental design was as in the bank case study up to and including the point where participants indicated whether they would retract or stand by their previous knowledge claim and how confident they were in their response. After giving these responses, they moved on to a new screen where we told them that we were interested in their thoughts while answering the previous questions. A list of the above error-possibilities (1) to (4) was provided (in random order) and participants could click either "Yes" or "No" for each error-possibility to indicate whether they had thought of it. All error-possibilities were visible at once. The specific instruction read as follows: "We are interested in your thoughts while answering the previous questions. Did you consciously think of the possibility that...". This instruction was followed by the error-possibilities (1) to ( 4 ) and the respective "Yes/No" buttons. The study concluded with the familiar attention check on a separate screen (recall how much time has passed since the last visit to the bank).

We asked participants for their "conscious" thoughts for the following reasons. First, the results from, for example, Mayseless and Kruglanski (1987, pp. 175-178) used above to motivate the mediation hypothesis concern the conscious generation of error-possibilities. Participants in their studies even wrote down the possibilities in question. Second, we are ultimately interested in whether stakes effects can be reduced to salient alternative effects. Salient alternative effects are naturally characterized in terms of the idea that people become less prone to ascribe knowledge when novel error-possibilities are brought to their attention, that is, when they consciously think of novel errorpossibilities. Sub-conscious error-possibilities do not play a role.

| Results

The study replicated the results from the previous studies. Responses from three respondents were discarded because they failed the attention check. The remaining participants were more inclined to retract in STAKES (45%) than in NEUTRAL (12%) as measured in binary responses (χ 2 (1, N = 170) = 22.760, p < .001, Cramer's V = .366 [medium effect]) and composite scores (STAKES [M = 4.75, SD = 3.63] vs. NEUTRAL (M = .76, SD = 5.28); t(168) = 5.74, p < .001, d = .88 [large effect]).

To test our mediation hypothesis, we determined the number of error-possibilities each participant claimed to have thought of, that is, the number of times they had clicked "Yes" on the error-possibility screen. We call this variable the number of generated error-possibilities. 40% of our participants generated none of the error-possibilities we provided, 22.4% one, 18.8% two, 12.4% three, and 6.5% all four.

A bootstrap mediation analysis (Hayes, 2013) was performed with condition as the independent variable, composite score as the dependent variable and number of generated error-possibilities as the mediator. Results are shown in Figure 3 and Table 3 . Participants generated more errorpossibilities when the stakes were high (a = .81). When they had generated more error-possibilities, they were more inclined to retract their previous knowledge attribution (b = -1.65). A bootstrap confidence interval for the indirect effect (ab = -1.34) based on 5,000 bootstraps was entirely below zero (95% CI for indirect effect = -2.12 to -.68) suggesting that there was an indirect effect. However, a direct effect of condition on composite scores remained (c 0 = -2.65).

| Discussion

The results described put pressure on the mediation hypothesis. On this hypothesis, the generation of error-possibilities alone causes retraction. This assumption is required to derive the reduction hypothesis, according to which we can fully explain stakes effects via salient alternative effects. Our study shows, however, that the generation of error-possibilities is only partially responsible for stakes effects. Thus, the mediation hypothesis fails, and the reduction hypothesis becomes unmotivated. Stakes effects do pose a distinctive explanatory challenge to orthodox positions.

One might worry that we should have provided a wider range of error-possibilities on the respective screen, or that we should have split up the given error-possibilities into more finegrained alternatives. This could further weaken the direct effect. Note, however, that the direct effect of high stakes on retraction in our studies was roughly twice as large as the indirect effect through generated error-possibilities (2.65 vs. 1.34 reduction in composite scores). It is doubtful from our perspective that the suggested modifications will make it disappear entirely, but we welcome further empirical research.

One might further worry about the reliability of participants in reporting the errorpossibilities they had previously thought of. This worry seems unfounded though. For instance, the short temporal distance makes it very unlikely that participants already forgot what they had thought about. 8 It could be argued that people who chose to retract felt pressure afterwards to say that they had thought of more error-possibilities in order to rationalize their decision. If anything, though, this should strengthen the indirect effect because it would tighten the connection between more generated error-possibilities (#EP) and retraction (COMP). So, if The so-called "thought listing technique" is an established experimental paradigm that relies on similar assumptions (Cacioppo & Petty, 1981) .

anything, the cards are stacked in favor of the mediation hypothesis, and yet our study disconfirms it. Finally, it might be noted that our mediation hypothesis is distinct from the mediationthrough-seriousness hypothesis, according to which (i*) high stakes lead people to take additional error-possibilities seriously, and (ii*) this alone leads them to retract. 9 Our mediation hypothesis may also be distinct from the mediation-through-salience hypothesis, according to which (i**) high stakes make new error-possibilities salient, and (ii**) this alone leads people to retract. 10 One may worry that these latter hypotheses could also be used to reduce stakes to salient alternative effects, even if our mediation hypothesis fails. Thus, even if we can refute the mediation hypothesis, the reduction hypothesis remains well-motivated.

The mentioned hypotheses may be true, and they may be interesting in their own right. Unlike our mediation hypothesis, however, they do not motivate the reduction hypothesis, that is, the thesis that any given explanation of salient alternative effects automatically yields an explanation of stakes effects.

Consider the mediation-through-seriousness hypothesis. Blome-Tillmann (2014), for instance, presents a version of contextualism, according to which this hypothesis is true. On his view, knowledge is defined in terms of one's ability to rule out the error-possibilities taken seriously in the context at hand. Stakes effects supposedly arise because high stakes lead people to take more error-possibilities seriously (pp. 14-15). The reduction hypothesis does not follow. Suppose, for instance, that we explain salient alternative effects as follows (e.g., Dinges, 2018a; Hawthorne, 2004; Williamson, 2005) . The conscious thought of an error-possibility leads us to overestimate the probability of error via something like the availability heuristic (Tversky & Kahneman, 1973) . Since knowledge is incompatible with a high probability of error, this leads us to deny knowledge. Such an account does not help at all in explaining stakes effects on knowledge ascriptions even given the mediation-through-seriousness hypothesis. It does not help to explain why high stakes lead us to take more error-possibilities seriously (compare (i*)). The availability account arguably explains why we take more error-possibilities seriously once we consciously think of them. After all, we presumably take possibilities more seriously the more probable we judge them to be. But this gets us nowhere as far as stakes are concerned. 11 The availability account does not help to explain either why taking more error-possibilities seriously leads us to deny knowledge (compare (ii*)). The explanation of this connection should presumably come from an analysis of knowledge of the type favored by Blome-Tillmann, but the availability account presupposes no such analysis.

Consider next the mediation-through-salience hypothesis. This hypothesis just collapses into our mediation hypothesis if we think of salience as, or as closely enough connected to, what we consciously think of. One may construe salience in other ways. For instance, one may appeal to the idea that salience is a conversational rather than a psychological phenomenon (see, e.g., Gerken, 2017, pp. 24-29 for this distinction) or that salience requires "vivid and 9 See, for example, Blome-Tillmann (2014, pp. 14-15) for an idea along these lines, when "taking seriously" is spelled out in terms of what we presuppose to be false. Buckwalter and Schaffer (2015, pp. 222-223) may subscribe to a similar hypothesis if "worrying" is understood as tantamount to "taking seriously". 10 See, for example, Cohen (2000, p. 98) and Pinillos (2016, p. 355) for such hypotheses. 11 One might envisage the following, additional mediation hypothesis here. High stakes lead us to take more errorpossibilities seriously because high stakes lead us to consciously think of more error-possibilities, which in turn leads us to take them seriously. The availability account could now be recycled to explain the latter step. However, we end up with a view where stakes effects are fully mediated through the conscious consideration of additional error-possibilities, and this type of position has been ruled out by our study.

concrete" presentation (Schaffer & Knobe, 2012, p. 694) . On these interpretations, though, the mediation-through-salience hypothesis does not look very plausible. Assume a notion of salience of the indicated sort, where salience is, say, a conversational factor or requires vivid and concrete presentation. The mediation-through-salience hypothesis would now say that high stakes lead to the conversational salience of new error-possibilities or their vivid and concrete presentation. But it seems clear that you can be in a high stakes situation without being in any conversation at all and without any error-possibilities being presented to you. Moreover, and as before, it is doubtful whether the mediation-through-salience hypothesis, so understood, allows us to reduce stakes effects to salient alternative effects. For instance, we find salient alternative effects even for cases where the protagonist merely thinks of a novel error-possibility (Alexander et al., 2014) and where the error-possibility in question is not presented in a particularly vivid or concrete fashion (Alexander et al., 2014; Gerken et al., 2020; Nagel et al., 2013) . This makes it unlikely that the indicated notions of salience even play a role in a general explanation of salient alternative effects.

| STRENGTH OF THE DATA

Our studies suggest that there are stakes effects on knowledge ascriptions and that these effects cannot be reduced to salient alternative effects. One may still doubt that this gives us a reason to abandon orthodoxy. Only a minority of our participants chose to retract, and this may seem too little. Buckwalter and Schaffer (2015, p. 221) , for instance, discuss similar results and observe that they "remain a far cry from the strong flip from 'knowledge' to 'ignorance'" that was characteristically assumed in early discussions of stakes effects. They correspondingly "doubt that such modest results should be very encouraging" for opponents of orthodoxy. We think this worry is unconvincing, for the following reasons.

There are two ways to spell out the worry. First, one could argue that proponents of orthodoxy can explain weak effects too. There is thus no pressure to reject their position. Second, one could hold that even if proponents of orthodoxy cannot explain the data, unorthodox positions do not fare any better. Orthodox positions undergenerate, predicting no stakes effects, while unorthodox positions overgenerate, predicting stronger stakes effects than we observe.

We will address these worries in turn, beginning with the first worry, according to which weak data can be explained from an orthodox perspective. Of course, we cannot show that the data are impossible to explain from an orthodox perspective. Novel accounts could always be developed. We can show though that the data provide a prima facie challenge to this position along the lines of the familiar challenge from salient alternative effects. The basic reasoning is simple. According to orthodoxy, the truth-value of knowledge ascriptions does not depend on stakes, not even a bit. The stakes-shift in STAKES should thus be perceived as just as immaterial to the initial knowledge ascriptions as the continuation in NEUTRAL, which is not the case. foot_4 Of course, as with salient alternative effects, proponents of orthodoxy can appeal to familiar pragmatic (e.g., Brown, 2006 , p. 426), doxastic (e.g., Nagel, 2008) , or psychological (e.g., Gerken, 2017) resources to respond. Such accounts, however, face equally familiar concerns.

Take doxastic accounts, according to which people retract previous knowledge claims only because high stakes make them lose their beliefs and thereby knowledge. First, while such accounts are consistent with the findings from our specific studies, they do not readily apply to related results in evidence seeking studies. There we find stakes effects even when it is stipulated that the protagonists retains her belief (Pinillos, 2012, p. 203) , when it is stipulated that she is unaware of what is at stake (Pinillos, 2012, pp. 202-203) , and when participants respond to the question of when the protagonist is in a position to know the target proposition (Dinges, 2020 ; see also Pynn, 2014, p. 130) . All of these findings require special pleading on behalf of the doxastic account to the extent that we seek a unified account of stakes effects across different experimental paradigms. Second, it is unclear whether the doxastic account really is compatible with orthodoxy rather than being a version of pragmatic encroachment. If knowledge entails belief, and if beliefs can change in virtue of shifts in stakes as doxasticism would seem to have it, then knowledge can shift in virtue of shifts in stakes too, that is, we get pragmatic encroachment. Correspondingly, the chief objection to pragmatic encroachmentaccording to which this view validates awkward counterfactuals such as "I know that p, but I wouldn't know that p if it was more important to be right"-applies with equal force to doxasticism. 13 On the second construal, the present worry has it that even if proponents of orthodoxy cannot explain the data, unorthodox positions do not explain them either because they predict stronger results (e.g., near universal retraction rates in STAKES). We respond by showing that unorthodox positions can easily be spelled out such that they do not make this problematic prediction.

As indicated, there are a number of unorthodox positions. For concreteness, it will be useful to focus on a specific view, which we will call the "simple pragmatic encroachment view", or SPEV for short. According to SPEV, knowledge entails evidence for the putatively known proposition. How much evidence? Practical factors set the threshold. The more is at stake, say, the more evidence knowledge requires.

To begin with, let us look at how the overgeneration worry arises for SPEV. It may seem that the best way for proponents of SPEV to explain the results from our retraction studies is as follows. The protagonist's evidence is good enough for knowledge in NEUTRAL, where the stakes are low. The (same) evidence is not good enough in STAKES, where the stakes are high. Participants realize this and correspondingly consider their initial knowledge claim to be false in STAKES retracting for this reason. 14 On this picture, one would naturally expect more consistent responses. If the evidence available to the protagonist is not good enough for knowledge in STAKES, why do we not see at least the majority of participants retract?

In response, notice the open-endedness of the scenario descriptions. In our typo stories, for instance, it is stated that the protagonist has carefully proofread the paper three times. Nothing 13 For further discussion of these and additional worries with doxasticism, see, for example, Fantl and McGrath (2009, pp. 44-46) , Pinillos (2012, pp. 202-203) , Sripada and Stanley (2012, pp. 20-23) , Shin (2014 , pp. 173-177), Pynn (2014, pp. 129-131) , Stoutenburg (2016 , pp. 2037 -2039 ), Gerken (2017 , p. 287), and Dinges (2020) . As for pragmatic accounts, Dimmock and Huvenes (2014, pp. 3244-3247) argue that these accounts cannot explain retraction data as we find them in our studies, and they raise a number of additional concerns. For yet further worries with pragmatic accounts, see, for example, Pinillos (2012, pp. 203-204) , Pinillos and Simpson (2014, p. 39, n. 17), Blome-Tillmann (2013) , Petersen (2014) , Roeber (2014) , Kindermann (2016) , Stoutenburg (2016 Stoutenburg ( , pp. 2033 Stoutenburg ( -2037)) , and Dinges (2018b). Gerken's psychological heuristic proxy account has difficulties explaining why stakes effects remain in studies that explicitly shift the participants' focus on an epistemic rather than a practical assessment (Pinillos, 2012, pp. 203-204) . 14 There is a question of whether the stakes change in STAKES or whether they are merely revealed to have been high all along. We assume the latter in the main text. If the stakes change, the initial utterance will come out true even by SPEV. Retraction may still be warranted because even though the protagonist knew the target proposition, she no longer knows it now. This is reason enough to retract, or at least not to stand by, one's previous utterance.

is said about, for example, how long the paper is or how good a proofreader the protagonist happens to be. These factors will be filled in differently depending on individual background assumptions. Proponents of SPEV can argue that participants in STAKES retract only to the extent that they perceive their initial evidence as insufficient for knowledge in STAKES. Skilled proofreaders, for instance, may stand by their claim even in this context. Austin (1956) captures the general spirit of this approach (see also Boyd & Nagel, 2014 ):

When we come down to cases, it transpires in the very great majority that what we had thought was our wanting to say different things of and in the same situation was really not so-we had simply imagined the situation slightly differently: which is all too easy to do, because of course no situation (and we are dealing with imagined situations) is ever "completely" described. Austin (1956, pp. 9-10) Similar things can be said about our bank cases. The assumed strength of one's evidence depends on background assumptions about, for example, the reliability of banks or one's memory. There will be individual differences. These differences explain differences in retraction behavior according to SPEV. 15 Notice that it is actually unsurprising that more general retraction rates are difficult to generate given SPEV. Proponents of SPEV will (or at least can and maybe should) say that evidential threshold for knowledge is high even in low stakes conditions. For knowledge is always a fairly demanding evidential state. Maybe they will say that low stakes knowledge requires an evidential probability of .8. When the stakes rise, the threshold goes up, but given that we are close to 1 from the start, we should not expect overly dramatic shifts. Maybe the threshold goes to .9. Universal retraction will occur due to shifts in stakes only if we manage to describe cases where the evidential probability is universally perceived to fall into the small window between .8 and .9, while all other conditions for knowledge are stably satisfied. Given how heavily evidential probabilities depend on individual background assumptions, given how multifarious the notion of knowledge is and given the unclarity about where exactly the relevant thresholds lie to begin with, these requirements are difficult to meet within the confines of a vignette suitable for survey studies. 16 To further explore this proposal, one could control for various background assumptions by asking participants about, for example, their estimate of how probable it is that banks change their hours. On the suggested proposal, we should find that retraction responses are partially mediated by at least some of these factors. We are optimistic that this prediction will be borne out, but we will leave confirmation to future research. 15 The open-endedness of the scenario descriptions may also help to explain why stakes effects do not appear in canonic studies. See, for example, Nagel (2010a, p. 429, n. 6 ) and Pinillos and Simpson (2014, p. 14) . 16 It will not do to just stipulate that the evidential probability of the target proposition is, say, .85 (in those terms).

Participants presumably cannot make sense of this semi-technical notion. It will not do either to fix evidential probabilities by introducing statistical evidence from, for example, a lottery. Such evidence is generally perceived as insufficient for knowledge (e.g., Turri & Friedman, 2014) , thus participants would only deny knowledge across the board, in both the low stakes and the high stakes condition.

Our studies suggest that there are stakes effects on knowledge ascriptions that cannot be reduced to salient alternative effects. So far, we have considered cases where protagonists selfascribe knowledge before the stakes rise. Future studies should vary these factors, exploring third-person cases, cases featuring knowledge denials rather than ascriptions, cases where the stakes are lowered rather than raised and cases where practical factors other than stakes (e.g., time constraints) shift. Our retraction paradigm provides a novel framework to explore all of these issues and to thereby deepen our understanding of stakes effects on knowledge ascriptions. How should we explain the data we have obtained so far? We have not settled on an account, but we hope to have shown that there is evidence that poses a serious challenge to orthodoxy about knowledge ascriptions.

Mind & Language. 2021;36:729-749.wileyonlinelibrary.com/journal/mila

14680017, 2021, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mila.12300 by Pcp/University Of Warsaw, Wiley Online Library on [22/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License

Pragmatic encroachment on knowledge can (but does not have to) be understood as a deep metaphysical thesis that need not be reflected in folk judgments (e.g.,Brown, 2013, pp. 240-241).

This way of calculating composite scores is taken from Gerken et al. (2020) .

SeePinillos and Simpson (2014, p. 39, n. 15) for related observations.

14680017, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/mila.12300 by Pcp/University Of Warsaw, Wiley Online Library on [22/04/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License