turrindepistemiccontextualismidle
/data/papers/turrindepistemiccontextualismidle/out/text.txt
Introduction

Epistemic contextualism is the view that the verb "know" is a context sensitive expression. More specifically, according to contextualism, in order for us to truthfully say a person "knows" a proposition, that person must meet the evidential standard set by our context and, critically, the * This is the penultimate version of a paper to appear in Australasian Journal of Philosophy. Please cite the final, published version if possible. 1 standard changes across contexts. Contextualists motivate their view based on a set of empirical claims about competent speakers' linguistic behavior in certain situations (Lewis 1996; DeRose 2009; Cohen 2013) .

The most famous way of illustrating the idea involves a pair of cases about a man who wants to deposit a check and is deciding whether to wait in a long line at the bank on a Friday afternoon, or come back on Saturday morning when the line would be short (DeRose 2009) . But the question arises: is this bank actually open tomorrow (Saturday) morning? The man visited this bank two Saturdays ago and it was open then, but banks do sometimes change their hours. In the "low stakes" version of the case, nothing serious hinges on whether he deposits the check before the weekend is over, and the man says, "I know that the bank is open tomorrow." In the "high stakes" version of the case, something very serious hinges on whether he deposits the check before the weekend is over, and the man says, "I don't know that the bank is open tomorrow." Contextualists claim that people will judge that both knowledge statements are true.

("Knowledge statement" includes both attributions and denials of knowledge.) That is, competent speakers will judge that the man truthfully says he "knows" in the low stakes version, and that the man truthfully says he "doesn't know" in the high stakes version.

For the sake of argument, suppose that people behave as contextualists predict. We might doubt that this is a surprising prediction, because we might expect this behavioral pattern even if contextualism is false. One possibility is that, across the pair of cases, people draw different inferences about whether the agent satisfies the (invariant) requirements of knowledge. It is widely assumed that in order for you to know that a proposition is true, it must be that the proposition is true, that you have good evidence for the proposition's truth, and that you think that the proposi -tion is true. Each requirement on knowledge provides a possible explanation for why competent speakers would judge that the agent can truthfully say "I know" in the low case and "I don't know" in the high case. For example, people might assume that someone who says "I don't know" is less likely to have the relevant belief (compare Bach 2005; Nagel 2008 ), or to have a true belief, or to have good evidence than someone who says "I do know." This could lead people to judge that the agent knows in the one case but not in the other, which in turn would lead people to judge that the one agent truthfully says "I know," whereas the other agent truthfully says "I don't know." Similarly, depending on how serious the situation is, people might draw different inferences about the truth, the agent's evidence, and what the agent thinks.

Contrary to these suggestions, contextualists have claimed that people's judgments about the invariant requirements of knowledge do not shift across the pair of cases. For instance, some propose that it is "natural" to interpret the agent as equally confident in the low and high stakes cases (DeRose 2009, pp. 190-3) . Some propose it is clear that "nothing changes" about the quality of the agent's evidence across the cases (Lawlor 2013, p. 81 ; see also DeRose 2009, p. 2).

And some imply that we can or will "assume" that the relevant proposition is true in each case (DeRose 2009, p. 2 ). 1 In other words, contextualists propose that the shift in "knowledge" judgments cannot be explained by appealing to widely accepted invariant conditions on knowledge (e.g. truth, evidence, belief) because our underlying judgments about these factors do not shift. Thus, something else must explain the shift in "knowledge" judgments, something which does shift across the cases. Contextualists propose a shifting evidential standard associated with "know" to explain the behavior.

Another possibility is that the shift in "knowledge" judgments is due to a critical confound.

1 This same assumption is not necessarily shared by previous experimental work on the issue.

The low and high stakes cases differ not only in how much is at stake, but also in whether the agent says "I know" or "I don't know." People might simply defer to others' self-regarding knowledge statements, regardless of whether the stakes vary. That is, people might assume that others are well positioned to report on their own mental states. Contextualists have not addressed this concern about a "deferral" confound. Accordingly, at least three questions are important for evaluating the motivation for contextualism. First, do "knowledge" judgments shift across the pair of cases? Second, do judgments about truth, belief, or evidence shift across the pair of cases? Third, is the confound of deferral an innocent imperfection or a genuine problem? If "knowledge" judgments shift, the other judgments do not shift, and the confound of deferral is an innocent imperfection, then the principal motivation offered for contextualism is supported; otherwise, it is undermined.

Prior research has not addressed the second or third questions and has yielded mixed results regarding the first question. Some results seemed inconsistent with contextualist predictions about shifting "knoweldge" judgments (e.g. Buckwalter 2010; Feltz & Zarpentine 2010; Buckwalter 2014 ; see also May, Sinnott-Armstrong, Hull & Zimmerman 2010) , but others seemed consistent with the predictions (e.g. Hansen & Chemla 2013 ; see also Alexander, Gonnerman & Waterman 2014) . Contextualists have made several methodological objections to these studies, particularly emphasizing the importance of eliciting metalinguistic judgments about a knowledge statement's truth value and of having the agent say "I know" in the low case and "I don't know" in the high case (DeRose 2009, p. 49, n. 2; DeRose 2011) . Others express dissatisfaction at the design of previous work on the topic (Schaffer 2006) . For example, some argue that contextualist thought experiments are poorly constructed because they are supposed to produce similar results across the two cases rather than a difference, leading to the problem of interpreting a null result (Hansen & Chemla 2013, pp. 292, 295) . But this last criticism is incorrect because contextualism does not predict a null result. Instead, contextualism predicts that people will agree with the knowledge statement in both low and high conditions. This requires a significance test against the neutral midpoint for a scaled response (or chance for a dichotomous response). If the results are significantly above midpoint in both conditions, then it vindicates contextualism's prediction.

Consistent with this, agreement might still be significantly higher in one of the conditions. (As observed below in Experiment 1, this actually happens sometimes.)

In this paper, I report a series of behavioral experiments that address all three questions.

According to the results, (1) "knowledge" judgments shift in the way contextualists predict, (2) judgments about truth, belief, and evidence also shift across the pair of cases, and (3) the deferral confound is a genuine problem. Overall, these results undermine the principal motivation offered for contextualism in the literature to date.

Experiment 1

This experiment tests whether judgments about "knowledge" and potential requirements of knowledge shift across a classic pair of low/high cases.

Method

Participants. Two hundred and one participants (aged 19-71 years, mean age = 31 years; 94% reporting English as a native language; 77 female) were tested. Participants were U.S. residents, recruited and tested online using Amazon Mechanical Turk and Qualtrics, and compensated $0.40 for approximately 2-3 minutes of their time. Repeat participation was prevented within and across experiments. Participants were recruited and tested the same way as in all experiments reported here.

Materials and Procedure. Participants were randomly assigned to one of two conditions, Low and High, in a between-subjects design. Each participant read a single story. The stories were based on the original pair of "bank cases" (DeRose 2009, pp. 1-2). Keith and Jane are driving home from work on Friday afternoon. Keith plans to deposit a check they just received, but then he sees that the line inside the bank is very long. He says that he will just return tomorrow instead of waiting in line. It is not important for the check to be cashed before Monday in the low condition, but it is very important in the high condition. Jane questions Keith calmly in the Low condition, but she questions him anxiously in the High condition. After Jane questions him, Keith claims to know that the bank is open tomorrow in the Low condition, but he claims to not know in the high condition. Here is the text for the stories:

(Low/High) Keith and his wife Jane are driving home from work on Friday afternoon.

They just received a large check from a client, which Keith plans to deposit in their bank account. It is [not/very] important for him to deposit the check before Monday:

[they definitely do/otherwise they won't] have enough money in their account for all their checks to clear. ¶ foot_0 As they drive past the bank, they see that the lines inside are very long. Keith says, "I hate waiting in line. I'll just come back tomorrow morning instead." ¶ Jane responds [calmly/anxiously], "This is really [not/very] important, [but/and] lots of banks are closed on Saturdays. Do you know that our bank is open tomorrow?" ¶ Keith answers, "It was two Saturdays ago that I went to our bank, and it was open. So, [yes, I do/no, I don't] know that our bank is open tomorrow." After reading the story, participants rated their agreement with six statements. At the end of the story: 1. It's true that the bank is open tomorrow. 2. Keith believes that the bank is open tomorrow. 3. Keith has good evidence that the bank is open tomorrow. 4. When Keith said, "I [do/don't] know," what he said was true. 5. The situation is potentially very serious for Keith. 6. Keith should come back tomorrow morning instead.

Statement 5 was included as a manipulation check, with the expectation that participants would disagree in the Low condition and agree in the High condition. Statement 4 was included to test whether people agree that the knowledge attribution and the knowledge denial are true. The other four statements were included to test whether judgments about other potential requirements of knowledge also differ across the conditions.

Responses were collected on a standard seven-point Likert scale, 1 ("Strongly Disagree") -7 ("Strongly Agree"), left-to-right on the participant's screen. (Participants never saw numerical labels.) The statements were presented in random order and appeared on the participant's screen all at once in a matrix table, while the story remained at the top of the screen.

After rating the statements, participants proceeded to a new screen where they answered three comprehension questions from memory (response options rotated randomly):

1. Keith and his wife were driving home on _____. (Friday/Saturday) 2. It was _____ ago that Keith was at the bank. (two Saturdays/one Saturday) 3. Keith said that he _____ know that the bank is open tomorrow. (does/does not) Finally, participants went to a new screen and completed a brief demographic questionnaire. At no point could participants return to a previous screen. The same is true for all studies reported in this paper.

Results

Participants answered the comprehension questions correctly 97% of the time, indicating that they understood and remembered the story's details extremely well. Participant response to the "seriousness" statement shows that the experimental manipulation was extremely effective. (See Table 1A .) 3 Participants in both conditions tended to agree that Keith's self-regarding knowledge statement was true: one sample t-tests, Low, t(101) = 6.95, p < .001, MD = 1.02 [0.73, 1.31], d = 0.69 (medium effect size), test proportion = 4; High, t(98) = 2.00, p = .048, MD = 0.36 [0.01, 0.72], d = 0.20 (small effect size). 4 (See Table 1A for descriptive statistics.) The modal response was 3 Preliminary tests revealed no main or interaction effects of participant gender on the dependent measures. Preliminary tests revealed two small, unpredicted main effects of participant age. Older participants (median split = 28 years) were more likely to attribute belief overall, independent samples t-test, younger/older, M = 5.45/5.94, SD = 1.48/1.20, t (194) = -2.60, p = .010, MD = -0.49 [-0.87, -0.12], d = 0.37 (small effect size). Similarly, older participants were more likely to agree overall that Keith's self-regarding knowledge statement was true, M = 4.42/4.99, SD = 1.80/1.50, t (195) = -2.46, p = .

015, MD = -0.57 [-1.03, -0.11], d = 0.35 (small effect size). I ignore these small and unpredicted demographic effects in the main text. Measure Low High t df p MD d 95% CI for MD LLCI ULCI Truth 5.08 (1.23) 4.27 (1.12) 4.84 199 <.001 0.81 0.69 0.48 1.13 Belief 6.36 (0.97) 4.99 (1.37) 8.15 176 foot_1 <.001 1.37 1.23 1.04 1.70 Evidence 5.66 (1.22) 4.75 (1.53) 4.64 187 <.001 0.91 0.68 0.52 1.30 "Know" 5.02 (1.48) 4.36 (1.81) 2.81 189 .006 0.66 0.41 0.20 1.12 Serious 2.78 (1.89) 5.85 (1.06) -14.21 160 <.001 -3.06 2.25 -3.49 -2.64 Actionability 4.72 (1.71) 2.26 (1.28) 11.54 187 <.001 2.45 1.69 2.03 2.87

Conceptual Replication

To ensure that these results are not due to peculiarities of one particular story, I conducted a conceptual replication based on another famous low/high pair from the literature (Cohen 1999 ). The After reading the story, participants answered these questions:

1. It's true that the flight is direct.

2. Stewart believes that the flight is direct.

3. Stewart has good evidence that the flight is direct.

4. When Stewart said, "I [do/don't] know," what he said was true.

5. The situation is potentially very serious for Stewart.

6. Stewart should buy the plane tickets to Chicago.

The results replicated the findings from Experiment 1. (See Table 1B .) Participants in both conditions tended to agree that Stewart's self-regarding knowledge statement was true: one sample t-tests, Low, t(51) = 2.06, p = .045; High, t(48) = 6.14, p < .001. The modal response was "Agree" (= 6) in both conditions.

Table 1B. Conceptual replication of Experiment 1: Mean response (standard deviation in parentheses) to the statements in the Low and High conditions along with the results from independent samples t-tests. Measure Low High t df p MD d 95% CI for MD LLCI ULCI Truth 4.73 (1.19) 4.24 (1.55) 1.77 99 .079 0.49 0.36 -0.06 1.03 Belief 6.65 (0.57) 4.63 (1.70) 7.92 58 <.001 2.02 2.08 1.51 2.53 Evidence 5.13 (1.51) 4.71 (1.47) 1.42 99 .160 0.42 0.29 -0.17 1.01 "Know" 4.44 (1.55) 5.41 (1.61) -3.07 99 .003 -0.97 0.61 -1.56 -0.34 Serious 3.25 (1.57) 6.04 (1.37) -9.50 99 <.001 -2.79 1.91 -3.37 -2.21 Actionability 5.31 (1.25) 4.76 (1.63) 1.92 99 .057 0.55 0.39 -0.01 1.12

Discussion

Participants agreed that a knowledge attribution was true in a low stakes case, and that a knowledge denial was true in a high stakes case. However, judgments about several potential invariant requirements of knowledge also differed across the cases. In the high stakes case, participants were more likely to doubt that the relevant proposition is true, they judged the evidence to be worse, they were less willing to attribute belief, and they were more likely to deny that the agent should act.

Experiment 2

Familiar contextualist test cases contain a potentially important confound. More specifically, the confound is that the low and high stakes cases differ not only in how much is at stake, but also in the self-regarding knowledge statement made by the agent. It could be that people tend to defer to an agent's own self-regarding knowledge statements, which leads them to agree with the selfregarding knowledge statement in the low stakes case ("I know") and high stakes case ("I don't know"). 6 The present experiment tests this deferral hypothesis by comparing results from two cases differing only in whether the agent says "I know" or "I don't know."

Method

Participants. One hundred ninety-nine new participants (aged 18-65 years, mean age = 31 years; 97% reporting English as a native language; 80 female) were tested. Data was not collected from one person who declined to sign the consent form. This differs from the "rule of accommodation" (Lewis 1979) . Deferral might happen because we flexibly interpret an expression's meaning to make it come out true, but a simpler explanation is just that we assume that people tend to be right about their own mental states. 2A .) In each condition the modal response to the knowledge statement was "Somewhat agree" (= 5).

Materials and

7

Preliminary tests revealed no main or interaction effects of participant gender or age on the dependent measures.

Table 2A. Experiment 2: Mean response (standard deviation in parentheses) to the statements in the Yes and No conditions along with the results from independent samples t-tests. Measure Yes No t df p MD d 95% CI for MD LLCI ULCI Truth 5.00 (1.18) 4.64 (1.19) 2.12 197 .035 0.36 0.30 0.03 0.69 Belief 6.51 (0.63) 4.90 (1.49) 9.99 135 <.001 1.61 1.72 1.29 1.93 Evidence 5.48 (1.30) 5.14 (1.33) 1.83 197 .069 0.34 0.26 -0.02 0.71 "Know" 4.58 (1.67) 4.49 (1.78) 0.39 197 .694 0.09 0.06 -0.39 0.58 Serious 2.54 (1.27) 2.50 (1.38) 0.19 197 .849 0.04 0.02 -0.34 0.41 Actionability 4.84 (1.55) 4.41 (1.72) 1.86 197 .065 0.43 0.27 -0.02 0.89

Conceptual Replication

Again I conducted a conceptual replication based on the story about the direct flight. The procedures and questions were exactly the same as in the conceptual replication for Experiment 1.

Here is the text for the stories: The results replicated the findings from Experiment 2. (See Table 2B .) Participants in each condition tended to agree with the self-regarding knowledge statement: one sample t-tests, Yes, t(49) = 2.21, p = .032; No, t(50) = 2.19, p = .033. The modal response to the knowledge statement was "Somewhat agree" (= 5) in the Yes condition and "Agree" (= 6) in the No condition.

Table 2B. Conceptual replication of Experiment 2: Mean response (standard deviation in parentheses) to the statements in the Yes and No conditions along with the results from independent samples t-tests. Measure Yes No t df p MD d 95% CI for MD LLCI ULCI Truth 4.98 (1.62) 4.96 (1.51) 0.06 99 .951 0.02 0.00 -0.60 0.64 Belief 6.50 (0.79) 4.90 (1.58) 6.46 74 <.001 1.60 1.50 1.11 2.09 Evidence 4.82 (1.67) 5.18 (1.42) -1.15 99 .252 -0.36 0.23 -0.97 0.26 "Know" 4.54 (1.73) 4.61 (1.98) -0.18 99 .855 -0.68 0.04 -0.80 0.67 Serious 3.92 (1.40) 3.51 (1.53) 1.41 99 .163 .410 0.28 -0.17 0.99 Actionability 5.12 (1.35) 5.25 (1.41) -0.49 99 .625 -0.14 0.10 -0.68 0.41

Discussion

In addition to differing in how much is at stake, contextualist test cases also differ in whether the agent self-attributes knowledge or self-denies knowledge. The present experiment tested whether this difference produces the principal datum that contextualists cite as motivation for their view, even when stakes do not vary. The results showed that it did: when the only difference between conditions was that the agent said either "I do know" or "I don't know," participants were equally likely to agree with the statement. Participants in both conditions were equally likely to judge the situation unserious, and the differences in ratings of truth, actionability, and evidence were all greatly diminished from those observed in Experiment 1. The difference in belief attribution across conditions remained large, so deferral might go through belief attribution. Another possibility is that deferral is merely an instance of a more general agreement bias, whereby people tend to endorse assertions (Krosnick 1999, p. 552 ; see also Gilbert, Krull & Malone 1990) .

Further investigation into why deferral occurs is warranted but unimportant for present purposes.

For present purposes, the crucial point is the confound of deferral is a genuine problem for contextualist test cases.

Experiment 3

Instead of asking participants to rate whether an agent's self-regarding knowledge statement is true, in this experiment I asked participants to recommend whether the agent should self-attribute knowledge in order to speak truthfully. Thus participants could not simply defer to the agent's self-regarding knowledge statement, because the agent makes no such statement, thereby avoiding the deferral confound.

Method

Participants. Two hundred and two participants (aged 18-63 years, mean age = 30 years; 95%

reporting English as a native language; 78 female) were tested.

Materials and Procedure. Participants were randomly assigned to one of two conditions, Low and High, in a between-subjects design. Each participant read a single story very similar to the ones used in Experiment 1. The difference is that this time the story did not end with Keith making a self-regarding knowledge statement. Instead, it ended when Keith says, "It was two Saturdays ago that I went to our bank, and it was open." After reading the story, participants were instructed, "Keith wants to correctly answer Jane's question. In light of that, please rate your agreement with the following statement." Here is the test statement:

In order for Keith's answer to be true, he should say, "Yes, I do know."

Participants then went to a new screen and rated their agreement with the same statements about truth, belief, evidence, seriousness, and actionability from Experiment 1, in the same way as in Experiment 1. The question about seriousness was included as a manipulation check, with the expectation that participants would disagree in the Low condition and agree in the High condition. After rating the statements, on a new screen participants answered a comprehension question from memory:

Keith and his wife were driving home on _____. (Friday/Saturday)

Results

Participants answered the comprehension question correctly 96% of the time. 8 Response to the "seriousness" statement again showed that the experimental manipulation was extremely effective. (See Table 3A .) The principal question is whether participants agreed more strongly in the Low condition that Keith should self-attribute knowledge in order to speak truthfully. Participants in both conditions were equally likely to agree that Keith should self-attribute knowledge in order to speak truthfully. (See Table 3A .) Mean response in each condition was significantly 8

Preliminary tests revealed no main effects of participant age or gender on response to the knowledge statement. Participant gender had a small unpredicted main effect on quality of evidence, with women tending to rate the evidence better than men: men/women, M = 5.25/5.64, SD = 1.24/1.14, t( 200 above the neutral midpoint (= 4): one sample t-tests, Low, t(101) = 4.66, p < .001, MD = 0.79 [0.46, 1.13], d = 0.46 (small effect size); High, t(99) = 3.81, p < .001, MD = 0.64 [0.31, 0.97], d = 0.38 (small effect size). The modal response in each condition was "Agree" (= 6).

Table 3A . Experiment 3: Mean response (standard deviation in parentheses) to the statements in the Low and High conditions along with the results from independent samples t-tests.

Measure Low High t df p MD d 95% CI for MD LLCI ULCI Truth 5.14 (1.33) 4.75 (1.19) 2.18 200 .030 0.39 0.31 0.04 0.74 Belief 6.27 (0.77) 6.16 (0.88) 0.98 200 .328 0.11 0.13 -0.12 0.35 Evidence 5.59 (1.20) 5.21 (1.22) 2.23 200 .027 0.38 0.32 0.04 0.71 "Know" 4.79 (1.72) 4.64 (1.68) 0.64 200 .520 0.15 0.09 -0.32 0.63 Serious 2.48 (1.50) 5.76 (1.20) -17.18 192 <.001 -3.28 2.48 -3.66 -2.90 Actionability 5.00 (1.49) 3.00 (1.73) 8.81 200 <.001 2.00 1.25 1.55 2.48

Conceptual Replication

Again I conducted a conceptual replication based on the story about the direct flight. The procedures and questions were almost the same as in the conceptual replication for Experiment 1 . The difference is that this time the story did not end with Stewart making a self-regarding knowledge statement. Instead, it ended when Stewart says, "It was two Saturdays ago that I flew this airline to Chicago, and the flight was direct." After reading the story, participants were instructed, "Stewart wants to correctly answer Jill's question. Please rate your agreement with the following statement." Here is the test statement:

In order for Stewart's answer to be true, he should say, "Yes, I do know."

Participants then went to a new screen and rated their agreement with the same statements about truth, belief, evidence, seriousness, and actionability from the conceptual replication of Experiment 1.

The results replicated the findings from Experiment 3. (See Table 3B .) Participants in both conditions were equally likely to agree that Stewart should self-attribute knowledge in order to speak truthfully. (See Table 3B .) Mean response in each condition was significantly above the neutral midpoint (= 4): one sample t-tests, Low, t(49) = 3.96, p < .001; High t(50) = 3.18, p = .

003. The modal response was "Somewhat Agree" (= 5) in Low and "Agree" (= 6) in High.

Table 3B . Conceptual replication of Experiment 3: Mean response (standard deviation in parentheses) to the statements in Low and High conditions along with the results from independent samples t-tests.

Measure Low High t df p MD d 95% CI for MD LLCI ULCI Truth 4.54 (1.01) 4.71 (1.19) -0.75 99 .453 -0.17 0.15 -0.60 0.27 Belief 6.16 (1.00) 6.27 (0.67) -0.68 99 .498 -0.12 0.13 -0.45 0.22 Evidence 4.86 (1.34) 5.41 (1.24) -2.14 99 .035 -0.55 0.43 -1.06 -0.04 "Know" 4.90 (1.61) 4.76 (1.72) 0.41 99 .684 0.14 0.08 -0.52 0.79 Serious 4.02 (1.68) 6.14 (0.75) -8.14 67 <.001 -2.12 1.99 -2.64 -1.60 Actionability 5.14 (1.25) 5.16 (1.48) -0.06 99 .951 -0.02 0.01 -0.56 0.52

Discussion

This experiment tried to find evidence for contextualism while avoiding the confound of deferral.

Instead of retrospectively rating an agent's self-regarding knowledge statement, participants prospectively recommended whether the agent should self-attribute knowledge in order to speak truthfully. The results did not support contextualism. Whether the agent was in a high stakes or a low stakes situation, participants were equally likely to recommend the self-attribution.

Before moving on, it is worth addressing one issue. In Experiment 1, participants in the high stakes condition tended to agree that the agent spoke truthfully when he said "I don't know." But in Experiment 3, participants in the high stakes condition recommended saying "I do know" in order to speak truthfully. These results are not inconsistent, because the two experiments differed in whether the agent self-denies knowledge. (Experiment 2 showed that people defer to self-regarding knowledge statements.)

Experiment 4

The present experiment attempts to detect evidence for contextualism in a new way. More specifically, I asked participants about the strength of evidence required to "know."

Method

Participants. Ninety-nine new participants (aged 18-74 years, mean age = 31 years; 97% reporting English as a native language; 41 female) were tested. Data was not collected from one person who declined to sign the consent form.

Materials and Procedure. Participants were randomly assigned to one of two conditions, Low and High, in a between-subjects design. The procedures and stories for the conditions were very similar to Experiment 1, except this time the story ends with Jane posing a question to Keith about evidence and knowledge. The stories were the same as in Experiment 1 through the point where Jane asks, "Do you know that our bank is open tomorrow?" At that point, each story ends as follows:

Jane continues, "This actually raises a more general question I've been considering.

On a scale of 1 to 10, with 10 being the highest, how strong must your evidence be in order to know that the bank is open tomorrow?"

After reading the story, participants were instructed, "Keith wants to correctly answer Jane's question. In order for Keith's answer to be true, what should he say?" Beneath the instruc -

tions was an open sentence to complete:

"Knowing requires evidence that rates _____ on the scale."

Responses were collected on a ten-point scale, 1-10, left-to-right on the participant's screen. P articipants then went to a new screen and answered a comprehension question from memory:

Keith and his wife were driving home on _____. (Friday/Saturday)

Results

Participants answered the comprehension question correctly 92% of the time. 9 The critical question is whether assignment to condition affected response to the question about evidence. If the standards for knowledge shift across context, then mean response should be higher in the High condition. That is, participants in the High condition should think that Keith should say that knowledge requires stronger evidence. Contrary to that prediction, there was no effect of condition, independent samples t-test, High/Low, M = 8.62/8.31, SD = 1.98/1.86 t(97) = -0.81, p = .

418, n.s.

Conceptual Replication

Again I conducted a conceptual replication with the story about the direct flight. The procedures and stories for the conditions were very similar to the conceptual replication of Experiment 1, ex -cept this time the story ends with Jill posing a question to Stewart about evidence and knowledge. The stories were the same up through the point where Jill asks, "Do you know that the flight is direct to Chicago?" Then each story ends as follows:

Jill continues, "This actually raises a more general question I've been considering. On a scale of 1 to 10, with 10 being the highest, how strong must your evidence be in order to know that the flight is direct?"

After reading the story, participants were instructed, "Stewart wants to correctly answer Jill's question. In order for Stewart's answer to be true, what should he say?" Beneath the instructions was an open sentence to complete:

"Knowing requires evidence that rates _____ on the scale."

The results replicated the findings from Experiment 4. There was no effect of condition on response to the test question, independent samples t-test, High/Low, M = 9.08/8.59, SD = 1.29/1.56, t(98) = 1.69, p = .093, d = 0.34.

Discussion

If contextualism is true, then people should think that, in order to speak truthfully, someone in a high stakes situation must say that the evidential standard for knowledge is higher than someone in a low stakes situation. The results did not support this prediction. Whether the agent was in a high stakes or a low stakes situation, participants recommended identifying a similar evidential standard. The small numerical difference between the two situations in the conceptual replication (Experiment 4B) reached what is sometimes called a "marginal" or "trending" level of statistical significance. Perhaps future work could build on this to detect evidence motivating contextual -ism.

Conclusion

Epistemic contextualism is principally motivated by a set of empirical claims about linguistic intuitions and behavior. It has its "basis in ordinary language," which allegedly provides "evidence of the very best type" that the standards of "knowledge" are context sensitive (DeRose 2009, p. 48) . The results from four experiments show that some but not all contextualist behavioral predictions were correct. Contextualists correctly predicted that people will agree when someone says "I know" in a low stakes situation, and when someone says "I don't know" in a related high stakes situation. However, people also draw different inferences about (what are widely agreed to be) invariant requirements of knowledge, including whether the target proposition is true, whether the agent believes the target proposition, the quality of the agent's evidence, and how the agent should act. Accordingly, rejecting contextualism does not force one to "resist the intuitions" that lead people to endorse different self-regarding knowledge statements in the low and high stakes cases (cf. DeRose 2011, p. 89).

Moreover, contextualist test cases contain a critical confound. The confound is that the low and high stakes cases also differ in whether the agent says "I know" or "I don't know." However, even when the stakes are held constant, people tend to agree when the agent says "I know" and when the agent says "I don't know" (Experiment 2). Some contextualists have insisted that, when testing their view, it is critical to include explicit utterances of "I know" and "I don't know," because this is "a significant part of what's behind the intuitions [contextualists] actually appeal to" (DeRose 2011, pp. 86-7). However, they failed to realize that this difference alone is enough to produce the relevant pattern of agreement with "knowledge" statements.

In short, the principal extant motivation for contextualism fails. Contextualists still owe us a distinguishing prediction of their view, something we would confidently expect only if contextualism were true, or which contextualism seems uniquely suited to explain. Absent such a prediction, contextualism is an idle hypothesis and we should not accept it. Of course, it is consistent with my findings that such a prediction will be made and vindicated. The present research was not designed to test all possible such predictions and future work exploring this possibility should not be ruled out (for a potential starting point, see the marginally significant difference observed in the conceptual replication of Experiment 4).

Some might be tempted to object that my results demonstrate only one thing: we cannot count on ordinary people to properly assess cases. Instead, we should trust the expert intuitions of trained philosophers, some of whom detect the relevant nuances and respond in line with contextualist predictions. In response, this objection is unavailable to leading contextualists, whose "main positive argument" is based on "straightforward data concerning" the ordinary behavior of competent speakers (DeRose 2009, p. 69) . Settling for this objection would amount to a transparent and feeble bait-and-switch. We were sold a batch of "ordinary language" goods, but all we got for our money was a raincheck on some "expert" intuitions.

The current findings suggest some general lessons for philosophers, including some critical improvements to standard methodology. First, for any discussion informed by patterns in judgments, including knowledge judgments, philosophers should not ignore basic tendencies in human cognition. For example, it appears that, other things being equal, we defer to others' self-regarding mental state attributions. Accordingly, we should not be impressed if a philosophical hy-pothesis predicts this deferential pattern. Fulfilling this prediction does not count in the theory's favor. It is an idle prediction.

Second, philosophers should remember that judgments about cases are influenced not only by explicitly stated details, but also by the inferences people draw from them. For instance, suppose that increasing what is at stake makes people more likely to deny knowledge. We should not automatically interpret this as a sign that knowledge (or "knowledge," or our concept of knowledge) is partly constituted by stakes (or variable evidential standards or truth conditions), in addition to factors such as truth, belief and evidence. For variation in underlying judgments about truth, belief or evidence could cause the change in knowledge judgments (Turri & Buckwalter in press) . And given how much of human cognition occurs automatically and unconsciously (Bargh, Schwader, Hailey, Dyer & Boothby 2012; Nosek, Hawkins & Frazier 2011; Haidt 2007; Wilson 2002) , philosophers should not assume that we do not make such underlying judgments just because we do not introspect them. Relatedly, theorists proposing cases cannot, by stipulation, magically cause us to "hold fixed" our underlying judgments and then legitimately proclaim that a difference must be due to some other factor.

Finally, for discussions hinging on patterns in ordinary usage, philosophers should be quick to rely on controlled experiments to separate wheat from chaff. This is especially true early in a research program, when experimental results can provide formative clues putting researchers on more promising paths. An enormous amount of time and energy has been devoted to contextualism in epistemology over the past two decades. In that time, it went from a novelty to a standard topic covered in textbooks, handbooks, anthologies, encyclopedias, and survey courses (e.g.

Turri 2014; Pritchard 2014; Steup, Turri & Sosa 2013; Rysiew 2011; BonJour 2010; Dancy, Sosa

4

(

Indicates paragraph break on the participant's screen.

Degrees of freedom (rounded to the nearest whole number) are adjusted lower for some dependent measures because of unequal variance on the measure across conditions.

Neither participant gender nor age affected response to the test question about evidence.