Responding to recent concerns about the reliability of the published literature in psychology and other disciplines, we formed the X-Phi Replicability Project to estimate the reproducibility of experimental philosophy. Drawing on a representative sample of 40 x-phi studies published between 2003 and 2015, we enlisted 20 research teams across 8 countries to conduct a high-quality replication of each study in order to compare the results to the original published findings. We found that x-phi studies – as represented in our sample (...) – successfully replicated about 70% of the time. We discuss possible reasons for this relatively high replication rate in the field of experimental philosophy and offer suggestions for best research practices going forward. (shrink)
Replicability is widely taken to ground the epistemic authority of science. However, in recent years, important published findings in the social, behavioral, and biomedical sciences have failed to replicate, suggesting that these fields are facing a “replicability crisis.” For philosophers, the crisis should not be taken as bad news but as an opportunity to do work on several fronts, including conceptual analysis, history and philosophy of science, research ethics, and social epistemology. This article introduces philosophers to these discussions. First, I (...) discuss precedents and evidence for the crisis. Second, I discuss methodological, statistical, and social-structural factors that have contributed to the crisis. Third, I focus on the philosophical issues raised by the crisis. Finally, I discuss proposed solutions and highlight the gaps that philosophers could focus on. (shrink)
The reward system of science is the priority rule. The first scientist making a new discovery is rewarded with prestige, while second runners get little or nothing. Michael Strevens, following Philip Kitcher, defends this reward system, arguing that it incentivizes an efficient division of cognitive labor. I argue that this assessment depends on strong implicit assumptions about the replicability of findings. I question these assumptions on the basis of metascientific evidence and argue that the priority rule systematically discourages replication. My (...) analysis leads us to qualify Kitcher and Strevens’s contention that a priority-based reward system is normatively desirable for science. (shrink)
Advocates of the self-corrective thesis argue that scientific method will refute false theories and find closer approximations to the truth in the long run. I discuss a contemporary interpretation of this thesis in terms of frequentist statistics in the context of the behavioral sciences. First, I identify experimental replications and systematic aggregation of evidence (meta-analysis) as the self-corrective mechanism. Then, I present a computer simulation study of scientific communities that implement this mechanism to argue that frequentist statistics may converge upon (...) a correct estimate or not depending on the social structure of the community that uses it. Based on this study, I argue that methodological explanations of the “replicability crisis” in psychology are limited and propose an alternative explanation in terms of biases. Finally, I conclude suggesting that scientific self-correction should be understood as an interaction effect between inference methods and social structures. (shrink)
The experimental interventions that provide evidence of causal relations are notably similar to those that provide evidence of constitutive relevance relations. In the first two sections, I show that this similarity creates a tension: there is an inconsistent triad between Woodward’s popular interventionist theory of causation, Craver’s mutual manipulability account of constitutive relevance in mechanisms, and a variety of arguments for the incoherence of inter-level causation. I argue for an interpretation of the views in which the tension is merely apparent. (...) I propose to explain inter-level relations without inter-level causation by appealing to the notion of fat-handed interventions, and an argument against inter-level causation which dissolves the problem. (shrink)
The finding that intuitions about the reference of proper names vary cross-culturally was one of the early milestones in experimental philosophy. Many follow-up studies investigated the scope and magnitude of such cross-cultural effects, but our paper provides the first systematic meta-analysis of studies replicating. In the light of our results, we assess the existence and significance of cross-cultural effects for intuitions about the reference of proper names.
The enduring replication crisis in many scientific disciplines casts doubt on the ability of science to estimate effect sizes accurately, and in a wider sense, to self-correct its findings and to produce reliable knowledge. We investigate the merits of a particular countermeasure—replacing null hypothesis significance testing with Bayesian inference—in the context of the meta-analytic aggregation of effect sizes. In particular, we elaborate on the advantages of this Bayesian reform proposal under conditions of publication bias and other methodological imperfections that are (...) typical of experimental research in the behavioral sciences. Moving to Bayesian statistics would not solve the replication crisis single-handedly. However, the move would eliminate important sources of effect size overestimation for the conditions we study. (shrink)
. Scientists, for the most part, want to get it right. However, the social structures that govern their work undermine that aim, and this leads to nonreplicable findings in many fields. Because the social structure of science is a decentralized system, it is difficult to intervene. In this article, I discuss how we might do so, focusing on self-corrective-labor schemes. First, I argue that we need to implement a scheme that makes replication work outcome independent, systematic, and sustainable. Second, I (...) use these three criteria to evaluate extant proposals, which place the responsibility for replication on original researchers, consumers of their research, students, or many labs. Third, on the basis of a philosophical analysis of the reward system of science and the benefits of the division of cognitive labor, I propose a scheme that satisfies the criteria better: the professional scheme. This scheme has two main components. First, the scientific community is organized into two groups: discovery researchers, who produce new findings, and confirmation researchers, whose primary function is to do confirmation work. Second, a distinct reward system is established for confirmation researchers so that their career advancement is separated from whether they obtain positive experimental results. (shrink)
The enduring replication crisis in many scientific disciplines casts doubt on the ability of science to estimate effect sizes accurately, and in a wider sense, to self-correct its findings and to produce reliable knowledge. We investigate the merits of a particular countermeasure—replacing null hypothesis significance testing with Bayesian inference—in the context of the meta-analytic aggregation of effect sizes. In particular, we elaborate on the advantages of this Bayesian reform proposal under conditions of publication bias and other methodological imperfections that are (...) typical of experimental research in the behavioral sciences. Moving to Bayesian statistics would not solve the replication crisis single-handedly. However, the move would eliminate important sources of effect size overestimation for the conditions we study. (shrink)
Envy is pervasive in academia. What are its epistemic effects? I present a characterization of envy that captures some of its essential features according to the philosophical literature. I use this characterization to illustrate a classic argument that views envy as collectively disadvantageous. Then, based on insights from the social epistemology of science, I evaluate this argument in the context of academic research. I argue that given the nature of epistemic goods, the best strategies available to the envious academic typically (...) lead to collective epistemic benefits. I conclude by presenting a challenge for the design of epistemic institutions: it is difficult to restructure institutions to reduce envy without severe epistemic drawbacks. (shrink)
It is common in psychiatry and other sciences to describe an individual or a type of individual in terms of its disposition to manifest specific effects in a particular range of circumstances. According to one understanding, dispositions are statistical regularities of an individual or type of individual in specific circumstances. According to another understanding, dispositions are properties of individuals in virtue of which such regularities hold. This entry considers a number of ways of making each of these senses of disposition (...) more precise while discussing a number of dangers lurking in careless use of the concept of a disposition. (shrink)