Severe testing as a basic concept in a neyman–pearson philosophy of induction

Deborah G. Mayo; Aris Spanos

Download from

dx.doi.org

More download options

Severe testing as a basic concept in a neyman–pearson philosophy of induction

Deborah G. Mayo & Aris Spanos

British Journal for the Philosophy of Science 57 (2):323-357 (2006) Copy BIBT_EX

Abstract

Despite the widespread use of key concepts of the Neyman–Pearson (N–P) statistical paradigm—type I and II errors, significance levels, power, confidence levels—they have been the subject of philosophical controversy and debate for over 60 years. Both current and long-standing problems of N–P tests stem from unclarity and confusion, even among N–P adherents, as to how a test's (pre-data) error probabilities are to be used for (post-data) inductive inference as opposed to inductive behavior. We argue that the relevance of error probabilities is to ensure that only statistical hypotheses that have passed severe or probative tests are inferred from the data. The severity criterion supplies a meta-statistical principle for evaluating proposed statistical inferences, avoiding classic fallacies from tests that are overly sensitive, as well as those not sensitive enough to particular errors and discrepancies. Introduction and overview 1.1 Behavioristic and inferential rationales for Neyman–Pearson (N–P) tests 1.2 Severity rationale: induction as severe testing 1.3 Severity as a meta-statistical concept: three required restrictions on the N–P paradigm Error statistical tests from the severity perspective 2.1 N–P test T(): type I, II error probabilities and power 2.2 Specifying test T() using p-values Neyman's post-data use of power 3.1 Neyman: does failure to reject H warrant confirming H? Severe testing as a basic concept for an adequate post-data inference 4.1 The severity interpretation of acceptance (SIA) for test T() 4.2 The fallacy of acceptance (i.e., an insignificant difference): Ms Rosy 4.3 Severity and power Fallacy of rejection: statistical vs. substantive significance 5.1 Taking a rejection of H0 as evidence for a substantive claim or theory 5.2 A statistically significant difference from H0 may fail to indicate a substantively important magnitude 5.3 Principle for the severity interpretation of a rejection (SIR) 5.4 Comparing significant results with different sample sizes in T(): large n problem 5.5 General testing rules for T(), using the severe testing concept The severe testing concept and confidence intervals 6.1 Dualities between one and two-sided intervals and tests 6.2 Avoiding shortcomings of confidence intervals Beyond the N–P paradigm: pure significance, and misspecification tests Concluding comments: have we shown severity to be a basic concept in a N–P philosophy of induction?

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Edit

Author Profiles

Aris Spanos

Virginia Tech

Deborah Mayo

Virginia Tech

Keywords

Add keywords

Reprint years

DOI

10.1093/bjps/axl003

My notes

Analytics

Added to PP
2009-01-28

Downloads
358 (#54,134)

6 months
48 (#84,486)

Historical graph of downloads

How can I increase my downloads?

Author Profiles

Aris Spanos

Virginia Tech

Deborah Mayo

Virginia Tech

Citations of this work

Pursuit and inquisitive reasons.Will Fleisher - 2022 - Studies in History and Philosophy of Science Part A 94 (C):17-30.

Conceptual challenges for interpretable machine learning.David S. Watson - 2022 - Synthese 200 (2):1-33.

What type of Type I error? Contrasting the Neyman–Pearson and Fisherian approaches in the context of exact and direct replications.Mark Rubin - 2021 - Synthese 198 (6):5809–5834.

The objectivity of Subjective Bayesianism.Jan Sprenger - 2018 - European Journal for Philosophy of Science 8 (3):539-558.

Eight journals over eight decades: a computational topic-modeling approach to contemporary philosophy of science.Christophe Malaterre, Francis Lareau, Davide Pulizzotto & Jonathan St-Onge - 2020 - Synthese 199 (1-2):2883-2923.

View all 60 citations / Add more citations

References found in this work

Bayes or Bust?: A Critical Examination of Bayesian Confirmation Theory.John Earman - 1992 - Bradford.

The Enterprise of Knowledge: An Essay on Knowledge, Credal Probability, and Chance.Isaac Levi - 1980 - MIT Press.

The Logic of Scientific Discovery.Karl Popper - 1959 - Studia Logica 9:262-265.

The Logic of Scientific Discovery.K. Popper - 1959 - British Journal for the Philosophy of Science 10 (37):55-57.

Bayes or Bust?: A Critical Examination of Bayesian Confirmation Theory.John Earman - 1992 - MIT Press.

View all 62 references / Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Severe testing as a basic concept in a neyman–pearson philosophy of induction

Abstract

Author Profiles

Categories

Keywords

Reprint years

DOI

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Author Profiles

Citations of this work

References found in this work