Significance Tests: Their Logic and Early History

Davis Whitney Baird

Significance Tests: Their Logic and Early History

Dissertation, Stanford University (1981) Copy BIBT_EX

Abstract

Significance tests are the mainstay of much experimental analysis. They formalize reasoning of the following sort: Assuming hypothesis h, evidence e is improbable, if e is observed, reject h. Most philosophical work on induction is either about a simpler form of reasoning, such as Reichenbach's straight rule , or a more powerful form of reasoning, such as Neyman/Pearson confidence intervals, Bayesian posterior densities or Fisher fiducial probabilities. By focusing on this simple, common method of inductive inference, many of the subtleties of the problem of scientific induction become apparent. ;Three features of the logic of significance tests are isolated. The test statistic must single out the correct aspect of the evidence for inference. The stringency measure must correctly formalize the improbability of the evidence. Composite hypotheses pose additional problems since they do not stipulate exact probabilities for all outcomes. ;Until Karl Pearson's 1895 paper on skew frequency curves, statisticians chiefly used the Normal curve. A question of goodness of fit became crucial with many different frequency curves available. Pearson proposed the Chi-Squared statistic as a measure of fit. He justified it by its relation to correlation--a category which replaced causation within Pearson's positivist philosophy. In fact, Chi-Squared works well only when correlation models adequately describe the phenomena of concern. ;Levels of significance measure test stringency as the probability of the observed value of the test statistic being in the tails of the test statistic density. This practice stems from the theory of errors of observation . Probable error, defined in terms of tail areas under the Normal density, measures the precision of a series of observations. Early significance tests using the Normal density measured stringency in terms of multiples of the probable error. Nowadays many densities are used in significance testing; stringency is still measured by tail areas. Hence, rejection occurs if and only if the test statistic takes a value in the tails. No completely satisfactory analysis of this practice now exists. ;In 1904 Pearson extended Chi-Squared to test the composite hypothesis of statistical independence. This extension yielded conflicting inferences from those based on other tests of independence. In 1922, by means of his new concept of degrees of freedom, R. A. Fisher proposed a solution. Degrees of freedom measure the informativeness of an hypothesis. From 1922 onward, significance tests test not only the putative truth of an hypothesis but its informativeness. Fisher's solution violates a rule of implication: If h implies i, then evidence sufficient to reject i is sufficient to reject h. This rule is widely endorsed by philosophers; indeed, Hempel calls it a condition of adequacy for any theory of confirmation. But, if h implies i, h is more informative than i. Consequently, if we test for informativeness, if h implies i, evidence sufficient to reject i need not be sufficient to reject the more informative h. Since the introduction of degrees of freedom, significance tests have checked both informativeness and truth. This examination of significance testing reveals aspects of induction missed by analyses from first principles

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Edit

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Author's Profile

Davis Baird

Clark University

Keywords

Add keywords

Reprint years

My notes

Analytics

Added to PP
2015-02-06

Downloads
0

6 months
0

Historical graph of downloads

Sorry, there are not enough data points to plot this chart.

How can I increase my downloads?

Author's Profile

Davis Baird

Clark University

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Significance Tests: Their Logic and Early History

Abstract

Author's Profile

Categories

Keywords

Reprint years

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Author's Profile

Citations of this work

References found in this work