The Devil in the Data: Machine Learning & the Theory-Free Ideal

Abstract

Machine learning (ML) refers to a class of computer-facilitated methods of statistical modelling. ML modelling techniques are now being widely adopted across the sciences. A number of outspoken representatives from the general public, computer science, various scientific fields, and philosophy of science alike seem to share in the belief that ML will radically disrupt scientific practice or the variety of epistemic outputs science is capable of producing. Such a belief is held, at least in part, because its adherents take ML to exist on novel epistemic footing relative to classical mathematical or statistical modelling approaches utilised in science. Namely, they take modelling with ML to be a “theory-free” enterprise, in the sense of not resting essentially on input from human conceptual grasp on the target phenomenon and domain expertise. I take this view to arise from the further, and more deeply entrenched belief that data is worldly and objective; i.e., data is viewed as recapitulating or representing with perfect fidelity the structure or properties of the systems in nature it is sampled from. Yet most contemporary philosophers of science take on board, in one version or other, the thesis of theory-ladenness or theory-mediation of observation or measurement. From this, it follows that most philosophers of science (and, I will venture, many scientists) hold irreconcilable views on the nature of data, an internal tension which threatens the integrity of their appraisals of ML in science. Taking the thesis of theory-ladenness on board---and its implications for the nature of data seriously---it follows that there is no reason to believe that ML differs fundamentally in its epistemic footing from established mathematical modelling approaches in the sciences. I show that usage and interpretation can differ from “standard practice” in scientific projects which wield the tools of ML without accepting a difference in the epistemic or representational status of such tools.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 92,574

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Similar books and articles

Human Semi-Supervised Learning.Bryan R. Gibson, Timothy T. Rogers & Xiaojin Zhu - 2013 - Topics in Cognitive Science 5 (1):132-172.
Model theory and machine learning.Hunter Chase & James Freitag - 2019 - Bulletin of Symbolic Logic 25 (3):319-332.
Machine learning and essentialism.Kristina Šekrst & Sandro Skansi - 2022 - Zagadnienia Filozoficzne W Nauce 73:171-196.
From privacy to anti-discrimination in times of machine learning.Thilo Hagendorff - 2019 - Ethics and Information Technology 21 (4):331-343.
Are Algorithms Value-Free?Gabbrielle M. Johnson - 2023 - Journal Moral Philosophy 21 (1-2):1-35.

Analytics

Added to PP
2023-06-03

Downloads
257 (#80,229)

6 months
77 (#64,616)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Mel Andrews
University of Cincinnati (PhD)

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references