Towards a Benchmark for Scientific Understanding in Humans and Machines

Minds and Machines 34 (1):1-16 (2024)
  Copy   BIBTEX

Abstract

Scientific understanding is a fundamental goal of science. However, there is currently no good way to measure the scientific understanding of agents, whether these be humans or Artificial Intelligence systems. Without a clear benchmark, it is challenging to evaluate and compare different levels of scientific understanding. In this paper, we propose a framework to create a benchmark for scientific understanding, utilizing tools from philosophy of science. We adopt a behavioral conception of understanding, according to which genuine understanding should be recognized as an ability to perform certain tasks. We extend this notion of scientific understanding by considering a set of questions that gauge different levels of scientific understanding, covering information retrieval, the capability to arrange information to produce an explanation, and the ability to infer how things would be different under different circumstances. We suggest building a Scientific Understanding Benchmark (SUB), formed by a set of these tests, allowing for the evaluation and comparison of scientific understanding. Benchmarking plays a crucial role in establishing trust, ensuring quality control, and providing a basis for performance evaluation. By aligning machine and human scientific understanding we can improve their utility, ultimately advancing scientific understanding and helping to discover new insights within machines.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,829

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Correction to: What Might Machines Mean?Mitchell Green & Jan G. Michel - 2022 - Minds and Machines 32 (2):339-339.
Erratum.[author unknown] - 2004 - Minds and Machines 14 (2):279-279.
Errata.[author unknown] - 1999 - Minds and Machines 9 (3):457-457.
Editor’s Note.[author unknown] - 2003 - Minds and Machines 13 (3):337-337.
Book Reviews. [REVIEW][author unknown] - 1997 - Minds and Machines 7 (2):289-320.
Call for papers.[author unknown] - 1999 - Minds and Machines 9 (3):459-459.
Book Reviews. [REVIEW][author unknown] - 1997 - Minds and Machines 7 (1):115-155.
Book Reviews. [REVIEW][author unknown] - 2004 - Minds and Machines 14 (2):241-278.
Instructions for authors.[author unknown] - 1998 - Minds and Machines 8 (4):587-590.
Editor's Note.[author unknown] - 2001 - Minds and Machines 11 (1):1-1.
Volume contents.[author unknown] - 1998 - Minds and Machines 8 (4):591-594.

Analytics

Added to PP
2024-04-26

Downloads
14 (#989,410)

6 months
14 (#179,338)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Henk W. de Regt
Radboud University

Citations of this work

No citations found.

Add more citations

References found in this work

The extended mind.Andy Clark & David J. Chalmers - 1998 - Analysis 58 (1):7-19.
Minds, brains, and programs.John Searle - 1980 - Behavioral and Brain Sciences 3 (3):417-57.
Computing machinery and intelligence.Alan M. Turing - 1950 - Mind 59 (October):433-60.
Studies in the logic of explanation.Carl Gustav Hempel & Paul Oppenheim - 1948 - Philosophy of Science 15 (2):135-175.

View all 25 references / Add more references