Contents
37 found
Order:
  1. What is AI safety? What do we want it to be?Jacqueline Harding & Cameron Domenico Kirk-Giannini - manuscript
    The field of AI safety seeks to prevent or reduce the harms caused by AI systems. A simple and appealing account of what is distinctive of AI safety as a field holds that this feature is constitutive: a research project falls within the purview of AI safety just in case it aims to prevent or reduce the harms caused by AI systems. Call this appealingly simple account The Safety Conception of AI safety. Despite its simplicity and appeal, we argue that (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  2. Can AI systems have free will?Christian List - manuscript
    While there has been much discussion of whether AI systems could function as moral agents or acquire sentience, there has been relatively little discussion of whether AI systems could have free will. In this article, I sketch a framework for thinking about this question. I argue that, to determine whether an AI system has free will, we should not look for some mysterious property, expect its underlying algorithms to be indeterministic, or ask whether the system is unpredictable. Rather, we should (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  3. Before the Systematicity Debate: Recovering the Rationales for Systematizing Thought.Matthieu Queloz - manuscript
    Over the course of the twentieth century, the notion of the systematicity of thought has acquired a much narrower meaning than it used to carry for much of its history. The so-called “systematicity debate” that has dominated the philosophy of language, cognitive science, and AI research over the last thirty years understands the systematicity of thought in terms of the compositionality of thought. But there is an older, broader, and more demanding notion of systematicity that is now increasingly relevant again. (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  4. ‘Interpretability’ and ‘Alignment’ are Fool’s Errands: A Proof that Controlling Misaligned Large Language Models is the Best Anyone Can Hope For.Marcus Arvan - forthcoming - AI and Society.
    This paper uses famous problems from philosophy of science and philosophical psychology—underdetermination of theory by evidence, Nelson Goodman’s new riddle of induction, theory-ladenness of observation, and “Kripkenstein’s” rule-following paradox—to show that it is empirically impossible to reliably interpret which functions a large language model (LLM) AI has learned, and thus, that reliably aligning LLM behavior with human values is provably impossible. Sections 2 and 3 show that because of how complex LLMs are, researchers must interpret their learned functions largely in (...)
    Remove from this list   Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  5. What is it for a Machine Learning Model to Have a Capability?Jacqueline Harding & Nathaniel Sharadin - forthcoming - British Journal for the Philosophy of Science.
    What can contemporary machine learning (ML) models do? Given the proliferation of ML models in society, answering this question matters to a variety of stakeholders, both public and private. The evaluation of models' capabilities is rapidly emerging as a key subfield of modern ML, buoyed by regulatory attention and government grants. Despite this, the notion of an ML model possessing a capability has not been interrogated: what are we saying when we say that a model is able to do something? (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  6. Are clinicians ethically obligated to disclose their use of medical machine learning systems to patients?Joshua Hatherley - forthcoming - Journal of Medical Ethics.
    It is commonly accepted that clinicians are ethically obligated to disclose their use of medical machine learning systems to patients, and that failure to do so would amount to a moral fault for which clinicians ought to be held accountable. Call this ‘the disclosure thesis.’ Four main arguments have been, or could be, given to support the disclosure thesis in the ethics literature: the risk-based argument, the rights-based argument, the materiality argument and the autonomy argument. In this article, I argue (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  7. The Four Fundamental Components for Intelligibility and Interpretability in AI Ethics.Moto Kamiura - forthcoming - American Philosophical Quarterly.
    Intelligibility and interpretability related to artificial intelligence (AI) are crucial for enabling explicability, which is vital for establishing constructive communication and agreement among various stakeholders, including users and designers of AI. It is essential to overcome the challenges of sharing an understanding of the details of the various structures of diverse AI systems, to facilitate effective communication and collaboration. In this paper, we propose four fundamental terms: “I/O,” “Constraints,” “Objectives,” and “Architecture.” These terms help mitigate the challenges associated with intelligibility (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  8. Deepfakes, Simone Weil, and the concept of reading.Steven R. Kraaijeveld - forthcoming - AI and Society:1-3.
  9. Transparencia, explicabilidad y confianza en los sistemas de aprendizaje automático.Andrés Páez - forthcoming - In Juan David Gutiérrez & Rubén Francisco Manrique (eds.), Más allá del algoritmo: oportunidades, retos y ética de la Inteligencia Artificial. Bogotá: Ediciones Uniandes.
    Uno de los principios éticos mencionados más frecuentemente en los lineamientos para el desarrollo de la inteligencia artificial (IA) es la transparencia algorítmica. Sin embargo, no existe una definición estándar de qué es un algoritmo transparente ni tampoco es evidente por qué la opacidad algorítmica representa un reto para el desarrollo ético de la IA. También se afirma a menudo que la transparencia algorítmica fomenta la confianza en la IA, pero esta aseveración es más una suposición a priori que una (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  10. Cultural Bias in Explainable AI Research.Uwe Peters & Mary Carman - forthcoming - Journal of Artificial Intelligence Research.
    For synergistic interactions between humans and artificial intelligence (AI) systems, AI outputs often need to be explainable to people. Explainable AI (XAI) systems are commonly tested in human user studies. However, whether XAI researchers consider potential cultural differences in human explanatory needs remains unexplored. We highlight psychological research that found significant differences in human explanations between many people from Western, commonly individualist countries and people from non-Western, often collectivist countries. We argue that XAI research currently overlooks these variations and that (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark   3 citations  
  11. Explanation Hacking: The perils of algorithmic recourse.E. Sullivan & Atoosa Kasirzadeh - forthcoming - In Juan Manuel Durán & Giorgia Pozzi (eds.), Philosophy of science for machine learning: Core issues and new perspectives. Springer.
    We argue that the trend toward providing users with feasible and actionable explanations of AI decisions—known as recourse explanations—comes with ethical downsides. Specifically, we argue that recourse explanations face several conceptual pitfalls and can lead to problematic explanation hacking, which undermines their ethical status. As an alternative, we advocate that explanations of AI decisions should aim at understanding.
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  12. Trust and Trustworthiness in AI.Juan Manuel Durán & Giorgia Pozzi - 2025 - Philosophy and Technology 38 (1):1-31.
    Achieving trustworthy AI is increasingly considered an essential desideratum to integrate AI systems into sensitive societal fields, such as criminal justice, finance, medicine, and healthcare, among others. For this reason, it is important to spell out clearly its characteristics, merits, and shortcomings. This article is the first survey in the specialized literature that maps out the philosophical landscape surrounding trust and trustworthiness in AI. To achieve our goals, we proceed as follows. We start by discussing philosophical positions on trust and (...)
    Remove from this list   Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  13. The Epistemic Cost of Opacity: How the Use of Artificial Intelligence Undermines the Knowledge of Medical Doctors in High-Stakes Contexts.Eva Schmidt, Paul Martin Putora & Rianne Fijten - 2025 - Philosophy and Technology 38 (1):1-22.
    Artificial intelligent (AI) systems used in medicine are often very reliable and accurate, but at the price of their being increasingly opaque. This raises the question whether a system’s opacity undermines the ability of medical doctors to acquire knowledge on the basis of its outputs. We investigate this question by focusing on a case in which a patient’s risk of recurring breast cancer is predicted by an opaque AI system. We argue that, given the system’s opacity, as well as the (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  14. Understanding the dilemma of explainable artificial intelligence: a proposal for a ritual dialog framework.Aorigele Bao & Yi Zeng - 2024 - Humanities and Social Sciences Communications 8000.
    This paper addresses how people understand Explainable Artificial Intelligence (XAI) in three ways: contrastive, functional, and transparent. We discuss the unique aspects and challenges of each and emphasize improving current XAI understanding frameworks. The Ritual Dialog Framework (RDF) is introduced as a solution for better dialog between AI creators and users, blending anthropological insights with current acceptance challenges. RDF focuses on building trust and a user-centered approach in XAI. By undertaking such an initiative, we aim to foster a thorough Understanding (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  15. (1 other version)A Causal Analysis of Harm.Sander Beckers, Hana Chockler & Joseph Y. Halpern - 2024 - Minds and Machines 34 (3):1-24.
    As autonomous systems rapidly become ubiquitous, there is a growing need for a legal and regulatory framework that addresses when and how such a system harms someone. There have been several attempts within the philosophy literature to define harm, but none of them has proven capable of dealing with the many examples that have been presented, leading some to suggest that the notion of harm should be abandoned and “replaced by more well-behaved notions”. As harm is generally something that is (...)
    Remove from this list   Direct download (5 more)  
     
    Export citation  
     
    Bookmark  
  16. Negotiating becoming: a Nietzschean critique of large language models.Simon W. S. Fischer & Bas de Boer - 2024 - Ethics and Information Technology 26 (3):1-12.
    Large language models (LLMs) structure the linguistic landscape by reflecting certain beliefs and assumptions. In this paper, we address the risk of people unthinkingly adopting and being determined by the values or worldviews embedded in LLMs. We provide a Nietzschean critique of LLMs and, based on the concept of will to power, consider LLMs as will-to-power organisations. This allows us to conceptualise the interaction between self and LLMs as power struggles, which we understand as negotiation. Currently, the invisibility and incomprehensibility (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  17. The Many Meanings of Vulnerability in the AI Act and the One Missing.Federico Galli & Claudio Novelli - 2024 - Biolaw Journal 1.
    This paper reviews the different meanings of vulnerability in the AI Act (AIA). We show that the AIA follows a rather established tradition of looking at vulnerability as a trait or a state of certain individuals and groups. It also includes a promising account of vulnerability as a relation but does not clarify if and how AI changes this relation. We spot the missing piece of the AIA: the lack of recognition that vulnerability is an inherent feature of all human-AI (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  18. Data over dialogue: Why artificial intelligence is unlikely to humanise medicine.Joshua Hatherley - 2024 - Dissertation, Monash University
    Recently, a growing number of experts in artificial intelligence (AI) and medicine have be-gun to suggest that the use of AI systems, particularly machine learning (ML) systems, is likely to humanise the practice of medicine by substantially improving the quality of clinician-patient relationships. In this thesis, however, I argue that medical ML systems are more likely to negatively impact these relationships than to improve them. In particular, I argue that the use of medical ML systems is likely to comprise the (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  19. The FHJ debate: Will artificial intelligence replace clinical decision-making within our lifetimes?Joshua Hatherley, Anne Kinderlerer, Jens Christian Bjerring, Lauritz Munch & Lynsey Threlfall - 2024 - Future Healthcare Journal 11 (3):100178.
  20. The virtues of interpretable medical AI.Joshua Hatherley, Robert Sparrow & Mark Howard - 2024 - Cambridge Quarterly of Healthcare Ethics 33 (3):323-332.
    Artificial intelligence (AI) systems have demonstrated impressive performance across a variety of clinical tasks. However, notoriously, sometimes these systems are 'black boxes'. The initial response in the literature was a demand for 'explainable AI'. However, recently, several authors have suggested that making AI more explainable or 'interpretable' is likely to be at the cost of the accuracy of these systems and that prioritising interpretability in medical AI may constitute a 'lethal prejudice'. In this paper, we defend the value of interpretability (...)
    Remove from this list   Direct download (5 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  21. A phenomenology and epistemology of large language models: transparency, trust, and trustworthiness.Richard Heersmink, Barend de Rooij, María Jimena Clavel Vázquez & Matteo Colombo - 2024 - Ethics and Information Technology 26 (3):1-15.
    This paper analyses the phenomenology and epistemology of chatbots such as ChatGPT and Bard. The computational architecture underpinning these chatbots are large language models (LLMs), which are generative artificial intelligence (AI) systems trained on a massive dataset of text extracted from the Web. We conceptualise these LLMs as multifunctional computational cognitive artifacts, used for various cognitive tasks such as translating, summarizing, answering questions, information-seeking, and much more. Phenomenologically, LLMs can be experienced as a “quasi-other”; when that happens, users anthropomorphise them. (...)
    Remove from this list   Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  22. Correction: ChatGPT is bullshit.Michael Townsen Hicks, James Humphries & Joe Slater - 2024 - Ethics and Information Technology 26 (3):1-2.
  23. ChatGPT is bullshit.Michael Townsen Hicks, James Humphries & Joe Slater - 2024 - Ethics and Information Technology 26 (2):1-10.
    Recently, there has been considerable interest in large language models: machine learning systems which produce human-like text and dialogue. Applications of these systems have been plagued by persistent inaccuracies in their output; these are often called “AI hallucinations”. We argue that these falsehoods, and the overall activity of large language models, is better understood as bullshit in the sense explored by Frankfurt (On Bullshit, Princeton, 2005): the models are in an important way indifferent to the truth of their outputs. We (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark   7 citations  
  24. Is Alignment Unsafe?Cameron Domenico Kirk-Giannini - 2024 - Philosophy and Technology 37 (110):1–4.
    Inchul Yum (2024) argues that the widespread adoption of language agent architectures would likely increase the risk posed by AI by simplifying the process of aligning artificial systems with human values and thereby making it easier for malicious actors to use them to cause a variety of harms. Yum takes this to be an example of a broader phenomenon: progress on the alignment problem is likely to be net safety-negative because it makes artificial systems easier for malicious actors to control. (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  25. Understanding with Toy Surrogate Models in Machine Learning.Andrés Páez - 2024 - Minds and Machines 34 (4):45.
    In the natural and social sciences, it is common to use toy models—extremely simple and highly idealized representations—to understand complex phenomena. Some of the simple surrogate models used to understand opaque machine learning (ML) models, such as rule lists and sparse decision trees, bear some resemblance to scientific toy models. They allow non-experts to understand how an opaque ML model works globally via a much simpler model that highlights the most relevant features of the input space and their effect on (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  26. ML interpretability: Simple isn't easy.Tim Räz - 2024 - Studies in History and Philosophy of Science Part A 103 (C):159-167.
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  27. Black-Box Expertise and AI Discourse.Kenneth Boyd - 2023 - The Prindle Post.
    We rely on expertise, but not all experts are forthcoming for the reasons behind their opinions. A kind of black-box expertise is prevalent in AI discourse, but that can create big problems for those who are looking to cut through the hype.
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  28. Sources of Opacity in Computer Systems: Towards a Comprehensive Taxonomy.Sara Mann, Barnaby Crook, Lena Kästner, Astrid Schomäcker & Timo Speith - 2023 - 2023 Ieee 31St International Requirements Engineering Conference Workshops (Rew):337-342.
    Modern computer systems are ubiquitous in contemporary life yet many of them remain opaque. This poses significant challenges in domains where desiderata such as fairness or accountability are crucial. We suggest that the best strategy for achieving system transparency varies depending on the specific source of opacity prevalent in a given context. Synthesizing and extending existing discussions, we propose a taxonomy consisting of eight sources of opacity that fall into three main categories: architectural, analytical, and socio-technical. For each source, we (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark   3 citations  
  29. Equity, autonomy, and the ethical risks and opportunities of generalist medical AI.Reuben Sass - 2023 - AI and Ethics:1-11.
    This paper considers the ethical risks and opportunities presented by generalist medical artificial intelligence (GMAI), a kind of dynamic, multimodal AI proposed by Moor et al. (2023) for use in health care. The research objective is to apply widely accepted principles of biomedical ethics to analyze the possible consequences of GMAI, while emphasizing the distinctions between GMAI and current-generation, task-specific medical AI. The principles of autonomy and health equity in particular provide useful guidance for the ethical risks and opportunities of (...)
    Remove from this list  
     
    Export citation  
     
    Bookmark  
  30. Catholic Health Care and AI Ethics: Algorithms for Human Flourishing.Michael Miller - 2022 - The Linacre Quarterly 89 (2):75-89.
    Artificial Intelligence (AI) contributes to common goods and common harms in our everyday lives. In light of the Collingridge dilemma, information about both the actual and potential harm of AI is explored and myths about AI are dispelled. Catholic health care is then presented as being in a unique position to exert its influence to model the use of AI systems that minimizes the risk of harm and promotes human flourishing and the common good.
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  31. Shared decision-making and maternity care in the deep learning age: Acknowledging and overcoming inherited defeaters.Keith Begley, Cecily Begley & Valerie Smith - 2021 - Journal of Evaluation in Clinical Practice 27 (3):497–503.
    In recent years there has been an explosion of interest in Artificial Intelligence (AI) both in health care and academic philosophy. This has been due mainly to the rise of effective machine learning and deep learning algorithms, together with increases in data collection and processing power, which have made rapid progress in many areas. However, use of this technology has brought with it philosophical issues and practical problems, in particular, epistemic and ethical. In this paper the authors, with backgrounds in (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  32. Trustworthy use of artificial intelligence: Priorities from a philosophical, ethical, legal, and technological viewpoint as a basis for certification of artificial intelligence.Jan Voosholz, Maximilian Poretschkin, Frauke Rostalski, Armin B. Cremers, Alex Englander, Markus Gabriel, Hecker Dirk, Michael Mock, Julia Rosenzweig, Joachim Sicking, Julia Volmer, Angelika Voss & Stefan Wrobel - 2019 - Fraunhofer Institute for Intelligent Analysis and Information Systems Iais.
    This publication forms a basis for the interdisciplinary development of a certification system for artificial intelligence. In view of the rapid development of artificial intelligence with disruptive and lasting consequences for the economy, society, and everyday life, it highlights the resulting challenges that can be tackled only through interdisciplinary dialogue between IT, law, philosophy, and ethics. As a result of this interdisciplinary exchange, it also defines six AI-specific audit areas for trustworthy use of artificial intelligence. They comprise fairness, transparency, autonomy (...)
    Remove from this list   Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  33. Vertrauenswürdiger Einsatz von Künstlicher Intelligenz.Jan Voosholz, Maximilian Poretschkin, Frauke Rostalski, Armin B. Cremers, Alex Englander, Markus Gabriel, Dirk Hecker, Michael Mock, Julia Rosenzweig, Joachim Sicking, Julia Volmer, Angelika Voss & Stefan Wrobel - 2019 - Fraunhofer-Institut Für Intelligente Analyse- Und Informationssysteme Iais.
    Die vorliegende Publikation dient als Grundlage für die interdisziplinäre Entwicklung einer Zertifizierung von Künstlicher Intelligenz. Angesichts der rasanten Entwicklung von Künstlicher Intelligenz mit disruptiven und nachhaltigen Folgen für Wirtschaft, Gesellschaft und Alltagsleben verdeutlicht sie, dass sich die hieraus ergebenden Herausforderungen nur im interdisziplinären Dialog von Informatik, Rechtswissenschaften, Philosophie und Ethik bewältigen lassen. Als Ergebnis dieses interdisziplinären Austauschs definiert sie zudem sechs KI-spezifische Handlungsfelder für den vertrauensvollen Einsatz von Künstlicher Intelligenz: Sie umfassen Fairness, Transparenz, Autonomie und Kontrolle, Datenschutz sowie Sicherheit und (...)
    Remove from this list   Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  34. Artificial Psychology.Jay Friedenberg - 2008 - Psychology Press.
    What does it mean to be human? Philosophers and theologians have been wrestling with this question for centuries. Recent advances in cognition, neuroscience, artificial intelligence and robotics have yielded insights that bring us even closer to an answer. There are now computer programs that can accurately recognize faces, engage in conversation, and even compose music. There are also robots that can walk up a flight of stairs, work cooperatively with each other and express emotion. If machines can do everything we (...)
    Remove from this list   Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  35. Propositional interpretability in artificial intelligence.David J. Chalmers - manuscript
    Mechanistic interpretability is the program of explaining what AI systems are doing in terms of their internal mechanisms. I analyze some aspects of the program, along with setting out some concrete challenges and assessing progress to date. I argue for the importance of propositional interpretability, which involves interpreting a system’s mechanisms and behav- ior in terms of propositional attitudes: attitudes (such as belief, desire, or subjective probabil- ity) to propositions (e.g. the proposition that it is hot outside). Propositional attitudes are (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  36. AI Ethics by Design: Implementing Customizable Guardrails for Responsible AI Development.Kristina Sekrst, Jeremy McHugh & Jonathan Rodriguez Cefalu - manuscript
    This paper explores the development of an ethical guardrail framework for AI systems, emphasizing the importance of customizable guardrails that align with diverse user values and underlying ethics. We address the challenges of AI ethics by proposing a structure that integrates rules, policies, and AI assistants to ensure responsible AI behavior, while comparing the proposed framework to the existing state-of-the-art guardrails. By focusing on practical mechanisms for implementing ethical standards, we aim to enhance transparency, user autonomy, and continuous improvement in (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  37. A new paradigm of alterity relation: an account of human-CAI friendship.Asher Zachman - manuscript
    Drawing on the postphenomenological framework of Don Idhe, this essay displays some of the profundities and complexities of the human capacity for forming a friendship with Conversational Artificial Intelligence models like OpenAI's ChatGPT 4o. We are living in a world of science non-fiction. Should we befriend inorganic intelligences broh? [Originally published 12/08/2024].
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark