One of the primary, if not most critical, difficulties in the design and implementation of autonomous systems is the black-boxed nature of the decision-making structures and logical pathways. How human values are embodied and actualised in situ may ultimately prove to be harmful if not outright recalcitrant. For this reason, the values of stakeholders become of particular significance given the risks posed by opaque structures of intelligent agents (IAs). This paper explores how decision matrix algorithms, via the belief-desire-intention model for (...) autonomous vehicles, can be designed to minimize the risks of opaque architectures. Primarily through an explicit orientation towards designing for the values of explainability and verifiability. In doing so, this research adopts the Value Sensitive Design (VSD) approach as a principled framework for the incorporation of such values within design. VSD is recognized as a potential starting point that offers a systematic way for engineering teams to formally incorporate existing technical solutions within ethical design, while simultaneously remaining pliable to emerging issues and needs. It is concluded that the VSD methodology offers at least a strong enough foundation from which designers can begin to anticipate design needs and formulate salient design flows that can be adapted to the changing ethical landscapes required for utilisation in autonomous vehicles. (shrink)
Purpose This paper aims to formalize long-term trajectories of human civilization as a scientific and ethical field of study. The long-term trajectory of human civilization can be defined as the path that human civilization takes during the entire future time period in which human civilization could continue to exist. -/- Design/methodology/approach This paper focuses on four types of trajectories: status quo trajectories, in which human civilization persists in a state broadly similar to its current state into the distant future; catastrophe (...) trajectories, in which one or more events cause significant harm to human civilization; technological transformation trajectories, in which radical technological breakthroughs put human civilization on a fundamentally different course; and astronomical trajectories, in which human civilization expands beyond its home planet and into the accessible portions of the cosmos. -/- Findings Status quo trajectories appear unlikely to persist into the distant future, especially in light of long-term astronomical processes. Several catastrophe, technological transformation and astronomical trajectories appear possible. -/- Originality/value Some current actions may be able to affect the long-term trajectory. Whether these actions should be pursued depends on a mix of empirical and ethical factors. For some ethical frameworks, these actions may be especially important to pursue. (shrink)
An artificial general intelligence might have an instrumental drive to modify its utility function to improve its ability to cooperate, bargain, promise, threaten, and resist and engage in blackmail. Such an AGI would necessarily have a utility function that was at least partially observable and that was influenced by how other agents chose to interact with it. This instrumental drive would conflict with the strong orthogonality thesis since the modifications would be influenced by the AGI’s intelligence. AGIs in highly competitive (...) environments might converge to having nearly the same utility function, one optimized to favorably influencing other agents through game theory. Nothing in our analysis weakens arguments concerning the risks of AGI. (shrink)
Machine ethics and robot rights are quickly becoming hot topics in artificial intelligence and robotics communities. We will argue that attempts to attribute moral agency and assign rights to all intelligent machines are misguided, whether applied to infrahuman or superhuman AIs, as are proposals to limit the negative effects of AIs by constraining their behavior. As an alternative, we propose a new science of safety engineering for intelligent artificial agents based on maximizing for what humans value. In particular, we challenge (...) the scientific community to develop intelligent systems that have human-friendly values that they provably retain, even under recursive self-improvement. (shrink)
Various authors have argued that in the future not only will it be technically feasible for human minds to be transferred to other substrates, but this will become, for most humans, the preferred option over the current biological limitations. It has even been claimed that such a scenario is inevitable in order to solve the challenging, but imperative, multi-agent value alignment problem. In all these considerations, it has been overlooked that, in order to create a suitable environment for a particular (...) mind – for example, a personal universe in a computational substrate – numerous other potentially sentient beings will have to be created. These range from non-player characters to subroutines. This article analyzes the additional suffering and mind crimes that these scenarios might entail. We offer a partial solution to reduce the suffering by imposing on the transferred mind the perception of indicators to measure potential suffering in non-player characters. This approach can be seen as implementing literal empathy through enhanced cognition. (shrink)
As AI technologies increase in capability and ubiquity, AI accidents are becoming more common. Based on normal accident theory, high reliability theory, and open systems theory, we create a framework for understanding the risks associated with AI applications. This framework is designed to direct attention to pertinent system properties without requiring unwieldy amounts of accuracy. In addition, we also use AI safety principles to quantify the unique risks of increased intelligence and human-like qualities in AI. Together, these two fields give (...) a more complete picture of the risks of contemporary AI. By focusing on system properties near accidents instead of seeking a root cause of accidents, we identify where attention should be paid to safety for current generation AI systems. (shrink)
This paper attempts to formalize and to address the 'leakproofing' of the Singularity problem presented by David Chalmers. The paper begins with the definition of the Artificial Intelligence Confinement Problem. After analysis of existing solutions and their shortcomings, a protocol is proposed aimed at making a more secure confinement environment which might delay potential negative effect from the technological singularity while allowing humanity to benefit from the superintelligence.
This paper attempts to formalize and to address the ‘leakproofing’ of the Singularity problem presented by David Chalmers. The paper begins with the definition of the Artificial Intelligence Confinement Problem. After analysis of existing solutions and their shortcomings, a protocol is proposed aimed at making a more secure confinement environment which might delay potential negative effect from the technological singularity while allowing humanity to benefit from the superintelligence.
This volume contains a selection of authoritative essays exploring the central questions raised by the conjectured technological singularity. In informed yet jargon-free contributions written by active research scientists, philosophers and sociologists, it goes beyond philosophical discussion to provide a detailed account of the risks that the singularity poses to human society and, perhaps most usefully, the possible actions that society and technologists can take to manage the journey to any singularity in a way that ensures a positive rather than a (...) negative impact on society. The discussions provide perspectives that cover technological, political and business issues. The aim is to bring clarity and rigor to the debate in a way that will inform and stimulate both experts and interested general readers. (shrink)
"The idea that human history is approaching a singularity - that ordinary humans will someday be overtaken by artificially intelligent machines or cognitively enhanced biological intelligence, or both - has moved from the realm of science fiction to serious debate. Some singularity theorists predict that if the field of artificial intelligence continues to develop at its current dizzying rate, the singularity could come about in the middle of the present century. Murray Shanahan offers an introduction to the idea of the (...) singularity and considers the ramifications of such a potentially seismic event. Shanahan's aim is not to make predictions but rather to investigate a range of scenarios. Whether we believe that singularity is near or far, likely or impossible, apocalypse or utopia, the very idea raises crucial philosophical and pragmatic questions, forcing us to think seriously about what we want as a species. Shanahan describes technological advances in AI, both biologically inspired and engineered from scratch. Once human-level AI - theoretically possible, but difficult to accomplish - has been achieved, he explains, the transition to superintelligent AI could be very rapid. Shanahan considers what the existence of superintelligent machines could mean for such matters as personhood, responsibility, rights, and identity. Some superhuman AI agents might be created to benefit humankind; some might go rogue. The singularity presents both an existential threat to humanity and an existential opportunity for humanity to transcend its limitations. Shanahan makes it clear that we need to imagine both possibilities if we want to bring about the better outcome. (shrink)