Results for 'multiagent systems, robocup soccer, keepaway, reinforcement learning, reward design'

993 found
Order:
  1.  19
    マルチエージェント連続タスクにおける報酬設計の実験的考察: RoboCup Soccer Keepaway タスクを例として.Tanaka Nobuyuki Arai Sachiyo - 2006 - Transactions of the Japanese Society for Artificial Intelligence 21 (6):537-546.
    In this paper, we discuss guidelines for a reward design problem that defines when and what amount of reward should be given to the agent/s, within the context of reinforcement learning approach. We would like to take keepaway soccer as a standard task of the multiagent domain which requires skilled teamwork. The difficulties of designing reward for this task are due to its features as follows: i) since it belongs to the continuing task which (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  2.  24
    Deep Reinforcement Learning for Vectored Thruster Autonomous Underwater Vehicle Control.Tao Liu, Yuli Hu & Hui Xu - 2021 - Complexity 2021:1-25.
    Autonomous underwater vehicles are widely used to accomplish various missions in the complex marine environment; the design of a control system for AUVs is particularly difficult due to the high nonlinearity, variations in hydrodynamic coefficients, and external force from ocean currents. In this paper, we propose a controller based on deep reinforcement learning in a simulation environment for studying the control performance of the vectored thruster AUV. RL is an important method of artificial intelligence that can learn behavior (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  3.  7
    Optimization of English Online Learning Dictionary System Based on Multiagent Architecture.Ying Wang - 2021 - Complexity 2021:1-10.
    As a universal language in the world, English has become a necessary language communication tool under the globalization of trade. Intelligent, efficient, and reasonable English language-assisted learning system helps to further improve the English ability of language learners. English online learning dictionary, as an important query tool for English learners, is an important part of English online learning. This paper will optimize the design of English online learning dictionary system based on multiagent architecture. Based on the hybrid (...) cooperative algorithm, this paper will improve the disadvantages of the online English learning dictionary system and propose an appropriate dictionary application evaluation function. At the same time, an improved reinforcement learning algorithm is introduced into the corresponding English online learning dictionary navigation problem so as to improve the efficiency of the online English learning dictionary system. English online learning dictionary is more intelligent and efficient. In this paper, the new online learning dictionary system optimization algorithm is proposed and compared with the traditional system algorithm. The experimental results show that the algorithm proposed in this paper solves the collaborative confusion problem of English learning online dictionary to a certain extent and further solves the corresponding navigation problem so as to improve the efficiency. (shrink)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  4. Karlsruhe Brainstormers a Reinforcement Learning Approach to Robotic Soccer. P. Stone, T. Balch and G. Kraetszchmar, eds, RoboCup 2000: Robot Soccer World Cup IV. [REVIEW]M. Riedmiller & A. Merke - 2001 - In P. Bouquet (ed.), Lecture Notes in Artificial Intelligence. Kluwer Academic Publishers.
  5.  14
    Iterative Learning Tracking Control of Nonlinear Multiagent Systems with Input Saturation.Bingyou Liu, Zhengzheng Zhang, Lichao Wang, Xing Li & Xiongfeng Deng - 2021 - Complexity 2021:1-13.
    A tracking control algorithm of nonlinear multiple agents with undirected communication is studied for each multiagent system affected by external interference and input saturation. A control design scheme combining iterative learning and adaptive control is proposed to perform parameter adaptive time-varying adjustment and prove the effectiveness of the control protocol by designing Lyapunov functions. Simulation results show that the high-precision tracking control problem of the nonlinear multiagent system based on adaptive iterative learning control can be well realized (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  6.  18
    Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.Tom Everitt, Marcus Hutter, Ramana Kumar & Victoria Krakovna - 2021 - Synthese 198 (Suppl 27):6435-6467.
    Can humans get arbitrarily capable reinforcement learning agents to do their bidding? Or will sufficiently capable RL agents always find ways to bypass their intended objectives by shortcutting their reward signal? This question impacts how far RL can be scaled, and whether alternative paradigms must be developed in order to build safe artificial general intelligence. In this paper, we study when an RL agent has an instrumental goal to tamper with its reward process, and describe design (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  7.  3
    Reinforcement Learning with Probabilistic Boolean Network Models of Smart Grid Devices.Pedro Juan Rivera Torres, Carlos Gershenson García, María Fernanda Sánchez Puig & Samir Kanaan Izquierdo - 2022 - Complexity 2022:1-15.
    The area of smart power grids needs to constantly improve its efficiency and resilience, to provide high quality electrical power in a resilient grid, while managing faults and avoiding failures. Achieving this requires high component reliability, adequate maintenance, and a studied failure occurrence. Correct system operation involves those activities and novel methodologies to detect, classify, and isolate faults and failures and model and simulate processes with predictive algorithms and analytics. In this paper, we showcase the application of a complex-adaptive, self-organizing (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  8.  21
    Action control, forward models and expected rewards: representations in reinforcement learning.Jami Pekkanen, Jesse Kuokkanen, Otto Lappi & Anna-Mari Rusanen - 2021 - Synthese 199 (5-6):14017-14033.
    The fundamental cognitive problem for active organisms is to decide what to do next in a changing environment. In this article, we analyze motor and action control in computational models that utilize reinforcement learning (RL) algorithms. In reinforcement learning, action control is governed by an action selection policy that maximizes the expected future reward in light of a predictive world model. In this paper we argue that RL provides a way to explicate the so-called action-oriented views of (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  9.  10
    Reinforcement Learning-Based Collision Avoidance Guidance Algorithm for Fixed-Wing UAVs.Yu Zhao, Jifeng Guo, Chengchao Bai & Hongxing Zheng - 2021 - Complexity 2021:1-12.
    A deep reinforcement learning-based computational guidance method is presented, which is used to identify and resolve the problem of collision avoidance for a variable number of fixed-wing UAVs in limited airspace. The cooperative guidance process is first analyzed for multiple aircraft by formulating flight scenarios using multiagent Markov game theory and solving it by machine learning algorithm. Furthermore, a self-learning framework is established by using the actor-critic model, which is proposed to train collision avoidance decision-making neural networks. To (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  10.  20
    Online Optimal Control of Robotic Systems with Single Critic NN-Based Reinforcement Learning.Xiaoyi Long, Zheng He & Zhongyuan Wang - 2021 - Complexity 2021:1-7.
    This paper suggests an online solution for the optimal tracking control of robotic systems based on a single critic neural network -based reinforcement learning method. To this end, we rewrite the robotic system model as a state-space form, which will facilitate the realization of optimal tracking control synthesis. To maintain the tracking response, a steady-state control is designed, and then an adaptive optimal tracking control is used to ensure that the tracking error can achieve convergence in an optimal sense. (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  11.  71
    Online Supervised Learning with Distributed Features over Multiagent System.Xibin An, Bing He, Chen Hu & Bingqi Liu - 2020 - Complexity 2020:1-10.
    Most current online distributed machine learning algorithms have been studied in a data-parallel architecture among agents in networks. We study online distributed machine learning from a different perspective, where the features about the same samples are observed by multiple agents that wish to collaborate but do not exchange the raw data with each other. We propose a distributed feature online gradient descent algorithm and prove that local solution converges to the global minimizer with a sublinear rate O 2 T. Our (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  12.  14
    A Stable Distributed Neural Controller for Physically Coupled Networked Discrete-Time System via Online Reinforcement Learning.Jian Sun & Jie Li - 2018 - Complexity 2018:1-15.
    The large scale, time varying, and diversification of physically coupled networked infrastructures such as power grid and transportation system lead to the complexity of their controller design, implementation, and expansion. For tackling these challenges, we suggest an online distributed reinforcement learning control algorithm with the one-layer neural network for each subsystem or called agents to adapt the variation of the networked infrastructures. Each controller includes a critic network and action network for approximating strategy utility function and desired control (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  13.  15
    Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning.Rui Wang, Xianghua Gan, Qing Li & Xiao Yan - 2021 - Complexity 2021:1-17.
    We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  14.  53
    SA w_ S _u: An Integrated Model of Associative and Reinforcement Learning.Vladislav D. Veksler, Christopher W. Myers & Kevin A. Gluck - 2014 - Cognitive Science 38 (3):580-598.
    Successfully explaining and replicating the complexity and generality of human and animal learning will require the integration of a variety of learning mechanisms. Here, we introduce a computational model which integrates associative learning (AL) and reinforcement learning (RL). We contrast the integrated model with standalone AL and RL models in three simulation studies. First, a synthetic grid‐navigation task is employed to highlight performance advantages for the integrated model in an environment where the reward structure is both diverse and (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  15.  1
    Combination of fuzzy control and reinforcement learning for wind turbine pitch control.J. Enrique Sierra-Garcia & Matilde Santos - forthcoming - Logic Journal of the IGPL.
    The generation of the pitch control signal in a wind turbine (WT) is not straightforward due to the nonlinear dynamics of the system and the coupling of its internal variables; in addition, they are subjected to the uncertainty that comes from the random nature of the wind. Fuzzy logic has proved useful in applications with changing system parameters or where uncertainty is relevant as in this one, but the tuning of the fuzzy logic controller (FLC) parameters is neither straightforward nor (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  16.  11
    Dynamic Large-Scale Server Scheduling for IVF Queuing Network in Cloud Healthcare System.Yafei Li, Hongfeng Wang, Li Li & Yaping Fu - 2021 - Complexity 2021:1-15.
    As one of the most effective medical technologies for the infertile patients, in vitro fertilization has been more and more widely developed in recent years. However, prolonged waiting for IVF procedures has become a problem of great concern, since this technology is only mastered by the large general hospitals. To deal with the insufficiency of IVF service capacity, this paper studies an IVF queuing network in an integrated cloud healthcare system, where the two key medical services, that is, egg retrieval (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  17.  57
    Evolutionary psychology, learning, and belief signaling: design for natural and artificial systems.Eric Funkhouser - 2021 - Synthese 199 (5-6):14097-14119.
    Recent work in the cognitive sciences has argued that beliefs sometimes acquire signaling functions in virtue of their ability to reveal information that manipulates “mindreaders.” This paper sketches some of the evolutionary and design considerations that could take agents from solipsistic goal pursuit to beliefs that serve as social signals. Such beliefs will be governed by norms besides just the traditional norms of epistemology. As agents become better at detecting the agency of others, either through evolutionary history or individual (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  18. HCI Model with Learning Mechanism for Cooperative Design in Pervasive Computing Environment.Hong Liu, Bin Hu & Philip Moore - 2015 - Journal of Internet Technology 16.
    This paper presents a human-computer interaction model with a three layers learning mechanism in a pervasive environment. We begin with a discussion around a number of important issues related to human-computer interaction followed by a description of the architecture for a multi-agent cooperative design system for pervasive computing environment. We present our proposed three- layer HCI model and introduce the group formation algorithm, which is predicated on a dynamic sharing niche technology. Finally, we explore the cooperative reinforcement learning (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  19.  7
    Learning reward machines: A study in partially observable reinforcement learning.Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, Margarita P. Castro, Ethan Waldie & Sheila A. McIlraith - 2023 - Artificial Intelligence 323 (C):103989.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  20.  10
    Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance.W. Bradley Knox & Peter Stone - 2015 - Artificial Intelligence 225 (C):24-50.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  21.  17
    Passively learned spatial navigation cues evoke reinforcement learning reward signals.Thomas D. Ferguson, Chad C. Williams, Ronald W. Skelton & Olave E. Krigolson - 2019 - Cognition 189 (C):65-75.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  22.  6
    Optimization of the Rapid Design System for Arts and Crafts Based on Big Data and 3D Technology.Haihan Zhou - 2021 - Complexity 2021:1-10.
    In this paper, to solve the problem of slow design of arts and crafts and to improve design efficiency and aesthetics, the existing big data and 3D technology are used to conduct an in-depth analysis of the optimization of the rapid design system of arts and crafts machine salt baking. In the system requirement analysis, the functional modules of this system are identified as nine functional modules such as design terminology management system and external information import (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  23.  15
    罰を回避する合理的政策の学習.坪井 創吾 宮崎 和光 - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16 (2):185-192.
    Reinforcement learning is a kind of machine learning. It aims to adapt an agent to a given environment with a clue to rewards. In general, the purpose of reinforcement learning system is to acquire an optimum policy that can maximize expected reward per an action. However, it is not always important for any environment. Especially, if we apply reinforcement learning system to engineering, environments, we expect the agent to avoid all penalties. In Markov Decision Processes, a (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  24. When, What, and How Much to Reward in Reinforcement Learning-Based Models of Cognition.Christian P. Janssen & Wayne D. Gray - 2012 - Cognitive Science 36 (2):333-358.
    Reinforcement learning approaches to cognitive modeling represent task acquisition as learning to choose the sequence of steps that accomplishes the task while maximizing a reward. However, an apparently unrecognized problem for modelers is choosing when, what, and how much to reward; that is, when (the moment: end of trial, subtask, or some other interval of task performance), what (the objective function: e.g., performance time or performance accuracy), and how much (the magnitude: with binary, categorical, or continuous values). (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   6 citations  
  25.  9
    CortexVR: Immersive analysis and training of cognitive executive functions of soccer players using virtual reality and machine learning.Christian Krupitzer, Jens Naber, Jan-Philipp Stauffert, Jan Mayer, Jan Spielmann, Paul Ehmann, Noel Boci, Maurice Bürkle, André Ho, Clemens Komorek, Felix Heinickel, Samuel Kounev, Christian Becker & Marc Erich Latoschik - 2022 - Frontiers in Psychology 13.
    GoalThis paper presents an immersive Virtual Reality system to analyze and train Executive Functions of soccer players. EFs are important cognitive functions for athletes. They are a relevant quality that distinguishes amateurs from professionals.MethodThe system is based on immersive technology, hence, the user interacts naturally and experiences a training session in a virtual world. The proposed system has a modular design supporting the extension of various so-called game modes. Game modes combine selected game mechanics with specific simulation content to (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  26.  68
    Learning to design systems.Gary Metcalf - 2003 - World Futures 59 (1):21 – 36.
    This article describes a brief overview of systems design concepts, and provides an example of the use of one very simple framework for utilizing systems design. Its purpose is to demonstrate the value of even the simplest of systems design models in clarifying the issues behind what are often perceived to be organizational conflicts. The example provided is that of a medical function within an industrial organization, but the implications apply to almost any support function or department (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  27. Bidding in Reinforcement Learning: A Paradigm for Multi-Agent Systems.Chad Sessions - unknown
    The paper presents an approach for developing multi-agent reinforcement learning systems that are made up of a coalition of modular agents. We focus on learning to segment sequences (sequential decision tasks) to create modular structures, through a bidding process that is based on reinforcements received during task execution. The approach segments sequences (and divides them up among agents) to facilitate the learning of the overall task. Notably, our approach does not rely on a priori knowledge or a priori structures. (...)
     
    Export citation  
     
    Bookmark   1 citation  
  28.  13
    Model-based average reward reinforcement learning.Prasad Tadepalli & DoKyeong Ok - 1998 - Artificial Intelligence 100 (1-2):177-224.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  29.  16
    Emotional State and Feedback-Related Negativity Induced by Positive, Negative, and Combined Reinforcement.Shuyuan Xu, Yuyan Sun, Min Huang, Yanhong Huang, Jing Han, Xuemei Tang & Wei Ren - 2021 - Frontiers in Psychology 12:647263.
    Reinforcement learning relies on the reward prediction error (RPE) signals conveyed by the midbrain dopamine system. Previous studies showed that dopamine plays an important role in both positive and negative reinforcement. However, whether various reinforcement processes will induce distinct learning signals is still unclear. In a probabilistic learning task, we examined RPE signals in different reinforcement types using an electrophysiology index, namely, the feedback-related negativity (FRN). Ninety-four participants were randomly assigned into four groups: base (no (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  30.  8
    Editor's Note.Jessica Heybach - 2023 - Education and Culture 38 (1):1-3.
    In lieu of an abstract, here is a brief excerpt of the content:Editor’s NoteJessica HeybachThis final installation of Education and Culture’s special theme issue on Dewey, Data, and Technology coincides with what feels like a technological paradigm shift. As I sat down to write this editor’s note, a former student forwarded me Stephen Marche’s December 6, 2022 piece in The Atlantic titled “The College Essay is Dead” wherein he offers a critique of the humanities as dependent on traditional forms of (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  31.  42
    Reinforcement learning and artificial agency.Patrick Butlin - 2024 - Mind and Language 39 (1):22-38.
    There is an apparent connection between reinforcement learning and agency. Artificial entities controlled by reinforcement learning algorithms are standardly referred to as agents, and the mainstream view in the psychology and neuroscience of agency is that humans and other animals are reinforcement learners. This article examines this connection, focusing on artificial reinforcement learning systems and assuming that there are various forms of agency. Artificial reinforcement learning systems satisfy plausible conditions for minimal agency, and those which (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  32.  45
    Reinforcement Learning and Counterfactual Reasoning Explain Adaptive Behavior in a Changing Environment.Yunfeng Zhang, Jaehyon Paik & Peter Pirolli - 2015 - Topics in Cognitive Science 7 (2):368-381.
    Animals routinely adapt to changes in the environment in order to survive. Though reinforcement learning may play a role in such adaptation, it is not clear that it is the only mechanism involved, as it is not well suited to producing rapid, relatively immediate changes in strategies in response to environmental changes. This research proposes that counterfactual reasoning might be an additional mechanism that facilitates change detection. An experiment is conducted in which a task state changes over time and (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  33.  49
    Current cases of AI misalignment and their implications for future risks.Leonard Dung - 2023 - Synthese 202 (5):1-23.
    How can one build AI systems such that they pursue the goals their designers want them to pursue? This is the alignment problem. Numerous authors have raised concerns that, as research advances and systems become more powerful over time, misalignment might lead to catastrophic outcomes, perhaps even to the extinction or permanent disempowerment of humanity. In this paper, I analyze the severity of this risk based on current instances of misalignment. More specifically, I argue that contemporary large language models and (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  34.  47
    Reinforcement learning: A brief guide for philosophers of mind.Julia Haas - 2022 - Philosophy Compass 17 (9):e12865.
    In this opinionated review, I draw attention to some of the contributions reinforcement learning can make to questions in the philosophy of mind. In particular, I highlight reinforcement learning's foundational emphasis on the role of reward in agent learning, and canvass two ways in which the framework may advance our understanding of perception and motivation.
    No categories
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  35.  19
    Leader-Following Consensus for Second-Order Nonlinear Multiagent Systems with Input Saturation via Distributed Adaptive Neural Network Iterative Learning Control.Xiongfeng Deng, Xiuxia Sun, Shuguang Liu & Boyang Zhang - 2019 - Complexity 2019:1-13.
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  36.  61
    Distributed Coordination for a Class of High-Order Multiagent Systems Subject to Actuator Saturations by Iterative Learning Control.Nana Yang & Suoping Li - 2022 - Complexity 2022:1-18.
    This paper investigates a distributed coordination control for a class of high-order uncertain multiagent systems. Under the framework of iterative learning control, a novel fully distributed learning protocol is devised for the coordination problem of MASs including time-varying parameter uncertainties as well as actuator saturations. Meanwhile, the learning updating laws of various parameters are proposed. Utilizing Lyapunov theory and combining with Graph theory, the proposed algorithm can make each follower track a leader completely over a limited time interval even (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  37. Integrating reinforcement learning, bidding and genetic algorithms.Ron Sun - unknown
    This paper presents a GA-based multi-agent reinforce- ment learning bidding approach (GMARLB) for perform- ing multi-agent reinforcement learning. GMARLB inte- grates reinforcement learning, bidding and genetic algo- rithms. The general idea of our multi-agent systems is as follows: There are a number of individual agents in a team, each agent of the team has two modules: Q module and CQ module. Each agent can select actions to be performed at each step, which are done by the Q module. (...)
     
    Export citation  
     
    Bookmark  
  38.  4
    Iterative Learning Consensus Control for Nonlinear Partial Difference Multiagent Systems with Time Delay.Cun Wang, Xisheng Dai, Kene Li & Zupeng Zhou - 2021 - Complexity 2021:1-15.
    This paper considers the consensus control problem of nonlinear spatial-temporal hyperbolic partial difference multiagent systems and parabolic partial difference multiagent systems with time delay. Based on the system’s own fixed topology and the method of generating the desired trajectory by introducing virtual leader, using the consensus tracking error between the agent and the virtual leader agent and neighbor agents in the last iteration, an iterative learning algorithm is proposed. The sufficient condition for the system consensus error to converge (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  39.  17
    Evolutionary Reinforcement Learning for Adaptively Detecting Database Intrusions.Seul-Gi Choi & Sung-Bae Cho - 2020 - Logic Journal of the IGPL 28 (4):449-460.
    Relational database management system is the most popular database system. It is important to maintain data security from information leakage and data corruption. RDBMS can be attacked by an outsider or an insider. It is difficult to detect an insider attack because its patterns are constantly changing and evolving. In this paper, we propose an adaptive database intrusion detection system that can be resistant to potential insider misuse using evolutionary reinforcement learning, which combines reinforcement learning and evolutionary learning. (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  40.  15
    Predictive Movements and Human Reinforcement Learning of Sequential Action.Roy Kleijn, George Kachergis & Bernhard Hommel - 2018 - Cognitive Science 42 (S3):783-808.
    Sequential action makes up the bulk of human daily activity, and yet much remains unknown about how people learn such actions. In one motor learning paradigm, the serial reaction time (SRT) task, people are taught a consistent sequence of button presses by cueing them with the next target response. However, the SRT task only records keypress response times to a cued target, and thus it cannot reveal the full time‐course of motion, including predictive movements. This paper describes a mouse movement (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  41. The Archimedean trap: Why traditional reinforcement learning will probably not yield AGI.Samuel Allen Alexander - 2020 - Journal of Artificial General Intelligence 11 (1):70-85.
    After generalizing the Archimedean property of real numbers in such a way as to make it adaptable to non-numeric structures, we demonstrate that the real numbers cannot be used to accurately measure non-Archimedean structures. We argue that, since an agent with Artificial General Intelligence (AGI) should have no problem engaging in tasks that inherently involve non-Archimedean rewards, and since traditional reinforcement learning rewards are real numbers, therefore traditional reinforcement learning probably will not lead to AGI. We indicate two (...)
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  42.  11
    Reward-respecting subtasks for model-based reinforcement learning.Richard S. Sutton, Marlos C. Machado, G. Zacharias Holland, David Szepesvari, Finbarr Timbers, Brian Tanner & Adam White - 2023 - Artificial Intelligence 324 (C):104001.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  43.  14
    Object‐Label‐Order Effect When Learning From an Inconsistent Source.Timmy Ma & Natalia L. Komarova - 2019 - Cognitive Science 43 (8):e12737.
    Learning in natural environments is often characterized by a degree of inconsistency from an input. These inconsistencies occur, for example, when learning from more than one source, or when the presence of environmental noise distorts incoming information; as a result, the task faced by the learner becomes ambiguous. In this study, we investigate how learners handle such situations. We focus on the setting where a learner receives and processes a sequence of utterances to master associations between objects and their labels, (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  44.  25
    Predictive Movements and Human Reinforcement Learning of Sequential Action.Roy de Kleijn, George Kachergis & Bernhard Hommel - 2018 - Cognitive Science 42 (S3):783-808.
    Sequential action makes up the bulk of human daily activity, and yet much remains unknown about how people learn such actions. In one motor learning paradigm, the serial reaction time (SRT) task, people are taught a consistent sequence of button presses by cueing them with the next target response. However, the SRT task only records keypress response times to a cued target, and thus it cannot reveal the full time‐course of motion, including predictive movements. This paper describes a mouse movement (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  45.  17
    強化学習エージェントへの階層化意志決定法の導入―追跡問題を例に―.輿石 尚宏 謙吾 片山 - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:279-291.
    Reinforcement Learning is a promising technique for creating agents that can be applied to real world problems. The most important features of RL are trial-and-error search and delayed reward. Thus, agents randomly act in the early learning stage. However, such random actions are impractical for real world problems. This paper presents a novel model of RL agents. A feature of our learning agent model is to integrate the Analytic Hierarchy Process into the standard RL agent model, which consists (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  46.  6
    Can Model-Free Learning Explain Deontological Moral Judgments?Alisabeth Ayars - 2016 - Cognition 150 (C):232-242.
    Dual-systems frameworks propose that moral judgments are derived from both an immediate emotional response, and controlled/rational cognition. Recently Cushman (2013) proposed a new dual-system theory based on model-free and model-based reinforcement learning. Model-free learning attaches values to actions based on their history of reward and punishment, and explains some deontological, non-utilitarian judgments. Model-based learning involves the construction of a causal model of the world and allows for far-sighted planning; this form of learning fits well with utilitarian considerations that (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  47.  7
    Consensus of Time-Varying Interval Uncertain Multiagent Systems via Reduced-Order Neighborhood Interval Observer.Hui Luo, Jin Zhao & Quan Yin - 2022 - Complexity 2022:1-14.
    This work focuses on a multiagent system with time-varying interval uncertainty in the system matrix, where multiple agents interact through an undirected topology graph and only the bounding matrices on the uncertainty in the system matrix are known. A reduced-order interval observer, which is named the reduced-order neighborhood interval observer, is designed to estimate the relative state of each agent and those of its neighbors. It is shown that the reduced-order IO can guarantee the consensus of the uncertain (...) system. Finally, simulation examples are proposed to verify the theoretical findings. (shrink)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  48.  16
    A system for the use of answer set programming in reinforcement learning.Matthias Nickles - 2012 - In Luis Farinas del Cerro, Andreas Herzig & Jerome Mengin (eds.), Logics in Artificial Intelligence. Springer. pp. 488--491.
  49. Authors' Response: What to Do Next: Applying Flexible Learning Algorithms to Develop Constructivist Communication.B. Porr & P. Di Prodi - 2014 - Constructivist Foundations 9 (2):218-222.
    Upshot: We acknowledge that our model can be implemented with different reinforcement learning algorithms. Subsystem formation has been successfully demonstrated on the basal level, and in order to show full subsystem formation in the communication system at least both intentional utterances and acceptance/rejection need to be implemented. The comments about intrinsic vs extrinsic rewards made clear that this distinction is not helpful in the context of the constructivist paradigm but rather needs to be replaced by a critical reflection on (...)
     
    Export citation  
     
    Bookmark  
  50.  26
    The Role of Frontostriatal Systems in Instructed Reinforcement Learning: Evidence From Genetic and Experimentally-Induced Variation.Nathan Tardiff, Kathryn N. Graves & Sharon L. Thompson-Schill - 2018 - Frontiers in Human Neuroscience 12.
1 — 50 / 993