Results for 'profit sharing, analytic hierarchy process, multi-agent reinforcement learning, pursuit problem'

999 found
Order:
  1.  17
    強化学習エージェントへの階層化意志決定法の導入―追跡問題を例に―.輿石 尚宏 謙吾 片山 - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:279-291.
    Reinforcement Learning is a promising technique for creating agents that can be applied to real world problems. The most important features of RL are trial-and-error search and delayed reward. Thus, agents randomly act in the early learning stage. However, such random actions are impractical for real world problems. This paper presents a novel model of RL agents. A feature of our learning agent model is to integrate the Analytic Hierarchy Process into the standard RL agent (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  2.  24
    Profit Sharing 法における強化関数に関する一考察.Tatsumi Shoji Uemura Wataru - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:197-203.
    In this paper, we consider profit sharing that is one of the reinforcement learning methods. An agent learns a candidate solution of a problem from the reward that is received from the environment if and only if it reaches the destination state. A function that distributes the received reward to each action of the candidate solution is called the reinforcement function. On this learning system, the agent can reinforce the set of selected actions when (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  3.  19
    経験に固執しない Profit Sharing 法.Ueno Atsushi Uemura Wataru - 2006 - Transactions of the Japanese Society for Artificial Intelligence 21:81-93.
    Profit Sharing is one of the reinforcement learning methods. An agent, as a learner, selects an action with a state-action value and receives rewards when it reaches a goal state. Then it distributes receiving rewards to state-action values. This paper discusses how to set the initial value of a state-action value. A distribution function ƒ( x ) is called as the reinforcement function. On Profit Sharing, an agent learns a policy by distributing rewards with (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  4.  31
    Instilling moral value alignment by means of multi-objective reinforcement learning.Juan Antonio Rodriguez-Aguilar, Maite Lopez-Sanchez, Marc Serramia & Manel Rodriguez-Soto - 2022 - Ethics and Information Technology 24 (1).
    AI research is being challenged with ensuring that autonomous agents learn to behave ethically, namely in alignment with moral values. Here, we propose a novel way of tackling the value alignment problem as a two-step process. The first step consists on formalising moral values and value aligned behaviour based on philosophical foundations. Our formalisation is compatible with the framework of (Multi-Objective) Reinforcement Learning, to ease the handling of an agent’s individual and ethical objectives. The second step (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  5.  24
    不完全知覚判定法を導入した Profit Sharing.Masuda Shiro Saito Ken - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:379-388.
    To apply reinforcement learning to difficult classes such as real-environment learning, we need to use a method robust to perceptual aliasing problem. The exploitation-oriented methods such as Profit Sharing can deal with the perceptual aliasing problem to a certain extent. However, when the agent needs to select different actions at the same sensory input, the learning efficiency worsens. To overcome the problem, several state partition methods using history information of state-action pairs are proposed. These (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  6.  54
    Too Many Cooks: Bayesian Inference for Coordinating MultiAgent Collaboration.Sarah A. Wu, Rose E. Wang, James A. Evans, Joshua B. Tenenbaum, David C. Parkes & Max Kleiman-Weiner - 2021 - Topics in Cognitive Science 13 (2):414-432.
    Collaboration requires agents to coordinate their behavior on the fly, sometimes cooperating to solve a single task together and other times dividing it up into sub‐tasks to work on in parallel. Underlying the human ability to collaborate is theory‐of‐mind (ToM), the ability to infer the hidden mental states that drive others to act. Here, we develop Bayesian Delegation, a decentralized multiagent learning mechanism with these abilities. Bayesian Delegation enables agents to rapidly infer the hidden intentions of others by (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  7. Bidding in Reinforcement Learning: A Paradigm for Multi-Agent Systems.Chad Sessions - unknown
    The paper presents an approach for developing multi-agent reinforcement learning systems that are made up of a coalition of modular agents. We focus on learning to segment sequences (sequential decision tasks) to create modular structures, through a bidding process that is based on reinforcements received during task execution. The approach segments sequences (and divides them up among agents) to facilitate the learning of the overall task. Notably, our approach does not rely on a priori knowledge or a (...)
     
    Export citation  
     
    Bookmark   1 citation  
  8.  3
    Multi-agent reinforcement learning based algorithm detection of malware-infected nodes in IoT networks.Marcos Severt, Roberto Casado-Vara, Ángel Martín del Rey, Héctor Quintián & Jose Luis Calvo-Rolle - forthcoming - Logic Journal of the IGPL.
    The Internet of Things (IoT) is a fast-growing technology that connects everyday devices to the Internet, enabling wireless, low-consumption and low-cost communication and data exchange. IoT has revolutionized the way devices interact with each other and the internet. The more devices become connected, the greater the risk of security breaches. There is currently a need for new approaches to algorithms that can detect malware regardless of the size of the network and that can adapt to dynamic changes in the network. (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  9. Multi-Agent Reinforcement Learning: Weighting and Partitioning.Ron Sun & Todd Peterson - unknown
    This paper addresses weighting and partitioning in complex reinforcement learning tasks, with the aim of facilitating learning. The paper presents some ideas regarding weighting of multiple agents and extends them into partitioning an input/state space into multiple regions with di erential weighting in these regions, to exploit di erential characteristics of regions and di erential characteristics of agents to reduce the learning complexity of agents (and their function approximators) and thus to facilitate the learning overall. It analyzes, in (...) learning tasks, di erent ways of partitioning a task and using agents selectively based on partitioning. Based on the analysis, some heuristic methods are described and experimentally tested. We nd that some o -line heuristic methods performed the best, signi cantly better than single-agent models. (shrink)
     
    Export citation  
     
    Bookmark   6 citations  
  10.  42
    Safe multi-agent reinforcement learning for multi-robot control.Shangding Gu, Jakub Grudzien Kuba, Yuanpei Chen, Yali Du, Long Yang, Alois Knoll & Yaodong Yang - 2023 - Artificial Intelligence 319 (C):103905.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  11.  11
    A Novel Modeling Technique for the Forecasting of Multiple-Asset Trading Volumes: Innovative Initial-Value-Problem Differential Equation Algorithms for Reinforcement Machine Learning.Mazin A. M. Al Janabi - 2022 - Complexity 2022:1-16.
    Liquidity risk arises from the inability to unwind or hedge trading positions at the prevailing market prices. The risk of liquidity is a wide and complex topic as it depends on several factors and causes. While much has been written on the subject, there exists no clear-cut mathematical description of the phenomena and typical market risk modeling methods fail to identify the effect of illiquidity risk. In this paper, we do not propose a definitive one either, but we attempt to (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  12.  11
    Massively Multi-Agent Simulations of Religion.William Sims Bainbridge - 2018 - Journal of Cognition and Culture 18 (5):565-586.
    Massively multiplayer online games are not merely electronic communication systems based on computational databases, but also include artificial intelligence that possesses complex, dynamic structure. Each visible action taken by a component of the multi-agent system appears simple, but is supported by vastly more sophisticated invisible processes. A rough outline of the typical hierarchy has four levels: interaction between two individuals, each either human or artificial, conflict between teams of agents who cooperate with fellow team members, enduring social-cultural (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  13. Automatic Partitioning for Multi-Agent Reinforcement Learning.Ron Sun - unknown
    This paper addresses automatic partitioning in complex reinforcement learning tasks with multiple agents, without a priori domain knowledge regarding task structures. Partitioning a state/input space into multiple regions helps to exploit the di erential characteristics of regions and di erential characteristics of agents, thus facilitating learning and reducing the complexity of agents especially when function approximators are used. We develop a method for optimizing the partitioning of the space through experience without the use of a priori domain knowledge. The (...)
     
    Export citation  
     
    Bookmark  
  14.  6
    Reinforcement Learning with Probabilistic Boolean Network Models of Smart Grid Devices.Pedro Juan Rivera Torres, Carlos Gershenson García, María Fernanda Sánchez Puig & Samir Kanaan Izquierdo - 2022 - Complexity 2022:1-15.
    The area of smart power grids needs to constantly improve its efficiency and resilience, to provide high quality electrical power in a resilient grid, while managing faults and avoiding failures. Achieving this requires high component reliability, adequate maintenance, and a studied failure occurrence. Correct system operation involves those activities and novel methodologies to detect, classify, and isolate faults and failures and model and simulate processes with predictive algorithms and analytics. In this paper, we showcase the application of a complex-adaptive, self-organizing (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  15.  20
    Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.Tom Everitt, Marcus Hutter, Ramana Kumar & Victoria Krakovna - 2021 - Synthese 198 (Suppl 27):6435-6467.
    Can humans get arbitrarily capable reinforcement learning agents to do their bidding? Or will sufficiently capable RL agents always find ways to bypass their intended objectives by shortcutting their reward signal? This question impacts how far RL can be scaled, and whether alternative paradigms must be developed in order to build safe artificial general intelligence. In this paper, we study when an RL agent has an instrumental goal to tamper with its reward process, and describe design principles that (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  16.  14
    Deep Reinforcement Learning for UAV Intelligent Mission Planning.Longfei Yue, Rennong Yang, Ying Zhang, Lixin Yu & Zhuangzhuang Wang - 2022 - Complexity 2022:1-13.
    Rapid and precise air operation mission planning is a key technology in unmanned aerial vehicles autonomous combat in battles. In this paper, an end-to-end UAV intelligent mission planning method based on deep reinforcement learning is proposed to solve the shortcomings of the traditional intelligent optimization algorithm, such as relying on simple, static, low-dimensional scenarios, and poor scalability. Specifically, the suppression of enemy air defense mission planning is described as a sequential decision-making problem and formalized as a Markov decision (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  17.  15
    Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning.Rui Wang, Xianghua Gan, Qing Li & Xiao Yan - 2021 - Complexity 2021:1-17.
    We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  18.  71
    Learning with neighbours: Emergence of convention in a society of learning agents.Roland Mühlenbernd - 2011 - Synthese 183 (S1):87-109.
    I present a game-theoretical multi-agent system to simulate the evolutionary process responsible for the pragmatic phenomenon division of pragmatic labour (DOPL), a linguistic convention emerging from evolutionary forces. Each agent is positioned on a toroid lattice and communicates via signaling games , where the choice of an interlocutor depends on the Manhattan distance between them. In this framework I compare two learning dynamics: reinforcement learning (RL) and belief learning (BL). An agent’s experiences from previous plays (...)
    Direct download (5 more)  
     
    Export citation  
     
    Bookmark   8 citations  
  19.  12
    Cognitive prediction of obstacle's movement for reinforcement learning pedestrian interacting model.Masaomi Kimura & Thanh-Trung Trinh - 2022 - Journal of Intelligent Systems 31 (1):127-147.
    Recent studies in pedestrian simulation have been able to construct a highly realistic navigation behaviour in many circumstances. However, when replicating the close interactions between pedestrians, the replicated behaviour is often unnatural and lacks human likeness. One of the possible reasons is that the current models often ignore the cognitive factors in the human thinking process. Another reason is that many models try to approach the problem by optimising certain objectives. On the other hand, in real life, humans do (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  20.  12
    Re-Imagining Business Agency through Multi-Agent Cross-Sector Coalitions: Integrating CSR Frameworks.David Lal & Philipp Dorstewitz - 2021 - Philosophy of Management 21 (1):87-103.
    This theoretical paper takes an agency-theoretic approach to questions of corporate social responsibility (CSR). A comparison of various extant frameworks focusses on how CSR agency emerges in complex multi-agent and multi-sector stakeholder networks. The discussion considers the respective capabilities and relevance of these frameworks – culminating in an integrative CSR practice model. A short literature review of the evolution of CSR since the 1950’s provides the backdrop for understanding multi-agent cross-sectoral stakeholder coalitions as a strategic (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  21.  87
    Using the analytical hierarchy process (ahp) to construct a measure of the magnitude of consequences component of moral intensity.Eric W. Stein & Norita Ahmad - 2009 - Journal of Business Ethics 89 (3):391 - 407.
    The purpose of this work is to elaborate an empirically grounded mathematical model of the magnitude of consequences component of “moral intensity” (Jones, Academy of Management Review 16 (2),366, 1991) that can be used to evaluate different ethical situations. The model is built using the analytical hierarchy process (AHP) (Saaty, The Analytic Hierarchy Process , 1980) and empirical data from the legal profession. One contribution of our work is that it illustrates how AHP can be applied in (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   6 citations  
  22.  28
    Using the Analytical Hierarchy Process to Construct a Measure of the Magnitude of Consequences Component of Moral Intensity.Eric W. Stein & Norita Ahmad - 2009 - Journal of Business Ethics 89 (3):391-407.
    The purpose of this work is to elaborate an empirically grounded mathematical model of the magnitude of consequences component of "moral intensity", 366, 1991) that can be used to evaluate different ethical situations. The model is built using the analytical hierarchy process and empirical data from the legal profession. One contribution of our work is that it illustrates how AHP can be applied in the field of ethics. Following a review of the literature, we discuss the development of the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   5 citations  
  23.  28
    Multi-stakeholder Partnerships for Sustainability: Designing Decision-Making Processes for Partnership Capacity.Adriane MacDonald, Amelia Clarke & Lei Huang - 2019 - Journal of Business Ethics 160 (2):409-426.
    To address the prevalence and complexities of sustainable development challenges around the world, organizations in the business, government, and non-profit sectors are increasingly collaborating via multi-stakeholder partnerships. Because complex problems can be neither understood nor addressed by a single organization, it is necessary to bring together the knowledge and resources of many stakeholders. Yet, how these partnerships coordinate their collaborative activities to achieve mutual and organization-specific goals is not well understood. This study takes an organization design perspective of (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   7 citations  
  24. Artificial virtuous agents in a multi-agent tragedy of the commons.Jakob Stenseke - 2022 - AI and Society:1-18.
    Although virtue ethics has repeatedly been proposed as a suitable framework for the development of artificial moral agents, it has been proven difficult to approach from a computational perspective. In this work, we present the first technical implementation of artificial virtuous agents in moral simulations. First, we review previous conceptual and technical work in artificial virtue ethics and describe a functionalistic path to AVAs based on dispositional virtues, bottom-up learning, and top-down eudaimonic reward. We then provide the details of a (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  25.  23
    合理的政策形成アルゴリズムの連続値入力への拡張.木村 元 宮崎 和光 - 2007 - Transactions of the Japanese Society for Artificial Intelligence 22 (3):332-341.
    Reinforcement Learning is a kind of machine learning. We know Profit Sharing, the Rational Policy Making algorithm, the Penalty Avoiding Rational Policy Making algorithm and PS-r* to guarantee the rationality in a typical class of the Partially Observable Markov Decision Processes. However they cannot treat continuous state spaces. In this paper, we present a solution to adapt them in continuous state spaces. We give RPM a mechanism to treat continuous state spaces in the environment that has the same (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  26.  18
    An Application of Fuzzy Analytic Hierarchy Process in Risk Evaluation Model.Geng Peng, Lu Han, Zeyan Liu, Yanyang Guo, Junai Yan & Xinyu Jia - 2021 - Frontiers in Psychology 12.
    Conflicts in land exploration are incisive social problems which have been the subject in many studies. Risk assessment of land conflicts is effective to resolve such problems. Specifically, fuzzy mathematics and the analytic hierarchy process were combined together to evaluate risk in land conflicts in our work, which is proved useful to solve uncertainty and imprecision problems. Based on the analysis of the principles for the risk assessment of a land conflicts index system, a set of risk assessment (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  27.  23
    尤度情報に基づく温度分布を用いた強化学習法.鈴木 健嗣 小堀 訓成 - 2005 - Transactions of the Japanese Society for Artificial Intelligence 20:297-305.
    In the existing Reinforcement Learning, it is difficult and time consuming to find appropriate the meta-parameters such as learning rate, eligibility traces and temperature for exploration, in particular on a complicated and large-scale problem, the delayed reward often occurs and causes a difficulty in solving the problem. In this paper, we propose a novel method introducing a temperature distribution for reinforcement learning. In addition to the acquirement of policy based on profit sharing, the temperature is (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  28.  11
    Practical Research on the Application of Sponge City Reconstruction in Pocket Parks Based on the Analytic Hierarchy Process.Kun Ding & Yuan Zhang - 2021 - Complexity 2021:1-10.
    The rainwater system is an important part of the urban infrastructure as well as a key hub for maintaining the dynamic operation of the city and a clear indicator of the level of urban development. With the rapid development of urbanization, the hardened area of roads and residential areas has increased, and the construction of rainwater systems is so far insufficient, causing the urban waterlogging and water pollution problems to become increasingly serious. Accordingly, combined with the “sponge city” construction concept (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  29.  25
    重点サンプリングを用いた Ga による強化学習.Kimura Hajime Tsuchiya Chikao - 2005 - Transactions of the Japanese Society for Artificial Intelligence 20:1-10.
    Reinforcement Learning (RL) handles policy search problems: searching a mapping from state space to action space. However RL is based on gradient methods and as such, cannot deal with problems with multimodal landscape. In contrast, though Genetic Algorithm (GA) is promising to deal with them, it seems to be unsuitable for policy search problems from the viewpoint of the cost of evaluation. Minimal Generation Gap (MGG), used as a generation-alternation model in GA, generates many offspring from two or more (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  30.  14
    環境状況に応じて自己の報酬を操作する学習エージェントの構築.沼尾 正行 森山 甲一 - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:676-683.
    The authors aim at constructing an agent which learns appropriate actions in a Multi-Agent environment with and without social dilemmas. For this aim, the agent must have nonrationality that makes it give up its own profit when it should do that. Since there are many studies on rational learning that brings more and more profit, it is desirable to utilize them for constructing the agent. Therefore, we use a reward-handling manner that makes internal (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  31. Integrating reinforcement learning, bidding and genetic algorithms.Ron Sun - unknown
    This paper presents a GA-based multi-agent reinforce- ment learning bidding approach (GMARLB) for perform- ing multi-agent reinforcement learning. GMARLB inte- grates reinforcement learning, bidding and genetic algo- rithms. The general idea of our multi-agent systems is as follows: There are a number of individual agents in a team, each agent of the team has two modules: Q module and CQ module. Each agent can select actions to be performed at each (...)
     
    Export citation  
     
    Bookmark  
  32. HCI Model with Learning Mechanism for Cooperative Design in Pervasive Computing Environment.Hong Liu, Bin Hu & Philip Moore - 2015 - Journal of Internet Technology 16.
    This paper presents a human-computer interaction model with a three layers learning mechanism in a pervasive environment. We begin with a discussion around a number of important issues related to human-computer interaction followed by a description of the architecture for a multi-agent cooperative design system for pervasive computing environment. We present our proposed three- layer HCI model and introduce the group formation algorithm, which is predicated on a dynamic sharing niche technology. Finally, we explore the cooperative reinforcement (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  33.  52
    Humean learning (how to learn).Jeffrey A. Barrett - forthcoming - Philosophical Studies:1-17.
    David Hume’s skeptical solution to the problem of induction was grounded in his belief that we learn by means of custom. We consider here how a form of reinforcement learning like custom may allow an agent to learn how to learn in other ways as well. Specifically, an agent may learn by simple reinforcement to adopt new forms of learning that work better than simple reinforcement in the context of specific tasks. We will consider (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  34.  23
    Profit Sharing の不完全知覚環境下への拡張: PS-r^* の提案と評価.Kobayashi Shigenobu Miyazaki Kazuteru - 2003 - Transactions of the Japanese Society for Artificial Intelligence 18:286-296.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  35.  16
    A Theoretical Framework for How We Learn Aesthetic Values.Hassan Aleem, Ivan Correa-Herran & Norberto M. Grzywacz - 2020 - Frontiers in Human Neuroscience 14:565629.
    How do we come to like the things that we do? Each one of us starts from a relatively similar state at birth, yet we end up with vastly different sets of aesthetic preferences. These preferences go on to define us both as individuals and as members of our cultures. Therefore, it is important to understand how aesthetic preferences form over our lifetimes. This poses a challenging problem: to understand this process, one must account for the many factors at (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  36.  11
    認知距離学習による問題解決器の実行時探索削減の評価と学習プロセスの解析.宮本 裕司 山川 宏 - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:1-13.
    Our proposed cognitive distance learning problem solver generates sequence of actions from initial state to goal states in problem state space. This problem solver learns cognitive distance of arbitrary combination of two states. Action generation at each state is selection of next state that has minimum cognitive distance to the goal, like Q-learning agent. In this paper, first, we show that our proposed method reduces search cost than conventional search method by analytical simulation in spherical state (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  37.  10
    Reinforcement Learning for Production‐Based Cognitive Models.Adrian Brasoveanu & Jakub Dotlačil - 2021 - Topics in Cognitive Science 13 (3):467-487.
    We investigate how Reinforcement Learning methods can be used to solve the production selection and production ordering problem in ACT‐R. We focus on four algorithms from the Q learning family, tabular Q and three versions of Deep Q Networks, as well as the ACT‐R utility learning algorithm, which provides a baseline for the Q algorithms. We compare the performance of these five algorithms in a range of lexical decision tasks framed as sequential decision problems.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  38.  17
    Modelling dynamic behaviour of agents in a multiagent world: Logical analysis of Wh-questions and answers.Martina Číhalová & Marie Duží - 2023 - Logic Journal of the IGPL 31 (1):140-171.
    In a multiagent and multi-cultural world, the fine-grained analysis of agents’ dynamic behaviour, i.e. of their activities, is essential. Dynamic activities are actions that are characterized by an agent who executes the action and by other participants of the action. Wh-questions on the participants of the actions pose a difficult particular challenge because the variability of the types of possible answers to such questions is huge. To deal with the problem, we propose the analysis and classification of (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  39.  23
    Κ-確実探査法と動的計画法を用いた mdps 環境の効率的探索法.Kawada Seiichi Tateyama Takeshi - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16:11-19.
    One most common problem in reinforcement learning systems (e.g. Q-learning) is to reduce the number of trials to converge to an optimal policy. As one of the solution to the problem, k-certainty exploration method was proposed. Miyazaki reported that this method could determine an optimal policy faster than Q-learning in Markov decision processes (MDPs). This method is very efficient learning method. But, we propose an improvement plan that makes this method more efficient. In k-certainty exploration method, in (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  40.  6
    Load Balancing Selection Method and Simulation in Network Communication Based on AHP-DS Heterogeneous Network Selection Algorithm.Weiwei Xiao - 2021 - Complexity 2021:1-12.
    This article proposes an Analytic Hierarchy Process Dempster-Shafer and similarity-based network selection algorithm for the scenario of dynamic changes in user requirements and network environment; combines machine learning with network selection and proposes a decision tree-based network selection algorithm; combines multiattribute decision-making and genetic algorithm to propose a weighted Gray Relation Analysis and genetic algorithm-based network access decision algorithm. Firstly, the training data is obtained from the collaborative algorithm, and it is used as the training set, and the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  41.  7
    Predictive maintenance of vehicle fleets through hybrid deep learning-based ensemble methods for industrial IoT datasets.Arindam Chaudhuri & Soumya K. Ghosh - forthcoming - Logic Journal of the IGPL.
    Connected vehicle fleets have formed significant component of industrial internet of things scenarios as part of Industry 4.0 worldwide. The number of vehicles in these fleets has grown at a steady pace. The vehicles monitoring with machine learning algorithms has significantly improved maintenance activities. Predictive maintenance potential has increased where machines are controlled through networked smart devices. Here, benefits are accrued considering uptimes optimization. This has resulted in reduction of associated time and labor costs. It has also provided significant increase (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  42.  8
    Achieving Sustainable Development Goals Through Collaborative Innovation: Evidence from Four European Initiatives.Laura Mariani, Benedetta Trivellato, Mattia Martini & Elisabetta Marafioti - 2022 - Journal of Business Ethics 180 (4):1075-1095.
    The role to be played by multi-stakeholder partnerships in addressing the ‘wicked problems’ of sustainable development is made explicit by the seventeenth Sustainable Development Goal. But how do these partnerships really work? Based on the analysis of four sustainability-oriented innovation initiatives implemented in Belgium, Italy, Germany, and France, this study explores the roles and mechanisms that collaborating actors may enact to facilitate the pursuit of sustainable development, with a particular focus on non-profit organizations. The results suggest that (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  43.  10
    Reinforcement Learning-Based Collision Avoidance Guidance Algorithm for Fixed-Wing UAVs.Yu Zhao, Jifeng Guo, Chengchao Bai & Hongxing Zheng - 2021 - Complexity 2021:1-12.
    A deep reinforcement learning-based computational guidance method is presented, which is used to identify and resolve the problem of collision avoidance for a variable number of fixed-wing UAVs in limited airspace. The cooperative guidance process is first analyzed for multiple aircraft by formulating flight scenarios using multiagent Markov game theory and solving it by machine learning algorithm. Furthermore, a self-learning framework is established by using the actor-critic model, which is proposed to train collision avoidance decision-making neural networks. To (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  44.  17
    Pessimistic value iteration for multi-task data sharing in Offline Reinforcement Learning.Chenjia Bai, Lingxiao Wang, Jianye Hao, Zhuoran Yang, Bin Zhao, Zhen Wang & Xuelong Li - 2024 - Artificial Intelligence 326 (C):104048.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  45. Collective intentionality and social agents.Raimo Tuomela - 2001
    In this paper I will discuss a certain philosophical and conceptual program -- that I have called philosophy of social action writ large -- and also show in detail how parts of the program have been, and is currently being carried out. In current philosophical research the philosophy of social action can be understood in a broad sense to encompass such central research topics as action occurring in a social context (this includes multi-agent action); shared we-attitudes (such as (...)
     
    Export citation  
     
    Bookmark   3 citations  
  46.  13
    Critical Pedagogy in the New Normal.Christopher Ryan Maboloc - 2020 - Voices in Bioethics 6.
    Photo by Thought Catalog on Unsplash INTRODUCTION The coronavirus pandemic is a challenge to educators, policy makers, and ordinary people. In facing the threat from COVID-19, school systems and global institutions need “to address the essential matter of each human being and how they are interacting with, and affected by, a much wider set of biological and technical conditions.”[1] Educators must grapple with the societal issues that come with the intent of ensuring the safety of the public. To some, “these (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  47.  23
    Toward a Psychology of Deep Reinforcement Learning Agents Using a Cognitive Architecture.Konstantinos Mitsopoulos, Sterling Somers, Joel Schooler, Christian Lebiere, Peter Pirolli & Robert Thomson - 2022 - Topics in Cognitive Science 14 (4):756-779.
    We argue that cognitive models can provide a common ground between human users and deep reinforcement learning (Deep RL) algorithms for purposes of explainable artificial intelligence (AI). Casting both the human and learner as cognitive models provides common mechanisms to compare and understand their underlying decision-making processes. This common grounding allows us to identify divergences and explain the learner's behavior in human understandable terms. We present novel salience techniques that highlight the most relevant features in each model's decision-making, as (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  48.  16
    Societies Learn and yet the World is Hard to Change.Klaus Eder - 1999 - European Journal of Social Theory 2 (2):195-215.
    Evolution and learning are two analytically distinct concepts. People learn yet evolution (`change') does not necessarily take place. To clarify this problem the concept of learning is explicated. The first problem addressed is the question of who is learning. Here a shift from the single actor perspective to an interaction perspective is proposed (using Habermas and Luhmann as theoretical arguments for such a shift). Both, however, idealize the preconditions that interactants share while learning collectively. Against rationalist assumptions it (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark   10 citations  
  49. Using Reinforcement Learning to Examine Dynamic Attention Allocation During Reading.Yanping Liu, Erik D. Reichle & Ding-Guo Gao - 2013 - Cognitive Science 37 (8):1507-1540.
    A fundamental question in reading research concerns whether attention is allocated strictly serially, supporting lexical processing of one word at a time, or in parallel, supporting concurrent lexical processing of two or more words (Reichle, Liversedge, Pollatsek, & Rayner, 2009). The origins of this debate are reviewed. We then report three simulations to address this question using artificial reading agents (Liu & Reichle, 2010; Reichle & Laurent, 2006) that learn to dynamically allocate attention to 1–4 words to “read” as efficiently (...)
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  50.  32
    Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning.Daniel J. Schad, Elisabeth Jünger, Miriam Sebold, Maria Garbusow, Nadine Bernhardt, Amir-Homayoun Javadi, Ulrich S. Zimmermann, Michael N. Smolka, Andreas Heinz, Michael A. Rapp & Quentin J. M. Huys - 2014 - Frontiers in Psychology 5:117016.
    Theories of decision-making and its neural substrates have long assumed the existence of two distinct and competing valuation systems, variously described as goal-directed vs. habitual, or, more recently and based on statistical arguments, as model-free vs. model-based reinforcement-learning. Though both have been shown to control choices, the cognitive abilities associated with these systems are under ongoing investigation. Here we examine the link to cognitive abilities, and find that individual differences in processing speed covary with a shift from model-free to (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   3 citations  
1 — 50 / 999