Results for 'reinforcement learning, profit sharing, rational policy making algorithm, POMDPs, theorem'

1000+ found
Order:
  1.  23
    Profit Sharing の不完全知覚環境下への拡張: PS-r^* の提案と評価.Kobayashi Shigenobu Miyazaki Kazuteru - 2003 - Transactions of the Japanese Society for Artificial Intelligence 18:286-296.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  2.  23
    合理的政策形成アルゴリズムの連続値入力への拡張.木村 元 宮崎 和光 - 2007 - Transactions of the Japanese Society for Artificial Intelligence 22 (3):332-341.
    Reinforcement Learning is a kind of machine learning. We know Profit Sharing, the Rational Policy Making algorithm, the Penalty Avoiding Rational Policy Making algorithm and PS-r* to guarantee the rationality in a typical class of the Partially Observable Markov Decision Processes. However they cannot treat continuous state spaces. In this paper, we present a solution to adapt them in continuous state spaces. We give RPM a mechanism to treat continuous state spaces in (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  3.  24
    不完全知覚判定法を導入した Profit Sharing.Masuda Shiro Saito Ken - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:379-388.
    To apply reinforcement learning to difficult classes such as real-environment learning, we need to use a method robust to perceptual aliasing problem. The exploitation-oriented methods such as Profit Sharing can deal with the perceptual aliasing problem to a certain extent. However, when the agent needs to select different actions at the same sensory input, the learning efficiency worsens. To overcome the problem, several state partition methods using history information of state-action pairs are proposed. These methods try to convert (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  4.  19
    経験に固執しない Profit Sharing 法.Ueno Atsushi Uemura Wataru - 2006 - Transactions of the Japanese Society for Artificial Intelligence 21:81-93.
    Profit Sharing is one of the reinforcement learning methods. An agent, as a learner, selects an action with a state-action value and receives rewards when it reaches a goal state. Then it distributes receiving rewards to state-action values. This paper discusses how to set the initial value of a state-action value. A distribution function ƒ( x ) is called as the reinforcement function. On Profit Sharing, an agent learns a policy by distributing rewards with the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  5.  14
    Deep Reinforcement Learning for UAV Intelligent Mission Planning.Longfei Yue, Rennong Yang, Ying Zhang, Lixin Yu & Zhuangzhuang Wang - 2022 - Complexity 2022:1-13.
    Rapid and precise air operation mission planning is a key technology in unmanned aerial vehicles autonomous combat in battles. In this paper, an end-to-end UAV intelligent mission planning method based on deep reinforcement learning is proposed to solve the shortcomings of the traditional intelligent optimization algorithm, such as relying on simple, static, low-dimensional scenarios, and poor scalability. Specifically, the suppression of enemy air defense mission planning is described as a sequential decision-making problem and formalized as a Markov decision (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  6.  22
    罰回避政策形成アルゴリズムの改良とオセロゲームへの応用.坪井 創吾 宮崎 和光 - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:548-556.
    The purpose of reinforcement learning is to learn an optimal policy in general. However, in 2-players games such as the othello game, it is important to acquire a penalty avoiding policy. In this paper, we focus on formation of a penalty avoiding policy based on the Penalty Avoiding Rational Policy Making algorithm [Miyazaki 01]. In applying it to large-scale problems, we are confronted with the curse of dimensionality. We introduce several ideas and heuristics (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  7.  24
    Profit Sharing 法における強化関数に関する一考察.Tatsumi Shoji Uemura Wataru - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:197-203.
    In this paper, we consider profit sharing that is one of the reinforcement learning methods. An agent learns a candidate solution of a problem from the reward that is received from the environment if and only if it reaches the destination state. A function that distributes the received reward to each action of the candidate solution is called the reinforcement function. On this learning system, the agent can reinforce the set of selected actions when it gets the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  8.  15
    罰を回避する合理的政策の学習.坪井 創吾 宮崎 和光 - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16 (2):185-192.
    Reinforcement learning is a kind of machine learning. It aims to adapt an agent to a given environment with a clue to rewards. In general, the purpose of reinforcement learning system is to acquire an optimum policy that can maximize expected reward per an action. However, it is not always important for any environment. Especially, if we apply reinforcement learning system to engineering, environments, we expect the agent to avoid all penalties. In Markov Decision Processes, a (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  9.  16
    Enforcing ethical goals over reinforcement-learning policies.Guido Governatori, Agata Ciabattoni, Ezio Bartocci & Emery A. Neufeld - 2022 - Ethics and Information Technology 24 (4):1-19.
    Recent years have yielded many discussions on how to endow autonomous agents with the ability to make ethical decisions, and the need for explicit ethical reasoning and transparency is a persistent theme in this literature. We present a modular and transparent approach to equip autonomous agents with the ability to comply with ethical prescriptions, while still enacting pre-learned optimal behaviour. Our approach relies on a normative supervisor module, that integrates a theorem prover for defeasible deontic logic within the control (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  10.  15
    Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning.Rui Wang, Xianghua Gan, Qing Li & Xiao Yan - 2021 - Complexity 2021:1-17.
    We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  11.  17
    強化学習エージェントへの階層化意志決定法の導入―追跡問題を例に―.輿石 尚宏 謙吾 片山 - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:279-291.
    Reinforcement Learning is a promising technique for creating agents that can be applied to real world problems. The most important features of RL are trial-and-error search and delayed reward. Thus, agents randomly act in the early learning stage. However, such random actions are impractical for real world problems. This paper presents a novel model of RL agents. A feature of our learning agent model is to integrate the Analytic Hierarchy Process into the standard RL agent model, which consists of (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  12.  23
    尤度情報に基づく温度分布を用いた強化学習法.鈴木 健嗣 小堀 訓成 - 2005 - Transactions of the Japanese Society for Artificial Intelligence 20:297-305.
    In the existing Reinforcement Learning, it is difficult and time consuming to find appropriate the meta-parameters such as learning rate, eligibility traces and temperature for exploration, in particular on a complicated and large-scale problem, the delayed reward often occurs and causes a difficulty in solving the problem. In this paper, we propose a novel method introducing a temperature distribution for reinforcement learning. In addition to the acquirement of policy based on profit sharing, the temperature is given (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  13.  37
    Inter‐temporal rationality without temporal representation.Simon A. B. Brown - 2023 - Mind and Language 38 (2):495-514.
    Recent influential accounts of temporal representation—the use of mental representations with explicit temporal contents, such as before and after relations and durations—sharply distinguish representation from mere sensitivity. A common, important picture of inter-temporal rationality is that it consists in maximizing total expected discounted utility across time. By analyzing reinforcement learning algorithms, this article shows that, given such notions of temporal representation and inter-temporal rationality, it would be possible for an agent to achieve inter-temporal rationality without temporal representation. It then (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  14.  14
    Rationing Care through Collaboration and Shared Values.James E. Sabin - 2018 - Hastings Center Report 48 (1):22-24.
    Although “rationing” continues to be a dirty word for the public in health policy discourse, Nir Eyal and colleagues handle the concept exactly right in their article in this issue of the Hastings Center Report. They correctly characterize rationing as an ethical requirement, not a moral abomination. They identify the key health policy question as how rationing can best be done, not whether it should be done at all. They make a cogent defense of what they call “rationing (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  15.  10
    Reinforcement Learning-Based Collision Avoidance Guidance Algorithm for Fixed-Wing UAVs.Yu Zhao, Jifeng Guo, Chengchao Bai & Hongxing Zheng - 2021 - Complexity 2021:1-12.
    A deep reinforcement learning-based computational guidance method is presented, which is used to identify and resolve the problem of collision avoidance for a variable number of fixed-wing UAVs in limited airspace. The cooperative guidance process is first analyzed for multiple aircraft by formulating flight scenarios using multiagent Markov game theory and solving it by machine learning algorithm. Furthermore, a self-learning framework is established by using the actor-critic model, which is proposed to train collision avoidance decision-making neural networks. To (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  16. Detecting racial bias in algorithms and machine learning.Nicol Turner Lee - 2018 - Journal of Information, Communication and Ethics in Society 16 (3):252-260.
    Purpose The online economy has not resolved the issue of racial bias in its applications. While algorithms are procedures that facilitate automated decision-making, or a sequence of unambiguous instructions, bias is a byproduct of these computations, bringing harm to historically disadvantaged populations. This paper argues that algorithmic biases explicitly and implicitly harm racial groups and lead to forms of discrimination. Relying upon sociological and technical research, the paper offers commentary on the need for more workplace diversity within high-tech industries (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   14 citations  
  17.  7
    Is educational policy making rational — and what would that mean, anyway?Eric Bredo - 2009 - Educational Theory 59 (5):533-547.
    In Moderating the Debate: Rationality and the Promise of American Education, Michael Feuer raises concerns about the consequences of basing educational policy on the model of rational choice drawn from economics. Policy making would be better and more realistic, he suggests, if it were based on a newer procedural model drawn from cognitive science. In this essay Eric Bredo builds on Feuer's analysis by offering a more systematic critique of the traditional model of rationality that Feuer (...)
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  18.  63
    Novelty and Inductive Generalization in Human Reinforcement Learning.Samuel J. Gershman & Yael Niv - 2015 - Topics in Cognitive Science 7 (3):391-415.
    In reinforcement learning, a decision maker searching for the most rewarding option is often faced with the question: What is the value of an option that has never been tried before? One way to frame this question is as an inductive problem: How can I generalize my previous experience with one set of options to a novel option? We show how hierarchical Bayesian inference can be used to solve this problem, and we describe an equivalence between the Bayesian model (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  19.  23
    Education as interaction.Martin Wenham - 1991 - Journal of Philosophy of Education 25 (2):235–246.
    ABSTRACT The teaching-learning process is of central importance in education. By developing a concept of effective teaching and a corresponding model of the teaching-learning process, it is argued that unless the needs of pupils are to be disregarded, teachers must become co-learners and responsibility for quality of education must be shared. Education is seen as an interactive process in which teachers and pupils participate co-operatively. It is shown that this concept, already implicit in much educational thought and practice, can contribute (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  20.  23
    Toward a Psychology of Deep Reinforcement Learning Agents Using a Cognitive Architecture.Konstantinos Mitsopoulos, Sterling Somers, Joel Schooler, Christian Lebiere, Peter Pirolli & Robert Thomson - 2022 - Topics in Cognitive Science 14 (4):756-779.
    We argue that cognitive models can provide a common ground between human users and deep reinforcement learning (Deep RL) algorithms for purposes of explainable artificial intelligence (AI). Casting both the human and learner as cognitive models provides common mechanisms to compare and understand their underlying decision-making processes. This common grounding allows us to identify divergences and explain the learner's behavior in human understandable terms. We present novel salience techniques that highlight the most relevant features in each model's decision- (...), as well as examples of this technique in common training environments such as Starcraft II and an OpenAI gridworld. (shrink)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  21.  32
    Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning.Daniel J. Schad, Elisabeth Jünger, Miriam Sebold, Maria Garbusow, Nadine Bernhardt, Amir-Homayoun Javadi, Ulrich S. Zimmermann, Michael N. Smolka, Andreas Heinz, Michael A. Rapp & Quentin J. M. Huys - 2014 - Frontiers in Psychology 5:117016.
    Theories of decision-making and its neural substrates have long assumed the existence of two distinct and competing valuation systems, variously described as goal-directed vs. habitual, or, more recently and based on statistical arguments, as model-free vs. model-based reinforcement-learning. Though both have been shown to control choices, the cognitive abilities associated with these systems are under ongoing investigation. Here we examine the link to cognitive abilities, and find that individual differences in processing speed covary with a shift from model-free (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  22.  10
    Averaged Soft Actor-Critic for Deep Reinforcement Learning.Feng Ding, Guanfeng Ma, Zhikui Chen, Jing Gao & Peng Li - 2021 - Complexity 2021:1-16.
    With the advent of the era of artificial intelligence, deep reinforcement learning has achieved unprecedented success in high-dimensional and large-scale artificial intelligence tasks. However, the insecurity and instability of the DRL algorithm have an important impact on its performance. The Soft Actor-Critic algorithm uses advanced functions to update the policy and value network to alleviate some of these problems. However, SAC still has some problems. In order to reduce the error caused by the overestimation of SAC, we propose (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  23.  22
    Action control, forward models and expected rewards: representations in reinforcement learning.Jami Pekkanen, Jesse Kuokkanen, Otto Lappi & Anna-Mari Rusanen - 2021 - Synthese 199 (5-6):14017-14033.
    The fundamental cognitive problem for active organisms is to decide what to do next in a changing environment. In this article, we analyze motor and action control in computational models that utilize reinforcement learning (RL) algorithms. In reinforcement learning, action control is governed by an action selection policy that maximizes the expected future reward in light of a predictive world model. In this paper we argue that RL provides a way to explicate the so-called action-oriented views of (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  24.  33
    Implicit learning of (boundedly) rational behaviour.Daniel John Zizzo - 2000 - Behavioral and Brain Sciences 23 (5):700-701.
    Stanovich & West's target article undervalues the power of implicit learning (particularly reinforcement learning). Implicit learning may allow the learning of more rational responses–and sometimes even generalisation of knowledge–in contexts where explicit, abstract knowledge proves only of limited value, such as for economic decision-making. Four other comments are made.
    Direct download (5 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  25.  14
    Complying with norms. a neurocomputational exploration.Matteo Colombo - 2012 - Dissertation, University of Edinburgh
    The subject matter of this thesis can be summarized by a triplet of questions and answers. Showing what these questions and answers mean is, in essence, the goal of my project. The triplet goes like this: Q: How can we make progress in our understanding of social norms and norm compliance? A: Adopting a neurocomputational framework is one effective way to make progress in our understanding of social norms and norm compliance. Q: What could the neurocomputational mechanism of social norm (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  26. Shared decision-making and maternity care in the deep learning age: Acknowledging and overcoming inherited defeaters.Keith Begley, Cecily Begley & Valerie Smith - 2021 - Journal of Evaluation in Clinical Practice 27 (3):497–503.
    In recent years there has been an explosion of interest in Artificial Intelligence (AI) both in health care and academic philosophy. This has been due mainly to the rise of effective machine learning and deep learning algorithms, together with increases in data collection and processing power, which have made rapid progress in many areas. However, use of this technology has brought with it philosophical issues and practical problems, in particular, epistemic and ethical. In this paper the authors, with backgrounds in (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  27.  8
    Deep Learning Image Feature Recognition Algorithm for Judgment on the Rationality of Landscape Planning and Design.Bin Hu - 2021 - Complexity 2021:1-15.
    This paper uses an improved deep learning algorithm to judge the rationality of the design of landscape image feature recognition. The preprocessing of the image is proposed to enhance the data. The deficiencies in landscape feature extraction are further addressed based on the new model. Then, the two-stage training method of the model is used to solve the problems of long training time and convergence difficulties in deep learning. Innovative methods for zoning and segmentation training of landscape pattern features are (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  28. Superintelligence: Fears, Promises and Potentials.Ben Goertzel - 2015 - Journal of Evolution and Technology 25 (2):55-87.
    Oxford philosopher Nick Bostrom; in his recent and celebrated book Superintelligence; argues that advanced AI poses a potentially major existential risk to humanity; and that advanced AI development should be heavily regulated and perhaps even restricted to a small set of government-approved researchers. Bostrom’s ideas and arguments are reviewed and explored in detail; and compared with the thinking of three other current thinkers on the nature and implications of AI: Eliezer Yudkowsky of the Machine Intelligence Research Institute ; and David (...)
    No categories
     
    Export citation  
     
    Bookmark   4 citations  
  29.  14
    環境状況に応じて自己の報酬を操作する学習エージェントの構築.沼尾 正行 森山 甲一 - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:676-683.
    The authors aim at constructing an agent which learns appropriate actions in a Multi-Agent environment with and without social dilemmas. For this aim, the agent must have nonrationality that makes it give up its own profit when it should do that. Since there are many studies on rational learning that brings more and more profit, it is desirable to utilize them for constructing the agent. Therefore, we use a reward-handling manner that makes internal evaluation from the agent's (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  30. Collective and Individual Rationality: Some Episodes in the History of Economic Thought.Andy Denis - 2002 - Dissertation, City, University of London
    This thesis argues for the fundamental importance of the opposition between holistic and reductionistic world-views in economics. Both reductionism and holism may nevertheless underpin laissez-faire policy prescriptions. Scrutiny of the nature of the articulation between micro and macro levels in the writings of economists suggests that invisible hand theories play a key role in reconciling reductionist policy prescriptions with a holistic world. An examination of the prisoners' dilemma in game theory and Arrow's impossibility theorem in social choice (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  31.  19
    Algorithmic Accountability In the Making.Deborah G. Johnson - 2021 - Social Philosophy and Policy 38 (2):111-127.
    Algorithms are now routinely used in decision-making; they are potent components in decisions that affect the lives of individuals and the activities of public and private institutions. Although use of algorithms has many benefits, a number of problems have been identified with their use in certain domains, most notably in domains where safety and fairness are important. Awareness of these problems has generated public discourse calling for algorithmic accountability. However, the current discourse focuses largely on algorithms and their opacity. (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  32.  25
    Applying interdisciplinary models to design, planning, and policy-making.Julie Thompson Klein - 1990 - Knowledge, Technology & Policy 3 (4):29-55.
    The difficulty of handling complex problems has spawned challenges to the traditional paradigm of technical rationality in design, planning, and policy making. One of the most frequently proposed solutions is an interdisciplinary approach, though few writers have described the operational dynamics of such an approach. A global model of interdisciplinary problem-solving is presented based on the premise that the unity of the interdisciplinary approach derives from the creation of an intermediary process that relies on common language, shared information, (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  33. Algorithmic Profiling as a Source of Hermeneutical Injustice.Silvia Milano & Carina Prunkl - forthcoming - Philosophical Studies:1-19.
    It is well-established that algorithms can be instruments of injustice. It is less frequently discussed, however, how current modes of AI deployment often make the very discovery of injustice difficult, if not impossible. In this article, we focus on the effects of algorithmic profiling on epistemic agency. We show how algorithmic profiling can give rise to epistemic injustice through the depletion of epistemic resources that are needed to interpret and evaluate certain experiences. By doing so, we not only demonstrate how (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  34.  44
    Rationalizable Irrationalities of Choice.Peter Dayan - 2014 - Topics in Cognitive Science 6 (2):204-228.
    Although seemingly irrational choice abounds, the rules governing these mis‐steps that might provide hints about the factors limiting normative behavior are unclear. We consider three experimental tasks, which probe different aspects of non‐normative choice under uncertainty. We argue for systematic statistical, algorithmic, and implementational sources of irrationality, including incomplete evaluation of long‐run future utilities, Pavlovian actions, and habits, together with computational and statistical noise and uncertainty. We suggest structural and functional adaptations that minimize their maladaptive effects.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  35.  11
    A Novel Modeling Technique for the Forecasting of Multiple-Asset Trading Volumes: Innovative Initial-Value-Problem Differential Equation Algorithms for Reinforcement Machine Learning.Mazin A. M. Al Janabi - 2022 - Complexity 2022:1-16.
    Liquidity risk arises from the inability to unwind or hedge trading positions at the prevailing market prices. The risk of liquidity is a wide and complex topic as it depends on several factors and causes. While much has been written on the subject, there exists no clear-cut mathematical description of the phenomena and typical market risk modeling methods fail to identify the effect of illiquidity risk. In this paper, we do not propose a definitive one either, but we attempt to (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  36. A lineage explanation of human normative guidance: the coadaptive model of instrumental rationality and shared intentionality.Ivan Gonzalez-Cabrera - 2022 - Synthese 200 (6):1-32.
    This paper aims to contribute to the existing literature on normative cognition by providing a lineage explanation of human social norm psychology. This approach builds upon theories of goal-directed behavioral control in the reinforcement learning and control literature, arguing that this form of control defines an important class of intentional normative mental states that are instrumental in nature. I defend the view that great ape capacities for instrumental reasoning and our capacity (or family of capacities) for shared intentionality coadapted (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  37.  8
    Universal Algorithmic Intelligence: A Mathematical Top-Down Approach.Marcus Hutter - 2006 - In Ben Goertzel & Cassio Pennachin (eds.), Artificial General Intelligence. Springer Verlag. pp. 227-290.
    Sequential decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental prior probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown prior distribution. We combine both ideas and get a parameter-free theory of universal Artificial Intelligence. We give strong arguments that the resulting AIXI model is the most intelligent unbiased agent possible. We outline how the AIXI model can formally solve a number of problem (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  38.  25
    重点サンプリングを用いた Ga による強化学習.Kimura Hajime Tsuchiya Chikao - 2005 - Transactions of the Japanese Society for Artificial Intelligence 20:1-10.
    Reinforcement Learning (RL) handles policy search problems: searching a mapping from state space to action space. However RL is based on gradient methods and as such, cannot deal with problems with multimodal landscape. In contrast, though Genetic Algorithm (GA) is promising to deal with them, it seems to be unsuitable for policy search problems from the viewpoint of the cost of evaluation. Minimal Generation Gap (MGG), used as a generation-alternation model in GA, generates many offspring from two (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  39.  27
    Stop making sense of Bell’s theorem and nonlocality?Federico Laudisa - 2018 - European Journal for Philosophy of Science 8 (2):293-306.
    In a recent paper on Foundations of Physics, Stephen Boughn reinforces a view that is more shared in the area of the foundations of quantum mechanics than it would deserve, a view according to which quantum mechanics does not require nonlocality of any kind and the common interpretation of Bell theorem as a nonlocality result is based on a misunderstanding. In the present paper I argue that this view is based on an incorrect reading of the presuppositions of the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  40.  86
    Rationalization is rational.Fiery Cushman - 2020 - Behavioral and Brain Sciences 43:1-69.
    Rationalization occurs when a person has performed an action and then concocts the beliefs and desires that would have made it rational. Then, people often adjust their own beliefs and desires to match the concocted ones. While many studies demonstrate rationalization, and a few theories describe its underlying cognitive mechanisms, we have little understanding of its function. Why is the mind designed to construct post hoc rationalizations of its behavior, and then to adopt them? This may accomplish an important (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   28 citations  
  41.  7
    Occluded algorithms.Adam Burke - 2019 - Big Data and Society 6 (2).
    Two definitions of algorithm, their uses, and their implied models of computing in society, are reviewed. The first, termed the structural programming definition, aligns more with usage in computer science, and as the name suggests, the intellectual project of structured programming. The second, termed the systemic definition, is more informal and emerges from ethnographic observations of discussions of software in both professional and everyday settings. Specific examples of locating algorithms within modern codebases are shared, as well as code directly impacting (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  42.  58
    Evidence-Based Medicine as an Instrument for Rational Health Policy.Nikola Biller-Andorno, Reidar K. Lie & Ruud Ter Meulen - 2002 - Health Care Analysis 10 (3):261-275.
    This article tries to present a broad view on the values and ethicalissues that are at stake in efforts to rationalize health policy on thebasis of economic evaluations (like cost-effectiveness analysis) andrandomly controlled clinical trials. Though such a rationalization isgenerally seen as an objective and `value free' process, moral valuesoften play a hidden role, not only in the production of `evidence', butalso in the way this evidence is used in policy making. For example, thedefinition of effectiveness of (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   10 citations  
  43.  21
    Ga により探索空間の動的生成を行う Q 学習.Matsuno Fumitoshi Ito Kazuyuki - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16:510-520.
    Reinforcement learning has recently received much attention as a learning method for complicated systems, e.g., robot systems. It does not need prior knowledge and has higher capability of reactive and adaptive behaviors. However increase in dimensionality of the action-state space makes it diffcult to accomplish learning. The applicability of the existing reinforcement learning algorithms are effective for simple tasks with relatively small action-state space. In this paper, we propose a new reinforcement learning algorithm: “Q-learning with Dynamic Structuring (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  44. Fair, Transparent, and Accountable Algorithmic Decision-making Processes: The Premise, the Proposed Solutions, and the Open Challenges.Bruno Lepri, Nuria Oliver, Emmanuel Letouzé, Alex Pentland & Patrick Vinck - 2018 - Philosophy and Technology 31 (4):611-627.
    The combination of increased availability of large amounts of fine-grained human behavioral data and advances in machine learning is presiding over a growing reliance on algorithms to address complex societal problems. Algorithmic decision-making processes might lead to more objective and thus potentially fairer decisions than those made by humans who may be influenced by greed, prejudice, fatigue, or hunger. However, algorithmic decision-making has been criticized for its potential to enhance discrimination, information and power asymmetry, and opacity. In this (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   50 citations  
  45.  46
    The rationality of different kinds of intuitive decision processes.Marc Jekel, Andreas Glöckner, Susann Fiedler & Arndt Bröder - 2012 - Synthese 189 (S1):147-160.
    Whereas classic work in judgment and decision making has focused on the deviation of intuition from rationality, more recent research has focused on the performance of intuition in real-world environments. Borrowing from both approaches, we investigate to which extent competing models of intuitive probabilistic decision making overlap with choices according to the axioms of probability theory and how accurate those models can be expected to perform in real-world environments. Specifically, we assessed to which extent heuristics, models implementing weighted (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  46. The Convergence of National Rational Self-Interest and Justice in Space Policy.Duncan Macintosh - 2023 - International Journal of Applied Philosophy 37 (1):87-106.
    How may nations protect their interests in space if its fragility makes military operations there self-defeating? This essay claims nations are in Prisoners Dilemmas on the matter, and applies David Gauthier’s theories about how it is rational to behave morally—cooperatively—in such dilemmas. Currently space-faring nations should i) enter into co-operative space sharing arrangements with other rational nations, ii) exclude—militarily, but with only terrestrial force—nations irrational or existentially opposed to other nations being in space, and iii) incentivize all nations (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  47.  11
    The combine will tell the truth: On precision agriculture and algorithmic rationality.Christopher Miles - 2019 - Big Data and Society 6 (1).
    Recent technological and methodological changes in farming have led to an emerging set of claims about the role of digital technology in food production. Known as precision agriculture, the integration of digital management and surveillance technologies in farming is normatively presented as a revolutionary transformation. Proponents contend that machine learning, Big Data, and automation will create more accurate, efficient, transparent, and environmentally friendly food production, staving off both food insecurity and ecological ruin. This article contributes a critique of these rhetorical (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark   14 citations  
  48.  8
    Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs.Finale Doshi-Velez, Joelle Pineau & Nicholas Roy - 2012 - Artificial Intelligence 187-188 (C):115-132.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  49. Learning to plan probabilistically from neural networks.R. Sun - unknown
    Di erent from existing reinforcement learning algorithms that generate only reactive policies and existing probabilis tic planning algorithms that requires a substantial amount of a priori knowledge in order to plan we devise a two stage bottom up learning to plan process in which rst reinforce ment learning dynamic programming is applied without the use of a priori domain speci c knowledge to acquire a reactive policy and then explicit plans are extracted from the learned reactive policy (...)
     
    Export citation  
     
    Bookmark  
  50. HCI Model with Learning Mechanism for Cooperative Design in Pervasive Computing Environment.Hong Liu, Bin Hu & Philip Moore - 2015 - Journal of Internet Technology 16.
    This paper presents a human-computer interaction model with a three layers learning mechanism in a pervasive environment. We begin with a discussion around a number of important issues related to human-computer interaction followed by a description of the architecture for a multi-agent cooperative design system for pervasive computing environment. We present our proposed three- layer HCI model and introduce the group formation algorithm, which is predicated on a dynamic sharing niche technology. Finally, we explore the cooperative reinforcement learning and (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
1 — 50 / 1000