罰回避政策形成アルゴリズムの改良とオセロゲームへの応用

坪井 創吾 宮崎 和光

Download from

dx.doi.org

More download options

罰回避政策形成アルゴリズムの改良とオセロゲームへの応用

坪井創吾宮崎和光

Transactions of the Japanese Society for Artificial Intelligence 17:548-556 (2002) Copy BIBT_EX

Abstract

The purpose of reinforcement learning is to learn an optimal policy in general. However, in 2-players games such as the othello game, it is important to acquire a penalty avoiding policy. In this paper, we focus on formation of a penalty avoiding policy based on the Penalty Avoiding Rational Policy Making algorithm [Miyazaki 01]. In applying it to large-scale problems, we are confronted with the curse of dimensionality. We introduce several ideas and heuristics to overcome the combinational explosion in large-scale problems. First, we propose an algorithm to save the memory by calculation of state transition. Second, we describe how to restrict exploration by two type knowledge; KIFU database and evaluation funcion. We show that our learning player can always defeat against the well-known othello game program KITTY.

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Edit

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Keywords

reinforcement learning, reward and penalty, penalty avoiding rational policy making, the othello game, KITTY

Reprint years

DOI

10.1527/tjsai.17.548

My notes

Analytics

Added to PP
2014-03-24

Downloads
22 (#733,560)

6 months
4 (#862,833)

Historical graph of downloads

How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

罰回避政策形成アルゴリズムの改良とオセロゲームへの応用

Abstract

Categories

Keywords

Reprint years

DOI

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Citations of this work

References found in this work