人工知能学会論文誌
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
ショートノート
Profit Sharing法における強化関数に関する一考察
植村 渉辰巳 昭治
著者情報
ジャーナル フリー

2004 年 19 巻 4 号 p. 197-203

詳細
抄録

In this paper, we consider profit sharing that is one of the reinforcement learning methods. An agent learns a candidate solution of a problem from the reward that is received from the environment if and only if it reaches the destination state. A function that distributes the received reward to each action of the candidate solution is called the reinforcement function. On this learning system, the agent can reinforce the set of selected actions when it gets the reward. And the agent should not reinforce the detour actions. First, we will propose a new constraint equation about reinforcement functions to distribute the reinforcement values on the non-detour actions. If we use the reinforcement function to satisfy the constraint equation, the agent can select the non-detour actions directing to the destination state. Next, it is shown that the reinforcement function can be constant after learning process to suppress the selection of detour actions. Lastly, in computer simulations for maze problems, we show that the learning performance of agents does not depend on the size of environment.

著者関連情報
© 2004 JSAI (The Japanese Society for Artificial Intelligence)
次の記事
feedback
Top