PDA

View Full Version : Reinforcement Learning Algorithm - AI


jhebda
11-06-2002, 01:58 AM
I am working on coding a reinforcement learning algorithm and attempting to compare it with an optimal Q-learning algorithm using an epsilon greedy approach. The first major problem i'm seeing, though, is that I can't find information online or in reference about optimizing an alpha value in the following equation for each state.

In a World of many States, each state has a number of Q values associated with it. Assuming initial state i, and next state j, the equation to update Q is:

Q[a,i] = Q[a,i](old) + alpha * (Reward received in state i + max(Q[a,j]) - Q[a,i](old)

This basically says that the new value of Q (Q[a,i] is the utility of performing action a in state i) is equal to the old value plus a small sum scaled by an alpha value.

Alpha should ideally start high, since we know nothing about the entire world of states, and then decrease. However, I cannot find information on an optimal decrease rate for alpha. Has anybody worked with reinforcement learning and dealt with a similar problem before that could help me out?

kmj
11-06-2002, 10:30 AM
Hmm. I know there was an article about Q-Learning online that I was looking at back when I took Distributed AI; I don't remember if it talked much about choosing an optimal alpha though...

google's got alot of responses on Q-learning; but I didn't see the one that I had used. If I remember to, I'll take a look at my DAI book at home, to see if it says anything more about it... I'm not too optimistic though.