From: Thorsten Kiefer on 9 Aug 2010 15:10 Hi Folks, I wrote this little applet : http://tokis-edv-service.de/index.php/beispiele/pole-balancing My program uses Q(0) learning with Dynamic Programming. The state space is continuous, so I partitioned it to make Dynamic Programming possible. Unfortunately it never learns to balance the pole. I think I implemented the Q(0) algorithm correctly. If anyone could have a look at my code and tell me, what is wrong, I'd be very grateful. Best wishes Thorsten
From: Thorsten Kiefer on 11 Aug 2010 19:05 Thorsten Kiefer wrote: > Hi Folks, > I wrote this little applet : > http://tokis-edv-service.de/index.php/beispiele/pole-balancing > > My program uses Q(0) learning with Dynamic Programming. > The state space is continuous, so I partitioned it to make Dynamic > Programming possible. > Unfortunately it never learns to balance the pole. > I think I implemented the Q(0) algorithm correctly. > If anyone could have a look at my code and tell me, what is wrong, I'd be > very grateful. > > Best wishes > Thorsten OK I fixed it. I found some minor bugs and played with the parameters. Please visit my applet again and start training for about 5 minutes and see the result. Please leave me a comment how you like it. Thanks and best wishes Thorsten
From: Thorsten Kiefer on 11 Aug 2010 19:08 Thorsten Kiefer wrote: > Thorsten Kiefer wrote: > >> Hi Folks, >> I wrote this little applet : >> http://tokis-edv-service.de/index.php/beispiele/pole-balancing >> >> My program uses Q(0) learning with Dynamic Programming. >> The state space is continuous, so I partitioned it to make Dynamic >> Programming possible. >> Unfortunately it never learns to balance the pole. >> I think I implemented the Q(0) algorithm correctly. >> If anyone could have a look at my code and tell me, what is wrong, I'd be >> very grateful. >> >> Best wishes >> Thorsten > > OK I fixed it. > I found some minor bugs and played with the parameters. > Please visit my applet again and start training for about 5 minutes and > see the result. > Please leave me a comment how you like it. > > Thanks and best wishes > Thorsten I'd like to start a discussion about why I chose shuch parameters and how I found them I'll add some gimmicks to that applet.
|
Pages: 1 Prev: 3SAT - Short Pigeonhole Refutations Next: The Money Part [Civic Networking] |