Details

Participants: 1-2 Student(s)
Keywords: Monte-Carlo Tree Simulation, UCB1, Kalman Filtering, Strategy Game Algorithms, Reinforcement Learning

Task Description & Requirements

Investigate whether the Kalman filter can be employed for making Monte Carlo Tree Simulation more efficient. The idea is: The essential problem of complex strategic problems is the dimensionality of the search tree (branching factor). This problem is known as exploitation vs. exploration (EE), i.e. the question whether promising behavior should be investigated even deeper or whether unexplored areas of the search tree should be prioritized. The currently accepted best solution is the so-called UCB1 heuristic. However, the Kalman filter has properties that are similar to the EE situation: The next estimate is a combination of past & current findings (exploitation) as well as (mis-)trust in the findings (exploration). In the present task, the student(s) should investigate whether there is a practical application of this idea and, if yes, how good it performs w.r.t. to the UCB1 heuristic. For that, literature work is requires, but the main task is to implement the idea for a game with sufficient complexity and, ideally, partial information. Such environments are take from the course Strategy Game Programming (SGP) and provided by the supervisor. Requires good knowledge of the Java programming language and a general interest in strategic optimization as well as reinforcement learning. Students of the SGP course are preferred as they should have acquired all necessary skills already. If performed as a diploma thesis, must include an extensive quantitative evaluation.

Contact

For more information please contact Horst Eidenberger.