1-4hit |
Taku YOSHIOKA Shin ISHII Minoru ITO
This article discusses automatic strategy acquisition for the game "Othello" based on a reinforcement learning scheme. In our approach, a computer player, which initially knows only the game rules, becomes stronger after playing several thousands of games against another player. In each game, the computer player refines the evaluation function for the game state, which is achieved by min-max reinforcement learning (MMRL). MMRL is a simple learning scheme that uses the min-max strategy. Since the state space of Othello is huge, we employ a normalized Gaussian network (NGnet) to represent the evaluation function. As a result, the computer player becomes strong enough to beat a player employing a heuristic strategy. This article experimentally shows that MMRL is better than TD(0) and also shows that the NGnet is better than a multi-layered perceptron, in our Othello task.
This paper reports on the trending literature of occlusion handling in the task of online visual tracking. The discussion first explores visual tracking realm and pinpoints the necessity of dedicated attention to the occlusion problem. The findings suggest that although occlusion detection facilitated tracking impressively, it has been largely ignored. The literature further showed that the mainstream of the research is gathered around human tracking and crowd analysis. This is followed by a novel taxonomy of types of occlusion and challenges arising from it, during and after the emergence of an occlusion. The discussion then focuses on an investigation of the approaches to handle the occlusion in the frame-by-frame basis. Literature analysis reveals that researchers examined every aspect of a tracker design that is hypothesized as beneficial in the robust tracking under occlusion. State-of-the-art solutions identified in the literature involved various camera settings, simplifying assumptions, appearance and motion models, target state representations and observation models. The identified clusters are then analyzed and discussed, and their merits and demerits are explained. Finally, areas of potential for future research are presented.
Shigeyuki OBA Masa-aki SATO Shin ISHII
We propose two modifications of Gaussian processes, which aim to deal with dynamic environments. One is a weight decay method that gradually forgets old data, and the other is a time stamp method that regards the time course of data as a Gaussian process. We show experimental results when these modifications are applied to regression problems in dynamic environments. The weight decay method is found to follow the environmental change by automatically ignoring the past data, and the time stamp method is found to predict linear alteration.
This article theoretically provides the ensemble average and the ensemble variance of membrane potential of an integrate-and-fire neuron, when the neuron receives random spikes from the other neurons. The model assumes that EPSPs rise and fall continuously. Our theoretical result shows good agreement with a numerical simulation.