1-2hit |
Jae Won LEE Sung-Dong KIM Jongwoo LEE Jinseok CHAE
This paper describes a stock trading system based on reinforcement learning, regarding the process of stock price changes as Markov decision process (MDP). The system adopts two popular reinforcement learning algorithms, temporal-difference (TD) and Q, for selecting stocks and optimizing trading parameters, respectively. Input features of the system are devised using technical analysis and value functions are approximated by feedforward neural networks. Multiple cooperative agents are used for Q-learning to efficiently integrate global trend prediction with local trading strategy. Agents communicate with others sharing training episodes and learned policies, while keeping the overall scheme of conventional Q-learning. Experimental results on the Korean stock market show that our trading system outperforms the market average and makes appreciable profits. Furthermore, we can find that our system is superior to a system trained by supervised learning in view of risk management.
It is known that the schedulability of a non-preemptive task set with fixed priority can be determined in pseudo-polynomial time. However, since Rate Monotonic scheduling is not optimal for non-preemptive scheduling, the applicability of existing polynomial time tests that provide sufficient schedulability conditions, such as Liu and Layland's bound, is limited. This letter proposes a new sufficient condition for non-preemptive fixed priority scheduling that can be used for any fixed priority assignment scheme. It is also shown that the proposed schedulability test has a tighter utilization bound than existing test methods.