An orthonormal basis adaptation method for function approximation was developed and applied to reinforcement learning with multi-dimensional continuous state space. First, a basis used for linear function approximation of a control function is set to an orthonormal basis. Next, basis elements with small activities are replaced with other candidate elements as learning progresses. As this replacement is repeated, the number of basis elements with large activities increases. Example chaos control problems for multiple logistic maps were solved, demonstrating that the method for adapting an orthonormal basis can modify a basis while holding the orthonormality in accordance with changes in the environment to improve the performance of reinforcement learning and to eliminate the adverse effects of redundant noisy states.
Masahiro ISHIBA Hideki SATOH Takehiko KOBAYASHI
To obtain a high throughput for transmission control protocol (TCP) connections over the wireless links, we previously proposed a novel transmission power control method for code division multiple access (CDMA) packet communication systems. By using this transmission power control method, we developed a transmission power control method and a packet multiplexing method to transmit constant bit rate (CBR) and TCP packets over CDMA wireless systems. Our methods can guarantee the quality of service (QoS) for CBR connections and utilize bandwidth effectively without modifying the TCP protocol or using slot assignments. Evaluation of our methods by computer simulation showed that the proposed methods provide a near-maximum throughput and guarantee the packet loss ratio of CBR connections regardless of the number of connections.
Takeshi ONOMI Yoshinao MIZUGAKI Hideki SATOH Tsutomu YAMASHITA Koji NAKAJIMA
We present two types of ICF (INHIBIT Controlled by Fluxon) gates as the basic circuits of the phase-mode logic family, and fabricate an adder circuit. The experimental result demonstrates that the carry operation followed up to 99 GHz input pulses. The performance of Josephson devices is improved by the use of junctions with high current density (Jc). We may use the high-Jc junctions without external resistive shunt in the phase-mode logic circuits because of reduction of the junction hysteresis. One of the ways to overcome the large area occupancy for geometric inductance is to utilize the effective inductance of a Josephson junction itself. We investigate a circuit construction with high-Jc inductor junctions, intrinsically overdumped junctions and junction-type resistors for the compactness of circuit integration, and discuss various aspects of the circuit construction.
A state space compression method based on multivariate analysis was developed and applied to reinforcement learning for high-dimensional continuous state spaces. First, useful components in the state variables of an environment are extracted and meaningless ones are removed by using multiple regression analysis. Next, the state space of the environment is compressed by using principal component analysis so that only a few principal components can express the dynamics of the environment. Then, a basis of a feature space for function approximation is constructed based on orthonormal bases of the important principal components. A feature space is thus autonomously construct without preliminary knowledge of the environment, and the environment is effectively expressed in the feature space. An example synchronization problem for multiple logistic maps was solved using this method, demonstrating that it solves the curse of dimensionality and exhibits high performance without suffering from disturbance states.
A robust routing algorithm was developed based on reinforcement learning that uses (1) reward-weighted principal component analysis, which compresses the state space of a network with a large number of nodes and eliminates the adverse effects of various types of attacks or disturbance noises, (2) activity-oriented index allocation, which adaptively constructs a basis that is used for approximating routing probabilities, and (3) newly developed space compression based on a potential model that reduces the space for routing probabilities. This algorithm takes all the network states into account and reduces the adverse effects of disturbance noises. The algorithm thus works well, and the frequencies of causing routing loops and falling to a local optimum are reduced even if the routing information is disturbed.
To accommodate best-effort multimedia Internet protocol (IP) connections in mobile environments, we introduced new criteria for TCP-friendliness and developed a control algorithm for the transient response and stability in the packet transmission rate. We improved the maximum throughput and QoS guaranteed congestion control algorithm (MAQS) by using these two solutions, and solved the following problems that Reno and conventional congestion control algorithms have: (1) network congestion cannot be avoided when the round-trip time (RTT) is short and the holding time is long, (2) the packet transmission rate of a long-RTT connection is small when it is multiplexed with short-RTT connections, (3) the packet transmission rate cannot be adjusted quickly when the channel capacity changes according to hand-off.
A function approximation based on an orthonormal wave function expansion in a complex space is derived. Although a probability density function (PDF) cannot always be expanded in an orthogonal series in a real space because a PDF is a positive real function, the function approximation can approximate an arbitrary PDF with high accuracy. It is applied to an actor-critic method of reinforcement learning to derive an optimal policy expressed by an arbitrary PDF in a continuous-action continuous-state environment. A chaos control problem and a PDF approximation problem are solved using the actor-critic method with the function approximation, and it is shown that the function approximation can approximate a PDF well and that the actor-critic method with the function approximation exhibits high performance.
We developed a distributed control algorithm to solve the problem of a trade-off between transient response and stability. We applied it to a congestion control algorithm for transmitting best-effort packets such as transmission control protocol (TCP) packets over the Internet. A new transmission power control algorithm suitable for transmitting best-effort packets over the wireless Internet was also developed using the distributed control algorithm. We showed that in a steady state, TCP connections can use the bandwidth efficiently over both wired and wireless Internet when the proposed control algorithms are used. The transient response was also evaluated and it was found that the packet transmission rate and the transmission power adjusted by the proposed control algorithms converge to a steady state faster than when adjusted by conventional control algorithms while maintaining the stability of network systems.
Multihigh-dimensional chaotic systems were reduced to low-dimensional space embedded equations (SEEs), and their macroscopic and statistical properties were investigated using eigen analysis of the moment vector equation (MVE) of the SEE. First, the state space of the target system was discretized into a finite discrete space. Next, an embedding from the discrete space to a low-dimensional discrete space was defined. The SEE of the target system was derived using the embedding. Finally, eigen analysis was applied to the MVE of the SEE to derive the properties of the target system. The geometric increase in the dimension of the MVE with the dimension of the target system was avoided by using the SEE. The pdfs of arbitrary elements in the target nonlinear system were derived without a reduction in accuracy due to dimension reduction. Moreover, since the dynamics of the system were expressed by the eigenvalues of the MVE, it was possible to identify multiple steady states that cannot be done using numerical simulation. This approach can thus be used to analyze the macroscopic and statistical properties of multi-dimensional chaotic systems.
The author proposes a flow control scheme which derives the optimal packet transmission rate from the ACKs of the sending packets. The optimization is based on mathematical programming such as the extremal method and least-squares method. The author proves that the proposed method is fair when the RTT and thepacket length of each sender are the same. It is also shown that the sufficient condition for the proposed method to be optimal and stable generally holds true in packet networks. The performances are examined by computer simulations, and it is found that high throughput is obtained regardless of the network structure.
Hideki SATOH Masahiro ISHIBA Takehiko KOBAYASHI
We previously developed a novel transmission power control method for code-division multiple access (CDMA) wireless systems that is suitable for the transmission control protocol (TCP) and constant bit rate (CBR) connections. It allows each mobile terminal to send packets to arbitrary slots without negotiation or the use of the ALOHA protocol. It results in high bandwidth utilization for TCP connections without the need to modify the TCP protocol or use a snoop agent. In this paper, we improve our previously developed power control method so as to adapt itself to distance variations and instantaneous fluctuations in the received power due to fading. We show that the developed method enables efficient bandwidth utilization compared with the conventional power control technique under various conditions.
A method was developed for deriving the control input for a multi-dimensional discrete-time nonlinear system so that a performance index is approximately minimized. First, a moment vector equation (MVE) is derived; it is a multi-dimensional linear equation that approximates a nonlinear system in the whole domain of the system state and control input. Next, the performance index is approximated by using a quadratic form with respect to the moment vector. On the basis of the MVE and the quadratic form, an approximate optimal controller is derived by solving the linear quadratic optimal control problem. A bilinear optimal control problem and a mountain-car problem were solved using this method, and the solutions were nearly optimal.
We developed a priority queueing method that can keep the loss ratio of high-priority packets at a target value. It regulates the number of input low-priority packets when the queue length exceeds the threshold that is adjusted so as to minimize the Lyapunov function of the system. We showed that the number of calculations for the optimum threshold is sufficiently small and derived a sufficient condition for the proposed method to be robust against unknown load fluctuations. From the viewpoint of guaranteeing the loss ratio of high-priority packets, we showed, by computer simulation and theoretical analysis, that the proposed method is superior to conventional methods such as the fixed-threshold method and the pushout method.
We derived the design requirements that wireless systems and congestion control algorithms must satisfy to transmit best-effort Internet protocol (IP) packets over wireless systems. We proved that, if these requirements are satisfied, congestion control algorithms are robust against unfairness in the systems and can provide near-maximum throughputs in various environments. From the viewpoint of the design requirements, we investigated the effect of automatic repeat request (ARQ) on the throughputs of best-effort IP connections, and showed why ARQ can improve the throughputs while too large a number of retransmissions degrade them. We also investigated the effect of variance in packet transmission rates and clarified what kind of congestion control algorithm degrades the throughputs.
A method has been developed for deriving the approximate global optimum of a nonlinear objective function. First, the objective function is expanded into a linear equation for a moment vector, and the optimization problem is reduced to an eigen analysis problem in the wave coefficient space. Next, the process of the optimization is expressed using a Schrodinger-type equation, so global optimization is equivalent to eigen analysis of the Hamiltonian of a Schrodinger-type equation. Computer simulation of this method demonstrated that it produces a good approximation of the global optimum. An example optimization problem was solved using a Hamiltonian constructed by combining Hamiltonians for other optimization problems, demonstrating that various types of applications can be solved by combining simple Hamiltonians.
Moment matrix analysis (MMA) that can derive statistical properties of non-linear equations is presented in this paper. First, non-linear stochastic differential, or difference, equations are approximately expressed by simultaneous linear equations of moments defined at discrete events. Next, by eliminating higher order moments, the simultaneous linear equations are reduced to a linear vector equation of their coefficient matrix and a moment vector comprised of the moments of the system state. By computing the eigenvectors and eigenvalues of the coefficient matrix, we can analyze the moments, transient response, and spectrum of the system state. The behavior of Internet traffic was evaluated by using MMA and computer simulation, and it is shown that MMA is effective for evaluating simultaneous non-linear stochastic differential equations.
A moment matrix analysis (MMA) method can derive macroscopic statistical properties such as moments, response time, and power spectra of non-linear equations without solving the equations. MMA expands a non-linear equation into simultaneous linear equations of moments, and reduces it to a linear equation of their coefficient matrix and a moment vector. We can analyze the statistical properties from the eigenvalues and eigenvectors of the coefficient matrix. This paper presents (1) a systematic procedure to linearize non-linear equations and (2) an expansion of the previous work of MMA to derive the statistical properties of various non-linear equations. The statistical properties of the logistic map were evaluated by using MMA and computer simulation, and it is shown that the proposed systematic procedure was effective and that MMA could accurately approximate the statistical properties of the logistic map even though such a map had strong non-linearity.
Hideki SATOH Takehiko KOBAYASHI
We propose a novel control method for an unknown distributed system, and apply it to transmission power control in a code-division multiple access (CDMA) wireless system. Our proposed distributed control contains conventional transmission power control and packet transmission rate control for constant bit rate (CBR) and transmission control protocol (TCP) connections. Using theoretical analysis and computer simulations we show that our method for transmission power control allows high bandwidth utilization for both CBR and TCP connections, and that conventional power control, by contrast, does not make efficient use of bandwidth in TCP connections.
A macroscopic structure was analyzed for a system comprising multiple elements in which the dynamics is affected by their distribution. First, a nonlinear Boltzmann equation, which has an integration term with respect to the distribution of the elements, was derived. Next, the moment vector equation (MVE) for the Boltzmann equation was derived. The average probability density function (pdf) in a steady state was derived using eigen analysis of the coefficient matrix of the MVE. The macroscopic structure of the system and the mechanism that provides the average pdf and the transient response were then analyzed using eigen analysis. Evaluation of the average pdf and transient response showed that using eigen analysis is effective for analyzing not only the transient and stationary properties of the system but also the macroscopic structure and the mechanism providing the properties.
A method was developed for deriving the approximate global optimum of a nonlinear objective function with multiple local optimums. The objective function is expanded into a linear wave coefficient equation, so the problem of maximizing the objective function is reduced to that of maximizing a quadratic function with respect to the wave coefficients. Because a wave function expressed by the wave coefficients is used in the algorithm for maximizing the quadratic function, the algorithm is equivalent to a full search algorithm, i.e., one that searches in parallel for the global optimum in the whole domain of definition. Therefore, the global optimum is always derived. The method was evaluated for various objective functions, and computer simulation showed that a good approximation of the global optimum for each objective function can always be obtained.