1-14hit |
Baohang ZHANG Haichuan YANG Tao ZHENG Rong-Long WANG Shangce GAO
The equilibrium optimizer (EO) is a novel physics-based meta-heuristic optimization algorithm that is inspired by estimating dynamics and equilibrium states in controlled volume mass balance models. As a stochastic optimization algorithm, EO inevitably produces duplicated solutions, which is wasteful of valuable evaluation opportunities. In addition, an excessive number of duplicated solutions can increase the risk of the algorithm getting trapped in local optima. In this paper, an improved EO algorithm with a bis-population-based non-revisiting (BNR) mechanism is proposed, namely BEO. It aims to eliminate duplicate solutions generated by the population during iterations, thus avoiding wasted evaluation opportunities. Furthermore, when a revisited solution is detected, the BNR mechanism activates its unique archive population learning mechanism to assist the algorithm in generating a high-quality solution using the excellent genes in the historical information, which not only improves the algorithm's population diversity but also helps the algorithm get out of the local optimum dilemma. Experimental findings with the IEEE CEC2017 benchmark demonstrate that the proposed BEO algorithm outperforms other seven representative meta-heuristic optimization techniques, including the original EO algorithm.
Yoshiharu YAMAGISHI Tatsuya KANEKO Megumi AKAI-KASAYA Tetsuya ASAI
Edge computing, which has been gaining attention in recent years, has many advantages, such as reducing the load on the cloud, not being affected by the communication environment, and providing excellent security. Therefore, many researchers have attempted to implement neural networks, which are representative of machine learning in edge computing. Neural networks can be divided into inference and learning parts; however, there has been little research on implementing the learning component in edge computing in contrast to the inference part. This is because learning requires more memory and computation than inference, easily exceeding the limit of resources available for edge computing. To overcome this problem, this research focuses on the optimizer, which is the heart of learning. In this paper, we introduce our new optimizer, hardware-oriented logarithmic momentum estimation (Holmes), which incorporates new perspectives not found in existing optimizers in terms of characteristics and strengths of hardware. The performance of Holmes was evaluated by comparing it with other optimizers with respect to learning progress and convergence speed. Important aspects of hardware implementation, such as memory and operation requirements are also discussed. The results show that Holmes is a good match for edge computing with relatively low resource requirements and fast learning convergence. Holmes will help create an era in which advanced machine learning can be realized on edge computing.
Koji KAMMA Sarimu INOUE Toshikazu WADA
Pruning is an effective technique to reduce computational complexity of Convolutional Neural Networks (CNNs) by removing redundant neurons (or weights). There are two types of pruning methods: holistic pruning and layer-wise pruning. The former selects the least important neuron from the entire model and prunes it. The latter conducts pruning layer by layer. Recently, it has turned out that some layer-wise methods are effective for reducing computational complexity of pruned models while preserving their accuracy. The difficulty of layer-wise pruning is how to adjust pruning ratio (the ratio of neurons to be pruned) in each layer. Because CNNs typically have lots of layers composed of lots of neurons, it is inefficient to tune pruning ratios by human hands. In this paper, we present Pruning Ratio Optimizer (PRO), a method that can be combined with layer-wise pruning methods for optimizing pruning ratios. The idea of PRO is to adjust pruning ratios based on how much pruning in each layer has an impact on the outputs in the final layer. In the experiments, we could verify the effectiveness of PRO.
Yi LIU Wei QIN Jinhui ZHANG Mengmeng LI Qibin ZHENG Jichuan WANG
Multi-objective evolutionary algorithms are widely used in many engineering optimization problems and artificial intelligence applications. Ant lion optimizer is an outstanding evolutionary method, but two issues need to be solved to extend it to the multi-objective optimization field, one is how to update the Pareto archive, and the other is how to choose elite and ant lions from archive. We develop a novel multi-objective variant of ant lion optimizer in this paper. A new measure combining Pareto dominance relation and distance information of individuals is put forward and used to tackle the first issue. The concept of time weight is developed to handle the second problem. Besides, mutation operation is adopted on solutions in middle part of archive to further improve its performance. Eleven functions, other four algorithms and four indicators are taken to evaluate the new method. The results show that proposed algorithm has better performance and lower time complexity.
Mengmeng LI Xiaoguang REN Yanzhen WANG Wei QIN Yi LIU
Feature selection is important for learning algorithms, and it is still an open problem. Antlion optimizer is an excellent nature inspired method, but it doesn't work well for feature selection. This paper proposes a hybrid approach called Ant-Antlion Optimizer which combines advantages of antlion's smart behavior of antlion optimizer and ant's powerful searching movement of ant colony optimization. A mutation operator is also adopted to strengthen exploration ability. Comprehensive experiments by binary classification problems show that the proposed algorithm is superiority to other state-of-art methods on four performance indicators.
Tomoyuki SASAKI Hidehiro NAKANO Arata MIYAUCHI Akira TAGUCHI
In this paper, we propose a new paradigm of deterministic PSO, named piecewise-linear particle swarm optimizer (PPSO). In PPSO, each particle has two search dynamics, a convergence mode and a divergence mode. The trajectory of each particle is switched between the two dynamics and is controlled by parameters. We analyze convergence condition of each particle and investigate parameter conditions to allow particles to converge to an equilibrium point through numerical experiments. We further compare solving performances of PPSO. As a result, we report here that the solving performances of PPSO are substantially the same as or superior to those of PSO.
Tomoyuki SASAKI Hidehiro NAKANO Arata MIYAUCHI Akira TAGUCHI
Particle swarm optimizer network (PSON) is one of the multi-swarm PSOs. In PSON, a population is divided into multiple sub-PSOs, each of which searches a solution space independently. Although PSON has a good solving performance, it may be trapped into a local optimum solution. In this paper, we introduce into PSON a dynamic stochastic network topology called “PSON with stochastic connection” (PSON-SC). In PSON-SC, each sub-PSO can be connected to the global best (gbest) information memory and refer to gbest stochastically. We show clearly herein that the diversity of PSON-SC is higher than that of PSON, while confirming the effectiveness of PSON-SC by many numerical simulations.
Kosuke KATAYAMA Mizuki MOTOYOSHI Kyoya TAKANO Chen Yang LI Shuhei AMAKAWA Minoru FUJISHIMA
E-band communication is allocated to the frequency bands of 71-76 and 81-86GHz. Radio-frequency (RF) front-end components for E-band communication have been realized using compound semiconductor technology. To realize a CMOS LNA for E-band communication, we propose a gain-boosted cascode amplifier (GBCA) stage that simultaneously provides high gain and stability. Designing an LNA from scratch requires considerable time because the tuning of matching networks with consideration of the parasitic elements is complicated. In this paper, we model the characteristics of devices including the effects of their parasitic elements. Using these models, an optimizer can estimate the characteristic of a designed LNA precisely without electromagnetic simulations and gives us the design values of an LNA when the layout constraint is ignored. Starting from the values, a four-stage LNA with a GBCA stage is designed very easily even though the layout constraint is considered and fabricated by a 65nm LP CMOS process. The fabricated LNA is measured, and it is confirmed that it achieves 18.5GHz bandwidth and over 24.3dB gain with 50.6mW power consumption. This is the first LNA to achieve a gain bandwidth of over 300GHz in the E-band among the LNAs utilizing any kind of semiconductor technologies. In this paper, we have proved that CMOS technology, which is suitable for baseband and digital circuitry, is applicable to a communication system covering the entire E-band.
Xin MAN Takashi HORIYAMA Shinji KIMURA
Clock gating is supported by commercial tools as a power optimization feature based on the guard signal described in HDL (structural method). However, the identification of control signals for gated registers is hard and designer-intensive work. Besides, since the clock gating cells also consume power, it is imperative to minimize the number of inserted clock gating cells and their switching activities for power optimization. In this paper, we propose an automatic multi-stage clock gating algorithm with ILP (Integer Linear Programming) formulation, including clock gating control candidate extraction, constraints construction and optimum control signal selection. By multi-stage clock gating, unnecessary clock pulses to clock gating cells can be avoided by other clock gating cells, so that the switching activity of clock gating cells can be reduced. We find that any multi-stage control signals are also single-stage control signals, and any combination of signals can be selected from single-stage candidates. The proposed method can be applied to 3 or more cascaded stages. The multi-stage clock gating optimization problem is formulated as constraints in LP format for the selection of cascaded clock-gating order of multi-stage candidate combinations, and a commercial ILP solver (IBM CPLEX) is applied to obtain the control signals for each register with minimum switching activity. Those signals are used to generate a gate level description with guarded registers from original design, and a commercial synthesis and layout tools are applied to obtain the circuit with multi-stage clock gating. For a set of benchmark circuits and a Low Density Parity Check (LDPC) Decoder (6.6k gates, 212 F.F.s), the proposed method is applied and actual power consumption is estimated using Synopsys NanoSim after layout. On average, 31% actual power reduction has been obtained compared with original designs with structural clock gating, and more than 10% improvement has been achieved for some circuits compared with single-stage optimization method. CPU time for optimum multi-stage control selection is several seconds for up to 25k variables in LP format. By applying the proposed clock gating, area can also be reduced since the multiplexors controlling register inputs are eliminated.
Junqi ZHANG Lina NI Jing YAO Wei WANG Zheng TANG
Kennedy has proposed the bare bones particle swarm (BBPS) by the elimination of the velocity formula and its replacement by the Gaussian sampling strategy without parameter tuning. However, a delicate balance between exploitation and exploration is the key to the success of an optimizer. This paper firstly analyzes the sampling distribution in BBPS, based on which we propose an adaptive BBPS inspired by the cloud model (ACM-BBPS). The cloud model adaptively produces a different standard deviation of the Gaussian sampling for each particle according to the evolutionary state in the swarm, which provides an adaptive balance between exploitation and exploration on different objective functions. Meanwhile, the diversity of the swarms is further enhanced by the randomness of the cloud model itself. Experimental results show that the proposed ACM-BBPS achieves faster convergence speed and more accurate solutions than five other contenders on twenty-five unimodal, basic multimodal, extended multimodal and hybrid composition benchmark functions. The diversity enhancement by the randomness in the cloud model itself is also illustrated.
Junqi ZHANG Lina NI Chen XIE Ying TAN Zheng TANG
This paper presents an adaptive magnification transformation based particle swarm optimizer (AMT-PSO) that provides an adaptive search strategy for each particle along the search process. Magnification transformation is a simple but very powerful mechanism, which is inspired by using a convex lens to see things much clearer. The essence of this transformation is to set a magnifier around an area we are interested in, so that we could inspect the area of interest more carefully and precisely. An evolutionary factor, which utilizes the information of population distribution in particle swarm, is used as an index to adaptively tune the magnification scale factor for each particle in each dimension. Furthermore, a perturbation-based elitist learning strategy is utilized to help the swarm's best particle to escape the local optimum and explore the potential better space. The AMT-PSO is evaluated on 15 unimodal and multimodal benchmark functions. The effects of the adaptive magnification transformation mechanism and the elitist learning strategy in AMT-PSO are studied. Results show that the adaptive magnification transformation mechanism provides the main contribution to the proposed AMT-PSO in terms of convergence speed and solution accuracy on four categories of benchmark test functions.
Katsuma ONO Kenya JIN'NO Toshimichi SAITO
This letter studies application of the growing PSO to the design of DC-AC inverters. In this application, each particle corresponds to a set of circuit parameters and moves to solve a multi-objective problem of the total harmonic distortion and desired average power. The problem is described by the hybrid fitness consisting of analog objective function, criterion and digital logic. The PSO has growing structure and dynamic acceleration parameters. Performing basic numerical experiments, we have confirmed the algorithm efficiency.
Junqi ZHANG Ying TAN Lina NI Chen XIE Zheng TANG
Particle swarm optimizer (PSO) is a stochastic global optimization technique based on a social interaction metaphor. Because of the complexity, dynamics and randomness involved in PSO, it is hard to theoretically analyze the mechanism on which PSO depends. Statistical results have shown that the probability distribution of PSO is a truncated triangle, with uniform probability across the middle that decreases on the sides. The "truncated triangle" is also called the "Maya pyramid" by Kennedy. However, very little is known regarding the sampling distribution of PSO in itself. In this paper, we theoretically analyze the "Maya pyramid" without any assumption and derive its computational formula, which is actually a hybrid uniform distribution that looks like a trapezoid and conforms with the statistical results. Based on the derived density function of the hybrid uniform distribution, the search strategy of PSO is defined and quantified to characterize the mechanism of the search strategy in PSO. In order to show the significance of these definitions based on the derived hybrid uniform distribution, the comparison between the defined search strategies of the classical linear decreasing weight based PSO and the canonical constricted PSO suggested by Clerc is illustrated and elaborated.
Tzer-Shyong CHEN Feipei LAI Shu-Lin HWANG Rung-Ji SHANG
Abstract machine modelling is a technique used frequently in developing the retargetable compilers. By translating the abstract machine operations into target machine instructions, we can construct retargetable compilers. However, such a technique will cause two problems. First, the code produced by the compilers is inefficient. Next, in order to emit the efficient code, the compilation time is too long. In view of these two disadvantages, we apply PO (peephole optimizer) in our retargetable compilers to do code optimization. Peephole optimizer searches for the adjacent instruction candidates in the intermediate code, and then replaces them with equivalent instructions of less cost. Furthermore, the peephole description table consists of simple tree-rewriting rules which are easily retargeted into different machines. At the same time, we have proposed a simple peephole pattern matching algorithm to reduce the peephole pattern matching time. The experiment indicates that the machine code generated by our compiler runs faster than that by GNU c compiler (gcc).