IEICE global.ieice.org Site

Author Search Result

[Author] Kazushi KAWAMURA(8hit)

1-8hit

Solving Constrained Slot Placement Problems Using an Ising Machine and Its Evaluations
Sho KANAMARU Kazushi KAWAMURA Shu TANAKA Yoshinori TOMITA Nozomu TOGAWA

PAPER-Fundamentals of Information Systems

Pubricized:
2020/11/09
Vol:
E104-D No:2
Page(s):
226-236
Ising machines have attracted attention, which is expected to obtain better solutions of various combinatorial optimization problems at high speed by mapping the problems to natural phenomena. A slot-placement problem is one of the combinatorial optimization problems, regarded as a quadratic assignment problem, which relates to the optimal logic-block placement in a digital circuit as well as optimal delivery planning. Here, we propose a mapping to the Ising model for solving a slot-placement problem with additional constraints, called a constrained slot-placement problem, where several item pairs must be placed within a given distance. Since the behavior of Ising machines is stochastic and we map the problem to the Ising model which uses the penalty method, the obtained solution does not always satisfy the slot-placement constraint, which is different from the conventional methods such as the conventional simulated annealing. To resolve the problem, we propose an interpretation method in which a feasible solution is generated by post-processing procedures. We measured the execution time of an Ising machine and compared the execution time of the simulated annealing in which solutions with almost the same accuracy are obtained. As a result, we found that the Ising machine is faster than the simulated annealing that we implemented.
A Floorplan-Driven High-Level Synthesis Algorithm for Multiplexer Reduction Targeting FPGA Designs
Koichi FUJIWARA Kazushi KAWAMURA Shin-ya ABE Masao YANAGISAWA Nozomu TOGAWA

PAPER

Vol:
E98-A No:7
Page(s):
1392-1405
Recently, high-level synthesis (HLS) techniques for FPGA designs are required in various applications such as computerized stock tradings and reconfigurable network processings. In HLS for FPGA designs, we need to consider module floorplan and reduce multiplexer's cost concurrently. In this paper, we propose a floorplan-driven HLS algorithm for multiplexer reduction targeting FPGA designs. By utilizing distributed-register architectures called HDR, we can easily consider module floorplan in HLS. In order to reduce multiplexer's cost, we propose two novel binding methods called datapath-oriented scheduling/FU binding and datapath-oriented register binding. Experimental results demonstrate that our algorithm can realize FPGA designs which reduce the number of slices by up to 47% and latency by up to 22% compared with conventional approaches while the number of required control steps is almost the same.
A Hybrid Integer Encoding Method for Obtaining High-Quality Solutions of Quadratic Knapsack Problems on Solid-State Annealers
Satoru JIMBO Daiki OKONOGI Kota ANDO Thiem Van CHU Jaehoon YU Masato MOTOMURA Kazushi KAWAMURA

PAPER

Pubricized:
2022/05/26
Vol:
E105-D No:12
Page(s):
2019-2031
For formulating Quadratic Knapsack Problems (QKPs) into the form of Quadratic Unconstrained Binary Optimization (QUBO), it is necessary to introduce an integer variable, which converts and incorporates the knapsack capacity constraint into the overall energy function. In QUBO, this integer variable is encoded with auxiliary binary variables, and the encoding method used for it affects the behavior of Simulated Annealing (SA) significantly. For improving the efficiency of SA for QKP instances, this paper first visualized and analyzed their annealing processes encoded by conventional binary and unary encoding methods. Based on this analysis, we proposed a novel hybrid encoding (HE), getting the best of both worlds. The proposed HE obtained feasible solutions in the evaluation, outperforming the others in small- and medium-scale models.
Bi-Partitioning Based Multiplexer Network for Field-Data Extractors
Koki ITO Kazushi KAWAMURA Yutaka TAMIYA Masao YANAGISAWA Nozomu TOGAWA

LETTER

Vol:
E99-A No:7
Page(s):
1410-1414
An (M,N)-field-data extractor reads out any consecutive N bytes from an M-byte register by connecting its input/output using a multiplexer (MUX) network. It is used in packet analysis and/or stream data processing for video/audio data. In this letter, we propose an efficient MUX network for an (M,N)-field-data extractor. By bi-partitioning a simple MUX network into an upper one and a lower one, we can theoretically reduce the number of required MUXs without increasing the MUX network depth. Experimental results show that we can reduce the gate count by up to 92% compared to a naive approach.
A Fully-Parallel Annealing Algorithm with Autonomous Pinning Effect Control for Various Combinatorial Optimization Problems
Daiki OKONOGI Satoru JIMBO Kota ANDO Thiem Van CHU Jaehoon YU Masato MOTOMURA Kazushi KAWAMURA

PAPER

Pubricized:
2023/09/19
Vol:
E106-D No:12
Page(s):
1969-1978
Annealing computation has recently attracted attention as it can efficiently solve combinatorial optimization problems using an Ising spin-glass model. Stochastic cellular automata annealing (SCA) is a promising algorithm that can realize fast spin-update by utilizing its parallel computing capability. However, in SCA, pinning effect control to suppress the spin-flip probability is essential, making escaping from local minima more difficult than serial spin-update algorithms, depending on the problem. This paper proposes a novel approach called APC-SCA (Autonomous Pinning effect Control SCA), where the pinning effect can be controlled autonomously by focusing on individual spin-flip. The evaluation results using max-cut, N-queen, and traveling salesman problems demonstrate that APC-SCA can obtain better solutions than the original SCA that uses pinning effect control pre-optimized by a grid search. Especially in solving traveling salesman problems, we confirm that the tour distance obtained by APC-SCA is up to 56.3% closer to the best-known compared to the conventional approach.
Interconnection-Delay and Clock-Skew Estimate Modelings for Floorplan-Driven High-Level Synthesis Targeting FPGA Designs
Koichi FUJIWARA Kazushi KAWAMURA Masao YANAGISAWA Nozomu TOGAWA

PAPER

Vol:
E99-A No:7
Page(s):
1294-1310
Recently, high-level synthesis techniques for FPGA designs (FPGA-HLS techniques) are strongly required in various applications. Both interconnection delays and clock skews have a large impact on circuit performance implemented onto FPGA, which indicates the need for floorplan-driven FPGA-HLS algorithms considering them. To appropriately estimate interconnection delays and clock skews at HLS phase, a reasonable model to estimate them becomes essential. In this paper, we demonstrate several experiments to characterize interconnection delays and clock skews in FPGA and propose novel estimate models called “IDEF” and “CSEF”. In order to evaluate our models, we integrate them into a conventional floorplan-driven FPGA-HLS algorithm. Experimental results demonstrate that our algorithm can realize FPGA designs which reduce the latency by up to 22% compared with conventional approaches.
Efficient Multiplexer Networks for Field-Data Extractors and Their Evaluations
Koki ITO Kazushi KAWAMURA Yutaka TAMIYA Masao YANAGISAWA Nozomu TOGAWA

PAPER-VLSI Design Technology and CAD

Vol:
E100-A No:4
Page(s):
1015-1028
As seen in stream data processing, it is necessary to extract a particular data field from bulk data, where we can use a field-data extractor. Particularly, an (M,N)-field-data extractor reads out any consecutive N bytes from an M-byte register by connecting its input/output using multiplexers (MUXs). However, the number of required MUXs increases too much as the input/output byte widths increase. It is known that partitioning a MUX network leads to reducing the number of MUXs. In this paper, we firstly pick up a multi-layered MUX network, which is generated by repeatedly partitioning a MUX network into a collection of single-layered MUX networks. We show that the multi-layered MUX network is equivalent to the barrel shifter from which redundant MUXs and wires are removed, and we prove that the number of required MUXs becomes the smallest among MUX-network-partitioning based field-data extractors. Next, we propose a rotator-based MUX network for a field-data extractor, which is based on reading out a particular data in an input register to a rotator. The byte width of the rotator is the same as its output register and hence we no longer require any extra wires nor MUXs. By rotating the input data appropriately, we can finally have a right-ordered data into an output register. Experimental results show that a multi-layered MUX network reduces the number of required gates to construct a field-data extractor by up to 97.0% compared with the one using a naive approach and its delay becomes 1.8ns-2.3ns. A rotator-based MUX network with a control circuit also reduces the number of required gates to construct a field-data extractor by up to 97.3% compared with the one using a naive approach and its delay becomes 2.1ns-2.9ns.
A Thermal-Aware High-Level Synthesis Algorithm for RDR Architectures through Binding and Allocation
Kazushi KAWAMURA Masao YANAGISAWA Nozomu TOGAWA

PAPER-VLSI Design Technology and CAD

Vol:
E96-A No:1
Page(s):
312-321
With process technology scaling, a heat problem in ICs is becoming a serious issue. Since high temperature adversely impacts on reliability, design costs, and leakage power, it is necessary to incorporate thermal-aware synthesis into IC design flows. In particular, hot spots are serious concerns where a chip is locally too much heated and reducing the peak temperature inside a chip is very important. On the other hand, increasing the average interconnect delays is also becoming a serious issue. By using RDR architectures (Regular-Distributed-Register architectures), the interconnect delays can be easily estimated and their influence can be much reduced even in high-level synthesis. In this paper, we propose a thermal-aware high-level synthesis algorithm for RDR architectures. The RDR architecture divides the entire chip into islands and each island has uniform area. Our algorithm balances the energy consumption among islands through re-binding to functional units. By allocating some new additional functional units to vacant areas on islands, our algorithm further balances the energy consumption among islands and thus reduces the peak temperature. Experimental results demonstrate that our algorithm reduces the peak temperature by up to 9.1% compared with the conventional approach.

Author Search Result

[Author] Kazushi KAWAMURA(8hit)

Solving Constrained Slot Placement Problems Using an Ising Machine and Its Evaluations

A Floorplan-Driven High-Level Synthesis Algorithm for Multiplexer Reduction Targeting FPGA Designs

A Hybrid Integer Encoding Method for Obtaining High-Quality Solutions of Quadratic Knapsack Problems on Solid-State Annealers

Bi-Partitioning Based Multiplexer Network for Field-Data Extractors

A Fully-Parallel Annealing Algorithm with Autonomous Pinning Effect Control for Various Combinatorial Optimization Problems

Interconnection-Delay and Clock-Skew Estimate Modelings for Floorplan-Driven High-Level Synthesis Targeting FPGA Designs

Efficient Multiplexer Networks for Field-Data Extractors and Their Evaluations

A Thermal-Aware High-Level Synthesis Algorithm for RDR Architectures through Binding and Allocation

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles