Kai IKUTA Jinya NAKAMURA Moriya NAKAMURA
In this paper, we investigated the overfitting characteristics of nonlinear equalizers based on an artificial neural network (ANN) and the Volterra series transfer function (VSTF), which were designed to compensate for optical nonlinear waveform distortion in optical fiber communication systems. Linear waveform distortion caused by, e.g., chromatic dispersion (CD) is commonly compensated by linear equalizers using digital signal processing (DSP) in digital coherent receivers. However, mitigation of nonlinear waveform distortion is considered to be one of the next important issues. An ANN-based nonlinear equalizer is one possible candidate for solving this problem. However, the risk of overfitting of ANNs is one obstacle in using the technology in practical applications. We evaluated and compared the overfitting of ANN- and conventional VSTF-based nonlinear equalizers used to compensate for optical nonlinear distortion. The equalizers were trained on repeated random bit sequences (RRBSs), while varying the length of the bit sequences. When the number of hidden-layer units of the ANN was as large as 100 or 1000, the overfitting characteristics were comparable to those of the VSTF. However, when the number of hidden-layer units was 10, which is usually enough to compensate for optical nonlinear distortion, the overfitting was weaker than that of the VSTF. Furthermore, we confirmed that even commonly used finite impulse response (FIR) filters showed overfitting to the RRBS when the length of the RRBS was equal to or shorter than the length of the tapped delay line of the filters. Conversely, when the RRBS used for the training was sufficiently longer than the tapped delay line, the overfitting could be suppressed, even when using an ANN-based nonlinear equalizer with 10 hidden-layer units.
Katsuya KOSUKEGAWA Kazuhiko KAWAMOTO
We considered the problem of forecasting the degradation recovery process of civil structures for prognosis and health management. In this process, structural health degrades over time but recovers when a maintenance intervention is performed. Maintenance interventions are typically recorded in terms of date and type. Such records can be represented as binary time series. Using binary maintenance intervention records, we forecast the process by using Long Short-Term Memory (LSTM). In this study, we experimentally examined how to feed binary time series data into LSTM. To this end, we compared the concatenation and reinitialization methods. The former is used to concatenate maintenance intervention records and health data and feed them into LSTM. The latter is used to reinitialize the LSTM internal memory when maintenance intervention is performed. The experimental results with the synthetic data revealed that the concatenation method outperformed the reinitialization method.
Noboru HAYASAKA Riku KASAI Takuya FUTAGAMI
In this paper, we propose a noise-robust scream detection method with the aim of expanding the scream detection system, a sound-based security system. The proposed method uses enhanced screams using Wave-U-Net, which was effective as a noise reduction method for noisy screams. However, the enhanced screams showed different frequency components from clean screams and erroneously emphasized frequency components similar to scream in noise. Therefore, Wave-U-Net was applied even in the process of training Gaussian mixture models, which are discriminators. We conducted detection experiments using the proposed method in various noise environments and determined that the false acceptance rate was reduced by an average of 2.1% or more compared with the conventional method.
Ryu ISHII Kyosuke YAMASHITA Zihao SONG Yusuke SAKAI Tadanori TERUYA Takahiro MATSUDA Goichiro HANAOKA Kanta MATSUURA Tsutomu MATSUMOTO
Fault-tolerant aggregate signature (FT-AS) is a special type of aggregate signature that is equipped with the functionality for tracing signers who generated invalid signatures in the case an aggregate signature is detected as invalid. In existing FT-AS schemes (whose tracing functionality requires multi-rounds), a verifier needs to send a feedback to an aggregator for efficiently tracing the invalid signer(s). However, in practice, if this feedback is not responded to the aggregator in a sufficiently fast and timely manner, the tracing process will fail. Therefore, it is important to estimate whether this feedback can be responded and received in time on a real system. In this work, we measure the total processing time required for the feedback by implementing an existing FT-AS scheme, and evaluate whether the scheme works without problems in real systems. Our experimental results show that the time required for the feedback is 605.3 ms for a typical parameter setting, which indicates that if the acceptable feedback time is significantly larger than a few hundred ms, the existing FT-AS scheme would effectively work in such systems. However, there are situations where such feedback time is not acceptable, in which case the existing FT-AS scheme cannot be used. Therefore, we further propose a novel FT-AS scheme that does not require any feedback. We also implement our new scheme and show that a feedback in this scheme is completely eliminated but the size of its aggregate signature (affecting the communication cost from the aggregator to the verifier) is 144.9 times larger than that of the existing FT-AS scheme (with feedbacks) for a typical parameter setting, and thus has a trade-off between the feedback waiting time and the communication cost from the verifier to the aggregator with the existing FT-AS scheme.
Adiabatic logic circuits are regarded as one of the most attractive solutions for low-power circuit design. This study is dedicated to optimizing the design of the Two-Level Adiabatic Logic (2LAL) circuit, which boasts a relatively simple structure and superior low-power performance among many asymptotically adiabatic or quasi-adiabatic logic families, but suffers from a large number of timing buffers for “decompute”. Our focus is on the “early decompute” technique for fully pipelined 2LAL, and we propose two ILP approaches for minimizing hardware cost through optimization of early decompute. In the first approach, the problem is formulated as a kind of scheduling problem, while it is reformulated as node selection problem (stable set problem). The performance of the proposed methods are evaluated using several benchmark circuits from ISCAS-85, and the maximum 70% hardware reduction is observed compared with an existing method.
Tomohiro NISHIGUCHI Nobutaka KUROKI Masahiro NUMA
This paper proposes multi-gate reconfigurable (RECON) cells and a technology remapping approach using them as spare cells for post-mask functional engineering change orders (ECOs). With the rapid increase in circuit complexity, ECOs often occur in the post-mask stage of LSI designs. To deal with post-mask ECOs at a low cost, only the metal layers are redesigned by making functional changes using spare cells. For this purpose, 2T/4T/6T-RECON cells were proposed as reconfigurable spare cells. However, conventional RECON cells are used to implement single functions, which may result in unused transistors in the cells. In addition, the number of 2T/4T/6T-RECON spare cells used for post-mask ECOs varies greatly depending on the circuit to be implemented and the type of ECO that occurs. Therefore, functional ECOs may fail due to a lack of certain types of RECON cells, even if other types of RECON cells remain. To solve this problem, we propose multi-gate RECON cells that implement multiple functions in a single RECON cell while retaining the layouts of conventional 4T/6T-RECON base cells, and a technology remapping approach using them. The proposed approach not only reduces the number of used spare cells for modifications but also allows the flexible use of spare cells to fix them with less increase in wire length and delay. Experimental results have confirmed that the functional ECO success ratio is increased by 4.8pt on average and the total number of used spare cells is reduced by 5.6% on average. It has also been confirmed that the increase in wire length is reduced by 17.4% on average and the decrease in slack is suppressed by 21.6% on average.
Masayoshi YOSHIMURA Atsuya TSUJIKAWA Toshinori HOSOKAWA
In recent years, to meet strict time-to-market constraints, it has become difficult for only one semiconductor design company to design a VLSI. Thus, design companies purchase IP cores from third-party IP vendors and design only the necessary parts. On the other hand, since IP cores have the disadvantage that copyright infringement can be easily performed, logic locking has to be applied to them. Functional logic locking methods using TTLock are resilient to SAT attacks however vulnerable to FALL attacks. Additionally, it is difficult to design logic locking based on TTLock at the gate level. This paper proposes a logic locking method, CRLock, based on SAT attack and FALL attack resistance at the register transfer level. The CRLock is a logic locking method for controllers at RTL in which the designer selects a protected input pattern and modifies the controller based on the protection input pattern. In experimental results, we applied CRLock to MCNC'91 benchmark circuits and showed that all circuits are resistant to SAT and FALL attacks.
In this work, template attacks that aimed to leak the nonce were performed on 256-bit ECDSA hardware to evaluate the resistance against side-channel attacks. The target hardware was an ASIC and was revealed to be vulnerable to the combination of template attacks and lattice attacks. Furthermore, the attack result indicated it was not enough to fix the MSB of the nonce to 1 which is a common countermeasure. Also, the success rate of template attacks was estimated by simulation. This estimation does not require actual hardware and enables us to test the security of the implementation in the design phase. To clarify the acceptable amount of the nonce leakage, the computational cost of lattice attacks was compared to that of ρ method which is a cryptanalysis method. As a result, the success rate of 2-bit leakage of the nonce must be under 62% in the case of 256-bit ECDSA. In other words, SNR must be under 2-4 in our simulation model.
Ryosuke MATSUO Shin-ichi MINATO
Logic circuits based on a photonic integrated circuit (PIC) have attracted significant interest due to their ultra-high-speed operation. However, they have a fundamental disadvantage that a large amount of the optical signal power is discarded in the path from the optical source to the optical output, which results in significant power consumption. This optical signal power loss is called a garbage output. To address this issue, this paper considers a circuit design without garbage outputs. Although a method for synthesizing an optical logic circuit without garbage outputs is proposed, this synthesis method can not obtain the optimal solution, such as a circuit with the minimum number of gates. This paper proposes a cross-bar gate logic (CBGL) as a new logic structure for optical logic circuits without garbage outputs, moreover enumerates the CBGLs with the minimum number of gates for all three input logic functions by an exhaustive search. Since the search space is vast, our enumeration algorithm incorporates a technique to prune it efficiently. Experimental results for all three-input logic functions demonstrate that the maximum number of gates required to implement the target function is five. In the best case, the number of gates in enumerated CBGLs is one-half compared to the existing method for optical logic circuits without garbage outputs.
We have realized a design automation platform of hardware accelerator for pairing operation over multiple elliptic curve parameters. Pairing operation is one of the fundamental operations to realize functional encryption. However, known as a computational complexity-heavy algorithm. Also because there have been not yet identified standard parameters, we need to choose curve parameters based on the required security level and affordable hardware resources. To explore this design optimization for each curve parameter is essential. In this research, we have realized an automated design platform for pairing hardware for such purposes. Optimization results show almost equivalent to those prior-art designs by hand.
Jiaxuan LU Yutaka MASUDA Tohru ISHIHARA
Approximate computing (AC) saves energy and improves performance by introducing approximation into computation in error-torrent applications. This work focuses on an AC strategy that accurately performs important computations and approximates others. In order to make AC circuits practical, we need to determine which computation is how important carefully, and thus need to appropriately approximate the redundant computation for maintaining the required computational quality. In this paper, we focus on the importance of computations at the flip-flop (FF) level and propose a novel importance evaluation methodology. The key idea of the proposed methodology is a two-step fault injection algorithm to extract the near-optimal set of redundant FFs in the circuit. In the first step, the proposed methodology performs the FI simulation for each FF and extracts the candidates of redundant FFs. Then, in the second step, the proposed methodology extracts the set of redundant FFs in a binary search manner. Thanks to the two-step strategy, the proposed algorithm reduces the complexity of architecture exploration from an exponential order to a linear order without understanding the functionality and behavior of the target application program. Experimental results show that the proposed methodology identifies the candidates of redundant FFs depending on the given constraints. In a case study of an image processing accelerator, the truncation for identified redundant FFs reduces the circuit area by 29.6% and saves power dissipation by 44.8% under the ASIC implementation while satisfying the PSNR constraint. Similarly, the dynamic power dissipation is saved by 47.2% under the FPGA implementation.
Minami SATO Sosuke MINAMOTO Ryuichi SAKAI Yasuyuki MURAKAMI
It is proven that many public-key cryptosystems would be broken by the quantum computer. The knapsack cryptosystem which is based on the subset sum problem has the potential to be a quantum-resistant cryptosystem. Murakami and Kasahara proposed a SOSI trapdoor sequence which is made by combining shifted-odd (SO) and super-increasing (SI) sequence in the modular knapsack cryptosystem. This paper firstly show that the key generation method could not achieve a secure density against the low-density attack. Second, we propose a high-density key generation method and confirmed that the proposed scheme is secure against the low-density attack.
In this work we propose a Bayesian version of the Nagaoka-Hayashi bound when estimating a parametric family of quantum states. This lower bound is a generalization of a recently proposed bound for point estimation to Bayesian estimation. We then show that the proposed lower bound can be efficiently computed as a semidefinite programming problem. As a lower bound, we also derive a Bayesian version of the Holevo-type bound from the Bayesian Nagaoka-Hayashi bound. Lastly, we prove that the new lower bound is tighter than the Bayesian quantum logarithmic derivative bounds.
Information-theoretic lower bounds of the Bayes risk have been investigated for a problem of parameter estimation in a Bayesian setting. Previous studies have proven the lower bound of the Bayes risk in a different manner and characterized the lower bound via different quantities such as mutual information, Sibson's α-mutual information, f-divergence, and Csiszár's f-informativity. In this paper, we introduce an inequality called a “meta-bound for lower bounds of the Bayes risk” and show that the previous results can be derived from this inequality.
In this paper, we introduce a framework of distributed orthogonal approximate message passing for recovering sparse vector based on sensing by multiple nodes. The iterative recovery process consists of local computation at each node, and global computation performed either by a particular node or joint computation on the overall network by exchanging messages. We then propose a method to reduce the communication cost between the nodes while maintaining the recovery performance.
Asahi MIZUKOSHI Ayano NAKAI-KASAI Tadashi WADAYAMA
This paper proposes the periodical successive over-relaxation (PSOR)-Jacobi algorithm for minimum mean squared error (MMSE) detection of multiple-input multiple-output (MIMO) signals. The proposed algorithm has the advantages of two conventional methods. One is the Jacobi method, which is an iterative method for solving linear equations and is suitable for parallel implementation. The Jacobi method is thus a promising candidate for high-speed simultaneous linear equation solvers for the MMSE detector. The other is the Chebyshev PSOR method, which has recently been shown to accelerate the convergence speed of linear fixed-point iterations. We compare the convergence performance of the PSOR-Jacobi algorithm with that of conventional algorithms via computer simulation. The results show that the PSOR-Jacobi algorithm achieves faster convergence without increasing computational complexity, and higher detection performance for a fixed number of iterations. This paper also proposes an efficient computation method of inverse matrices using the PSOR-Jacobi algorithm. The results of computer simulation show that the PSOR-Jacobi algorithm also accelerates the computation of inverse matrix.
Daisuke HIBINO Tomoharu SHIBUYA
Distributed computing is one of the powerful solutions for computational tasks that need the massive size of dataset. Lagrange coded computing (LCC), proposed by Yu et al. [15], realizes private and secure distributed computing under the existence of stragglers, malicious workers, and colluding workers by using an encoding polynomial. Since the encoding polynomial depends on a dataset, it must be updated every arrival of new dataset. Therefore, it is necessary to employ efficient algorithm to construct the encoding polynomial. In this paper, we propose Newton coded computing (NCC) which is based on Newton interpolation to construct the encoding polynomial. Let K, L, and T be the number of data, the length of each data, and the number of colluding workers, respectively. Then, the computational complexity for construction of an encoding polynomial is improved from O(L(K+T)log 2(K+T)log log (K+T)) for LCC to O(L(K+T)log (K+T)) for the proposed method. Furthermore, by applying the proposed method, the computational complexity for updating the encoding polynomial is improved from O(L(K+T)log 2(K+T)log log (K+T)) for LCC to O(L) for the proposed method.
A channel coding problem with cost constraint for general channels is considered. Verdú and Han derived ϵ-capacity for general channels. Following the same lines of its proof, we can also derive ϵ-capacity with cost constraint. In this paper, we derive a formula for ϵ-capacity with cost constraint allowing overrun. In order to prove this theorem, a new variation of Feinstein's lemma is applied to select codewords satisfying cost constraint and codewords not satisfying cost constraint.
Koshi SHIMADA Shota SAITO Toshiyasu MATSUSHIMA
The context tree model has the property that the occurrence probability of symbols is determined from a finite past sequence and is a broader class of sources that includes i.i.d. or Markov sources. This paper proposes a non-stationary source with context tree models that change from interval to interval. The Bayes code for this source requires weighting of the posterior probabilities of the context tree models and change points, so the computational complexity of it usually increases to exponential order. Therefore, the challenge is how to reduce the computational complexity. In this paper, we propose a special class of prior probability distribution of context tree models and change points and develop an efficient Bayes coding algorithm by combining two existing Bayes coding algorithms. The algorithm minimizes the Bayes risk function of the proposed source in this paper, and the computational complexity of the proposed algorithm is polynomial order. We investigate the behavior and performance of the proposed algorithm by conducting experiments.
The achievability part of the rate-distortion theorem is proved by showing existence of good codes. For i.i.d. sources, two methods showing existence are known; random coding and non-random coding. For general sources, however, no proof in which good codes are constructed with non-random coding is found. In this paper, with a non-random method of code construction, we prove the achievability part of the rate-distortion theorem for general sources. Moreover, we also prove a stochastic variation of the rate-distortion theorem with the same method.