Sohei SHIMOMAI Kei UEDA Shinji KIMURA
Recently, Quantum Annealing (QA) has attracted attention as an efficient algorithm for combinatorial optimization problems. In QA, the input data size becomes large and its reduction is important for accelerating by the hardware emulation since the usable memory size and its bandwidth are limited. The paper proposes the compression method of input sparse matrices for QA emulator. The proposed method uses the sparseness of the coefficient matrix and the reappearance of the same values. An independent table is introduced and data are compressed by the search and registration method of two consecutive data in the value table. The proposed method is applied to Traveling Salesman Problem (TSP) with 32, 64 and 96 cities and Nurse Scheduling Problem (NSP). The proposed method could reduce the amount of data by 1/40 for 96 city TSP and could manage 96 city TSP on the hardware emulator. When applied to NSP, we confirmed the effectiveness of the proposed method by the compression ratio ranging from 1/4 to 1/11.8. The data reduction is also useful for the simulation/emulation performance when using the compressed data directly and 1.9 times faster speed can be found on 96 city TSP than the CSR-based method.
We have realized a design automation platform of hardware accelerator for pairing operation over multiple elliptic curve parameters. Pairing operation is one of the fundamental operations to realize functional encryption. However, known as a computational complexity-heavy algorithm. Also because there have been not yet identified standard parameters, we need to choose curve parameters based on the required security level and affordable hardware resources. To explore this design optimization for each curve parameter is essential. In this research, we have realized an automated design platform for pairing hardware for such purposes. Optimization results show almost equivalent to those prior-art designs by hand.
Jiaxuan LU Yutaka MASUDA Tohru ISHIHARA
Approximate computing (AC) saves energy and improves performance by introducing approximation into computation in error-torrent applications. This work focuses on an AC strategy that accurately performs important computations and approximates others. In order to make AC circuits practical, we need to determine which computation is how important carefully, and thus need to appropriately approximate the redundant computation for maintaining the required computational quality. In this paper, we focus on the importance of computations at the flip-flop (FF) level and propose a novel importance evaluation methodology. The key idea of the proposed methodology is a two-step fault injection algorithm to extract the near-optimal set of redundant FFs in the circuit. In the first step, the proposed methodology performs the FI simulation for each FF and extracts the candidates of redundant FFs. Then, in the second step, the proposed methodology extracts the set of redundant FFs in a binary search manner. Thanks to the two-step strategy, the proposed algorithm reduces the complexity of architecture exploration from an exponential order to a linear order without understanding the functionality and behavior of the target application program. Experimental results show that the proposed methodology identifies the candidates of redundant FFs depending on the given constraints. In a case study of an image processing accelerator, the truncation for identified redundant FFs reduces the circuit area by 29.6% and saves power dissipation by 44.8% under the ASIC implementation while satisfying the PSNR constraint. Similarly, the dynamic power dissipation is saved by 47.2% under the FPGA implementation.
Shogo CHIWAKI Ryutaroh MATSUMOTO
Stabilizer-based quantum secret sharing has two methods to reconstruct a quantum secret: The erasure correcting procedure and the unitary procedure. It is known that the unitary procedure has a smaller circuit width. On the other hand, it is unknown which method has smaller depth and fewer circuit gates. In this letter, it is shown that the unitary procedure has smaller depth and fewer circuit gates than the erasure correcting procedure which follows a standard framework performing measurements and unitary operators according to the measurements outcomes, when the circuits are designed for quantum secret sharing using the [[5, 1, 3]] binary stabilizer code. The evaluation can be reversed if one discovers a better circuit for the erasure correcting procedure which does not follow the standard framework.
Yuta NAKAHARA Toshiyasu MATSUSHIMA
Previously, we proposed a probabilistic data generation model represented by an unobservable tree and a sequential updating method to calculate a posterior distribution over a set of trees. The set is called a meta-tree. In this paper, we propose a more efficient batch updating method.
In this work we propose a Bayesian version of the Nagaoka-Hayashi bound when estimating a parametric family of quantum states. This lower bound is a generalization of a recently proposed bound for point estimation to Bayesian estimation. We then show that the proposed lower bound can be efficiently computed as a semidefinite programming problem. As a lower bound, we also derive a Bayesian version of the Holevo-type bound from the Bayesian Nagaoka-Hayashi bound. Lastly, we prove that the new lower bound is tighter than the Bayesian quantum logarithmic derivative bounds.
Information-theoretic lower bounds of the Bayes risk have been investigated for a problem of parameter estimation in a Bayesian setting. Previous studies have proven the lower bound of the Bayes risk in a different manner and characterized the lower bound via different quantities such as mutual information, Sibson's α-mutual information, f-divergence, and Csiszár's f-informativity. In this paper, we introduce an inequality called a “meta-bound for lower bounds of the Bayes risk” and show that the previous results can be derived from this inequality.
In this paper, we introduce a framework of distributed orthogonal approximate message passing for recovering sparse vector based on sensing by multiple nodes. The iterative recovery process consists of local computation at each node, and global computation performed either by a particular node or joint computation on the overall network by exchanging messages. We then propose a method to reduce the communication cost between the nodes while maintaining the recovery performance.
Daisuke HIBINO Tomoharu SHIBUYA
Distributed computing is one of the powerful solutions for computational tasks that need the massive size of dataset. Lagrange coded computing (LCC), proposed by Yu et al. [15], realizes private and secure distributed computing under the existence of stragglers, malicious workers, and colluding workers by using an encoding polynomial. Since the encoding polynomial depends on a dataset, it must be updated every arrival of new dataset. Therefore, it is necessary to employ efficient algorithm to construct the encoding polynomial. In this paper, we propose Newton coded computing (NCC) which is based on Newton interpolation to construct the encoding polynomial. Let K, L, and T be the number of data, the length of each data, and the number of colluding workers, respectively. Then, the computational complexity for construction of an encoding polynomial is improved from O(L(K+T)log 2(K+T)log log (K+T)) for LCC to O(L(K+T)log (K+T)) for the proposed method. Furthermore, by applying the proposed method, the computational complexity for updating the encoding polynomial is improved from O(L(K+T)log 2(K+T)log log (K+T)) for LCC to O(L) for the proposed method.
Koshi SHIMADA Shota SAITO Toshiyasu MATSUSHIMA
The context tree model has the property that the occurrence probability of symbols is determined from a finite past sequence and is a broader class of sources that includes i.i.d. or Markov sources. This paper proposes a non-stationary source with context tree models that change from interval to interval. The Bayes code for this source requires weighting of the posterior probabilities of the context tree models and change points, so the computational complexity of it usually increases to exponential order. Therefore, the challenge is how to reduce the computational complexity. In this paper, we propose a special class of prior probability distribution of context tree models and change points and develop an efficient Bayes coding algorithm by combining two existing Bayes coding algorithms. The algorithm minimizes the Bayes risk function of the proposed source in this paper, and the computational complexity of the proposed algorithm is polynomial order. We investigate the behavior and performance of the proposed algorithm by conducting experiments.
Tomohiko UYEMATSU Tetsunao MATSUTA
This paper proposes three new information measures for individual sequences and clarifies their properties. Our new information measures are called as the non-overlapping max-entropy, the overlapping smooth max-entropy, and the non-overlapping smooth max-entropy, respectively. These measures are related to the fixed-length coding of individual sequences. We investigate these measures, and show the following three properties: (1) The non-overlapping max-entropy coincides with the topological entropy. (2) The overlapping smooth max-entropy and the non-overlapping smooth max-entropy coincide with the Ziv-entropy. (3) When an individual sequence is drawn from an ergodic source, the overlapping smooth max-entropy and the non-overlapping smooth max-entropy coincide with the entropy rate of the source. Further, we apply these information measures to the fixed-length coding of individual sequences, and propose some new universal coding schemes which are asymptotically optimum.
In this study, we consider the data compression with side information available at both the encoder and the decoder. The information source is assigned to a variable-length code that does not have to satisfy the prefix-free constraints. We define several classes of codes whose codeword lengths and error probabilities satisfy worse-case criteria in terms of side-information. As a main result, we investigate the exact first-order asymptotics with second-order bounds scaled as Θ(√n) as blocklength n increases under the regime of nonvanishing error probabilities. To get this result, we also derive its one-shot bounds by employing the cutoff operation.
Information-theoretic security and computational security are fundamental paradigms of security in the theory of cryptography. The two paradigms interact with each other but have shown different progress, which motivates us to explore the intersection between them. In this paper, we focus on Multi-Party Computation (MPC) because the security of MPC is formulated by simulation-based security, which originates from computational security, even if it requires information-theoretic security. We provide several equivalent formalizations of the security of MPC under a semi-honest model from the viewpoints of information theory and statistics. The interpretations of these variants are so natural that they support the other aspects of simulation-based security. Specifically, the variants based on conditional mutual information and sufficient statistics are interesting because security proofs for those variants can be given by information measures and factorization theorem, respectively. To exemplify this, we show several security proofs of BGW (Ben-Or, Goldwasser, Wigderson) protocols, which are basically proved by constructing a simulator.
In the field of machine learning security, as one of the attack surfaces especially for edge devices, the application of side-channel analysis such as correlation power/electromagnetic analysis (CPA/CEMA) is expanding. Aiming to evaluate the leakage resistance of neural network (NN) model parameters, i.e. weights and biases, we conducted a feasibility study of CPA/CEMA on floating-point (FP) operations, which are the basic operations of NNs. This paper proposes approaches to recover weights and biases using CPA/CEMA on multiplication and addition operations, respectively. It is essential to take into account the characteristics of the IEEE 754 representation in order to realize the recovery with high precision and efficiency. We show that CPA/CEMA on FP operations requires different approaches than traditional CPA/CEMA on cryptographic implementations such as the AES.
Ren TAKEUCHI Rikima MITSUHASHI Masakatsu NISHIGAKI Tetsushi OHKI
The war between cyber attackers and security analysts is gradually intensifying. Owing to the ease of obtaining and creating support tools, recent malware continues to diversify into variants and new species. This increases the burden on security analysts and hinders quick analysis. Identifying malware families is crucial for efficiently analyzing diversified malware; thus, numerous low-cost, general-purpose, deep-learning-based classification techniques have been proposed in recent years. Among these methods, malware images that represent binary features as images are often used. However, no models or architectures specific to malware classification have been proposed in previous studies. Herein, we conduct a detailed analysis of the behavior and structure of malware and focus on PE sections that capture the unique characteristics of malware. First, we validate the features of each PE section that can distinguish malware families. Then, we identify PE sections that contain adequate features to classify families. Further, we propose an ensemble learning-based classification method that combines features of highly discriminative PE sections to improve classification accuracy. The validation of two datasets confirms that the proposed method improves accuracy over the baseline, thereby emphasizing its importance.
Daisuke MAEDA Koki MORIMURA Shintaro NARISADA Kazuhide FUKUSHIMA Takashi NISHIDE
We propose how to homomorphically evaluate arbitrary univariate and bivariate integer functions such as division. A prior work proposed by Okada et al. (WISTP'18) uses polynomial evaluations such that the scheme is still compatible with the SIMD operations in BFV and BGV schemes, and is implemented with the input domain ℤ257. However, the scheme of Okada et al. requires the quadratic numbers of plaintext-ciphertext multiplications and ciphertext-ciphertext additions in the input domain size, and although these operations are more lightweight than the ciphertext-ciphertext multiplication, the quadratic complexity makes handling larger inputs quite inefficient. In this work, first we improve the prior work and also propose a new approach that exploits the packing method to handle the larger input domain size instead of enabling the SIMD operation, thus making it possible to work with the larger input domain size, e.g., ℤ215 in a reasonably efficient way. In addition, we show how to slightly extend the input domain size to ℤ216 with a relatively moderate overhead. Further we show another approach to handling the larger input domain size by using two ciphertexts to encrypt one integer plaintext and applying our techniques for uni/bivariate function evaluation. We implement the prior work of Okada et al., our improved version of Okada et al., and our new scheme in PALISADE with the input domain ℤ215, and confirm that the estimated run-times of the prior work and our improved version of the prior work are still about 117 days and 59 days respectively while our new scheme can be computed in 307 seconds.
Secure two-party computation is a cryptographic tool that enables two parties to compute a function jointly without revealing their inputs. It is known that any function can be realized in the correlated randomness (CR) model, where a trusted dealer distributes input-independent CR to the parties beforehand. Sometimes we can construct more efficient secure two-party protocol for a function g than that for a function f, where g is a restriction of f. However, it is not known in which case we can construct more efficient protocol for domain-restricted function. In this paper, we focus on the size of CR. We prove that we can construct more efficient protocol for a domain-restricted function when there is a “good” structure in CR space of a protocol for the original function, and show a unified way to construct a more efficient protocol in such case. In addition, we show two applications of the above result: The first application shows that some known techniques of reducing CR size for domain-restricted function can be derived in a unified way, and the second application shows that we can construct more efficient protocol than an existing one using our result.
Rikuhiro KOJIMA Jacob C. N. SCHULDT Goichiro HANAOKA
Multi-signatures have seen renewed interest due to their application to blockchains, e.g., BIP 340 (one of the Bitcoin improvement proposals), which has triggered the proposals of several new schemes with improved efficiency. However, many previous works have a “loose” security reduction (a large gap between the difficulty of the security assumption and breaking the scheme) or depend on strong idealized assumptions such as the algebraic group model (AGM). This makes the achieved level of security uncertain when instantiated in groups typically used in practice, and it becomes unclear for developers how secure a given scheme is for a given choice of security parameters. Thus, this leads to the question “what kind of schemes can we construct that achieves tight security based on standard assumptions?”. In this paper, we show a simple two-round tightly-secure pairing-based multi-signature scheme based on the computation Diffie-Hellman problem in the random oracle model. This proposal is the first two-round multi-signature scheme that achieves tight security based on a computational assumption and supports key aggregation. Furthermore, our scheme reduce the signature bit size by 19% compared with the shortest existing tightly-secure DDH-based multi-signature scheme. Moreover, we implemented our scheme in C++ and confirmed that it is efficient in practice; to complete the verification takes less than 1[ms] with a total (computational) signing time of 13[ms] for under 100 signers. The source code of the implementation is published as OSS.
Wang XU Yongliang MA Kehai CHEN Ming ZHOU Muyun YANG Tiejun ZHAO
Non-autoregressive generation has attracted more and more attention due to its fast decoding speed. Latent alignment objectives, such as CTC, are designed to capture the monotonic alignments between the predicted and output tokens, which have been used for machine translation and sentence summarization. However, our preliminary experiments revealed that CTC performs poorly on document abstractive summarization, where a high compression ratio between the input and output is involved. To address this issue, we conduct a theoretical analysis and propose Hierarchical Latent Alignment (HLA). The basic idea is a two-step alignment process: we first align the sentences in the input and output, and subsequently derive token-level alignment using CTC based on aligned sentences. We evaluate the effectiveness of our proposed approach on two widely used datasets XSUM and CNNDM. The results indicate that our proposed method exhibits remarkable scalability even when dealing with high compression ratios.
Yang YU Longlong LIU Ye ZHU Shixin CEN Yang LI
Pedestrian attribute recognition (PAR) aims to recognize a series of a person's semantic attributes, e.g., age, gender, which plays an important role in video surveillance. This paper proposes a multi-correlation graph convolutional network named MCGCN for PAR, which includes a semantic graph, visual graph, and synthesis graph. We construct a semantic graph by using attribute features with semantic constraints. A graph convolution is employed, based on prior knowledge of the dataset, to learn the semantic correlation. 2D features are projected onto visual graph nodes and each node corresponds to the feature region of each attribute group. Graph convolution is then utilized to learn regional correlation. The visual graph nodes are connected to the semantic graph nodes to form a synthesis graph. In the synthesis graph, regional and semantic correlation are embedded into each other through inter-graph edges, to guide each other's learning and to update the visual and semantic graph, thereby constructing semantic and regional correlation. On this basis, we use a better loss weighting strategy, the suit_polyloss, to address the imbalance of pedestrian attribute datasets. Experiments on three benchmark datasets show that the proposed approach achieves superior recognition performance compared to existing technologies, and achieves state-of-the-art performance.