Takayuki YATO Takahiro SETA Tsuyoshi ITO
Generalized Tsume-Shogi (GTS) is Tsume-Shogi on the board of size n n for arbitrary n. The problem to decide the existence of a winning sequence of moves (where the attacker must always check) on an instance of GTS was proved to be exptime-complete by Yokota et al. (2000). This paper considers the complexity of yozume problem of GTS, which is, roughly speaking, the problem whether a given position of GTS has a winning sequence other than given sequences (though the actual rule of yozume is more complicated). The detection of yozume is an important issue in designing Tsume-Shogi problems, since the modern designing rule strongly prohibits it. We define a function problem of GTS appropriately to formulate yozume problem as its Another Solution Problem (ASP; the problem to decide the existence of solutions other than given ones). Moreover, we extend the existing framework for investigating ASPs so that it can be applied to exptime-complete problems. In particular, since the decision of correctness of given winning sequences is not easy, we establish a framework to treat ASP of function problems with promises. On the basis of these results, we prove that the decision version of yozume problem of GTS is exptime-complete as a promise problem using the existing reduction which was constructed by Yokota et al. to prove the exptime-completeness of GTS.
An efficient computational Zero-Knowledge Proof of Knowledge whose security relies on the NP-completeness of the Independent Set Problem is presented here. The proposed algorithm is constructed from a bit commitment scheme based on the hardness of the Discrete Logarithm Problem, which guarantees the fulfillment of soundness, completeness and computational zero-knowledge properties, and allows avoiding the use of the Graph Isomorphism Problem, which is present in every known Zero-Knowledge Proofs for the Independent Set Problem.
The aim of this work is to investigate the possibility of designing zero-knowledge identification schemes based on hard-on-average problems. It includes a new two-party identification protocol whose security relies on a discrete mathematics problem classified as DistNP-Complete under the average-case analysis, the so-called Distributional Matrix Representability Problem. Thanks to the use of the search version of the mentioned problem, the zero-knowledge property is formally proved by black-box simulation, and consequently the security of the proposed scheme is actually guaranteed. Furthermore, with the proposal of a new zero-knowledge proof based on a problem never used before for this purpose, the set of tools for designing cryptographic applications is enlarged.
Hun-Chen CHEN Tian-Sheuan CHANG Jiun-In GUO Chein-Wei JEN
This paper presents a long length discrete Hartley transform (DHT) design with a new hardware efficient distributed arithmetic (DA) approach. The new DA design approach not only explores the constant property of coefficients as the conventional DA, but also exploits its cyclic property. To efficiently apply this approach to long length DHT, we first decompose the long length DHT algorithm to short ones using the prime factor algorithm (PFA), and further reformulate it by using Agarwal-Cooley algorithm such that all the partitioned short DHT still consists of the cyclic property. Besides, we also exploit the scheme of computation sharing on the content of ROM to reduce the hardware cost with the trade-off in slowing down the computing speeds. Comparing with the existing designs shows that the proposed design can reduce the area-delay product from 52% to 91% according to a 0.35 µm CMOS cell library.
This paper presents a novel method to speed up neural network (NN) based face detection systems. NN-based face detection can be viewed as a classification and search problem. The proposed method formulates the face search problem as an integer nonlinear optimization problem (INLP) and expands the basic particle swarm optimization (PSO) to handle it. PSO works with a population of particles, each representing a subwindow in an input image. The subwindows are evaluated by how well they match a NN based face filter. A face is indicated when the filter response of the best particle is above a given threshold. Experiments on a set of 42 test images show the effectiveness of the proposed approach. Moreover, the effect of PSO parameter settings on the search performance was investigated.
Satoshi INOUE Katsushi INOUE Akira ITO Yue WANG
For each positive integer r 1, a nondeterministic machine M is r path-bounded if for any input word x, there are r computation paths of M on x. This paper investigates the accepting powers of path-bounded one-way (simple) multihead nondeterministic finite automata. It is shown that for each k 2 and r 1, there is a language accepted by an (r + 1) path-bounded one-way nondeterministic k head finite automaton, but not accepted by any r path-bounded one-way nondeterministic k head finite automaton whether or not simple.
Huey-Min SUN Chia-Mei CHEN LihChyun SHU
In this study, we propose an object-based multimedia model for specifying the QoS (quality of service) requirements, such as the maximum data-dropping rate or the maximum data-delay rate. We also present a resource allocation model, called the net-profit model, in which the satisfaction of user's QoS requirements is measured by the benefit earned by the system. Based on the net-profit model, the system is rewarded if it can allocate enough resources to a multimedia delivery request and fulfill the QoS requirements specified by the user. At the same time, the system is penalized if it cannot allocate enough resources to a multimedia delivery request. We first investigate the problem of how to allocate resources efficiently, so that the QoS satisfaction is maximized. However, the net-profit may be distributed unevenly among the multimedia delivery requests. Thus, the second problem discusses how to allocate the resource efficiently so that the net-profit difference is minimized between any two multimedia requests. A dynamic programming based algorithm is proposed to find such an optimal solution with the minimum net-profit differences.
Ching-Lung CHR Szu-Lin SU Shao-Wei WU
A low-complexity step-by-step decoding algorithm for t-error-correcting binary Bose-Chaudhuri-Hocquenghem (BCH) codes is proposed. Using logical analysis, we obtained a simple rule which can directly determine whether a bit in the received word is correct. The computational complexity of this decoder is less than the conventional step-by-step decoding algorithm, since it reduces at least half of the matrix computations and the most complex element in the conventional step-by-step decoder is the "matrix-computing" element.
Herng-Jer LEE Ming-Hong LAI Chia-Chi CHU Wu-Shiung FENG
A new moment computation technique for general lumped R(L)C interconnect circuits with multiple resistor loops is proposed. Using the concept of tearing, a lumped R(L)C network can be partitioned into a spanning tree and several resistor links. The contributions of network moments from each tree and the corresponding links can be determined independently. By combining the conventional moment computation algorithms and the reduced ordered binary decision diagram (ROBDD), the proposed method can compute system moments efficiently. Experimental results have demonstrate that the proposed method can indeed obtain accurate moments and is more efficient than the conventional approach.
Nari TANABE Toshihiro FURUKAWA Kohichi SAKANIWA Shigeo TSUJII
We propose a practical blind channel identification algorithm based on the principal component analysis. The algorithm estimates (1) the channel order, (2) the noise variance, and then identifies (3) the channel impulse response, from the autocorrelation of the channel output signal without using the eigenvalue and singular-value decomposition. The special features of the proposed algorithm are (1) practical method to find the channel order and (2) reduction of computational complexity. Numerical examples show the effectiveness of the proposed algorithm.
A reaction-diffusion computer is a large-scale array of elementary processors, micro-volumes of chemical medium, which act, change their states determined by chemical reactions, concurrently and interact locally, via local diffusion of chemical species; it transforms data to results, both represented by concentration profiles of chemical species, by traveling and colliding waves in spatially extended chemical media. We show that reaction-diffusion processors, simulated or experimental, can solve a variety of tasks, including computational geometry, robot navigation, logics and arithmetics.
Efficient general secure multiparty computation (MPC) protocols were previously proposed, and the combination with the efficient auction circuits achieves the efficient sealed-bid auctions with the full privacy and correctness. However, the combination requires that each bidder submits ciphertexts of bits representing his bid, and their zero-knowledge proofs. This cost amounts to about 80 multi-exponentiations in usual case that the bid size is 20 bits (i.e. about 1,000,000 bid prices). This paper proposes sealed-bid auction protocols based on the efficient MPC protocols, where a bidder can submit only a single ciphertext. The bidder's cost is a few multi-exponentiations, and thus the proposed protocols are suitable for mobile bidders. A novel technique for the realization is a bit-slicing conversion by multiple servers, where a single ciphertext for a bid is securely converted into ciphertexts of bits representing the bid.
Elizabeth H. BLESZYNSKI Marek K. BLESZYNSKI Thomas JAROSZEWICZ
We describe elements of a fast integral equation solver for large periodic and partly periodic finite array systems. A key element of the algorithm is utilization (in a rigorous way) of a block-Toeplitz structure of the impedance matrix in conjunction with either conventional Method of Moments (MoM), Fast Multipole Method (FMM), or Fast Fourier Transform (FFT)-based Adaptive Integral Method (AIM) compression techniques. We refer to the resulting algorithms as the (block-)Toeplitz-MoM, (block-)Toeplitz-AIM, or (block-)Toeplitz-FMM algorithms. While the computational complexity of the Toeplitz-AIM and Toeplitz-FMM algorithms is comparable to that of their non-Toeplitz counterparts, they offer a very significant (about two orders of magnitude for problems of the order of five million unknowns) storage reduction. In particular, our comparisons demonstrate, that the Toeplitz-AIM algorithm offers significant advantages in problems of practical interest involving arrays with complex antenna elements. This result follows from the more favorable scaling of the Toeplitz-AIM algorithm for arrays characterized by large number of unknowns in a single array element and applicability of the AIM algorithm to problems requiring strongly sub-wavelength resolution.
Hochong PARK Younhee KIM Jisang YOO
The AMR wideband speech codec was recently developed for high-quality wideband speech communications. Although it has an excellent performance due to expanded bandwidth of speech signal, it requires a huge amount of computation especially in codebook search. To solve this problem, this paper proposes an efficient codebook search method for AMR wideband codec. Starting from a poorly performing initial codevector, the proposed method enhances the performance of the codevector iteratively by exchanging the worst pulse in the codevector with a better one after evaluating the role of each pulse. Simulations show that the AMR wideband codec adopting the proposed codebook search method provides better performance with much less computational load than that using the standard method.
Huabing ZHU Tony K.Y. CHAN Lizhe WANG Reginald C. JEGATHESE
This paper presents a prototype of a distributed 3D rendering system in a hierarchical Grid environment. 3D rendering with massive data sets is a computationally intensive task. In order to make full use of computational resources on Grids, a hierarchical system architecture is designed to run over multiple clusters. This architecture involves both sort-first and sort-last parallel rendering algorithms to achieve excellent scalability, rendering performance and load balance.
In this paper we apply a parallel adaptive solution algorithm to simulate nanoscale double-gate metal-oxide-semiconductor field effect transistors (MOSFETs) on a personal computer (PC)-based Linux cluster with the message passing interface (MPI) libraries. Based on a posteriori error estimation, the triangular mesh generation, the adaptive finite volume method, the monotone iterative method, and the parallel domain decomposition algorithm, a set of two-dimensional quantum correction hydrodynamic (HD) equations is solved numerically on our constructed cluster system. This parallel adaptive simulation methodology with 1-irregular mesh was successfully developed and applied to deep-submicron semiconductor device simulation in our recent work. A 10 nm n-type double-gate MOSFET is simulated with the developed parallel adaptive simulator. In terms of physical quantities and refined adaptive mesh, simulation results demonstrate very good accuracy and computational efficiency. Benchmark results, such as load-balancing, speedup, and parallel efficiency are achieved and exhibit excellent parallel performance. On a 16 nodes PC-based Linux cluster, the maximum difference among CPUs is less than 6%. A 12.8 times speedup and 80% parallel efficiency are simultaneously attained with respect to different simulation cases.
Taisuke SHIMAMOTO Tetsuo ASANO
This paper addresses the problem of arranging fewest possible probes to detect a hidden object in a specified region and presents a reasonable scheme for the purpose. Of special interest is the case where an object is a double-sided conic cylinder which represents the shape of the energy distribution of laser light used in the optical network. The performance of our scheme is evaluated by comparing the number of probes to that of an existing scheme, and our scheme shows a potential for reducing the number of probes. In other words, the time for detection is significantly reduced from a realistic point of view.
Takeshi OKAMOTO Hirofumi KATSUNO Eiji OKAMOTO
In this paper, we propose a fast signature scheme which realizes short transmissions and minimal on-line computation. Our scheme requires a modular exponentiation as preprocessing (i.e., off-line computation). However, we need to acknowledge the existance of the following remarkable properties: neither multiplication nor modular reduction is used in the actual signature generation (i.e., on-line computation). Our scheme requires only two operations: hashing and addition. Although some fast signature schemes with small on-line computation have been proposed so far, those schemes require multiplication or modular reduction in the on-line phase. This leads to a large amount of work compared to that of addition. As far as we know, this is the first approach to obtain the fast signature without those two calculus methods.
Yasuharu MIZUTANI Fumihiko INO Kenichi HAGIHARA
This paper describes the design and implementation of a testbed for predicting master/slave (M/S) programs written using Message Passing Interface (MPI) programs. The testbed, named M/S Emulator (MSE), aims at assisting developers in evaluating the performance of M/S programs and dynamic load-balancing strategies on clusters of PCs. In order to realize this, MSE predicts the communication time by using a realistic parallel computational model, an extension of the LogGPS model. This extended model improves the prediction accuracy on a large number of processors, because it captures the master's bottleneck: the overhead required for retrieving arrival messages from the slaves. Current MSE also employs a best effort emulation method for predicting the calculation time. In our experiments, MSE demonstrated an accurate prediction on clusters, especially on a larger number of nodes. Therefore, we believe that our extended model enables us to analyze the scalability of the M/S program performance.
Nageswara S.V. RAO William C. GRIMMELL Young-Cheol BANG Sridhar RADHAKRISHNAN
In the emerging networks, routing may be performed at various levels of the TCP/IP stack, such as datagram, TCP stream or application level, with possibly different message forwarding modes. We formulate an abstract quickest path problem for the transmission of a message of size σ from a source to a destination with the minimum end-to-end delay over a network with bandwidth and delay constraints on the links. We consider six modes for the message forwarding at the nodes reflecting the mechanisms such as circuit switching, store and forward, and their combinations. For each of first five modes, we present O( m2 + mn log n ) time algorithms to compute the quickest path for a given message size σ. For the last mode, the quickest path can be computed in O(m + n log n ) time.