The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] PU(3318hit)

881-900hit(3318hit)

  • Extended Algorithm for Solving Underdefined Multivariate Quadratic Equations

    Hiroyuki MIURA  Yasufumi HASHIMOTO  Tsuyoshi TAKAGI  

     
    PAPER-Cryptography and Information Security

      Vol:
    E97-A No:6
      Page(s):
    1418-1425

    It is well known that solving randomly chosen Multivariate Quadratic equations over a finite field (MQ-Problem) is NP-hard, and the security of Multivariate Public Key Cryptosystems (MPKCs) is based on the MQ-Problem. However, this problem can be solved efficiently when the number of unknowns n is sufficiently greater than that of equations m (This is called “Underdefined”). Indeed, the algorithm by Kipnis et al. (Eurocrypt'99) can solve the MQ-Problem over a finite field of even characteristic in a polynomial-time of n when n ≥ m(m+1). Therefore, it is important to estimate the hardness of the MQ-Problem to evaluate the security of Multivariate Public Key Cryptosystems. We propose an algorithm in this paper that can solve the MQ-Problem in a polynomial-time of n when n ≥ m(m+3)/2, which has a wider applicable range than that by Kipnis et al. We will also compare our proposed algorithm with other known algorithms. Moreover, we implemented this algorithm with Magma and solved the MQ-Problem of m=28 and n=504, and it takes 78.7 seconds on a common PC.

  • A Low-Cost Stimulus Design for Linearity Test in SAR ADCs

    An-Sheng CHAO  Cheng-Wu LIN  Hsin-Wen TING  Soon-Jyh CHANG  

     
    PAPER

      Vol:
    E97-C No:6
      Page(s):
    538-545

    The proposed stimulus design for linearity test is embedded in a differential successive approximation register analog-to-digital converter (SAR ADC), i.e. a design for testability (DFT). The proposed DFT is compatible to the pattern generator (PG) and output response analyzer (ORA) with the cost of 12.4-% area of the SAR ADC. The 10-bit SAR ADC prototype is verified in a 0.18-µm CMOS technology and the measured differential nonlinearity (DNL) error is between -0.386 and 0.281 LSB at 1-MS/s.

  • Throughput and Power Efficiency Evaluation of Block Ciphers on Kepler and GCN GPUs Using Micro-Benchmark Analysis

    Naoki NISHIKAWA  Keisuke IWAI  Hidema TANAKA  Takakazu KUROKAWA  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E97-D No:6
      Page(s):
    1506-1515

    Computer systems with GPUs are expected to become a strong methodology for high-speed encryption processing. Moreover, power consumption has remained a primary deterrent for such processing on devices of all sizes. However, GPU vendors are currently announcing their future roadmaps of GPU architecture development: Nvidia Corp. promotes the Kepler architecture and AMD Corp. emphasizes the GCN architecture. Therefore, we evaluated throughput and power efficiency of three 128-bit block ciphers on GPUs with recent Nvidia Kepler and AMD GCN architectures. From our experiments, whereas the throughput and per-watt throughput of AES-128 on Radeon HD 7970 (2048 cores) with GCN architecture are 205.0Gbps and 1.3Gbps/Watt respectively, those on Geforce GTX 680 (1536 cores) with Kepler architecture are, respectively, 63.9Gbps and 0.43Gbps/W; an approximately 3.2 times throughput difference occurs between AES-128 on the two GPUs. Next, we investigate the reasons for the throughput difference using our micro-benchmark suites. According to the results, we speculate that to ameliorate Kepler GPUs as co-processor of block ciphers, the arithmetic and logical instructions must be improved in terms of software and hardware.

  • #P-hardness of Computing High Order Derivative and Its Logarithm

    Ei ANDO  

     
    LETTER

      Vol:
    E97-A No:6
      Page(s):
    1382-1384

    In this paper, we show a connection between #P and computing the (real) value of the high order derivative at the origin. Consider, as a problem instance, an integer b and a sufficiently often differentiable function F(x) that is given as a string. Then we consider computing the value F(b)(0) of the b-th derivative of F(x) at the origin. By showing a polynomial as an example, we show that we have FP = #P if we can compute log 2F(b)(0) up to certain precision. The previous statement holds even if F(x) is limited to a function that is analytic at any x ∈ R. It implies the hardness of computing the b-th value of a number sequence from the closed form of its generating function.

  • Expressing Algorithms as Concise as Possible via Computability Logic

    Keehang KWON  

     
    LETTER

      Vol:
    E97-A No:6
      Page(s):
    1385-1387

    This paper proposes a new approach to defining and expressing algorithms: the notion of task logical algorithms. This notion allows the user to define an algorithm for a task T as a set of agents who can collectively perform T. This notion considerably simplifies the algorithm development process and can be seen as an integration of the sequential pseudocode and logical algorithms. This observation requires some changes to algorithm development process. We propose a two-step approach: the first step is to define an algorithm for a task T via a set of agents that can collectively perform T. The second step is to translate these agents into (higher-order) computability logic.

  • An Effective Model of the Overshooting Effect for Multiple-Input Gates in Nanometer Technologies

    Li DING  Zhangcai HUANG  Atsushi KUROKAWA  Jing WANG  Yasuaki INOUE  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E97-A No:5
      Page(s):
    1059-1074

    With the scaling of CMOS technology into the nanometer regime, the overshooting effect is more and more obvious and has a significant influence to gate delay and power consumption. Recently, researchers have already proposed the overshooting effect models for an inverter. However, the accurate overshooting effect model for multiple-input gates is seldom presented and the existing technology to reduce a multiple-input gate to an inverter is not useful when modeling the overshooting effect for multiple-input gates. Therefore, modeling the overshooting effect for multiple-input gates is proposed in this paper. Firstly, a formula-based model is presented for the overshooting time of 2-input NOR gate. Then, more complicated methods are given to calculate the overshooting time of 3-input NOR gate and other multiple-input gates. The proposed model is verified to have a good agreement, within 3.4% error margin, compared with SPICE simulation results using CMOS 32nm PTM model.

  • A High Output Resistance 1.2-V VDD Current Mirror with Deep Submicron Vertical MOSFETs

    Satoru TANOI  Tetsuo ENDOH  

     
    PAPER

      Vol:
    E97-C No:5
      Page(s):
    423-430

    A low VDD current mirror with deep sub-micron vertical MOSFETs is presented. The keys are new bias circuits to reduce both the minimum VDD for the operation and the sensitivity of the output current on VDD. In the simulation, our circuits reduce the minimum VDD by about 17% and the VDD sensitivity by one order both from those of the conventional. In the simulation with 90nm φ vertical MOSFET approximate models, our circuit shows about 4MΩ output resistance at 1.2-V VDD with the small temperature dependence, which is about six times as large as that with planar MOSFETs.

  • Influence of Si Surface Roughness on Electrical Characteristics of MOSFET with HfON Gate Insulator Formed by ECR Plasma Sputtering

    Dae-Hee HAN  Shun-ichiro OHMI  Tomoyuki SUWA  Philippe GAUBERT  Tadahiro OHMI  

     
    PAPER

      Vol:
    E97-C No:5
      Page(s):
    413-418

    To improve metal oxide semiconductor field effect transistors (MOSFET) performance, flat interface between gate insulator and silicon (Si) should be realized. In this paper, the influence of Si surface roughness on electrical characteristics of MOSFET with hafnium oxynitride (HfON) gate insulator formed by electron cyclotron resonance (ECR) plasma sputtering was investigated for the first time. The surface roughness of Si substrate was reduced by Ar/4.9%H2 annealing utilizing conventional rapid thermal annealing (RTA) system. The obtained root-mean-square (RMS) roughness was 0.07nm (without annealed: 0.18nm). The HfON was formed by 2nm-thick HfN deposition followed by the Ar/O2 plasma oxidation. The electrical properties of HfON gate insulator were improved by reducing Si surface roughness. It was found that the current drivability of fabricated nMOSFETs was remarkably increased by reducing Si surface roughness. Furthermore, the reduction of Si surface roughness also leads to decrease of the 1/f noise.

  • A Formal Verification of a Subset of Information-Based Access Control Based on Extended Weighted Pushdown System

    Pablo LAMILLA ALVAREZ  Yoshiaki TAKATA  

     
    PAPER-Formal Verification

      Vol:
    E97-D No:5
      Page(s):
    1149-1159

    Information-Based Access Control (IBAC) has been proposed as an improvement to History-Based Access Control (HBAC) model. In modern component-based systems, these access control models verify that all the code responsible for a security-sensitive operation is sufficiently authorized to execute that operation. The HBAC model, although safe, may incorrectly prevent the execution of operations that should be executed. The IBAC has been shown to be more precise than HBAC maintaining its safety level while allowing sufficiently authorized operations to be executed. However the verification problem of IBAC program has not been discussed. This paper presents a formal model for IBAC programs based on extended weighted pushdown systems (EWPDS). The mapping process between the IBAC original semantics and the EWPDS structure is described. Moreover, the verification problem for IBAC programs is discussed and several typical IBAC program examples using our model are implemented.

  • A Triple-Push Voltage Controlled Oscillator in 0.13-µm RFCMOS Technology Operating Near 177GHz

    Namhyung KIM  Kyungmin KIM  Jae-Sung RIEH  

     
    BRIEF PAPER

      Vol:
    E97-C No:5
      Page(s):
    444-447

    This paper presents a G-band triple-push voltage controlled oscillator (VCO) operating around 177GHz. The VCO, implemented in a commercial 0.13-µm RFCMOS technology, adopts a triple-push topology that is composed of 3 symmetrically coupled identical Colpitts sub-oscillators. Oscillation frequency can be tuned from 175.9GHz to 178.4GHz with varactor tuning voltage swept from 0 to 1.2V. The calibrated output power ranged from -19.7dBm to -16.6dBm depending on the oscillation frequency. The measured phase noise of the VCO is -80.2dBc/Hz at 1MHz offset. The results clearly demonstrate the possibility of applying triple-push topology for VCOs operating beyond 100GHz, enabling various high frequency applications that require tunable frequency sources.

  • A Novel Method of Deinterleaving Pulse Repetition Interval Modulated Sparse Sequences in Noisy Environments

    Mahmoud KESHAVARZI  Delaram AMIRI  Amir Mansour PEZESHK  Forouhar FARZANEH  

     
    LETTER-Digital Signal Processing

      Vol:
    E97-A No:5
      Page(s):
    1136-1139

    This letter presents a novel method based on sparsity, to solve the problem of deinterleaving pulse trains. The proposed method models the problem of deinterleaving pulse trains as an underdetermined system of linear equations. After determining the mixing matrix, we find sparsest solution of an underdetermined system of linear equations using basis pursuit denoising. This method is superior to previous ones in a number of aspects. First, spurious and missing pulses would not cause any performance reduction in the algorithm. Second, the algorithm works well despite the type of pulse repetition interval modulation that is used. Third, the proposed method is able to separate similar sources.

  • An Efficient Parallel SOVA-Based Turbo Decoder for Software Defined Radio on GPU

    Rongchun LI  Yong DOU  Jiaqing XU  Xin NIU  Shice NI  

     
    PAPER-Digital Signal Processing

      Vol:
    E97-A No:5
      Page(s):
    1027-1036

    In this paper, we propose a fully parallel Turbo decoder for Software-Defined Radio (SDR) on the Graphics Processing Unit (GPU) platform. Soft Output Viterbi algorithm (SOVA) is chosen for its low complexity and high throughput. The parallelism of SOVA is fully analyzed and the whole codeword is divided into multiple sub-codewords, where the turbo-pass decoding procedures are performed in parallel by independent sub-decoders. In each sub-decoder, an efficient initialization method is exploited to assure the bit error ratio (BER) performance. The sub-decoders are mapped to numerous blocks on the GPU. Several optimization methods are employed to enhance the throughput, such as the memory optimization, codeword packing scheme, and asynchronous data transfer. The experiment shows that our decoder has BER performance close to Max-Log-MAP and the peak throughput is 127.84Mbps, which is about two orders of magnitude faster than that of central processing unit (CPU) implementation, which is comparable to application-specific integrated circuit (ASIC) solutions. The presented decoder can achieve higher throughput than that of the existing fastest GPU-based implementation.

  • DOA and DOD Estimation Using Orthogonal Projection Approach for Bistatic MIMO Radars

    Ann-Chen CHANG  Chih-Chang SHEN  Kai-Shiang CHANG  

     
    LETTER-Digital Signal Processing

      Vol:
    E97-A No:5
      Page(s):
    1121-1124

    In this letter, the orthogonal projection (OP) estimation of the direction of arrival (DOA) and direction of departure (DOD) of multiple targets for bistatic multiple-input multiple-output radars is addressed. First, a two-dimensional direction finding estimator based on OP technique with automatic pairing is developed. Second, this letter also presents a modified reduced-dimension estimator by utilizing the characteristic of Kronecker product, which only performs two one-dimensional angle estimates. Furthermore, the DOA and DOD pairing is given automatically. Finally, simulation results are presented to verify the efficiency of the proposed estimators.

  • Accelerating Extended Hamming Code Decoders on Graphic Processing Units for High Speed Communication

    Md Shohidul ISLAM  Jong-Myon KIM  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E97-B No:5
      Page(s):
    1050-1058

    Emerging networks characterized by growing speed and data insensitivity demand faster and scalable error handling. Prevalent decoders are based on dedicated hardware, offering considerable processing speed, but limited flexibility, programmability and scalability. This paper proposes an efficient approach to accelerate the extended-Hamming code decoder using a graphics processing unit (GPU), chosen for its low cost and extremely high-throughput parallel-computing capability. This paper compares the performance of the GPU-based approach with the equivalent sequential approaches that are performed on a central processing unit (CPU) and Texas Instruments TMS320C6742 digital signal processor (DSP) with varying packet sizes and error tolerances. Experimental results demonstrate that the proposed GPU-based approach outperforms the sequential approaches in terms of execution time and energy consumption.

  • Computation of the Total Autocorrelation over Shared Binary Decision Diagrams

    Miloš RADMANOVIC  Radomir S. STANKOVIC  Claudio MORAGA  

     
    LETTER-VLSI Design Technology and CAD

      Vol:
    E97-A No:5
      Page(s):
    1140-1143

    This paper describes a method for the efficient computation of the total autocorrelation for large multiple-output Boolean functions over a Shared Binary Decision Diagram (SBDD). The existing methods for computing the total autocorrelation over decision diagrams are restricted to single output functions and in the case of multiple-output functions require repeating the procedure k times where k is the number of outputs. The proposed method permits to perform the computation in a single traversal of SBDD. In that order, compared to standard BDD packages, we modified the way of traversing sub-diagrams in SBDD and introduced an additional memory function kept in the hash table for storing results of the computation of the autocorrelation between two subdiagrams in the SBDD. Due to that, the total amount of computations is reduced which makes the method feasible in practical applications. Experimental results over standard benchmarks confirm the efficiency of the method.

  • Reconfigurable Out-of-Order System for Fluid Dynamics Computation Using Unstructured Mesh

    Takayuki AKAMINE  Mohamad Sofian ABU TALIP  Yasunori OSANA  Naoyuki FUJITA  Hideharu AMANO  

     
    PAPER-Computer System

      Vol:
    E97-D No:5
      Page(s):
    1225-1234

    Computational fluid dynamics (CFD) is an important tool for designing aircraft components. FaSTAR (Fast Aerodynamics Routines) is one of the most recent CFD packages and has various subroutines. However, its irregular and complicated data structure makes it difficult to execute FaSTAR on parallel machines due to memory access problem. The use of a reconfigurable platform based on field programmable gate arrays (FPGAs) is a promising approach to accelerating memory-bottlenecked applications like FaSTAR. However, even with hardware execution, a large number of pipeline stalls can occur due to read-after-write (RAW) data hazards. Moreover, it is difficult to predict when such stalls will occur because of the unstructured mesh used in FaSTAR. To eliminate this problem, we developed an out-of-order mechanism for permuting the data order so as to prevent RAW hazards. It uses an execution monitor and a wait buffer. The former identifies the state of the computation units, and the latter temporarily stores data to be processed in the computation units. This out-of-order mechanism can be applied to various types of computations with data dependency by changing the number of execution monitors and wait buffers in accordance with the equations used in the target computation. An out-of-order system can be reconfigured by automatic changing of the parameters. Application of the proposed mechanism to five subroutines in FaSTAR showed that its use reduces the number of stalls to less than 1% compared to without the mechanism. In-order execution was speeded up 2.6-fold and software execution was speeded up 2.9-fold using an Intel Core 2 Duo processor with a reasonable amount of overhead.

  • An Efficient Strategy for Bit-Quad-Based Euler Number Computing Algorithm

    Bin YAO  Hua WU  Yun YANG  Yuyan CHAO  Atsushi OHTA  Haruki KAWANAKA  Lifeng HE  

     
    LETTER-Pattern Recognition

      Vol:
    E97-D No:5
      Page(s):
    1374-1378

    The Euler number of a binary image is an important topological property for pattern recognition, and can be calculated by counting certain bit-quads in the image. This paper proposes an efficient strategy for improving the bit-quad-based Euler number computing algorithm. By use of the information obtained when processing the previous bit quad, the number of times that pixels must be checked in processing a bit quad decreases from 4 to 2. Experiments demonstrate that an algorithm with our strategy significantly outperforms conventional Euler number computing algorithms.

  • Complexity Reduction in Joint Decoding of Block Coded Signals in Overloaded MIMO-OFDM System

    Yoshihito DOI  Mamiko INAMORI  Yukitoshi SANADA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E97-B No:4
      Page(s):
    905-914

    This paper presents a low complexity joint decoding scheme of block coded signals in an overloaded multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) system. In previous literature, a joint maximum likelihood decoding scheme of block coded signals has been evaluated through theoretical analysis. The diversity gain with block coding prevents the performance degradation induced by signal multiplexing. However, the computational complexity of the joint decoding scheme increases exponentially with the number of multiplexed signal streams. Thus, this paper proposes a two step joint decoding scheme for block coded signals. The first step of the proposed scheme calculates metrics to reduce the number of the candidate codewords using decoding based on joint maximum likelihood symbol detection. The second step of the proposed scheme carries out joint decoding on the reduced candidate codewords. It is shown that the proposed scheme reduces the complexity by about 1/174 for 4 signal stream transmission.

  • Digital Controller for Single-Phase DCM Boost PFC Converter with High Power Factor over Wide Input Voltage and Load Range

    Daying SUN  Weifeng SUN  Qing WANG  Miao YANG  Shen XU  Shengli LU  

     
    PAPER-Electronic Circuits

      Vol:
    E97-C No:4
      Page(s):
    377-385

    A new digital controller for a single-phase boost power factor correction (PFC) converter operating at a discontinuous conduction mode (DCM), is presented to achieve high input power factor over wide input voltage and load range. A method of duty cycle modulation is proposed to reduce the line harmonic distortion and improve the power factor. The loop regulation scheme is adopted to further improve the system stability and the power factor simultaneously. Meanwhile, a novel digital pulse width modulator (DPWM) based on the delay lock loop technique, is realized to improve the regulation linearity of duty cycle and reduce the regulation deviation. The single-phase DCM boost PFC converter with the proposed digital controller based on the field programmable gate array (FPGA) has been implemented. Experimental results indicate that the proposed digital controller can achieve high power factor more than 0.99 over wide input voltage and load range, the output voltage deviation is less than 3V, and the peak conversion efficiency is 96.2% in the case of a full load.

  • A New Hybrid Approach for Privacy Preserving Distributed Data Mining

    Chongjing SUN  Hui GAO  Junlin ZHOU  Yan FU  Li SHE  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E97-D No:4
      Page(s):
    876-883

    With the distributed data mining technique having been widely used in a variety of fields, the privacy preserving issue of sensitive data has attracted more and more attention in recent years. Our major concern over privacy preserving in distributed data mining is the accuracy of the data mining results while privacy preserving is ensured. Corresponding to the horizontally partitioned data, this paper presents a new hybrid algorithm for privacy preserving distributed data mining. The main idea of the algorithm is to combine the method of random orthogonal matrix transformation with the proposed secure multi-party protocol of matrix product to achieve zero loss of accuracy in most data mining implementations.

881-900hit(3318hit)