Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Gyuyeong KIM
Hyun KWON Jun LEE
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Guangjin Ouyang Yong Guo Yu Lu Fang He
Yuyao LIU Qingyong LI Shi BAO Wen WANG
Cong PANG Ye NI Jia Ming CHENG Lin ZHOU Li ZHAO
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Kazuya KAKIZAKI Kazuto FUKUCHI Jun SAKUMA
Yitong WANG Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Waqas NAWAZ Muhammad UZAIR Kifayat ULLAH KHAN Iram FATIMA
Haeyoung Lee
Ji XI Pengxu JIANG Yue XIE Wei JIANG Hao DING
Weiwei JING Zhonghua LI
Sena LEE Chaeyoung KIM Hoorin PARK
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Chih-Ping Wang Duen-Ren Liu
Yuya TAKADA Rikuto MOCHIDA Miya NAKAJIMA Syun-suke KADOYA Daisuke SANO Tsuyoshi KATO
Yi Huo Yun Ge
Rikuto MOCHIDA Miya NAKAJIMA Haruki ONO Takahiro ANDO Tsuyoshi KATO
Koichi FUJII Tomomi MATSUI
Yaotong SONG Zhipeng LIU Zhiming ZHANG Jun TANG Zhenyu LEI Shangce GAO
Souhei TAKAGI Takuya KOJIMA Hideharu AMANO Morihiro KUGA Masahiro IIDA
Jun ZHOU Masaaki KONDO
Tetsuya MANABE Wataru UNUMA
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Jingjing Liu Chuanyang Liu Yiquan Wu Zuo Sun
Zhenglong YANG Weihao DENG Guozhong WANG Tao FAN Yixi LUO
Yoshiaki TAKATA Akira ONISHI Ryoma SENDA Hiroyuki SEKI
Dinesh DAULTANI Masayuki TANAKA Masatoshi OKUTOMI Kazuki ENDO
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Genta INOUE Daiki OKONOGI Satoru JIMBO Thiem Van CHU Masato MOTOMURA Kazushi KAWAMURA
Hikaru USAMI Yusuke KAMEDA
Yinan YANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
Takao HINAMOTO Toshimasa WATANABE
Kazunori HAYASHI Hideaki SAKAI
This paper proposes per-tone equalization methods for single carrier block transmission with cyclic prefix (SC-CP) systems. Minimum mean-square-error (MMSE) based optimum weights of the per-tone equalizers are derived for SISO (single-input single-output), SIMO (single-input multiple-output), and MIMO (multiple-input multiple-output) SC-CP systems. Unlike conventional frequency domain equalization methods, where discrete Fourier transform (DFT) is employed, the per-tone equalizers utilize sliding DFT, which makes it possible to achieve good performance even when the length of the guard interval is shorter than the channel order. Computer simulation results show that the proposed equalizers can significantly improve the bit error rate (BER) performance of the SISO, SIMO, and MIMO SC-CP systems with the insufficient guard interval.
Gordana Jovanovic DOLECEK Sanjit K. MITRA
This paper presents a new multistage comb-rotated sinc (RS) decimator with a sharpened magnitude response. Novelty of this paper is that the multistage structure has more design parameters that provides additional flexibility to the design procedure. It uses different sharpening polynomials and different cascaded comb filters at different stages. As the comb filters at the latter stages are of lower order than that of the original comb filter, the use of more complex sharpening polynomials at latter stages is possible. This leads to an improvement of the frequency characteristic without a significant increase in the complexity of the overall filter. The comb filter of the first stage is realized in a non-recursive form and can be implemented in a computationally efficient form by making use of the polyphase decomposition of the transfer function in which the subfilters operate at a lower rate that depends on the down-sampling factor employed in the first stage. In addition, both multipliers of the rotated sinc (RS) filter of the second stage work at a lower rate.
Nozomu TOGAWA Koichi TACHIKAKE Yuichiro MIYAOKA Masao YANAGISAWA Tatsuo OHTSUKI
This paper focuses on SIMD processor synthesis and proposes a SIMD instruction set/functional unit synthesis algorithm. Given an initial assembly code and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with optimal SIMD functional units. It also synthesizes a SIMD instruction set. The input initial assembly code is assumed to run on a full-resource SIMD processor (virtual processor) which has all the possible SIMD functional units. In our algorithm, we introduce the SIMD operation decomposition and apply it to the initial assembly code and the full-resource SIMD processor. By gradually reducing SIMD operations or decomposing SIMD operations, we can finally find a processor core with small area under the given timing constraint. The promising experimental results are also shown.
The main contribution of this work is to present several hardware implementations of an "n choose k" counter (C(n,k) counter for short), which lists all n-bit numbers with (n-k) 0's and k 1's, and to show their applications. We first present concepts of C(n,k) counters and their efficient implementations on an FPGA. We then go on to evaluate their performance in terms of the number of used slices and the clock frequency for the Xilinx VirtexII family FPGA XC2V3000-4. As one of the real life applications, we use a C(n,k) counter to accelerate a digital halftoning method that generates a binary image reproducing an original gray-scale image. This method repeatedly replaces an image pattern in small square regions of a binary image by the best one. By the partial exhaustive search using a C(n,k) counter we succeeded in accelerating the task of finding the best image pattern and achieved a speedup factor of more than 2.5 over the simple exhaustive search.
Jin-Tai YAN Yen-Hsiang CHEN Chia-Fang LEE
As the complexity of VLSI circuits increases, the routability problem becomes more and more important in modern VLSI design. In general, the flexibility improvement of the edges in a routing tree has been exploited to release the routing congestion and increase the routability in the routing stage. Given an initial rectilinear Steiner tree, the rectilinear Steiner tree can be transformed into a Steiner routing tree by deleting all the corner points in the rectilinear Steiner tree. Based on the definition of the routing flexibility in a Steiner routing tree and the timing-constrained location flexibility of the Steiner-point in any Y-type wire, the simulated-annealing-based approach is proposed to construct a better timing-constrained flexibility-driven Steiner routing tree by reassigning the feasible locations of the Steiner points in all the Y-type wires. The experimental results show that our proposed algorithm, STFSRT, can increase about 0.005-0.020% wire length to improve about 43-173% routing flexibility for the tested benchmark circuits.
Konstantinos SIOZIOS George KOUTROUMPEZIS Konstantinos TATAS Nikolaos VASSILIADIS Vasilios KALENTERIDIS Haroula POURNARA Ilias PAPPAS Dimitrios SOUDRIS Antonios THANAILAKIS Spiridon NIKOLAIDIS Stilianos SISKOS
A complete system for the implementation of digital logic in a Field-Programmable Gate Array (FPGA) platform is introduced. The novel power-efficient FPGA architecture was designed and simulated in STM 0.18 µm CMOS technology. The detailed design and circuit characteristics of the Configurable Logic Block, the interconnection network, the switch box and the connection box were determined and evaluated in terms of energy, delay and area. A number of circuit-level low-power techniques were employed because power consumption was the primary concern. Additionally, a complete tool framework for the implementation of digital logic circuits in FPGA platforms is introduced. Having as input VHDL description of an application, the framework derives the reconfiguration bitstream of FPGA. The framework consists of: i) non-modified academic tools, ii) modified academic tools and iii) new tools. Furthermore, the framework can support a variety of FPGA architectures. Qualitative and quantitative comparisons with existing academic and commercial architectures and tools are provided, yielding promising results.
Fatih KOCAN Mehmet H. GUNES Atakan KURT
Zero-suppressed BDDs (ZBDDs) have been used in the nonenumerative path delay fault (PDF) grading of VLSI circuits. One basic and one cut-based grading algorithm are proposed to grade circuits with polynomial and exponential number of PDFs, respectively. In this article, we present a new ZBDD-based basic PDF grading algorithm to enable grading of some circuits with exponential number of PDFs without using the cut-based algorithm. The algorithm overcomes the memory overflow problems by dynamically pruning the ZBDD at run-time. This new algorithm may give exact or pessimistic coverage depending on the statuses of the pruned nodes. Furthermore, we re-assess the performance of the static variable ordering heuristics in ZBDDs for PDF coverage calculation. The proposed algorithm combined with the efficient static variable ordering heuristics can avoid ZBDD size explosion in many circuits. Experimental results for ISCAS85 benchmarks show that the proposed algorithm efficiently grades circuits.
Chikaaki KODAMA Kunihiro FUJIYOSHI
This paper discusses how to minimize the number of dissection lines regarded as wiring channels on a floorplan corresponding to a placement of n modules. In a floorplan (rectangular dissection), the number of dissection lines exceeds the number of rooms exactly by three. Since a floorplan obtained from a given module placement may have many empty rooms where no module is assigned, redundant wiring channels and wire bends may also be generated. Hence, in order to reduce redundant channels and wire bends, removal of empty rooms is required. For this purpose, we formulate a problem of obtaining a floorplan with the minimum possible empty rooms based on a given module placement. Then, we propose a method of removing as many redundant empty rooms as possible by merging dissection lines on a floorplan in O(n) time. The number of empty rooms in the resultant floorplan is reduced to n-
This paper proposes a novel boundary scan test scheme for intellectual property (IP) core identification via watermarking. The core concept is embedding a watermark identification circuit (WIC) and a test circuit into the IP core at the behavior design level. The procedure depends on current IP-based design flow. This scheme can detect the identification of the IP provider without the need to examine the microphotograph after the chip has been manufactured and packaged. This scheme can successfully survive synthesis, placement, and routing and identify the IP core at various design levels. Experimental results have demonstrated that the proposed approach has the potential to solve the IP identification problem.
Yasuaki INOUE Yu IMAI Kiyotaka YAMAMURA
Finding DC operating points of transistor circuits is a very important and difficult task. The Newton-Raphson method employed in SPICE-like simulators often fails to converge to a solution. To overcome this convergence problem, homotopy methods have been studied from various viewpoints. For efficiency of homotopy methods, it is important to construct an appropriate homotopy function. In conventional homotopy methods, linear auxiliary functions have been commonly used. In this paper, a homotopy method for solving transistor circuits using a nonlinear auxiliary function is proposed. The proposed method utilizes the nonlinear function closely related to circuit equations to be solved, so that it efficiently finds DC operating points of practical transistor circuits. Numerical examples show that the proposed method is several times more efficient than conventional three homotopy methods.
Jessi E. JOHNSON Andrew SILVA George R. BRANNER
For a highly nonlinear circuit design such as an active frequency multiplier, performing an input impedance "match" is not a straightforward problem. In this work, an analysis of nonlinear input impedance matching in active microwave frequency multipliers is presented. By utilizing harmonic balance simulation of an idealized device model, fundamental aspects of performing an input "match" are explored for classical frequency doubler and frequency tripler configurations. The analysis is then repeated using a realistic device model, verifying the efficacy of using nonlinear input impedance matching to improve the output power and return loss characteristics of a multiplier.
Minh-Tuan LE Van-Su PHAM Linh MAI Giwan YOON
Orthogonal space-time block codes (STBCs) appear to be a very fascinating means of enhancing reception quality in quasi-static MIMO channels due to their full diversity, and especially their simple maximum-likelihood (ML) decoders. However, full-rate full-diversity orthogonal STBCs do not exist for more than two transmit antennas. Vertical layered space-time architecture (so-called the V-BLAST) with a nulling- and cancelling-based detection algorithm, in contrast, has an ability of achieving high transmission rates at the cost of having very low diversity gain, an undesirable consequence caused by the interference nulling and cancelling processes. The uncoded V-BLAST system is able to reach its ML performance with the aid of the sphere decoder algorithm at the expense of higher detection complexity. Undoubtedly, the tradeoff between transmission rates, diversity, and complexity is inherent in designing space-time codes. This paper investigates a method to increase the "nulling diversity gains" for a general high-rate space-time code and introduces a new design strategy for high-rate space-time codes detected based on interference nulling and cancelling processes, thanks to which high-rate quasi-orthogonal space-time codes for MIMO applications are proposed. We show that when nT transmit and nR=nT receive antennas are deployed, the first code offers a transmission rate of (nT-1) with a minimum nulling diversity order of 3, whereas the second one offers a transmission rate of (nT-2) with a minimum nulling diversity order of 5. Therefore, the proposed codes significantly outperform the V-BLAST as nR=nT. Simulation results and discussions on the performance of the proposed codes are provided.
Minseok KIM Aiko KIYONO Koichi ICHIGE Hiroyuki ARAI
Undersampling (or bandpass sampling) phase modulated signals directly at high frequency band, the harmful effects of the aperture jitter characteristics of ADCs (Analog-to-Digital converters) and sampling clock instability of the system can not be ignored. In communication systems the sampling jitter brings additional phase noise to the constellation pattern besides thermal noise, thus the BER (bit error rate) performance will be degraded. This paper examines the relationship between the input frequency to ADC and the sampling jitter in digital IF (Intermediate Frequency) downconversion receivers with undersampling scheme. This paper presents the measurement results with a real hardware prototype system as well as the computer simulation results with a theoretically modeled IF sampling receiver. We evaluated EVM (Error Vector Magnitude) in various clock jitter configurations with commonly used and reasonable cost ADCs of which sampling rates was 40 MHz. According to the results, the IF input frequencies of QPSK (16 QAM) signals were limited below around 290 (210) MHz for wireless LAN standard, and 730 (450) MHz for W-CDMA standard, respectively, in our best configuration.
Eun-Gu JUNG Jeong-Gun LEE Kyoung-Sun JHANG Dong-Soo HAR
Since the inception of Globally Asynchronous Locally Synchronous (GALS) VLSI design, GALS has been considered a promising design technique for multi-clock-domain System-on-Chip (SoC). Among the handshake protocols available for SoC design, delay insensitive (DI) handshake protocol is becoming a core technology, since it facilitates robust data transfer regardless of wire delay variation. In this paper, a new data encoding scheme Differential Value Encoding (DVE) is proposed for two-phase 1-of-N DI handshake protocol. Compared with the conventional data encoding method, the proposed scheme effectively reduces the crosstalk effect on wires sending sequentially increasing data patterns, resulting in reduction of the data transfer time. Simulation results with SPEC CPU 2000 benchmarks and sequentially increasing data pattern reveal that the DVE scheme can reduce the crosstalk effect by tens of percentage and significantly decrease the data transfer time.
This paper proposes a new theory and design method for a class of recombination nonuniform filter banks (RNFBs) with linear phase (LP) filters. In a uniform filter bank (FB), consecutive channels are merged by sets of transmultiplexers (TMUXs) to realize a nonuniform FB. RNFBs with LP analysis/synthesis filters are of great interest because the analysis filters for the partially reconstructed signals, through merging, are LP and hence less phase distortions are introduced to the desired signals. We analyze the spectrum supports of the analysis filters of these LP RNFBs. The conditions on the uniform FB and recombination TMUXs of an LP RNFB with good frequency characteristics are determined. These conditions are relatively simple to be satisfied and the uniform FB and recombination TMUXs can be designed separately without much degradation in performance. This allows dynamically recombination of different number of channels in the original uniform FB to give a flexible and time-varying frequency partitioning. Using these results, a method for designing a class of near-perfect-reconstruction (NPR) LP RNFBs with cosine roll-off transition band using the REMEZ algorithm is proposed. A design example is given to show that LP RNFBs with good frequency responses and reasonably low reconstruction errors can be achieved.
Toshihiko FUKUE Atsushi FUJITA Nozomu HAMADA
In this paper we propose a stepped-FM array radar system that can precisely estimate the target position by combining S- and T-MUSIC and adaptive beamforming. By adopting the adaptive beamformer as a preprocessor of T-MUSIC, the proposed system can uniquely determine the direction and distance of targets. In addition, the distance estimation precision is improved by introducing beamformer.
Chen LIU Zhenyang WU Hua-An ZHAO
This paper proposes a new family of space-time block codes whose transmission rate is 1 symbol per channel use. The proposed space-time codes can achieve full transmit diversity with larger coding gain for the constellation carved from the scaled complex integer ring κZ[i]. It is confirmed that the performances of the proposed space-time codes are superior to the existing space-time block codes by our simulation results.
Suk-Jin KIM Jeong-Gun LEE Kiseon KIM
This letter presents a synchronizer and its handshake interface for bridging clock domains in SoC. The proposed scheme uses a double two-flop synchronizer operated at different clock edges respectively, based on a two-phase handshake protocol. Performance analysis shows that the proposed design reduces latency up to a clock cycle, while retaining its safety to a tolerable level.
Tso-Bing JUANG Shen-Fu HSIAO Ming-Yu TSAI Jenq-Shiun JAN
In this paper, a cell-driven multiplier generator is developed that can produce high-performance gate-level netlists for multiplier-related arithmetic functional units, including multipliers, multiplier and accumulators (MAC) and dot product calculator. The generator optimizes the speed/area performance both in the partial product compression and in the final addition stage for the specified process technology. In addition to the conventional CMOS full adder cells, we have also designed fast compression elements based on pass-transistor logic for further performance improvement of the generated multipliers. Simulation results show that our proposed generator could produce better multiplier-related functional units compared to those generated using Synopsys Designware library or other previously proposed approaches.
In this paper, the 1-D real-valued discrete Gabor transform (RDGT) proposed in our previous work and its relationship with the complex-valued discrete Gabor transform (CDGT) are briefly reviewed. Block time-recursive RDGT algorithms for the efficient and fast computation of the 1-D RDGT coefficients and for the fast reconstruction of the original signal from the coefficients are then developed in both the critical sampling case and the oversampling case. Unified parallel lattice structures for the implementation of the algorithms are studied. And the computational complexity analysis and comparison show that the proposed algorithms provide a more efficient and faster approach for the computation of the discrete Gabor transforms.
Jianping HU Tiefeng XU Hong LI
This paper presents a novel low-power register file based on adiabatic logic. The register file consists of a storage-cell array, address decoders, read/write control circuits, sense amplifiers, and read/write drivers. The storage-cell array is based on the conventional memory cell. All the circuits except the storage-cell array employ CPAL (complementary pass-transistor adiabatic logic) to recover the charge of large node capacitance on address decoders, bit-lines and word-lines in fully adiabatic manner. The minimization of energy consumption was investigated by choosing the optimal size of CPAL circuits for large load capacitance. The power consumption of the proposed adiabatic register file is significantly reduced because the energy transferred to the large capacitance buses is mostly recovered. The energy and functional simulations are performed using the net-list extracted from the layout. HSPICE simulation results indicate that the proposed register file attains energy savings of 65% to 85% as compared to the conventional CMOS implementation for clock rates ranging from 25 to 200 MHz.
Masanori HARIYAMA Haruka SASAKI Michitaka KAMEYAMA
This paper presents a VLSI processor for high-speed and reliable stereo matching based on adaptive window-size control of SAD(Sum of Absolute Differences) computation. To reduce its computational complexity, SADs are computed using multi-resolution images. Parallel memory access is essential for highly parallel image processing. For parallel memory access, this paper also presents an optimal memory allocation that minimizes the hardware amount under the condition of parallel memory access at specified resolutions.
Debatosh DEBNATH Tsutomu SASAO
This paper presents a design method for three-level programmable logic arrays (PLAs), which have input decoders and two-input EXOR gates at the outputs. The PLA realizes an EXOR of two sum-of-products expressions (EX-SOP) for multiple-valued input two-valued output functions. We developed an output phase optimization method for EX-SOPs where some outputs of the function are minimized in the complemented form and presented techniques to minimize EX-SOPs for adders by using an extension of Dubrova-Miller-Muzio's AOXMIN algorithm. The proposed algorithm produces solutions with a half products of AOXMIN-like algorithm in 250 times shorter time for large adders with two-valued inputs. We also proved that an n-bit adder with two-valued inputs requires at most 3
Mohammed A. ELGAMEL Md Ibrahim FAISAL Magdy A. BAYOUMI
About 20-45% of the total power in any VLSI circuit is consumed by the clocking system and 90% of this power consumption is spent by flip-flops. Wider datapaths, deeper pipelines, and increasing number of registers in modern processors have underscored the importance of the flip-flops. As a result, the flip-flops' performance metrics such as, power, delay, and power delay product will become a crucial factor in overall performance of processors. As technology is moving into deep submicron level, noise immunity and noise generated by any component in a digital device is also becoming a vital factor in circuit design. This paper studies various flip-flop designs for their noise immunity and noise generation metrics. It categorizes the flip-flops and reports extensive simulation results for best representative examples including the newly proposed one from the group (a patent is filed for this flip-flop). It compares power, delay, power delay product, number of transistors, number of clocked transistors, noise immunity, and noise generation for flip-flops that are reported as ones with the best performances in the literature.
Jeong-Gun LEE Jeong-A LEE Suk-Jin KIM Kiseon KIM
A mutated adder architecture utilizing a mixture of carry propagation schemes is proposed to design a delay-area efficient adder which were not available in an ordinary design space. Further, we develop an optimization method based on integer linear programming to search the expanded design space of the mutated adder.
Aranzazu OTIN Santiago CELMA Concepcion ALDEA
In this paper we report a 3rd-order Gm-C filter based on pseudo-differential continuous-time transconductors for applications in low-voltage systems over VHF range. By using a 0.18 µm pure digital CMOS process, a prototype low pass filter with -3 dB frequency programmable from 38 MHz to 213 MHz confirms the feasibility of the proposed filter in applications such as data storage systems.
An efficient algorithm to reduce the noise from the Nuclear Magnetic Resonance Free Induction Decay (NMR FID) signals is presented, in this paper, via the oversampled real-valued discrete Gabor transform using the Gaussian synthesis window. An NMR FID signal in the Gabor transform domain (i.e., a joint time-frequency domain) is concentrated in a few number of Gabor transform coefficients while the noise is fairly distributed among all the coefficients. Therefore, the NMR FID signal can be significantly enhanced by performing a thresholding technique on the coefficients in the transform domain. Theoretical and simulation experimental analyses in this paper show that the oversampled Gabor transform using the Gaussian synthesis window is more suitable for the NMR FID signal enhancement than the critically-sampled one using the exponential synthesis window, because both the Gaussian synthesis window and its corresponding analysis window in the oversampling case can have better localization in the frequency domain than the exponential synthesis window and its corresponding analysis window in the critically-sampling case. Moreover, to speed up the transform, instead of the commonly-used complex-valued discrete Gabor transform, the real-valued discrete Gabor transform presented in our previous work is adopted in the proposed algorithm.
Akira IKUTA Hisako MASUIKE Mitsuo OHTA
The actual sound environment system exhibits various types of linear and non-linear characteristics, and it often contains an unknown structure. Furthermore, the observations in the sound environment are often in the level-quantized form. In this paper, a method for estimating the specific signal for stochastic systems with unknown structure and the quantized observation is proposed by introducing a system model of the conditional probability type. The effectiveness of the proposed theoretical method is confirmed by applying it to the actual problem of psychological evaluation for the sound environment.
Kazunori SHIMIZU Nozomu TOGAWA Takeshi IKENAGA Satoshi GOTO
This paper proposes a reconfigurable adaptive FEC system based on Reed-Solomon (RS) code with interleaving. In adaptive FEC schemes, error correction capability t is changed dynamically according to the communication channel condition. For given error correction capability t, we can implement an optimal RS decoder composed of minimum hardware units for each t. If the hardware units of the RS decoder can be reduced for any given error correction capability t, we can embed as large deinterleaver as possible into the RS decoder for each t. Reconfiguring the RS decoder embedded with the expanded deinterleaver dynamically for each error correction capability t allows us to decode larger interleaved codes which are more robust error correction codes to burst errors. In a reliable transport protocol, experimental results show that our system achieves up to 65% lower packet error rate and 5.9% higher data transmission throughput compared to the adaptive FEC scheme on a conventional fixed hardware system. In an unreliable transport protocol, our system achieves up to 76% better bit error performance with higher code rate compared to the adaptive FEC scheme on a conventional fixed hardware system.
Luca FANUCCI Sergio SAPONARA Massimiliano MELANI Pierangelo TERRENI
With reference to video motion estimation in the framework of the new H.264/AVC video coding standard, this paper presents algorithmic and architectural solutions for the implementation of context-aware coprocessors in real-time, low-power embedded systems. A low-complexity context-aware controller is added to a conventional Full Search (FS) motion estimation engine. While the FS coprocessor is working, the context-aware controller extracts from the intermediate processing results information related to the input signal statistics in order to automatically configure the coprocessor itself in terms of search area size and number of reference frames; thus unnecessary computations and memory accesses can be avoided. The achieved complexity saving factor ranges from 2.2 to 25 depending on the input signal while keeping unaltered performance in terms of motion estimation accuracy. The increased efficiency is exploited both for (i) processing time reduction in case of software implementation on a programmable platform; (ii) power consumption reduction in case of dedicated hardware implementation in CMOS technology.
Mohammad Shorif UDDIN Tadayoshi SHIOYAMA
A new and simple image processing approach for the measurement of the length of pedestrian crossings with a view to develop a travel aid for the blind people is described. In a crossing, the usual black road surface is painted with constant width periodic white bands. The crossing length is estimated using vector geometry from the left- and the right-border lines, the first-, the second- and the end-edge lines of the crossing region. Image processing techniques are applied on the crossing image to find these lines. Experimental results using real road scenes with pedestrian crossing confirm the effectiveness of the proposed method.
Hariadi MOCHAMAD Hui Chien LOY Takafumi AOKI
This paper presents a semi-automatic algorithm for video object segmentation. Our algorithm assumes the use of multiple key video frames in which a semantic object of interest is defined in advance with human assistance. For video frames between every two key frames, the specified video object is tracked and segmented automatically using Learning Vector Quantization (LVQ). Each pixel of a video frame is represented by a 5-dimensional feature vector integrating spatial and color information. We introduce a parameter K to adjust the balance of spatial and color information. Experimental results demonstrate that the algorithm can segment the video object consistently with less than 2% average error when the object is moving at a moderate speed.
Shen LI Yong JIANG Takeshi IKENAGA Satoshi GOTO
In adaptive motion estimation, spatial-temporal correlation based motion type inference has been recognized as an effective way to guide the motion estimation strategy adjustment according to video contents. However, the complexity and the reliability of those methods remain two crucial problems. In this paper, a motion vector field model is introduced as the basis for a new spatial-temporal correlation based motion type inference method. For each block, Full Search with Adaptive Search Window (ASW) and Three Step Search (TSS), as two search strategy candidates, can be employed alternatively. Simulation results show that the proposed method can constantly reduce the dynamic computational cost to as low as 3% to 4% of that of Full Search (FS), while remaining a closer approximation to FS in terms of visual quality than other fast algorithms for various video sequences. Due to its efficiency and reliability, this method is expected to be a favorable contribution to the mobile video communication where low power real-time video coding is necessary.
Recent microprocessors have included SIMD (single instruction multiple data) extensions into their instruction set architecture to improve the performance of multimedia applications. SIMD instructions speed up the execution of programs but pose lots of challenges to software developers. An efficient matrix-based splitter (or merger), which can split an N
Wen-Huang CHENG Wei-Ta CHU Ja-Ling WU
This paper presents a framework for automatic video region-of-interest determination based on visual attention model. We view this work as a preliminary step towards the solution of high-level semantic video analysis. Facing such a challenging issue, in this work, a set of attempts on using video attention features and knowledge of computational media aesthetics are made. The three types of visual attention features we used are intensity, color, and motion. Referring to aesthetic principles, these features are combined according to camera motion types on the basis of a new proposed video analysis unit, frame-segment. We conduct subjective experiments on several kinds of video data and demonstrate the effectiveness of the proposed framework.
When video data are transmitted via the network, the quality of video data must be carefully chosen to be best under the condition that the transmission is not influenced by other internet services. They often use the simulcast type, which uses independent streams that are stored and transmitted for the quality, considering implementation, when they select the video quality. On the other hand, we had already proposed the scalable structure, which consists of base and enhancement data, but when they require the high quality video, these data are combined using the transcoding methods. In this paper, we propose the video contents delivery methods with scalable transcoding, in which users can update the quality of video data even after the transmission by base data and differential data. In order to reduce the total time of not only users' access time, but also watching time, we compare simulcast method with proposed methods in the total content utilization time using a video contents access model, and evaluate required transcoding time to reduce the waiting time of users.
Somchart CHOKCHAITAM Masahiro IWAHASHI Somchai JITAPUNKUL
In this paper, we propose a new one-dimensional (1D) integer discrete cosine transform (Int-DCT) for unified lossless/lossy image compression. The proposed 1D Int-DCT is newly designed to reduce rounding effects by minimizing number of rounding operations. The proposed Int-DCT can be operated not only lossless coding for a high quality decoded image but also lossy coding for a compatibility with the conventional DCT-based coding system. Both theoretical analysis and simulation results confirm an effectiveness of the proposed Int-DCT.
This letter proposes a run-length code based test data compression technique capable of efficient compression. The proposed test compression method is based on a hybrid run-length encoding, which greatly reduces test data storage on the tester. The code words are carefully selected so as to increase the compression ratio for the test data. Also, a heuristic mapping algorithm and a scan latch reordering method for don't care values in the test cubes increase the compression ratio. Results indicate that the proposed code and heuristic mapping schemes are very efficient in reducing test data. Reduced test data results in less test storage and test time.
Zhijun LU Yamu HU Mohamad SAWAN
In this paper, a low-voltage low-power sigma-delta modulator dedicated to implantable sensing devices is presented. This second-order single-loop sigma-delta modulator is implemented with half-delay integrators. These integrators are based on new fully-differential CMOS class AB switched-Operational Transconductance Amplifier (switched-OTA). An on-chip voltage doubler is introduced to locally boost a supply voltage at the input stage of a conventional OTA in order to allow rail-to-rail signal swing. Experimental results of the modulator fabricated in CMOS 0.18 µm technology confirm its expected features of a peak signal-to-noise ratio (SNR) of 72 dB, a signal-to-noise distortion ratio (SNDR) of 62 dB in a 5 kHz signal bandwidth, and a power consumption lower than 66 µW with a 900 mV voltage supply.
Gualberto AGUILAR Mariko NAKANO-MIYATAKE Hector PEREZ-MEANA
An alaryngeal speech enhancement system is proposed to improve the intelligibility and quality of speech signals generated by an artificial larynx transducer (ALT). Proposed system identifies the voiced segments of alaryngeal speech signal, by using pattern recognition methods, and replaces these by their equivalent voiced segments of normal speech. Evaluation results show that proposed system provides a fairly good improvement of the quality and intelligibility of ALT generated speech.
Lingfeng LI Satoshi GOTO Takeshi IKENAGA
This paper presents a highly parallel architecture for deblocking filter in H.264/AVC. We adopt various parallel schemes in memory sub-system and datapath. A 2-dimensional parallel memory scheme is employed to support efficient parallel access in both horizontal and vertical directions in order to speed up the whole filtering process. This parallel memory also eliminates the need for a transpose circuit. In the datapath, an algorithm optimization is performed to implement parallel filtering with hardware reuse. Pipeline techniques are also adopted to improve the throughput of filtering operations. Our design is implemented under TSMC 0.18 µm technology. Results show that the core size is 0.82
Hassan KHORASHADI-ZADEH Mohammad Reza AGHAEBRAHIMI
This paper presents the design of a novel method for improvement of the operation of distance relays during capacitive voltage transformer transients using artificial neural network. The proposed module uses voltage and current signals to learn the hidden relationship existing in the input patterns. Simulation studies are preformed and the influence of changing system parameters, such as fault resistance and source impedance is studied. Details of the design procedure and the results of performance studies with the proposed relay are given in the paper. Performance studies results show that the proposed algorithm decreases the effects of CVT transients and is fast and accurate.
A scalable multicast session announcement system is a key component of a group communication framework over the Internet. It enables the announcement of session parameters (like the {source address; group address} pair) to a potentially large number of users, according to each site administrator's policy. This system should accommodate any flavor of group communication system, like the Any-Source Multicast (ASM) and Source-Specific Multicast (SSM) schemes. In this paper we first highlight the limitations of the current Session Announcement Protocol (SAP) and study several other information distribution protocols. This critical analysis leads us to formulate the requirements of an ideal multicast session announcement system. We then introduce a new session announcement system called "Channel Reflector". It appears as a hierarchical directory system and offers an effective policy and scope control technique. We finally mention some design aspects, like the protocol messages and configuration structures the Channel Reflector uses.
Denduang PRADUBSUWUN Tomohiro YONEDA Chris MYERS
This paper proposes a partial order reduction algorithm for timed trace theoretic verification in order to detect both safety failures and timing failures of timed circuits efficiently. This algorithm is based on the framework of timed trace theoretic verification according to the original untimed trace theory. Consequently, its conformance checking supports hierarchical structure when verifying timed circuits. Experimenting with the STARI and DME circuits, the proposed approach shows its effectiveness.
In this paper, a complete X-tolerant test data compression solution is proposed for system-on-a-chip (SOC) testing. The solution achieves low-cost testing by employing not only selective Huffman vertical coding (SHVC) for test stimulus compression but also MISR-based time compactor for test response compaction. Moreover, the solution is non-intrusive, since it can tolerate any number of unknown states (also called X state) in test responses such that it does not require modifying the logic of core to eliminate or block the sources of unknown states. Furthermore, the solution achieves enhanced diagnosis capability over conventional MISR. The enhanced diagnosis requires the least hardware overhead by reusing the existing masking logic and achieves significant saving in diagnostic time. Experimental results for ISCAS 89 benchmarks as well as the evaluation of hardware implementation have proven the efficiency of the proposed test solution.
Masayasu FUKUNAGA Seiji KAJIHARA Sadami TAKEOKA
We propose a method to estimate fault efficiency of test patterns for path delay faults. In path delay fault testing, fault coverage of test patterns is usually very low, because circuits have not only a lot of paths but also a lot of untestable paths. Although fault efficiency would be better metric to evaluate test patterns rather than fault coverage, it is too difficult to compute it exactly, if we do not compute the total number of untestable paths exactly. The proposed method samples a part of paths after untestable path analysis, and estimate fault efficiency based on the percentage of untestable paths in the sample paths. Through our experimental results, we show that the proposed method can accurately estimate fault efficiency of test patterns in a reasonable time. Also, since the accuracy of fault efficiency estimated with the proposed method depends on how to sample the paths, we look into the influence of path sampling methods to the accuracy in the experiments.
A rotator graph was proposed as a topology for interconnection networks of parallel computers, and it is promising because of its small diameter and small degree. However, a rotator graph is a directed graph that sometimes behaves harmfully when it is applied to actual problems. A bi-rotator graph is obtained by making each edge of a rotator graph bi-directional. In a bi-rotator graph, average distance is improved against a rotator graph with the same number of nodes. In this paper, we give an algorithm for the container problem in bi-rotator graphs with its evaluation results. The solution achieves some fault tolerance such as file distribution based information dispersal technique. The algorithm is of polynomial order of n for an n-bi-rotator graph. It is based on recursion and divided into two cases according to the position of the destination node. The time complexity of the algorithm and the maximum length of paths obtained are estimated to be O(n3) and 4n-5, respectively. Average performance of the algorithm is also evaluated by computer experiments.
Shigeta KUNINOBU Yoshiaki TAKATA Naoya NITTA Hiroyuki SEKI
A policy is an execution rule (or constraint) for objects in a system to retain security and integrity of the system. We introduce a simple policy specification language and define its operational semantics. A new NFA construction algorithm that works in linear time is proposed and a model checking method for policy controlled system (PCS) is presented. We conducted verification of a sample PCS for hotel reservation by our automatic verification tool and the experimental results showed the efficiency of the proposed method.
As a specific signature, the nominative proxy signature scheme is a method in which the designated proxy signer generates a nominative signature and transmits it to a verifier, instead of the original signer. Recently, Seo et al. proposed a nominative proxy signature scheme for mobile communication and claimed that the scheme hash non-repudiation. However, after analyzing the scheme, we show that the scheme is insecure and cannot provide non-repudiation, note that a malicious original signer can forge the proxy signer to sign on any message. Finally, we also present a modification version of the scheme to repair the security flaw.
Ryo NAGATA Tatsuya IGUCHI Fumito MASUI Atsuo KAWAI Naoki ISU
In this paper, we propose a statistical model for detecting article errors, which Japanese learners of English often make in English writing. It is based on the three head words--the verb head, the preposition, and the noun head. To overcome the data sparseness problem, we apply the backed-off estimate to it. Experiments show that its performance (F-measure=0.70) is better than that of other methods. Apart from the performance, it has two advantages: (i) Rules for detecting article errors are automatically generated as conditional probabilities once a corpus is given; (ii) Its recall and precision rates are adjustable.
In the H.264 video coding standard, 7 modes {16
Bing-Fei WU Yen-Lin CHEN Chung-Cheng CHIU
In this study, we have proposed an efficient automatic multilevel thresholding method for image segmentation. An effective criterion for measuring the separability of the homogenous objects in the image, based on discriminant analysis, has been introduced to automatically determine the number of thresholding levels to be performed. Then, by applying this discriminant criterion, the object regions with homogeneous illuminations in the image can be recursively and automatically thresholded into separate segmented images. The proposed method is fast and effective in analyzing and thresholding the histogram of the image. In order to conduct an equitable comparative performance evaluation of the proposed method with other thresholding methods, a combinatorial scheme is also introduced to properly reduce the computational complexity of performing multilevel thresholding. The experimental results demonstrated that the proposed method is feasible and computationally efficient in automatic multilevel thresholding for image segmentation.
When a dependency parser analyzes long sentences with fewer subjects than predicates, it is difficult for it to recognize which predicate governs which subject. To handle such syntactic ambiguity between subjects and predicates, we define an "a subject clause (s-clause)" as a group of words containing several predicates and their common subject. This paper proposes a two-phase method for S-clause segmentation. The first phase reduces the number of candidates of S-clause boundaries, and the second performs S-clause segmentation using decision trees. In experimental evaluation, the S-clause information turned out to be effective for determining the governor of a subject and that of a predicate in dependency parsing. Further syntactic analysis using S-clauses achieved an improvement in precision of 5 percent.
Machine transliteration is an automatic method to generate characters or words in one alphabetical system for the corresponding characters in another alphabetical system. Machine transliteration can play an important role in natural language application such as information retrieval and machine translation, especially for handling proper nouns and technical terms. The previous works focus on either a grapheme-based or phoneme-based method. However, transliteration is an orthographical and phonetic converting process. Therefore, both grapheme and phoneme information should be considered in machine transliteration. In this paper, we propose a grapheme and phoneme-based transliteration model and compare it with previous grapheme-based and phoneme-based models using several machine learning techniques. Our method shows about 13
In the paper, we introduce TLM methodology focusing on IEEE 802.11 WLAN as a derivative system. Decomposing the entire system into several computation components, we analyzed the property of each transaction, resulting in the TLM. In the case of shared bus, the simulation results show the effect of communication architecture such as bus protocol and bus parameters on the system performance.
Jong Wook KWAK Ju-Hwan KIM Chu Shik JHON
Most branch predictors use the PC information of the branch instruction and its dynamic Global Branch History (GBH). In this letter, we suggest a Branch Direction History (BDH) as the third component of the branch prediction and analyze its impact upon the prediction accuracy. Additionally, we propose a new branch predictor, direction-gshare predictor, which utilizes the BDH combined with the GBH. At first, we model a neural network with (PC, GBH, and BDH) and analyze their actual impact upon the branch prediction accuracy, and then we simulate our new predictor, the direction-gshare predictor. The simulation results show that the aliasing in Pattern History Table (PHT) is significantly reduced by the additional use of BDH information. The direction-gshare predictor outperforms bimodal predictor, two-level adaptive predictor and gshare predictor up to 15.32%, 5.41% and 5.74% respectively, without additional hardware costs.
As the number of cluster system users increases, it is important to maintain stable operation. Although hardware preventive maintenance is important for sustaining smooth operation, hardware testing tools for cluster systems have received little attention. In this paper, we propose a memory testing tool for Linux cluster systems.
Achmad ARIFIN Takashi WATANABE Nozomu HOSHIMIYA
We proposed a fuzzy control scheme to implement the cycle-to-cycle control for restoring swing phase of gait using functional electrical stimulation (FES). We designed two fuzzy controllers for the biceps femoris short head (BFS) and the vastus muscles to control flexion and extension of the knee joint during the swing phase. Control capabilities of the designed fuzzy controllers were tested and compared to proportional-integral-derivative (PID) and adaptive PID controllers in automatic generation of stimulation burst duration and compensation of muscle fatigue through computer simulations using a musculo-skeletal model. Parameter adaptations in the adaptive PID controllers did not significantly improve the control performance of the PID controllers. The fuzzy controllers were superior to the PID and adaptive PID controllers under several subject conditions and different fatigue levels. These results showed the fuzzy controller would be suitable to implement the cycle-to-cycle control of FES-induced gait.
Jan ANGUITA Javier HERNANDO Alberto ABAD
Jacobian Adaptation (JA) has been successfully used in Automatic Speech Recognition (ASR) systems to adapt the acoustic models from the training to the testing noise conditions. In this work we present an improvement of JA for speaker verification, where a specific training noise reference is estimated for each speaker model. The new proposal, which will be referred to as Model-dependent Noise Reference Jacobian Adaptation (MNRJA), has consistently outperformed JA in our speaker verification experiments.