Halftoning is an important process to convert a gray scale image into a binary image with black and white pixels. The Direct Binary Search (DBS) is one of the well-known halftoning methods that can generate high quality binary images for middle tone of original gray scale images. However, binary images generated by the DBS have clippings, that is, have no tone in highlights and shadows of original gray scale images. The first contribution of this paper is to show the reason why the DBS generates binary images with clippings, to clarify the range of tone in original images that may have clipping, and to present a clipping-free DBS-based halftoning algorithm. The key idea is to apply the ordered dither using a threshold array generated by DBS-based method, to highlights and shadows, and then use the DBS. The second contribution is to extend the DBS to generate L-level multitone images with each pixel taking one of the intensity levels , , ..., . However, clippings appear in highlights, middle tone, and shadows of generated L-level multitone images. The third contribution of this paper is to modify the multitone version of the DBS to generate a clipping-free L-level multitone images. The resulting multitone images are so good that they reproduce the tones and the details of the original gray scale images very well.
Kenji KUMAKI Ikuo NAKAGAWA Kenichi NAGAMI Tomohiko OGISHI Shigehiro ANO
This paper proposes a point-to-multipoint (P2MP) Multi-Protocol Label Switching (MPLS) based hierarchical service management system. Traditionally, general management systems deployed in some service providers control MPLS Label Switched Paths (LSPs) (e.g., RSVP-TE and LDP) and services (e.g., L2VPN, L3VPN and IP) separately. In order for dedicated management systems for MPLS LSPs and services to cooperate with each other automatically, a hierarchical service management system has been proposed with the main focus on point-to-point (P2P) TE LSPs in MPLS path management. In the case where P2MP TE LSPs and services are deployed in MPLS networks, the dedicated management systems for P2MP TE LSPs and services must work together automatically. Therefore, this paper proposes a new algorithm that uses a correlation between P2MP TE LSPs and multicast VPN services based on a P2MP MPLS-based hierarchical service management architecture. Also, the capacity and performance of the proposed algorithm are evaluated by simulations, which are actually based on certain real MPLS production networks, and are compared to that of the algorithm for P2P TE LSPs. Results show this system is very scalable within real MPLS production networks. This system, with the automatic correlation, appears to be deployable in real MPLS production networks.
Rameswar DEBNATH Masakazu MURAMATSU Haruhisa TAKAHASHI
The core of the support vector machine (SVM) problem is a quadratic programming problem with a linear constraint and bounded variables. This problem can be transformed into the second order cone programming (SOCP) problems. An interior-point-method (IPM) can be designed for the SOCP problems in terms of storage requirements as well as computational complexity if the kernel matrix has low-rank. If the kernel matrix is not a low-rank matrix, it can be approximated by a low-rank positive semi-definite matrix, which in turn will be fed into the optimizer. In this paper we present two SOCP formulations for each SVM classification and regression problem. There are several search direction methods for implementing SOCPs. Our main goal is to find a better search direction for implementing the SOCP formulations of the SVM problems. Two popular search direction methods: HKM and AHO are tested analytically for the SVM problems, and efficiently implemented. The computational costs of each iteration of the HKM and AHO search direction methods are shown to be the same for the SVM problems. Thus, the training time depends on the number of IPM iterations. Our experimental results show that the HKM method converges faster than the AHO method. We also compare our results with the method proposed in Fine and Scheinberg (2001) that also exploits the low-rank of the kernel matrix, the state-of-the-art SVM optimization softwares SVMTorch and SVMlight. The proposed methods are also compared with Joachims 'Linear SVM' method on linear kernel.
Xianghui WEI Takeshi IKENAGA Satoshi GOTO
Motion estimation (ME) is a computation and data intensive module in video coding system. The search window reuse methods play a critical role in bandwidth reduction by exploiting the data locality in video coding system. In this paper, a search window reuse method (Level C+) is proposed for MPEG-2 to H.264/AVC transcoding. The proposed method is designed for ultra-low bandwidth application, while the on-chip memory is not a main constraining factor. By loading search window for the motion estimation unit (MEU) and applying motion vector clipping processing, each MB in MEU can utilize both horizontal and vertical search reuse. A very low bandwidth level (Rα<2) can be achieved with an acceptable on-chip memory.
Woo Joo KIM Sung Hee LEE Sun Young HWANG
This paper presents a hierarchical NoC architecture to support GT (Guaranteed Throughput) signals to process multimedia data in embedded systems. The architecture provides a communication environment that meets the diverse conditions of communication constraints among IPs in power and area. With a system based on packet switching, which requires storage/control circuits to support GT signals, it is hard to satisfy design constraints in area, scalability and power consumption. This paper proposes a hierarchical 444 mesh-type NoC architecture based on circuit switching, which is capable of processing GT signals requiring high throughput. The proposed NoC architecture shows reduction in area by 50.2% and in power consumption by 57.4% compared with the conventional NoC architecture based on circuit switching. These figures amount to by 72.4% and by 86.1%, when compared with an NoC architecture based on packet switching. The proposed NoC architecture operates in the maximum throughput of 19.2 Gb/s.
Kaikai CHI Xiaohong JIANG Susumu HORIGUCHI
Recently, a promising packet forwarding architecture COPE was proposed to essentially improve the throughput of multihop wireless networks, where each network node can intelligently encode multiple packets together and forward them in a single transmission. However, COPE is still in its infancy and has the following limitations: (1) COPE adopts the FIFO packet scheduling and thus does not provide different priorities for different types of packets. (2) COPE simply classifies all packets destined to the same nexthop into small-size or large-size virtual queues and examines only the head packet of each virtual queue to find coding solutions. Such a queueing structure will lose some potential coding opportunities, because among packets destined to the same nexthop at most two packets (the head packets of small-size and large-size queues) will be examined in the coding process, regardless of the number of flows. (3) The coding algorithm adopted in COPE is fast but cannot always find good solutions. In order to address the above limitations, in this paper we first present a new queueing structure for COPE, which can provide more potential coding opportunities, and then propose a new packet scheduling algorithm for this queueing structure to assign different priorities to different types of packets. Finally, we propose an efficient coding algorithm to find appropriate packets for coding. Simulation results demonstrate that this new COPE architecture can further greatly improve the node transmission efficiency.
Yu WU Taisuke IZUMI Fukuhito OOSHITA Hirotsugu KAKUGAWA Toshimitsu MASUZAWA
Resource search is a fundamental problem in large-scale and highly dynamic Peer-to-Peer (P2P) systems. Unstructured search approaches are widely used because of their flexibility and robustness. However, such approaches incur high communication cost. The index-dissemination-based search is a kind of efficient unstructured search approach. We investigate such approaches with respect to minimize the system communication cost. Based on a dynamic system model that peers continuously leave and join, we solve two problems. One problem is how to efficiently disseminate and maintain a given number of indices. Another is to determine the optimal number of indices for each resource object of a given popularity. Finally, we propose an optimized index dissemination scheme which is fully decentralized and self-adaptive. A remarkable advantage is that the scheme yields no additional communication cost to achieve the self-adaptive feature.
Yasuhide KURAMOCHI Akira MATSUZAWA Masayuki KAWABATA
We present a 10-bit 1-MS/s successive approximation analog-to-digital converter core including a charge redistribution digital-to-analog converter and a comparator. A new linearity calibration technique enables use of a nearly minimum capacitor limited by kT/C noise. The ADC core without digital control blocks has been fabricated in a 0.18-µm CMOS process and consumes 118 µW at 1.8 V power supply. Also, the active area of ADC core is realized to be 0.027 mm2. The calibration improves the SNDR by 13.4 dB and the SFDR by 21.0 dB. The measured SNDR and SFDR at 1 kHz input are 55.2 dB and 73.2 dB respectively.
Nam-Geun KIM Youngsu PARK Jong-Wook KIM Eunsu KIM Sang Woo KIM
In this paper, we present a recently developed pattern search method called Genetic Pattern Search algorithm (GPSA) for the global optimization of cost function subject to simple bounds. GPSA is a combined global optimization method using genetic algorithm (GA) and Digital Pattern Search (DPS) method, which has the digital structure represented by binary strings and guarantees convergence to stationary points from arbitrary starting points. The performance of GPSA is validated through extensive numerical experiments on a number of well known functions and on robot walking application. The optimization results confirm that GPSA is a robust and efficient global optimization method.
Shangce GAO Qiping CAO Catherine VAIRAPPAN Jianchen ZHANG Zheng TANG
This paper describes an improved local search method for synthesizing arbitrary Multiple-Valued Logic (MVL) function. In our approach, the MVL function is mapped from its algebraic presentation (sum-of-products form) on a multiple-layered network based on the functional completeness property. The output of the network is evaluated based on two metrics of correctness and optimality. A local search embedded with chaotic dynamics is utilized to train the network in order to minimize the MVL functions. With the characteristics of pseudo-randomness, ergodicity and irregularity, both the search sequence and solution neighbourhood generated by chaotic variables enables the system to avoid local minimum settling and improves the solution quality. Simulation results based on 2-variable 4-valued MVL functions and some other large instances also show that the improved local search learning algorithm outperforms the traditional methods in terms of the correctness and the average number of product terms required to realize a given MVL function.
The deployment of historical trajectories of moving objects has greatly increased for various applications in road networks. For instance, similar patterns of moving-object trajectories are very useful for designing the transportation network of a new city. In this paper, we define a spatio-temporal similarity measure based on a road network distance, rather than a Euclidean distance. We also propose a new similar trajectory search algorithm based on the spatio-temporal measure by using an efficient pruning mechanism. Finally, we show the efficiency of our algorithm, both in terms of retrieval accuracy and retrieval efficiency.
Kyo TAKAHASHI Shingo SATO Tadamichi KUDO Yoshitaka TSUNEKAWA
In this report, we propose a high-performance pipelined VLSI architecture of the LMS adaptive filter derived by a cut-set retiming technique. The proposed architecture has a peculiar pipelined form with 3 adaptation delays, and the FIR filter portion has a peculiar class of the transposed form providing a minimum output latency and coefficient delay. Both the delays, the adaptation delay and coefficient delay, are compensated by a look-ahead conversion. A new high-speed 4-input and 2-output CSA type adder with a small hardware is employed. The proposed architecture can achieve a good convergence property, high-sampling rate, minimum output latency, small hardware, and lower power dissipation, simultaneously, and is very suitable to implement on the VLSI.
Jun YAJIMA Terutoshi IWASAKI Yusuke NAITO Yu SASAKI Takeshi SHIMOYAMA Thomas PEYRIN Noboru KUNIHIRO Kazuo OHTA
This paper proposes a new algorithm for evaluating the number of chaining variable conditions (CVCs) in the selecting step of a disturbance vector (DV) for the analysis of SHA-1 collision search. The algorithm is constructed by combining four strategies, that can evaluate the number of CVCs more strictly compared with the previous approach. By using our method, we found some DVs that have 57 (or 59) essential CVCs for 1st (or 2nd) block in the case if we assume that we can modify messages up to step 25, which we have not confirmed the practicability of the assumption.
Jeong Ki KIM Hyunseuk YOO Moon Ho LEE
The weakness of implementation for LDPC encoder is that conventional binary Matrix Vector Multiplier has many clock cycles which lead to limited throughput. In this letter in order to construct efficient architecture, we target on IEEE 802.16e LDPC encoders. Over the standard H matrices with Circulant Permutation Matrices, we propose semi-parallel architecture by using cyclic right shift registers and exclusive-OR instead of complex Matrix Vector Multipliers. Proposed efficient encoder for IEEE 802.16e LDPC satisfies compact size and high throughput.
Kumiko MAEBASHI Nobuo SUEMATSU Akira HAYASHI
The mixture modeling framework is widely used in many applications. In this paper, we propose a component reduction technique, that collapses a Gaussian mixture model into a Gaussian mixture with fewer components. The EM (Expectation-Maximization) algorithm is usually used to fit a mixture model to data. Our algorithm is derived by extending mixture model learning using the EM-algorithm. In this extension, a difficulty arises from the fact that some crucial quantities cannot be evaluated analytically. We overcome this difficulty by introducing an effective approximation. The effectiveness of our algorithm is demonstrated by applying it to a simple synthetic component reduction task and a phoneme clustering problem.
Yong-Chun PIAO Jinwoo CHOE Wonjin SUNG Dong-Joon SHIN
In this letter, we propose combinatorial and search construction methods of 2-D multi-weight optical orthogonal codes (OOCs) with autocorrelation 0 and crosscorrelation 1, called multi-weight single or no pulse per row (MSNPR) codes. An upper bound on the size of MSNPR codes is derived and the performance of MSNPR codes is compared to those of other OOCs in terms of the bit error rate (BER) and evaluated using blocking probability. It is also demonstrated that the MSNPR codes can be flexibly constructed for different applications, providing the scalability to optical CDMA networks.
Tianruo ZHANG Guifen TIAN Takeshi IKENAGA Satoshi GOTO
Intra coding in H.264/AVC has significantly enhanced video compression efficiency. However, computation complexity increases by the rate-distortion (RD) based mode decision. This paper proposes a novel fast mode decision algorithm in H.264/AVC intra prediction and its VLSI architecture. A novel edge-detection pattern is proposed and both edge-detection technique and spatial mode prediction technique are combined together to reduce the number of intra 44 candidate modes from 9 to an average of 2.50. VLSI architecture of intra mode decision module is designed with TSMC 0.18 µm CMOS technology. The maximum frequency of 285 MHz is achieved and 13.1k NAND gates are required. High frequency, efficient processing cycle reduction and small area make this design to be an excellent accelerator for HDTV 1080p@30 fps real time encoder.
Takeshi ITO Hiroyuki OHSAKI Makoto IMASE
In this paper, we propose an extension to GridFTP that optimizes its performance by dynamically adjusting the number of parallel TCP connections. GridFTP has been used as a data transfer protocol to effectively transfer a large volume of data in Grid computing. GridFTP supports a feature called parallel data transfer that improves throughput by establishing multiple TCP connections in parallel. However, for achieving high GridFTP throughput, the number of TCP connections should be optimized based on the network status. In this paper, we propose an automatic parallelism tuning mechanism called GridFTP-APT (GridFTP with Automatic Parallelism Tuning) that adjusts the number of parallel TCP connections according to information available to the Grid middleware. Through simulations, we demonstrate that GridFTP-APT significantly improves the performance of GridFTP in various network environments.
Mariko MATSUMOTO Takashi MOCHIZUKI
This letter proposes a fast carrier search method, the Carrier Search Step method (CSSM), to quickly detect the carrier frequency even when mobile stations have no knowledge of the carrier frequencies used [1]. CSSM consists of two operations: 1) mobile stations use the coarse-to-fine search to detect the synchronization channel (SCH), and 2) the center frequency of SCH is shifted within the channel bandwidth so that mobile stations can detect the SCH in an early step of the coarse-to-fine search. Compared with conventional methods, CSSM can reduce carrier search time by 90% when SCH bandwidth is 1.08 MHz and the channel bandwidth is 5 MHz. The reduction in carrier search time strengthens as the channel bandwidth increases.
Xin CHEN Jun YANG Long-xing SHI
A novel fast lock-in digitally controlled phase-locked loop (DCPLL) is proposed in this letter. This DCPLL adopts a novel frequency search algorithm to reduce the lock-in time. Furthermore, to reduce the power consumption, the frequency divider is reused as a frequency detector during the frequency acquisition, and reused as a time-to-digital converter module during the phase acquisition. To verify the proposed algorithm and architecture, a DCPLL design is implemented by SMIC 0.18 µm 1P6M CMOS technology. The Spice simulation results show that the DCPLL can achieve frequency acquisition in 3 reference cycles and complete phase acquisition in 11 reference cycles when locking to 200 MHz. The corresponding power consumption of DCPLL is 3.71 mW.