Kunihiko YANAGIBASHI Yasuhiro TAKASHIMA Yuichi NAKAMURA
In this paper, we propose a novel migration method. In this method, the resultant placement retains the structure of the original placement, called model placement, as much as possible. For this purpose, we minimize the sum of the difference in area between the model placement and the relocated one and the total amount of displacement between them. Moreover, to achieve a short runtime, we limit the solution space and change the packing origin in the optimization process. We construct the system on Sequence-Pair. Experimental results show that our approach preserves the chip area and the overall circuit structure with 98% less runtime than that realized by naive simulated annealing.
We propose an accurate, distributed localization method that uses the connectivity measure to localize nodes in a wireless sensor network. The proposed method is based on a self-organizing isometric embedding algorithm that adaptively emphasizes the most accurate range of measurements and naturally accounts for communication constraints within the sensor network. Each node adaptively chooses a neighborhood of sensors and updates its estimate of position by minimizing a local cost function and then passes this update to the neighboring sensors. Simulations demonstrate that the proposed method is more robust to measurement error than previous methods and it can achieve comparable results using much fewer anchor nodes than previous methods.
This paper proposes a method (Group-Linking Method) that has control over the complexity of the sequential function to construct Finite Memory Machines with minimal order--the machines have the largest number of states based on their memory taps. Finding a machine with maximum number of states is a nontrivial problem because the total number of machines with memory order k is (256)2k-2, a pretty large number. Based on the analysis of Group-Linking Method, it is shown that the amount of data necessary to reconstruct an FMM is the set of strings not longer than the depth of the machine plus one, which is significantly less than that required for traditional greedy-based machine learning algorithm. Group-Linking Method provides a useful systematic way of generating unified benchmarks to evaluate the capability of machine learning techniques. One example is to test the learning capability of recurrent neural networks. The problem of encoding finite state machines with recurrent neural networks has been extensively explored. However, the great representation power of those networks does not guarantee the solution in terms of learning exists. Previous learning benchmarks are shown to be not rich enough structurally in term of solutions in weight space. This set of benchmarks with great expressive power can serve as a convenient framework in which to study the learning and computation capabilities of various network models. A fundamental understanding of the capabilities of these networks will allow users to be able to select the most appropriate model for a given application.
Kazunori KOBAYASHI Ken'ichi FURUYA Yoichi HANEDA Akitoshi KATAOKA
We previously proposed a method of sound source and microphone localization. The method estimates the locations of sound sources and microphones from only time differences of arrival between signals picked up by microphones even if all their locations are unknown. However, there is a problem that some estimation results converge to local minimum solutions because this method estimates locations iteratively and the error function has multiple minima. In this paper, we present a new iterative method to solve the local minimum problem. This method achieves accurate estimation by selecting effective initial locations from many random initial locations. The computer simulation and experimental results demonstrate that the presented method eliminates most local minimum solutions. Furthermore, the computational complexity of the presented method is similar to that of the previous method.
The fixed charge transportation problem (FCTP) is a classic challenge for combinatorial optimization; it is based on the well-known transportation problem (TP), and is one of the prime examples of an NP-complete variant of the TP, of general importance in a wide range of transportation network design problems. Many techniques have been applied to this problem, and the most effective so far (in terms of near-optimal results in reasonable time on large instances) are evolutionary algorithm based approaches. In particular, an EA proposed by Eckert and Gottlieb has produced the best performance so far on a set of specific benchmark instances. We introduce a new scheme, which has more general applicability, but which we test here on the FCTP. The proposed scheme applies an adaptive mutation process immediately following the evaluation of a phenotype. It thereby adapts automatically to learned information encoded in the chromosome. The underlying encoding approach is to encode an ordering of elements for interpretation by a constructive algorithm (such as with the Link and Node Biased encoding for spanning trees, and the Random Keys encoding which has been applied to both scheduling and graph problems), however the main adaptive process rewards links in such a way that genes effectively encode a measure of the number of times their associated link has appeared in selected solutions. Tests are done which compare our approach with Eckert and Gottlieb's results on benchmark FCTP instances, and other approaches.
This paper proposes a flexible VLSI architecture design for motion estimation in H.264/AVC using a high-efficiency variable block-size decision (VBSD) approach. The proposed VBSD approach can perform a full motion search on integrating the 44 block sizes into 48, 84, 88, 816, 168, or 1616 block sizes and then appropriately select the optimal modes for motion compensation operating. In other words, the proposed architecture based on the VBSD approach can effectively reduce the encoding time of the motion estimation by dealing with different block sizes under 1616 searching range. Using the TSMC 0.18-µm CMOS technology, the proposed architecture has been successfully realized. Simulation and verification results show that the proposed architecture has significant bit-rate reduction and small PSNR degradation. Also, the physical chip design revealed that the maximum frame rate of this work can process 704 fps with QCIF (176144), 176 fps with CIF (352288) and 44 fps with 4CIF (704576) video resolutions under lower gate counts and higher working frequency.
Jinhwan KIM Jeonghun CHO Tag Gon KIM
In these days, many dynamically reconfigurable architectures have been introduced to fill the gap between ASICs and software-programmed processors such as GPPs and DSPs. These reconfigurable architectures have shown to achieve higher performance compared to software-programmed processors. However, reconfigurable architectures suffer from a significant reconfiguration overhead and a speedup limitation. By reducing the reconfiguration overhead, the overall performance of reconfigurable architectures can be improved. Therefore, we will describe temporal partitioning, which are able to amortize the reconfiguration overhead at synthesis phase or compilation time. Our temporal partitioning methodology splits a configuration context into temporal partitions to amortize reconfiguration overhead. And then, we will present benchmark results to demonstrate the effectiveness of our methodology.
Phanumas KHUMSAT Apisak WORAPISHET
A compact OTA suitable for low-voltage active-RC and MOSFET-C filters is presented. The input stage of the OTA utilises the NMOS pseudo-differential amplifier with PMOS active load. The output stage relies upon the dual-mode feed-forward class-AB technique (based on an inverter-type transconductor) with common-mode rejection capability that incurs no penalty on transconductance/bias-current efficiency. Simulation results of a 0.5-V 100-kHz 5th-order Chebyshev filter based on the proposed OTA in a 0.18 µm CMOS process indicate SNR and SFDR of 68 dB and 63 dB (at 50 kHz+55 kHz) respectively. The filter consumes total power consumption of 60 µW.
Kyeongyeon KIM Jaesang HAM Chungyong LEE
Owing to frequency diversity gain and simplicity, a chip interleaved multi-carrier code division multiple access (MC-CDMA) system has been considered in a multi-cell environment, and combining it with multiple input multiple output (MIMO) schemes offers high spectral efficiency. In spite of the advantages, the system is highly affected by inter code interferences. To control them, this letter analyzes the asymptotic performance of a MIMO MC-CDMA system with a symbol level MMSE receiver in a multi-cell environment and presents an appropriate multiplexed code ratio satisfying a required error rate.
Daisuke KOSAKA Makoto NAGATA Yoshitaka MURASAKA Atsushi IWATA
Chip-level substrate coupling analysis uses F-matrix computation with slice-and-stack execution to include highly concentrated substrate resistivity gradient. The technique that has been applied to evaluation of device-level isolation structures against substrate coupling is now developed into chip-level substrate noise analysis. A time-series divided parasitic capacitance (TSDPC) model is equivalent to a transition controllable noise source (TCNS) circuit that captures noise generation in a CMOS digital circuit. A reference structure incorporating TCNS circuits and an array of on-chip high precision substrate noise monitors provides a basis for the verification of chip-level analysis of substrate coupling in a given technology. Test chips fabricated in two different wafer processings of 0.30-µm and 0.18-µm CMOS technologies demonstrate the universal availability of the proposed analysis techniques. Substrate noise simulation achieves no more than 3 dB discrepancy in peak amplitude compared to measurements with 100-ps/100-µV resolution, enabling precise evaluation of the impacts of the distant placements of sensitive devices from sources of noise as well as application of guard ring/band structures.
This paper presents a self-reconfigurable adaptive FIR filter system design using dynamic partial reconfiguration, which has flexibility, power efficiency, advantages of configuration time allowing dynamically inserting or removing adaptive FIR filter modules. This self-reconfigurable adaptive FIR filter is responsible for providing the best solution for realization and autonomous adaptation of FIR filters, and processes the optimal digital signal processing algorithms, which are the low-pass, band-pass and high-pass filter algorithms with various frequencies, for noise removal operations. The proposed stand-alone self-reconfigurable system using Xilinx Virtex4 FPGA and Compact-Flash memory shows the improvement of configuration time and flexibility by using the dynamic partial reconfiguration techniques.
Takefumi MIYOSHI Nobuhiko SUGINO
For a coarse grain dynamic reconfigurable processing unit cooperating with a general purpose processor, a context selection method, which can reduce total execution cycles of a given program, is proposed. The method evaluates context candidates from a given program, in terms of reduction in cycles by exploiting parallel and pipeline execution of the reconfigurable processor. According to this evaluation measure, the method selects appropriate contexts for the dynamic reconfigurable processing unit. The proposed method is implemented on the framework of COINS project. For several example programs, the generated codes are evaluated by a software simulator in terms of execution cycles, and these results prove the effectiveness of the proposed method.
In this paper we propose a discrete program-size dependent software reliability growth model flexibly describing the software failure-occurrence phenomenon based on a discrete Weibull distribution. We also conduct model comparisons of our discrete SRGM with existing discrete SRGMs by using actual data sets. The program size is one of the important metrics of software complexity. It is known that flexible discrete software reliability growth modeling is difficult due to the mathematical manipulation under a conventional modeling-framework in which the time-dependent behavior of the cumulative number of detected faults is formulated by a difference equation. Our discrete SRGM is developed under an existing unified modeling-framework based on the concept of general order-statistics, and can incorporate the effect of the program size into software reliability assessment. Further, we discuss the method of parameter estimation, and derive software reliability assessment measures of our discrete SRGM. Finally, we show numerical examples of discrete software reliability analysis based on our discrete SRGM by using actual data.
Dong-Sun KIM Taeo HWANG Seung-Yerl LEE Kwang-Ho WON Byung-Soo KIM Seong-Dong KIM Duck-Jin CHUNG
The Ubiquitous sensor network (USN) node is required to operate for several months with limited system resources such as memory and power. The typical USN node is in the active state for less than 1% of its several month lifetime and waits in the inactive state for the remaining 99% of its lifetime. This paper suggests a power adjustment dual priority scheduler (PA-DPS) that offers low power consumption while meeting the USN requirements by estimating power consumption in the USN node. PA-DPS has been designed based on the event-driven approach and the dual-priority scheduling structure, which has been conventionally suggested in the real-time system field. From experimental results, PA-DPS reduced the inactive mode current up to 40% under the 1% duty cycle.
A simplified equalization method based on the band structure of the frequency domain channel matrix is proposed for the single carrier systems employing cyclic prefix (SC-CP) over time-varying wireless channels. Using both theoretical analysis and computer simulation, it is shown that the complexity of this method is proportional to the number of symbols in one SC-CP block and is less than that of traditional block equalization methods. We also show that they have similar performance.
Daihan WANG Hiroki MATSUTANI Michihiro KOIBUCHI Hideharu AMANO
A temporal correlation based port combination algorithm that customizes the router design in Network-on-Chip (NoC) is proposed for reconfigurable systems in order to minimize required hardware amount. Given the traffic characteristics of the target application and the expected hardware amount reduction rate, the algorithm automatically makes the port combination plan for the networks. Since the port combination technique has the advantage of almost keeping the topology including two-surface layout, it does not affect the design of the other layer, such as task mapping and scheduling. The algorithm shows much better efficiency than the algorithm without temporal correlation. For the multimedia stream processing application, the algorithm can save 55% of the hardware amount without performance degradation, while the none temporal correlation algorithm suffers from 30% performance loss.
Akihide HORITA Kenji NAKAYAMA Akihiro HIRANO
FeedForward (FF-) Blind Source Separation (BSS) systems have some degree of freedom in the solution space. Therefore, signal distortion is likely to occur. First, a criterion for the signal distortion is discussed. Properties of conventional methods proposed to suppress the signal distortion are analyzed. Next, a general condition for complete separation and distortion-free is derived for multi-channel FF-BSS systems. This condition is incorporated in learning algorithms as a distortion-free constraint. Computer simulations using speech signals and stationary colored signals are performed for the conventional methods and for the new learning algorithms employing the proposed distortion-free constraint. The proposed method can well suppress signal distortion, while maintaining a high source separation performance.
Our purpose is to estimate conditional probabilities of output labels in multiclass classification problems. Adaboost provides highly accurate classifiers and has potential to estimate conditional probabilities. However, the conditional probability estimated by Adaboost tends to overfit to training samples. We propose loss functions for boosting that provide shrinkage estimator. The effect of regularization is realized by shrinkage of probabilities toward the uniform distribution. Numerical experiments indicate that boosting algorithms based on proposed loss functions show significantly better results than existing boosting algorithms for estimation of conditional probabilities.
Ikuo AWAI Yangjun ZHANG Tetsuya ISHIDA Tsuyoshi SUZUKI
A new unified method is proposed to calculate the basic resonator parameters, i.e., the resonant frequency, external Q, unloaded Q and coupling coefficient in the time domain. By exciting the resonator from a weakly coupled external circuit, one can inject only a narrow resonant spectrum from the broad spectrum of the excitation pulse. The resonant frequency is easily counted by the number of zero crossings of the internal field intensity, whereas the Q's are calculated by the decay rate of the field amplitude. The coupling coefficient computed by the energy exchange rate between two resonators completes the new time domain algorithm.
Jianping QIAO Ju LIU Yen-Wei CHEN
Most learning-based super-resolution methods neglect the illumination problem. In this paper we propose a novel method to combine blind single-frame super-resolution and shadow removal into a single operation. Firstly, from the pattern recognition viewpoint, blur identification is considered as a classification problem. We describe three methods which are respectively based on Vector Quantization (VQ), Hidden Markov Model (HMM) and Support Vector Machines (SVM) to identify the blur parameter of the acquisition system from the compressed/uncompressed low-resolution image. Secondly, after blur identification, a super-resolution image is reconstructed by a learning-based method. In this method, Logarithmic-wavelet transform is defined for illumination-free feature extraction. Then an initial estimation is obtained based on the assumption that small patches in low-resolution space and patches in high-resolution space share a similar local manifold structure. The unknown high-resolution image is reconstructed by projecting the intermediate result into general reconstruction constraints. The proposed method simultaneously achieves blind single-frame super-resolution and image enhancement especially shadow removal. Experimental results demonstrate the effectiveness and robustness of our method.