Wen CHEN Jie CHEN Shuichi ITOH
Following our former works on regular sampling in wavelet subspaces, the paper provides two algorithms to estimate the truncation error and aliasing error respectively when the theorem is applied to calculate concrete signals. Furthermore the shift sampling case is also discussed. Finally some important examples are calculated to show the algorithm.
Jie CHEN Shuichi ITOH Takeshi HASHIMOTO
A new method for the compression of electrocardiographic (ECG) data is presented. The method is based on the orthonormal wavelet analysis recently developed in applied mathematics. By using wavelet transform, the original signal is decomposed into a set of sub-signals with different frequency channels corresponding to the different physical features of the signal. By utilizing the optimum bit allocation scheme, each decomposed sub-signal is treated according to its contribution to the total reconstruction distortion and to the bit rate. In our experiments, compression ratios (CR) from 13.5: 1 to 22.9: 1 with the corresponding percent rms difference (PRD) between 5.5% and 13.3% have been obtained at a clinically acceptable signal quality. Experimental results show that the proposed method seems suitable for the compression of ECG data in the sense of high compression ratio and high speed.
Trong-Yen LEE Pao-Ann HSIUNG Sao-Jie CHEN
A novel Multi-Level Partitioning (MLP) technique taking into account real-world constraints for hardware-software partitioning in Distributed Embedded Multiprocessor Systems (DEMS) is proposed. This MLP algorithm uses a gradient metric based on hardware-software cost and performance as the core metric for selection of optimal partitions and consists of three nested levels. The innermost level is a simple binary search that allows quick evaluations of a large number of possible partitions. The middle level iterates over different possible allocations of processors (that execute software) to subsystems. The outermost level iterates over the number of processors and the hardware cost range. Heuristics are applied to each level to avoid the expensive exhaustive search. The application of MLP as a recently purposed Distributed Embedded System Codesign (DESC) methodology shows its feasibility. Comparisons between real-world examples partitioned using MLP and using other existing techniques demonstrate contrasting strengths of MLP. Sharing, clustering, and hierarchical system model are some important features of MLP, which contribute towards producing more optimal partition results.
Po-Hsun CHENG Sao-Jie CHEN Jin-Shin LAI
This work elucidates the evolution of three generations of the laboratory information system in the National Taiwan University Hospital, which were respectively implemented in an IBM Series/1 minicomputer, a client/server and a plug-and-play HL7 interface engine environment respectively. The experience of using the HL7 healthcare information exchange in the hospital information system, laboratory information system, and automatic medical instruments over the past two decades are illustrated and discussed. The latest design challenge in developing intelligent laboratory information services is to organize effectively distributed and heterogeneous medical instruments through the message gateways. Such experiences had spread to some governmental information systems for different purposes in Taiwan; besides, the healthcare information exchange standard, software reuse mechanism, and application service provider adopted in developing the plug-and-play laboratory information system are also illustrated.
Pao-Ann HSIUNG Trong-Yen LEE Sao-Jie CHEN
A formal system-level synthesis model for the concurrent object-oriented design of parallel computer systems, called Multi-token Object-oriented Bi-directional net (MOBnet), is proposed. The MOBnet model extends the standard Petri net by defining (1) multiple tokens to represent different kinds of synthesis control information, (2) object-oriented nodes (places) to denote the system parts under synthesis, and (3) bi-directional arcs to model the design completion check and synthesis rollback operations. In this paper, we first show that MOBnet can serve as a pre-fabrication design methodology analysis tool in ways such as class hierarchy construction, design specification comparison, reachability analysis, and concurrent process management and analysis. We then formally prove MOBnet to be a valid model for concurrent synthesis and give experimental application examples to verify. Finally, solution schemes for the design completion check and synthesis rollback problems are formally validated by analyzing the dynamic behavior of MOBnet, and experimentally illustrated through examples.
Ji LI Chen HE Jie CHEN Dongjian WANG
The recognition vector of the decision-theoretic approach and that of cumulant-based classification are combined to compose a higher dimension hyperspace to get the benefits of both methods. The method proposed in this paper can cover more kinds of signals including signals with order higher than 4 in the AWGN channel even under low SNR values, i.e. those down to -5 dB. The composed vector is input into an RBF neural network to get more reasonable reference points. Eleven kinds of signals, say 2ASK, 4ASK, 8ASK, 2PSK, 4PSK, 8PSK, 2FSK 4FSK, 8FSK, 16QAM and 64QAM, are involved in the discussion.
Yuyu YUAN Chuanyi LIU Jie CHENG Xiaoliang WANG
Execution performance is critical for large-scale and data-intensive workflows. This paper proposes DISWOP, a novel scheduling algorithm for data-intensive workflow optimizations; it consists of three main steps: workflow process generation, task & resource mapping, and task clustering. To evaluate the effectiveness and efficiency of DISWOP, a comparison evaluation of different workflows is conducted a prototype workflow platform. The results show that DISWOP can speed up execution performance by about 1.6-2.3 times depending on the task scale.
Xutao LI Minjie CHEN Hirofumi SHINOHARA Tsutomu YOSHIHARA
Small loop gain and low crossover frequency result in poor dynamic performance of a single-loop output voltage controlled boost converter in continuous conduction mode. Multi-loop current control can improve the dynamic performance, however, the cost, size and weight of the circuit will also be increased. Sensorless multi-loop control solves the problems, however, the difficulty of the closed-loop characteristics evaluation will be severely aggravated, because there are more parameters in the loops, meanwhile, different from the single-loop, the relationships between the loop gains and closed-loop characteristics including audio susceptibility and output impedance are generally indirect for the multi-loop. Therefore, in this paper, a novel robust H∞ synthesis approach in the time-domain is proposed to design a sensorless controller for boost converters, which need not solve any algebraic Riccati equation or linear matrix inequalities, and most importantly, provides an approach to parameterizing the controller by an adjustable parameter. The adjustable parameter behaves like a ‘knob’ on the dynamic performance, consequently, which makes the closed-loop characteristics evaluation straightforward. A boost converter is used to verify the proposed synthesis approach. Simulations show the great convenience of the closed-loop characteristics evaluation. Practical experiments confirm the simulations.
Jiann-Fu LIN Win-Bin SEE Sao-Jie CHEN
This paper investigates the problem of scheduling parallel tasks" with consideration of communication cost on an m-processor system, where processors are assumed to be identical and tasks being scheduled are independent such that they can run on more than one processor simultaneously. Once a task is processed in parallel, its finish time will be speeded up, but communication cost will also be incurred and should be taken into account. To find a schedule with minimum finish time for the parallel tasks scheduling problem is NP-hard. Therefore, in this paper, we will propose a heuristic algorithm for this kind of problem and derive its performance bounds for two different cases of applications, respectively.
Po-Hsun CHENG Sao-Jie CHEN Jin-Shin LAI Feipei LAI
This paper illustrates a feasible health informatics domain knowledge management process which helps gather useful technology information and reduce many knowledge misunderstandings among engineers who have participated in the IBM mainframe rightsizing project at National Taiwan University (NTU) Hospital. We design an asynchronously sharing mechanism to facilitate the knowledge transfer and our health informatics domain knowledge management process can be used to publish and retrieve documents dynamically. It effectively creates an acceptable discussion environment and even lessens the traditional meeting burden among development engineers. An overall description on the current software development status is presented. Then, the knowledge management implementation of health information systems is proposed.
Given an odd prime q and an integer m ≤ q, an array-based parity-check matrix H(m,q) can be constructed for a quasi-cyclic low-density parity-check (LDPC) code C(m,q). For m=4 and q ≥ 11, we prove the stopping distance of H(4,q) is 10, which is equal to the minimum Hamming distance of the associated code C(4,q). In addition, a tighter lower bound on the stopping distance of H(m,q) is also given for m > 4 and q ≥ 11.
Weijie CHEN Ming CAI Xiaojun TAN Bo WEI
Accurate estimation of the state-of-charge is a crucial need for the battery, which is the most important power source in electric vehicles. To achieve better estimation result, an accurate battery model with optimum parameters is required. In this paper, a gradient-free optimization technique, namely tree seed algorithm (TSA), is utilized to identify specific parameters of the battery model. In order to strengthen the search ability of TSA and obtain more quality results, the original algorithm is improved. On one hand, the DE/rand/2/bin mechanism is employed to maintain the colony diversity, by generating mutant individuals in each time step. On the other hand, the control parameter in the algorithm is adaptively updated during the searching process, to achieve a better balance between the exploitation and exploration capabilities. The battery state-of-charge can be estimated simultaneously by regarding it as one of the parameters. Experiments under different dynamic profiles show that the proposed method can provide reliable and accurate estimation results. The performance of conventional algorithms, such as genetic algorithm and extended Kalman filter, are also compared to demonstrate the superiority of the proposed method in terms of accuracy and robustness.
Zhijie CHEN Masaya MIYAHARA Akira MATSUZAWA
This paper proposes an opamp-free solution to implement single-phase-clock controlled noise shaping in a SAR ADC. Unlike a conventional noise shaping SAR ADC, the proposal realizes noise shaping by charge redistribution, which is a passive technique. The passive implementation has high power efficiency. Meanwhile, since the proposal maintains the basic architecture and operation method of a traditional SAR ADC, it retains all the advantages of a SAR ADC. Furthermore, noise shaping helps to improve the performance of SAR ADC and relaxes its non-ideal effects. Designed in a 65-nm CMOS technology, the prototype realizes 58-dB SNDR based on an 8-bit C-DAC at 50-MS/s sampling frequency. It consumes 120.7-µW power from a 0.8-V supply and achieves a FoM of 14.8-fJ per conversion step.
Jie CHEN Guoliang SHOU Changming ZHOU
High-speed low-power matched filter plays an important role in the fast despreading of spread-signals in wideband code division multiple access (W-CDMA) mobile communications. In this paper, we describe the algorithm and the VLSI-architecture of a complex matched filter chip implemented by our proposed digital-controlled analog parallel operational circuits. The complex matched filter VLSI with variable taps from 4 to 128 is developed for despreading QPSK-modulated spread-signals for W-CDMA communications, which is fabricated by a 2-metal 0.8 µm CMOS technology. The dissipation power of the chip is 225 mW and 130 mW when it operates at the chip-rate of 20 MHz with the supply voltages of 3.0 V and 2.5 V, respectively, and it can be furthermore reduced to 62 mW at chip rate of 10 MHz when the supply voltage is lowered to 2.2 V. The 3-dB cut-off frequency of the fabricated chip is higher than 20 MHz for both 3.0 V and 2.5 V supplies. Comparing to pure digital matched filters, the massive and high-speed despreading operations of the spread-signals are directly carried out in analog domain. As a result, two high-speed analog-to-digital (A/D) converters operating at chip rate are omitted, the inner signal paths and the total dissipation power are greatly reduced.
Jie CHEN Guoliang SHOU Changming ZHOU
Weighted summation (W-SUM) operation of multi-input signals plays an important role in signal processing, image compression and communication systems. Conventional digital LSI implementation for the massive high-speed W-SUM operations usually consumes a lot of power, and the power dissipation linearly increases with the operational frequencies. Analog or digital-analog mixed technology may provide a solution to this problem, but the large scale integration for analog circuits especially for digital-analog mixed circuits faces some difficulties in terms of circuit design, mixed-simulation, physical layout and anti-noises. To practically integrate large scale analog or digital-analog mixed circuits, the simplicity of the analog circuits are usually required. In this paper, we present a solution to realize the parallel W-SUM operations of multi-input analog signals based on our developed digital-controlled analog operational circuits. The major features of the proposed circuits include the simplicity in the circuitry architecture and the advantage in the dissipation power, which make it easy to be designed and to be integrated in large scale. To improve the design efficiency, a Top-Down design approach for mixed LSI implementation is proposed. The proposed W-SUM circuits and the Top-Down design approach have been practically used in the LSI implementation for a series of programmable finite impulse response (FIR) filters and matched filters applied in adaptive signal processing and the mobile communication systems based on the wideband code division multiple access (W-CDMA) technology.
Leibo LIU Dong WANG Yingjie CHEN Min ZHU Shouyi YIN Shaojun WEI
This paper presents the design of a multiple-standard 1080 high definition (HD) video decoder on a mixed-grained reconfigurable computing platform integrating coarse-grained reconfigurable processing units (RPUs) and FPGAs. The proposed RPU, including 16×16 multi-functional processing elements (PEs), is used to accelerate compute-intensive tasks in the video decoding. A soft-core-based microprocessor array is implemented on the FPGA and adopted to speed-up the dynamic reconfiguration of the RPU. Furthermore, a mail-box-based communication scheme is utilized to improve the communication efficiency between RPUs and FPGAs. By exploiting dynamic reconfiguration of the RPUs and static reconfiguration of the FPGAs, the proposed platform achieves scalable performances and cost trade-offs to support a variety of video coding standards, including MPEG-2, AVS, H.264, and HEVC. The measured results show that the proposed platform can support H.264 1080 HD video streams at up to 57 frames per second (fps) and HEVC 1080 HD video streams at up to 52fps under 250MHz, at the same time, it achieves a 3.6× performance gain over an industrial coarse-grained reconfigurable processor for H.264 decoding, and a 6.43× performance boosts over a general purpose processor based implementation for HEVC decoding.
Yu HOU Zhijie CHEN Masaya MIYAHARA Akira MATSUZAWA
This paper proposes a SAR-VCO hybrid 1-1 MASH ADC architecture, where a fully-passive 1st-order noise-shaping SAR ADC is implemented in the first stage to eliminate Op-amp. A VCO-based ADC quantizes the residue of the SAR ADC with one additional order of noise shaping in the second stage. The inter-stage gain error can be suppressed by a foreground calibration technique. The proposed ADC architecture is expected to accomplish 2nd-order noise shaping without Op-amp, which makes both high SNDR and low power possible. A prototype ADC is designed in a 65nm CMOS technology to verify the feasibility of the proposed ADC architecture. The transistor-level simulation results show that 75.7dB SNDR is achieved in 5MHz bandwidth at 60MS/s. The power consumption is 748.9µW under 1.0V supply, which results in a FoM of 14.9fJ/conversion-step.
Binhao HE Meiting XUE Shubiao LIU Feng YU Weijie CHEN
The top-K sorting is a variant of sorting used heavily in applications such as database management systems. Recently, the use of field programmable gate arrays (FPGAs) to accelerate sorting operation has attracted the interest of researchers. However, existing hardware top-K sorting algorithms are either resource-intensive or of low throughput. In this paper, we present a resource-efficient top-K sorting architecture that is composed of L cascading sorting units, and each sorting unit is composed of P sorting cells. K=PL largest elements are produced when a variable length input sequence is processed. This architecture can operate at a high frequency while consuming fewer resources. The experimental results show that our architecture achieved a maximum 1.2x throughput-to-resource improvement compared to previous studies.
Jie CHEN Shuichi ITOH Takeshi HASHIMOTO
A complete analysis for the quantization noises and the reconstruction noises of the wavelet pyramid coding system is given. It is shown that in the (orthonormal) wavelet image coding system, there exists a simple and exact formula to compute the reconstruction mean-square-error (MSE) for any kind of quantization errors. Based on the noise analysis, an optimal bit allocation scheme which minimizes the system reconstruction distortion at a given rate is developed. The reconstruction distortion of a wavelet pyramid system is proved to be directly proportional to 2-2, where is a given bit rate. It is shown that, when the optimal bit allocation scheme is adopted, the reconstruction noises can be approximated to white noises. Particularly, it is shown that with only one known quantization MSE of a wavelet decomposition at any layer of the wavelet pyramid, all of the reconstruction MSE's and the quantization MSE's of the coding system can be easily calculated. When uniform quantizers are used, it is shown that at two successive layers of the wavelet pyramid, the optimal quantization step size is a half of its predecessor, which coincides with the resolution version of the wavelet pyramid decomposition. A comparison between wavelet-based image coding and some well-known traditional image coding methods is made by simulations, and the reasons why the wavelet-based image coding is superior to the traditional image coding are explained.
Ran LI Hongbing LIU Jie CHEN Zongliang GAN
The conventional bilateral motion estimation (BME) for motion-compensated frame rate up-conversion (MC-FRUC) can avoid the problem of overlapped areas and holes but usually results in lots of inaccurate motion vectors (MVs) since 1) the MV of an object between the previous and following frames is more likely to have no temporal symmetry with respect to the target block of the interpolated frame and 2) the repetitive patterns existing in video frame lead to the problem of mismatch due to the lack of the interpolated block. In this paper, a new BME algorithm with a low computational complexity is proposed to resolve the above problems. The proposed algorithm incorporates multi-resolution search into BME, since it can easily utilize the MV consistency between two adjacent pyramid levels and spatial neighboring MVs to correct the inaccurate MVs resulting from no temporal symmetry while guaranteeing low computational cost. Besides, the multi-resolution search uses the fast wavelet transform to construct the wavelet pyramid, which not only can guarantee low computational complexity but also can reserve the high-frequency components of image at each level while sub-sampling. The high-frequency components are used to regularize the traditional block matching criterion for reducing the probability of mismatch in BME. Experiments show that the proposed algorithm can significantly improve both the objective and subjective quality of the interpolated frame with low computational complexity, and provide the better performance than the existing BME algorithms.