I propose an acoustic model adaptation method using bases constructed through the sparse principal component analysis (SPCA) of acoustic models trained in a clean environment. I perform experiments on adaptation to a new speaker and noise. The SPCA-based method outperforms the PCA-based method in the presence of babble noise.
This paper presents an efficient algorithm for reporting all intersections among n given segments in the plane using work space of arbitrarily given size. More exactly, given a parameter s which is between Ω(1) and O(n) specifying the size of work space, the algorithm reports all the segment intersections in roughly O(n2/+ K) time using O(s) words of O(log n) bits, where K is the total number of intersecting pairs. The time complexity can be improved to O((n2/s) log s + K) when input segments have only some number of different slopes.
Given a binary image I and a threshold t, the size-thresholded binary image I(t) defined by I and t is the binary image after removing all connected components consisting of at most t pixels. This paper presents space-efficient algorithms for computing a size-thresholded binary image for a binary image of n pixels, assuming that the image is stored in a read-only array with random-access. With regard to the problem, there are two cases depending on how large the threshold t is, namely, Relatively large threshold where t = Ω(), and Relatively small threshold where t = O(). In this paper, a new algorithmic framework for the problem is presented. From an algorithmic point of view, the problem can be solved in O() time and O() work space. We propose new algorithms for both the above cases which compute the size-threshold binary image for any binary image of n pixels in O(nlog n) time using only O() work space.
Cyclic Redundancy Check (CRC) is a well known error detection scheme used to detect corruption of digital content in digital networks and storage devices. Since it is a compute-intensive process which adversely affects performance, hardware acceleration using FPGAs has been tried and satisfactory performance has been achieved. However, recent extended usage of networks and storage systems require various correction capabilities for various CRC standards. Traditional hardware designs based on the LFSR (Linear Feedback Shift Register) tend to have fixed structure without such flexibility. Here, fully-adaptable CRC accelerator based on a table-based algorithm is proposed. The table-based algorithm is a flexible method commonly used in software implementations. It has been rarely implemented with the hardware, since it is believed that the operational speed is not enough. However, by using pipelined structure and efficient use of memory modules in FPGAs, it appeared that the table-based fixed CRC accelerators achieved better performance than traditional implementation. Based on the implementation, fully-adaptable CRC accelerator which eliminate the need for many non-adaptable CRC implementations is proposed. The accelerator has ability to process arbitrary number of input data and generates CRC for any known CRC standard, up to 65 bits of generator polynomial, during run-time. Further, we modify Table generation algorithm in order to decrease its space complexity from O(nm) to O(n). On Xilinx Virtex 6 LX550T board, the fully-adaptable accelerators occupy between 1 to 2% area to produce maximum of 289.8 Gbps at 283.1 MHz if BRAM is deployed, or between 1.6 - 14% of area for 418 Gbps at 408.9 MHz if tables are implemented in logic. Proposed architecture enables further expansion of throughput by increasing a number of input bits M processed at a time.
Yongqing HUO Fan YANG Vincent BROST Bo GU
Due to the growing popularity of High Dynamic Range (HDR) images and HDR displays, a large amount of existing Low Dynamic Range (LDR) images are required to be converted to HDR format to benefit HDR advantages, which give rise to some LDR to HDR algorithms. Most of these algorithms especially tackle overexposed areas during expanding, which is the potential to make the image quality worse than that before processing and introduces artifacts. To dispel these problems, we present a new LDR to HDR approach, unlike the existing techniques, it focuses on avoiding sophisticated treatment to overexposed areas in dynamic range expansion step. Based on a separating principle, firstly, according to the familiar types of overexposure, the overexposed areas are classified into two categories which are removed and corrected respectively by two kinds of techniques. Secondly, for maintaining color consistency, color recovery is carried out to the preprocessed images. Finally, the LDR image is expanded to HDR. Experiments show that the proposed approach performs well and produced images become more favorable and suitable for applications. The image quality metric also illustrates that we can reveal more details without causing artifacts introduced by other algorithms.
Ryuichi HARASAWA Yutaka SUEYOSHI Aichi KUDO
We consider the computation of r-th roots in finite fields. For the computation of square roots (i.e., the case of r=2), there are two typical methods: the Tonelli-Shanks method [7],[10] and the Cipolla-Lehmer method [3],[5]. The former method can be extended to the case of r-th roots with r prime, which is called the Adleman-Manders-Miller method [1]. In this paper, we generalize the Cipolla-Lehmer method to the case of r-th roots in Fq with r prime satisfying r | q-1, and provide an efficient computational procedure of our method. Furthermore, we implement our method and the Adleman-Manders-Miller method, and compare the results.
Etsuji TOMITA Yoichi SUTANI Takanori HIGASHI Mitsuo WAKATSUKI
Many problems can be formulated as maximum clique problems. Hence, it is highly important to develop algorithms that can find a maximum clique very fast in practice. We propose new approximate coloring and other related techniques which markedly improve the run time of the branch-and-bound algorithm MCR (J. Global Optim., 37, pp.95–111, 2007), previously shown to be the fastest maximum-clique-finding algorithm for a large number of graphs. The algorithm obtained by introducing these new techniques in MCR is named MCS. It is shown that MCS is successful in reducing the search space quite efficiently with low overhead. Extensive computational experiments confirm the superiority of MCS over MCR and other existing algorithms. It is faster than the other algorithms by orders of magnitude for several graphs. In particular, it is faster than MCR for difficult graphs of very high density and for very large and sparse graphs, even though MCS is not designed for any particular type of graph. MCS can be faster than MCR by a factor of more than 100,000 for some extremely dense random graphs. This paper demonstrates in detail the effectiveness of each new techniques in MCS, as well as the overall contribution.
Jhin-Fang HUANG Wen-Cheng LAI Kun-Jie HUANG
A 5.6-GHz 1-V balanced LC-tank Colpitts voltage controlled oscillator is designed and implemented with a TSMC 0.18-µm CMOS process. This proposed Colpitts VCO circuit adopts two single-ended complementary LC-tank VCOs coupled by two pairs of varactors. The proposed VCO operates at low power consumption because it has the same dc current path as the np-MOSFETs. The Measured results of the proposed VCO achieve tuning range of 670 MHz from 5.23 to 5.9 GHz while the controlled voltage is tuned from 0 to 1-V, phase noise of -118.8 dBc/Hz at 1 MHz offset frequency from the carrier of 5.6 GHz and output power of -10.97 dBm at the supply voltage of 1 V. The power consumption of the core circuit is 1.79 mW and the chip area including pads is 0.451 (0.55 0.82) mm2.
Sung-Wook JUN Lianghua MIAO Keita YASUTOMI Keiichiro KAGAWA Shoji KAWAHITO
This paper presents a digitally error-corrected pipeline analog-to-digital converter (ADC) using linearization of incomplete settling errors. A pre-charging technique is used for residue amplifiers in order to reduce the incomplete settling error itself and linearize the input signal dependency of the incomplete settling error. A technique with charge redistribution of divided capacitors is proposed for pre-charging capacitors without any additional reference sources. This linearized settling error is corrected by a first-order error approximation in digital domain with feasible complexity and cost. Simulation results show that the ADC achieves SNDR of 70 dB, SFDR of 79 dB at nyquist input frequency in a 65 nm CMOS process under 1.2 V power supply voltage for 1.2 Vp-p input signal swing. The estimated power consumption of the 12b 200 MS/s pipeline ADC using the proposed digital error correction of incomplete settling errors is 7.6 mW with a small FOM of 22 fJ/conv-step.
Min-Chul SUN Sang Wan KIM Garam KIM Hyun Woo KIM Hyungjin KIM Byung-Gook PARK
A novel tunneling field-effect transistor (TFET) featuring the sigma-shape embedded SiGe sources and recessed channel is proposed. The gate facing the source effectively focuses the E-field at the tip of the source and eliminates the gradual turn-on issue of planar TFETs. The fabrication scheme modified from the state-of-the-art 45 nm/32 nm CMOS technology flows provides a unique benefit in the co-integrability and the control of ID-VGS characteristics. The feasibility is verified with TCAD process simulation of the device with 14 nm of the gate dimension. The device simulation shows 5-order change in the drain current with a gate bias change less than 300 mV.
Lu ZHAO Qiao-yan WEN Jie ZHANG
The linear complexity of quaternary sequences plays an important role in cryptology. In this paper, the minimal polynomial of a class of quaternary sequences with low autocorrelation constructed by generalized cyclotomic sequences pairs is determined, and the linear complexity of the sequences is also obtained.
Xu ZHOU Kai LU Xiaoping WANG Wenzhe ZHANG Kai ZHANG Xu LI Gen LI
The nondeterminism of message-passing communication brings challenges to program debugging, testing and fault-tolerance. This paper proposes a novel deterministic message-passing implementation (DMPI) for parallel programs in the distributed environment. DMPI is compatible with the standard MPI in user interface, and it guarantees the reproducibility of message with high performance. The basic idea of DMPI is to use logical time to solve message races and control asynchronous transmissions, and thus we could eliminate the nondeterministic behaviors of the existing message-passing mechanism. We apply a buffering strategy to alleviate the performance slowdown caused by mismatch of logical time and physical time. To avoid deadlocks introduced by deterministic mechanisms, we also integrate DMPI with a lightweight deadlock checker to dynamically detect and solve these deadlocks. We have implemented DMPI and evaluated it using NPB benchmarks. The results show that DMPI could guarantee determinism with incurring modest runtime overhead (14% on average).
Zi-Yi WANG Shi-Ze GUO Zhe-Ming LU Guang-Hua SONG Hui LI
Many deterministic small-world network models have been proposed so far, and they have been proven useful in describing some real-life networks which have fixed interconnections. Search efficiency is an important property to characterize small-world networks. This paper tries to clarify how the search procedure behaves when random walks are performed on small-world networks, including the classic WS small-world network and three deterministic small-world network models: the deterministic small-world network created by edge iterations, the tree-structured deterministic small-world network, and the small-world network derived from the deterministic uniform recursive tree. Detailed experiments are carried out to test the search efficiency of various small-world networks with regard to three different types of random walks. From the results, we conclude that the stochastic model outperforms the deterministic ones in terms of average search steps.
This paper presents a method for learning an overcomplete, nonnegative dictionary and for obtaining the corresponding coefficients so that a group of nonnegative signals can be sparsely represented by them. This is accomplished by posing the learning as a problem of nonnegative matrix factorization (NMF) with maximization of the incoherence of the dictionary and of the sparsity of coefficients. By incorporating a dictionary-incoherence penalty and a sparsity penalty in the NMF formulation and then adopting a hierarchically alternating optimization strategy, we show that the problem can be cast as two sequential optimal problems of quadratic functions. Each optimal problem can be solved explicitly so that the whole problem can be efficiently solved, which leads to the proposed algorithm, i.e., sparse hierarchical alternating least squares (SHALS). The SHALS algorithm is structured by iteratively solving the two optimal problems, corresponding to the learning process of the dictionary and to the estimating process of the coefficients for reconstructing the signals. Numerical experiments demonstrate that the new algorithm performs better than the nonnegative K-SVD (NN-KSVD) algorithm and several other famous algorithms, and its computational cost is remarkably lower than the compared algorithms.
Takahiro IIZUKA Kenji FUKUSHIMA Akihiro TANAKA Hideyuki KIKUCHIHARA Masataka MIYAKE Hans J. MATTAUSCH Mitiko MIURA-MATTAUSCH
The trench-gate type high-voltage (HV) MOSFET is one of the variants of HV-MOSFET, typically with its utility segments lying on a larger power consumption domain, compared to planar HV-MOSFETs. In this work, the HiSIM_HV compact model, originally intended for planar LDMOSFETs, was adequately extended to accommodate trench-gate type HV-MOSFETs. The model formulation focuses on a closed-form description of the current path in the highly resistive drift region, specific to the trench-gate HV-MOSFETs. It is verified that the developed compact expression can capture the conductivity in the drift region, which varies with voltage bias and device technology such as trench width. The notable enhancement of current drivability can be accounted for by the electrostatic control exerted by the trench gate within the framework of this model.
Xiaoping LI Wenping MA Tongjiang YAN Xubo ZHAO
In this letter, we first introduce a new generalized cyclotomic sequence of order two of length pq, then we calculate its linear complexity and minimal polynomial. Our results show that this sequence possesses both high linear complexity and optimal balance on 1 s and 0 s, which may be attractive for use in stream cipher cryptosystems.
Kazumi YAMAWAKI Fumiya NAKANO Hideki NODA Michiharu NIIMI
The application of information hiding to image compression is investigated to improve compression efficiency for JPEG color images. In the proposed method, entropy-coded DCT coefficients of chrominance components are embedded into DCT coefficients of the luminance component. To recover an image in the face of the degradation caused by compression and embedding, an image restoration method is also applied. Experiments show that the use of both information hiding and image restoration is most effective to improve compression efficiency.
Min Woo RYU Sung-Ho KIM Hee Cheol HWANG Kibog PARK Kyung Rok KIM
In this paper, we present the validity and potential capacity of a modeling and simulation environment for the nonresonant plasmonic terahertz (THz) detector based on the silicon (Si) field-effect transistor (FET) with a technology computer-aided design (TCAD) platform. The nonresonant and “overdamped” plasma-wave behaviors have been modeled by introducing a quasi-plasma electron charge box as a two-dimensional electron gas (2DEG) in the channel region only around the source side of Si FETs. Based on the coupled nonresonant plasma-wave physics and continuity equation on the TCAD platform, the alternate-current (AC) signal as an incoming THz wave radiation successfully induced a direct-current (DC) drain-to-source output voltage as a detection signal in a sub-THz frequency regime under the asymmetric boundary conditions with a external capacitance between the gate and drain. The average propagation length and density of a quasi-plasma have been confirmed as around 100 nm and 11019/cm3, respectively, through the transient simulation of Si FETs with the modulated 2DEG at 0.7 THz. We investigated the incoming radiation frequency dependencies on the characteristics of the plasmonic THz detector operating in sub-THz nonresonant regime by using the quasi-plasma modeling on TCAD platform. The simulated dependences of the photoresponse with quasi-plasma 2DEG modeling on the structural parameters such as gate length and dielectric thickness confirmed the operation principle of the nonresonant plasmonic THz detector in the Si FET structure. The proposed methodologies provide the physical design platform for developing novel plasmonic THz detectors operating in the nonresonant detection mode.
Sang Wan KIM Woo Young CHOI Min-Chul SUN Hyun Woo KIM Jong-Ho LEE Hyungcheol SHIN Byung-Gook PARK
In order to implement complementary logic function with L-shaped tunneling field-effect transistors (TFETs), current drivability and subthreshold swing (SS) need to be improved more. For this purpose, high-k material such as hafnium dioxide (HfO2) has been used as gate dielectric rather than silicon dioxide (SiO2). The effects of device parameters on performance have been investigated and the design of L-shaped TFETs has been optimized. Finally, the performance of L-shaped TFET inverters have been compared with that of conventional TFET ones.
Pablo Rosales TEJADA Jae-Yoon JUNG
A variety of ubiquitous computing devices, such as radio frequency identification (RFID) and wireless sensor network (WSN), are generating huge and significant events that should be rapidly processed for business excellence. In this paper, we describe how complex event processing (CEP) technology can be applied to ubiquitous process management based on context-awareness. To address the issue, we propose a method for context-aware event processing using event processing language (EPL) statement. Specifically, the semantics of a situation drive the transformation of EPL statement templates into executable EPL statements. The proposed method is implemented in the domain of ubiquitous cold chain logistics management. With the proposed method, context-aware event processing can be realized to enhance business performance and excellence in ubiquitous computing environments.