Akihiro NAGASE Nami NAKANO Masako ASAMURA Jun SOMEYA Gosuke OHASHI
The authors have evaluated a method of expanding the bit depth of image signals called SGRAD, which requires fewer calculations, while degrading the sharpness of images less. Where noise is superimposed on image signals, the conventional method for obtaining high bit depth sometimes incorrectly detects the contours of images, making it unable to sufficiently correct the gradation. Requiring many line memories is also an issue with the conventional method when applying the process to vertical gradation. As a solution to this particular issue, SGRAD improves the method of detecting contours with transiting gradation to effectively correct the gradation of image signals which noise is superimposed on. In addition, the use of a prediction algorithm for detecting gradation reduces the scale of the circuit with less correction of the vertical gradation.
Daichi TAKEUCHI Katsunori MAKIHARA Mitsuhisa IKEDA Seiichi MIYAZAKI Hirokazu KAKI Tsukasa HAYASHI
We fabricated highly dense Si nano-columnar structures accompanied with Si nanocrystals on W-coated quartz and characterized their local electrical transport in the thickness direction in a non-contact mode by using a Rh-coated Si cantilever with pulse bias application, in which Vmax, Vmin, and the duty ratio were set at +3.0V, -14V, and 50%, respectively. By applying a pulse bias to the bottom W electrode with respect to a grounded top electrode made of ∼10-nm-thick Au on a sample surface, non-uniform current images in correlation with surface morphologies reflecting electron emission were obtained. The change in the surface potential of the highly dense Si nano-columnar structures accompanied with Si nanocrystals, which were measured at room temperature by using an AFM/Kelvin probe technique, indicated electron injection into and extraction from Si nanocrystals, depending on the tip bias polarity. This result is attributable to efficient electron emission under pulsed bias application due to electron charging from the top electrode to the Si nanocrystals in a positively biased duration at the bottom electrode and subsequent quasi-ballistic transport through Si nanocrystals in a negatively biased duration.
Wenming YANG Guoli MA Fei ZHOU Qingmin LIAO
This study proposes a feature-level fusion method that uses finger veins (FVs) and finger dorsal texture (FDT) for personal authentication based on orientation selection (OS). The orientation codes obtained by the filters correspond to different parts of an image (foreground or background) and thus different orientations offer different levels of discrimination performance. We have conducted an orientation component analysis on both FVs and FDT. Based on the analysis, an OS scheme is devised which combines the discriminative orientation features of both modalities. Our experiments demonstrate the effectiveness of the proposed method.
Miloš RADMANOVIC Radomir S. STANKOVIC Claudio MORAGA
This paper describes a method for the efficient computation of the total autocorrelation for large multiple-output Boolean functions over a Shared Binary Decision Diagram (SBDD). The existing methods for computing the total autocorrelation over decision diagrams are restricted to single output functions and in the case of multiple-output functions require repeating the procedure k times where k is the number of outputs. The proposed method permits to perform the computation in a single traversal of SBDD. In that order, compared to standard BDD packages, we modified the way of traversing sub-diagrams in SBDD and introduced an additional memory function kept in the hash table for storing results of the computation of the autocorrelation between two subdiagrams in the SBDD. Due to that, the total amount of computations is reduced which makes the method feasible in practical applications. Experimental results over standard benchmarks confirm the efficiency of the method.
As data volumes explode, data storage costs become a large fraction of total IT costs. We can reduce the costs substantially by using compression. However, it is generally known that database compression is not suitable for write-intensive workloads. In this paper, we provide a comprehensive solution to improve the performance of compressed databases for write-intensive OLTP workloads. We find that storing data too densely in compressed pages incurs many future page splits, which require exclusive locks. In order to avoid lock contention, we reduce page splits by sacrificing a couple of percent of space savings. We reserve enough space in each compressed page for future updates of records and prevent page merges that are prone to incur page splits in the near future. The experimental results using TPC-C benchmark and MySQL/InnoDB show that our method gives 1.5 times higher throughput with 33% space savings compared with the uncompressed counterpart and 1.8 times higher throughput with only 1% more space compared with the state-of-the-art compression method developed by Facebook.
Narpendyah Wisjnu ARIWARDHANI Masashi KIMURA Yurie IRIBE Kouichi KATSURADA Tsuneo NITTA
In this paper, we propose voice conversion (VC) based on articulatory features (AF) to vocal-tract parameters (VTP) mapping. An artificial neural network (ANN) is applied to map AF to VTP and to convert a speaker's voice to a target-speaker's voice. The proposed system is not only text-independent VC, in which it does not need parallel utterances between source and target-speakers, but can also be used for an arbitrary source-speaker. This means that our approach does not require source-speaker data to build the VC model. We are also focusing on a small number of target-speaker training data. For comparison, a baseline system based on Gaussian mixture model (GMM) approach is conducted. The experimental results for a small number of training data show that the converted voice of our approach is intelligible and has speaker individuality of the target-speaker.
Gibran BENITEZ-GARCIA Gabriel SANCHEZ-PEREZ Hector PEREZ-MEANA Keita TAKAHASHI Masahide KANEKO
This paper presents a facial expression recognition algorithm based on segmentation of a face image into four facial regions (eyes-eyebrows, forehead, mouth and nose). In order to unify the different results obtained from facial region combinations, a modal value approach that employs the most frequent decision of the classifiers is proposed. The robustness of the algorithm is also evaluated under partial occlusion, using four different types of occlusion (half left/right, eyes and mouth occlusion). The proposed method employs sub-block eigenphases algorithm that uses the phase spectrum and principal component analysis (PCA) for feature vector estimation which is fed to a support vector machine (SVM) for classification. Experimental results show that using modal value approach improves the average recognition rate achieving more than 90% and the performance can be kept high even in the case of partial occlusion by excluding occluded parts in the feature extraction process.
Seng KHEANG Kouichi KATSURADA Yurie IRIBE Tsuneo NITTA
To achieve high quality output speech synthesis systems, data-driven grapheme-to-phoneme (G2P) conversion is usually used to generate the phonetic transcription of out-of-vocabulary (OOV) words. To improve the performance of G2P conversion, this paper deals with the problem of conflicting phonemes, where an input grapheme can, in the same context, produce many possible output phonemes at the same time. To this end, we propose a two-stage neural network-based approach that converts the input text to phoneme sequences in the first stage and then predicts each output phoneme in the second stage using the phonemic information obtained. The first-stage neural network is fundamentally implemented as a many-to-many mapping model for automatic conversion of word to phoneme sequences, while the second stage uses a combination of the obtained phoneme sequences to predict the output phoneme corresponding to each input grapheme in a given word. We evaluate the performance of this approach using the American English words-based pronunciation dictionary known as the auto-aligned CMUDict corpus[1]. In terms of phoneme and word accuracy of the OOV words, on comparison with several proposed baseline approaches, the evaluation results show that our proposed approach improves on the previous one-stage neural network-based approach for G2P conversion. The results of comparison with another existing approach indicate that it provides higher phoneme accuracy but lower word accuracy on a general dataset, and slightly higher phoneme and word accuracy on a selection of words consisting of more than one phoneme conflicts.
Shun UMETSU Akinobu SHIMIZU Hidefumi WATANABE Hidefumi KOBATAKE Shigeru NAWANO
This paper presents a novel liver segmentation algorithm that achieves higher performance than conventional algorithms in the segmentation of cases with unusual liver shapes and/or large liver lesions. An L1 norm was introduced to the mean squared difference to find the most relevant cases with an input case from a training dataset. A patient-specific probabilistic atlas was generated from the retrieved cases to compensate for livers with unusual shapes, which accounts for liver shape more specifically than a conventional probabilistic atlas that is averaged over a number of training cases. To make the above process robust against large pathological lesions, we incorporated a novel term based on a set of “lesion bases” proposed in this study that account for the differences from normal liver parenchyma. Subsequently, the patient-specific probabilistic atlas was forwarded to a graph-cuts-based fine segmentation step, in which a penalty function was computed from the probabilistic atlas. A leave-one-out test using clinical abdominal CT volumes was conducted to validate the performance, and proved that the proposed segmentation algorithm with the proposed patient-specific atlas reinforced by the lesion bases outperformed the conventional algorithm with a statistically significant difference.
Sumxin JIANG Rendong YING Peilin LIU Zhenqi LU Zenghui ZHANG
This paper describes a new method for lossy audio signal compression via compressive sensing (CS). In this method, a structured shrinkage operator is employed to decompose the audio signal into three layers, with two sparse layers, tonal and transient, and additive noise, and then, both the tonal and transient layers are compressed using CS. Since the shrinkage operator is able to take into account the structure information of the coefficients in the transform domain, it is able to achieve a better sparse approximation of the audio signal than traditional methods do. In addition, we propose a sparsity allocation algorithm, which adjusts the sparsity between the two layers, thus improving the performance of CS. Experimental results demonstrated that the new method provided a better compression performance than conventional methods did.
As one of the popular social media that many people turn to in recent years, collaborative encyclopedia Wikipedia provides information in a more “Neutral Point of View” way than others. Towards this core principle, plenty of efforts have been put into collaborative contribution and editing. The trajectories of how such collaboration appears by revisions are valuable for group dynamics and social media research, which suggest that we should extract the underlying derivation relationships among revisions from chronologically-sorted revision history in a precise way. In this paper, we propose a revision graph extraction method based on supergram decomposition in the document collection of near-duplicates. The plain text of revisions would be measured by its frequency distribution of supergram, which is the variable-length token sequence that keeps the same through revisions. We show that this method can effectively perform the task than existing methods.
Rompei SUGAWARA Hao SAN Kazuyuki AIHARA Masao HOTTA
Proof-of-concept cyclic analog-to-digital converters (ADCs) have been designed and fabricated in 90-nm CMOS technology. The measurement results of an experimental prototype demonstrate the effectiveness of the proposed switched-capacitor (SC) architecture to realize a non-binary ADC based on β expansion. Different from the conventional binary ADC, a simple 1-bit/step structure for an SC multiplying digital-to-analog converter (MDAC) is proposed to present residue amplification by β (1 < β < 2). The redundancy of non-binary ADCs with radix β tolerates the non-linear conversion errors caused by the offsets of comparators, the mismatches of capacitors, and the finite DC gains of amplifiers, which are used in the MDAC. We also employed a radix value estimation algorithm to obtain an effective value of β for non-binary encoding; it can be realized by merely adding a simple conversion sequence and digital circuits. As a result, the power penalty of a high-gain wideband amplifier and the required accuracy of the circuit elements for a high-resolution ADC were largely relaxed so that the circuit design was greatly simplified. The implemented ADC achieves a measured peak signal-to-noise-and-distortion-ratio (SNDR) of 60.44dB, even with an op-amp with a poor DC gain (< 50dB) while dissipating 780µW in analog circuits at 1.4V and occupying an active area of 0.25 × 0.26mm2.
Sotarat THAMMABOOSADEE Bunthit WATANAPA Jonathan H. CHAN Udom SILPARCHA
A two-stage classifier is proposed that identifies criminal charges and a range of punishments given a set of case facts and attributes. Our supervised-learning model focuses only on the offences against life and body section of the criminal law code of Thailand. The first stage identifies a set of diagnostic issues from the case facts using a set of artificial neural networks (ANNs) modularized in hierarchical order. The second stage extracts a set of legal elements from the diagnostic issues by employing a set of C4.5 decision tree classifiers. These linked modular networks of ANNs and decision trees form an effective system in terms of determining power and the ability to trace or infer the relevant legal reasoning behind the determination. Isolated and system-integrated experiments are conducted to measure the performance of the proposed system. The overall accuracy of the integrated system can exceed 90%. An actual case is also demonstrated to show the effectiveness of the proposed system.
Yongjoo SHIN Sihu SONG Yunho LEE Hyunsoo YOON
This letter proposes a novel intrusion tolerant system consisting of several virtual machines (VMs) that refresh the target system periodically and by live migration, which monitors the many features of the VMs to identify and replace exhausted VMs. The proposed scheme provides adequate performance and dependability against denial of service (DoS) attacks. To show its efficiency and security, we conduct experiments on the CSIM20 simulator, which showed 22% improvement in a normal situation and approximately 77.83% improvement in heavy traffic in terms of the response time compared to that reported in the literature. We measure and compare the response time. The result show that the proposed scheme has shorter response time and maintains than other systems and supports services during the heavy traffic.
Qingyi GU Abdullah AL NOMAN Tadayoshi AOYAMA Takeshi TAKAKI Idaku ISHII
In this paper, we present a high frame rate (HFR) vision system that can automatically control its exposure time by executing brightness histogram-based image processing in real time at a high frame rate. Our aim is to obtain high-quality HFR images for robust image processing of high-speed phenomena even under dynamically changing illumination, such as lamps flickering at 100 Hz, corresponding to an AC power supply at 50 / 60 Hz. Our vision system can simultaneously calculate a 256-bin brightness histogram for an 8-bit gray image of 512×512 pixels at 2000 fps by implementing a brightness histogram calculation circuit module as parallel hardware logic on an FPGA-based high-speed vision platform. Based on the HFR brightness histogram calculation, our method realizes automatic exposure (AE) control of 512×512 images at 2000 fps using our proposed AE algorithm. The proposed AE algorithm can maximize the number of pixels in the effective range of the brightness histogram, thus excluding much darker and brighter pixels, to improve the dynamic range of the captured image without over- and under-exposure. The effectiveness of our HFR system with AE control is evaluated according to experimental results for several scenes with illumination flickering at 100 Hz, which is too fast for the human eye to see.
Ryochi KATAOKA Kentaro NISHIMORI Takefumi HIRAGURI Naoki HONMA Tomohiro SEKI Ken HIRAGA Hideo MAKINO
A novel analog decoding method using only 90-degree phase shifters is proposed to simplify the decoding method for short-range multiple-input multiple-output (MIMO) transmission. In a short-range MIMO transmission, an optimal element spacing that maximizes the channel capacity exists for a given transmit distance between the transmitter and receiver. We focus on the fact that the weight matrix by zero forcing (ZF) at the optimal element spacing can be obtained by using dividers and 90-degree phase shifters because it can be expressed by a unitary matrix. The channel capacity by the proposed method is next derived for the evaluation of the exact limitation of the channel capacity. Moreover, it is shown that an optimal weight when using directional antennas can be expressed by using only dividers, 90-degree phase shifters, and attenuators, regardless of the beam width of the directional antenna. Finally, bit error rate and channel capacity evaluations by both simulation and measurement confirm the effectiveness of the proposed method.
Takao MURAKAMI Kenta TAKAHASHI Kanta MATSUURA
Biometric identification has recently attracted attention because of its convenience: it does not require a user ID nor a smart card. However, both the identification error rate and response time increase as the number of enrollees increases. In this paper, we combine a score level fusion scheme and a metric space indexing scheme to improve the accuracy and response time in biometric identification, using only scores as information sources. We firstly propose a score level indexing and fusion framework which can be constructed from the following three schemes: (I) a pseudo-score based indexing scheme, (II) a multi-biometric search scheme, and (III) a score level fusion scheme which handles missing scores. A multi-biometric search scheme can be newly obtained by applying a pseudo-score based indexing scheme to multi-biometric identification. We secondly propose the NBS (Naive Bayes search) scheme as a multi-biometric search scheme and discuss its optimality with respect to the retrieval error rate. We evaluated our proposal using the datasets of multiple fingerprints and face scores from multiple matchers. The results showed that our proposal significantly improved the accuracy of the unimodal biometrics while reducing the average number of score computations in both the datasets.
Per-User Unitary Rate Control (PU2RC) performs poorly when the number of users is small and suffers from the sum-rate ceiling effect in the high signal-to-noise ratio (SNR) regime. In this paper, we propose a multimode transmission (MMT) strategy to overcome these inherent shortcomings of PU2RC. In the proposed MMT strategy, the transmitter finds out the optimal transmission mode and schedules users using each user's instantaneous channel quality information (CQI) parameters. First we assume that each user's CQI parameters are perfectly reported in order to introduce the proposed MMT strategy. Then we consider the quantization of CQI parameters using codebooks designed by the Lloyd algorithm. Moreover, we modify the CQI parameters to improve the system's robustness against quantization error. Finally, in order to reduce the quantization error, we design a hierarchical codebook to jointly quantize the modified CQI parameters by considering the correlation between them. Simulation results show that the proposed MMT strategy effectively overcomes the shortcomings of PU2RC and is robust against low quantization level of CQI parameters.
Ashir AHMED Andrew REBEIRO-HARGRAVE Yasunobu NOHARA Eiko KAI Zahidul HOSSEIN RIPON Naoki NAKASHIMA
This study looks at how an e-Health System can reduce morbidity (poor health) in unreached communities. The e-Health system combines affordable sensors and Body Area Networking technology with mobile health concepts and is called a Portable Health Clinic. The health clinic is portable because all the medical devices fit inside a briefcase and are carried to unreached communities by a healthcare assistants. Patient morbidity is diagnosed using software stratification algorithm and categorized according to triage color-coding scheme within the briefcase. Morbid patients are connected to remote doctor in a telemedicine call center using the mobile network coverage. Electronic Health Records (EHR) are used for the medical consultancy and e-Prescription is generated. The effectiveness of the portable health clinic system to target morbidity was tested on 8690 patients in rural and urban areas of Bangladesh during September 2012 to January 2013. There were two phases to the experiment: the first phase identified the intensity of morbidity and the second phase re-examined the morbid patients, two months later. The experiment results show a decrease in patients to identify as morbid among those who participated in telemedicine process.
Bo WU Yan WANG Xiuying CAO Pengcheng ZHU
Attenuated and delayed versions of the pulse signal overlap in multipath propagation. Previous algorithms can resolve them only if signal sampling is ideal, but fail to resolve two counterparts with non-ideal sampling. In this paper, we propose a novel method which can resolve the general types of non-ideally sampled pulse signals in the time domain via Taylor Series Expansion (TSE) and estimate multipath signals' precise time delays and amplitudes. In combination with the CLEAN algorithm, the overlapped pulse signal parameters are estimated one by one through an iteration method. Simulation results verify the effectiveness of the proposed method.