Tu Bao HO Saori KAWASAKI Katsuhiko TAKABAYASHI Canh Hao NGUYEN
From lessons learned in medical data mining projects we show that integration of advanced computation techniques and human inspection is indispensable in medical data mining. We proposed an integrated approach that merges data mining and text mining methods plus visualization support for expert evaluation. We also appropriately developed temporal abstraction and text mining methods to exploit the collected data. Furthermore, our visual discovery system D2MS allowed to actively and effectively working with physicians. Significant findings in hepatitis study were obtained by the integrated approach.
Jianguo WEI Xugang LU Jianwu DANG
Machine learning techniques have long been applied in many fields and have gained a lot of success. The purpose of learning processes is generally to obtain a set of parameters based on a given data set by minimizing a certain objective function which can explain the data set in a maximum likelihood or minimum estimation error sense. However, most of the learned parameters are highly data dependent and rarely reflect the true physical mechanism that is involved in the observation data. In order to obtain the inherent knowledge involved in the observed data, it is necessary to combine physical models with learning process rather than only fitting the observations with a black box model. To reveal underlying properties of human speech production, we proposed a learning process based on a physiological articulatory model and a coarticulation model, where both of the models are derived from human mechanisms. A two-layer learning framework was designed to learn the parameters concerned with physiological level using the physiological articulatory model and the parameters in the motor planning level using the coarticulation model. The learning process was carried out on an articulatory database of human speech production. The learned parameters were evaluated by numerical experiments and listening tests. The phonetic targets obtained in the planning stage provided an evidence for understanding the virtual targets of human speech production. As a result, the model based learning process reveals the inherent mechanism of the human speech via the learned parameters with certain physical meaning.
Kazuhiro TAKEUCHI Yukie NAKAO Hitoshi ISAHARA
Dividing a lecture speech into segments and providing those segments as learning objects are quite general and convenient way to construct e-learning resources. However it is difficult to assign an appropriate title to each object that reflects its content. Since there are various aspects of analyzing discourse segments, it is inevitable that researchers will face the diversity when describing the "meanings" of discourse segments. In this paper, we propose the assignment of discourse segment titles from the representation of their "meanings." In this assigning procedure, we focus on the speaker's evaluation for the event or the speech object. To verify the effectiveness of our idea, we examined identification of the segment boundaries from the titles that were described in our procedure. We confirmed that the result of the identification was more accurate than that of intuitive identification.
Haruna MATSUSHITA Yoshifumi NISHIO
Since we can accumulate a large amount of data including useless information in recent years, it is important to investigate various extraction method of clusters from data including much noises. The Self-Organizing Map (SOM) has attracted attention for clustering nowadays. In this study, we propose a method of using plural SOMs (TSOM: Tentacled SOM) for effective data extraction. TSOM consists of two kinds of SOM whose features are different, namely, one self-organizes the area where input data are concentrated, and the other self-organizes the whole of the input space. Each SOM of TSOM can catch the information of other SOMs existing in its neighborhood and self-organizes with the competing and accommodating behaviors. We apply TSOM to data extraction from input data including much noise, and can confirm that TSOM successfully extracts only clusters even in the case that we do not know the number of clusters in advance.
Shu-Chen WANG Pei-Hwa HUANG Chi-Jui WU Yung-Sung CHUANG
This paper is to investigate the application of fuzzy c-means clustering to the direct identification of coherent synchronous generators in power systems. Because of the conceptual appropriateness and computational simplicity, this approach is essentially a fast and flexible method. At first, the coherency measures are derived from the time-domain responses of generators in order to reveal the relations between any pair of generators. And then they are used as initial element values of the membership matrix in the clustering procedures. An application of the proposed method to the Taiwan power (Taipower) system is demonstrated in an attempt to show the effectiveness of this clustering approach. The effects of short circuit fault locations, operating conditions, data sampling interval, and power system stabilizers are also investigated, as well. The results are compared with those obtained from the similarity relation method. And thus it is found that the presented approach needs less computation time and can directly initialize a clustering process for any number of clusters.
Akari SATO Yoshihiro HAYAKAWA Koji NAKAJIMA
Many researchers have attempted to solve the combinatorial optimization problems, that are NP-hard or NP-complete problems, by using neural networks. Though the method used in a neural network has some advantages, the local minimum problem is not solved yet. It has been shown that the Inverse Function Delayed (ID) model, which is a neuron model with a negative resistance on its dynamics and can destabilize an intended region, can be used as the powerful tool to avoid the local minima. In our previous paper, we have shown that the ID network can separate local minimum states from global minimum states in case that the energy function of the embed problem is zero. It can achieve 100% success rate in the N-Queen problem with the certain parameter region. However, for a wider parameter region, the ID network cannot reach a global minimum state while all of local minimum states are unstable. In this paper, we show that the ID network falls into a particular permanent oscillating state in this situation. Several neurons in the network keep spiking in the particular permanent oscillating state, and hence the state transition never proceed for global minima. However, we can also clarify that the oscillating state is controlled by the parameter α which affects the negative resistance region and the hysteresis property of the ID model. In consequence, there is a parameter region where combinatorial optimization problems are solved at the 100% success rate.
The present paper introduces an integrated construction of binary sequences having a zero-correlation zone. The cross-correlation function and the side-lobe of the auto-correlation function of the proposed sequence set is zero for the phase shifts within the zero-correlation zone. The proposed method enables more flexible design of the binary zero-correlation zone sequence set with respect to its member size, length, and width of zero-correlation zone. Several previously reported sequence construction methods of binary zero-correlation zone sequence sets can be explained as special cases of the proposed method.
Verayuth LERTNATTEE Thanaruk THEERAMUNKONG
In order to support decision making, text classification is an important tool. Recently, in addition to term frequency and inverse document frequency, term distributions have been shown to be useful to improve classification accuracy in multi-class classification. This paper investigates the performance of these term distributions on binary classification using a centroid-based approach. In such one-against-the-rest, there are only two classes, the positive (focused) class and the negative class. To improve the performance, a so-called hierarchical EM method is applied to cluster the negative class, which is usually much larger and more diverse than the positive one, into several homogeneous groups. The experimental results on two collections of web pages, namely Drug Information (DI) and WebKB, show the merits of term distributions and clustering on binary classification. The performance of the proposed method is also investigated using the Thai Herbal collection where the texts are written in Thai language.
Chin-Chen CHANG Wen-Chuan WU Chih-Chiang TSOU
The major application of digital data hiding techniques is to deliver confidential data secretly via public but unreliable computer networks. Most of the existing data hiding schemes, however, exploit the raw data of cover images to perform secret communications. In this paper, a novel data hiding scheme was presented with the manipulation of images based on the compression of side-match vector quantization (SMVQ). This proposed scheme provided adaptive alternatives for modulating the quantized indices in the compressed domain so that a considerable quantity of secret data could be artfully embedded. As the experimental results demonstrated, the proposed scheme indeed provided a larger payload capacity without making noticeable distortions in comparison with schemes proposed in earlier works. Furthermore, this scheme also presented a satisfactory compression performance.
Zhipeng YE Wenbin CHEN Michael Peter KENNEDY
A Verilog-AMS model of a fractional-N frequency synthesizer is presented that is capable of predicting spurious tones as well as noise and jitter performance. The model is based on a voltage-domain behavioral simulation. Simulation efficiency is improved by merging the voltage controlled oscillator (VCO) and the frequency divider. Due to the benefits of Verilog-AMS, the ΔΣ modulator which is incorporated in the synthesizer is modeled in a fully digital way. This makes it accurate enough to evaluate how the performance of the frequency synthesizer is affected by cyclic behavior in the ΔΣ modulator. The spur-minimizing effect of an odd initial condition on the first accumulator of the ΔΣ modulator is verified. Sequence length control and its effect on the fractional-N frequency synthesizer are also discussed. The simulated results are in agreement with prior published data on fractional-N synthesizers and with new measurement results.
Koichiro ISHIKAWA Yoshihisa SHINOZAWA Akito SAKURAI
We propose in this paper a SOM-like algorithm that accepts online, as inputs, starts and ends of viewing of a multimedia content by many users; a one-dimensional map is then self-organized, providing an approximation of density distribution showing how many users see a part of a multimedia content. In this way "viewing behavior of crowds" information is accumulated as experience accumulates, summarized into one SOM-like network as knowledge is extracted, and is presented to new users as the knowledge is transmitted. Accumulation of multimedia contents on the Internet increases the need for time-efficient viewing of the contents and the possibility of compiling information on many users' viewing experiences. In the circumstances, a system has been proposed that presents, in the Internet environment, a kind of summary of viewing records of many viewers of a multimedia content. The summary is expected to show that some part is seen by many users but some part is rarely seen. The function is similar to websites utilizing "wisdom of crowds" and is facilitated by our proposed algorithm.
Sangwook LEE Haesun PARK Moongu JEON
Particle swarm optimization (PSO), inspired by social psychology principles and evolutionary computations, has been successfully applied to a wide range of continuous optimization problems. However, research on discrete problems has been done not much even though discrete binary version of PSO (BPSO) was introduced by Kennedy and Eberhart in 1997. In this paper, we propose a modified BPSO algorithm, which escapes from a local optimum by employing a bit change mutation. The proposed algorithm was tested on De jong's suite and its results show that BPSO with the proposed mutation outperforms the original BPSO.
It has been shown that the output information produced by the soft output Viterbi algorithm (SOVA) is too optimistic. To compensate for this, the output information should be normalized. This letter proposes a simple normalization technique that extends the existing sign difference ratio (SDR) criterion. The new normalization technique counts the sign differences between the a-priori information and the extrinsic information, and then adaptively determines the corresponding normalization factor for each data block. Simulations comparing the new technique with other well-known normalization techniques show that the proposed normalization technique can achieve about 0.2 dB coding gain improvement on average while reducing up to about 1/2 iteration for decoding.
Kuniyasu SHIMIZU Tetsuro ENDO Hisa-Aki TANAKA
The averaged equation for an arbitrary number of oscillators coupled by nonlinear coupling scheme invented by S. Nagano, is derived. This system is invented as a model of uni-cellular slime amoeba. By using the averaged equation, we investigate the synchronization characteristics of five coupled oscillators and a large number of coupled oscillators. In particular, we present the statistical property of coupled oscillators in terms of coupling factor γ. We also investigate the effect of linear and nonlinear coupling terms for achieving synchronization, and confirm that the nonlinear coupling term plays an important role for strong synchronization than linear coupling term does.
Sunae SEO Youil KIM Hyun-Goo KANG Taisook HAN
Correctness of Java programs is important because they are executed in distributed computing environments. The object initialization scheme in the Java programming language is complicated, and this complexity may lead to undesirable semantic bugs. Various tools have been developed for detecting program patterns that might cause errors during program execution. However, current tools cannot identify code patterns in which an uninitialized field is accessed when an object is initialized. We refer to such erroneous patterns as uninitialized field references. In this paper, we propose a static pattern detection algorithm for identifying uninitialized field references. We design a sound analysis for this problem and implement an analyzer using the Soot framework. In addition, we apply our algorithm to some real Java applications. From the experiments, we identify 12 suspicious field references in the applications, and among those we find two suspected errors by manual inspection.
Zhi-Ren TSAI Jiing-Dong HWANG Yau-Zen CHANG
This study introduces the fuzzy Lyapunov function to the fuzzy PID control systems, modified fuzzy systems, with an optimized robust tracking performance. We propose a compound search strategy called conditional linear matrix inequality (CLMI) approach which was composed of the proposed improved random optimal algorithm (IROA) concatenated with the simplex method to solve the linear matrix inequality (LMI) problem. If solutions of a specific system exist, the scheme finds more than one solutions at a time, and these fixed potential solutions and variable PID gains are ready for tracking performance optimization. The effectiveness of the proposed control scheme is demonstrated by the numerical example of a cart-pole system.
Hyunggi CHO Myungseok KANG Jonghoon KIM Hagbae KIM
This paper presents a Maximum Likelihood Location Estimation (MLLE) algorithm for the home network environments. We propose a deployment of cluster-tree topology in the ZigBee networks and derive the MLE under the log-normal models for the Received Signal Strength (RSS) measurements. Experiments are also conducted to validate the effectiveness of the proposed algorithm.
Jie JIA Eun-Ku JUNG Hae-Kwang KIM
This paper presents an adaptive transform coefficient scan method that effectively improves intra coding efficiency of H.264. Instead of applying one zig-zag scan to all transform blocks, the proposed method applies a field scan to a horizontally predicted block, a horizontal scan to a vertically predicted block, and a zig-zag scan to blocks predicted in other prediction modes. Experiments based on JM9.6 were performed using only intra coding. Results of the experiments show that the proposed method yields an average PSNR enhancement of 0.16 dB and a maximum PSNR enhancement of 0.31 dB over the current H.264 using zig-zag scan.
Yuki YOSHIDA Kazunori HAYASHI Hideaki SAKAI
This paper proposes low-complexity pre- and post-frequency domain equalization and frequency diversity combining methods for block transmission schemes with cyclic prefix. In the proposed methods, the equalization and diversity combining are performed simultaneously in discrete frequency domain. The weights for the proposed equalizer and combiner are derived based on zero-forcing and minimum-mean-square error criteria. We demonstrate the performance of the proposed methods, including bit-error rate performance and peak-to-average power ratios of the transmitted signal, via computer simulations.
In this paper, we present a novel force-directed method for automatically drawing intersecting compound mixed graphs (ICMGs) that can express complicated relations among elements such as adjacency, inclusion, and intersection. For this purpose, we take a strategy called unified simplification that can transform layout problem for an ICMG into that for an undirected graph. This method is useful for various information visualizations. We describe definitions, aesthetics, force model, algorithm, evaluation, and applications.