Qieshi ZHANG Sei-ichiro KAMATA
This paper proposes an improved color barycenter model (CBM) and its separation for automatic road sign (RS) detection. The previous version of CBM can find out the colors of RS, but the accuracy is not high enough for separating the magenta and blue regions and the influence of number with the same color are not considered. In this paper, the improved CBM expands the barycenter distribution to cylindrical coordinate system (CCS) and takes the number of colors at each position into account for clustering. Under this distribution, the color information can be represented more clearly for analyzing. Then aim to the characteristic of barycenter distribution in CBM (CBM-BD), a constrained clustering method is presented to cluster the CBM-BD in CCS. Although the proposed clustering method looks like conventional K-means in some part, it can solve some limitations of K-means in our research. The experimental results show that the proposed method is able to detect RS with high robustness.
In this letter, we present a novel interference-aware clustering scheme for cell broadcasting service. The proposed approach is based on a genetic algorithm for re-clustering. Using the genetic algorithm, the suggested method efficiently re-clusters the user nodes when the relays fail in receiving the cell broadcasting message from the base station. The simulation results exhibit that the proposed clustering scheme can maintain much higher capacity than the conventional clustering scheme in the cases of relay outage. The re-clustering method based on genetic algorithm also shows lower complexity than the re-clustering approach based on exhaustive search.
Mutual information (MI) is a standard measure of statistical dependence of random variables. However, due to the log function and the ratio of probability densities included in MI, it is sensitive to outliers. On the other hand, the L2-distance variant of MI called quadratic MI (QMI) tends to be robust against outliers because QMI is just the integral of the squared difference between the joint density and the product of marginals. In this paper, we propose a kernel least-squares QMI estimator called least-squares QMI (LSQMI) that directly estimates the density difference without estimating each density. A notable advantage of LSQMI is that its solution can be analytically and efficiently computed just by solving a system of linear equations. We then apply LSQMI to dependence-maximization clustering, and demonstrate its usefulness experimentally.
Danyi LI Weifeng LI Qingmin LIAO
In this paper, we propose a hybrid fuzzy geometric active contour method, which embeds the spatial fuzzy clustering into the evolution of geometric active contour. In every iteration, the evolving curve works as a spatial constraint on the fuzzy clustering, and the clustering result is utilized to construct the fuzzy region force. On one hand, the fuzzy region force provides a powerful capability to avoid the leakages at weak boundaries and enhances the robustness to various noises. On the other hand, the local information obtained from the gradient feature map contributes to locating the object boundaries accurately and improves the performance on the images with heterogeneous foreground or background. Experimental results on synthetic and real images have shown that our model can precisely extract object boundaries and perform better than the existing representative hybrid active contour approaches.
Bei HE Guijin WANG Chenbo SHI Xuanwu YIN Bo LIU Xinggang LIN
Based on sample-pair refinement and local optimization, this paper proposes a high-accuracy and quick matting algorithm. First, in order to gather foreground/background samples effectively, we shoot rays in hybrid (gradient and uniform) directions. This strategy utilizes the prior knowledge to adjust the directions for effective searching. Second, we refine sample-pairs of pixels by taking into account neighbors'. Both high confidence sample-pairs and usable foreground/background components are utilized and thus more accurate and smoother matting results are achieved. Third, to reduce the computational cost of sample-pair selection in coarse matting, this paper proposes an adaptive sample clustering approach. Most redundant samples are eliminated adaptively, where the computational cost decreases significantly. Finally, we convert fine matting into a de-noising problem, which is optimized by minimizing the observation and state errors iteratively and locally. This leads to less space and time complexity compared with global optimization. Experiments demonstrate that we outperform other state-of-the-art methods in local matting both on accuracy and efficiency.
WonHee LEE Samuel Sangkon LEE Dong-Un AN
Clustering methods are divided into hierarchical clustering, partitioning clustering, and more. K-Means is a method of partitioning clustering. We improve the performance of a K-Means, selecting the initial centers of a cluster through a calculation rather than using random selecting. This method maximizes the distance among the initial centers of clusters. Subsequently, the centers are distributed evenly and the results are more accurate than for initial cluster centers selected at random. This is time-consuming, but it can reduce the total clustering time by minimizing allocation and recalculation. Compared with the standard algorithm, F-Measure is more accurate by 5.1%.
Chen ZHANG ShiXiong XIA Bing LIU Lei ZHANG
Maximum margin clustering (MMC) is a newly proposed clustering method that extends the large-margin computation of support vector machine (SVM) to unsupervised learning. Traditionally, MMC is formulated as a nonconvex integer programming problem which makes it difficult to solve. Several methods rely on reformulating and relaxing the nonconvex optimization problem as semidefinite programming (SDP) or second-order cone program (SOCP), which are computationally expensive and have difficulty handling large-scale data sets. In linear cases, by making use of the constrained concave-convex procedure (CCCP) and cutting plane algorithm, several MMC methods take linear time to converge to a local optimum, but in nonlinear cases, time complexity is still high. Since extreme learning machine (ELM) has achieved similar generalization performance at much faster learning speed than traditional SVM and LS-SVM, we propose an extreme maximum margin clustering (EMMC) algorithm based on ELM. It can perform well in nonlinear cases. Moreover, the kernel parameters of EMMC need not be tuned by means of random feature mappings. Experimental results on several real-world data sets show that EMMC performs better than traditional MMC methods, especially in handling large-scale data sets.
In this paper, we propose an improved face clustering method using a weighted graph-based approach. We combine two parameters as the weight of a graph to improve clustering performance. One is average similarity, which is calculated with two constraints of geometric and symmetric properties, and the other is a newly proposed parameter called the orientation matching ratio, which is calculated from orientation analysis for matched keypoints in the face region. According to the results of face clustering for several datasets, the proposed method shows improved results compared to the previous method.
Jie GONG Sheng ZHOU Lu GENG Meng ZHENG Zhisheng NIU
In this letter, we propose a novel precoding scheme for base station (BS) cooperation in downlink cellular networks that allow overlapped clusters. The proposed precoding scheme is designed to mitigate the overlapping-BS interference by maximizing the so-called clustered virtual signal-to-interference-plus-noise ratio (CVSINR). Simulations show that with the proposed scheme, overlapped clustering provides substantial throughput gain over the traditional non-overlapped clustering methods, and user fairness is also improved.
In this paper, we propose a method for designing genetically optimized Linguistic Models (LM) with the aid of fuzzy granulation. The fundamental idea of LM introduced by Pedrycz is followed and their design framework based on Genetic Algorithm (GA) is enhanced. A LM is designed by the use of information granulation realized via Context-based Fuzzy C-Means (CFCM) clustering. This clustering technique builds information granules represented as a fuzzy set. However, it is difficult to optimize the number of linguistic contexts, the number of clusters generated by each context, and the weighting exponent. Thus, we perform simultaneous optimization of design parameters linking information granules in the input and output spaces based on GA. Experiments on the coagulant dosing process in a water purification plant reveal that the proposed method shows better performance than the previous works and LM itself.
Shuta KIMURA Masanori HASHIMOTO Takao ONOYE
Post-silicon tuning is attracting a lot of attention for coping with increasing process variation. However, its tuning cost via testing is still a crucial problem. In this paper, we propose tuning-friendly body bias clustering with multiple bias voltages. The proposed method provides a small set of compensation levels so that the speed and leakage current vary monotonically according to the level. Thanks to this monotonic leveling and limitation of the number of levels, the test-cost of post-silicon tuning is significantly reduced. During the body bias clustering, the proposed method explicitly estimates and minimizes the average leakage after the post-silicon tuning. Experimental results demonstrate that the proposed method reduces the average leakage by 25.3 to 51.9% compared to non clustering case. In a test case of four clusters, the number of necessary tests is reduced by 83% compared to the conventional exhaustive test approach. We reveal that two bias voltages are sufficient when only a small number of compensation levels are allowed for test-cost reduction. We also give an implication on how to synthesize a circuit to which post-silicon tuning will be applied.
In wireless sensor networks, unbalanced energy consumption and transmission collisions are two inherent problems and can significantly reduce network lifetime. This letter proposes an unequal clustering and TDMA-like scheduling mechanism (UCTSM) based on a gradient sinking model in wireless sensor networks. It integrates unequal clustering and TDMA-like transmission scheduling to balance the energy consumption among cluster heads and reduce transmission collisions. Simulation results show that UCTSM balances the energy consumption among the cluster heads, saves nodes' energy and so improves the network lifetime.
Fengwei AN Tetsushi KOIDE Hans Jürgen MATTAUSCH
In this paper, we propose a hardware solution for overcoming the problem of high computational demands in a nearest neighbor (NN) based multi-prototype learning system. The multiple prototypes are obtained by a high-speed K-means clustering algorithm utilizing a concept of software-hardware cooperation that takes advantage of the flexibility of the software and the efficiency of the hardware. The one nearest neighbor (1-NN) classifier is used to recognize an object by searching for the nearest Euclidean distance among the prototypes. The major deficiency in conventional implementations for both K-means and 1-NN is the high computational demand of the nearest neighbor searching. This deficiency is resolved by an FPGA-implemented coprocessor that is a VLSI circuit for searching the nearest Euclidean distance. The coprocessor requires 12.9% logic elements and 58% block memory bits of an Altera Stratix III E110 FPGA device. The hardware communicates with the software by a PCI Express (4) local-bus-compatible interface. We benchmark our learning system against the popular case of handwritten digit recognition in which abundant previous works for comparison are available. In the case of the MNIST database, we could attain the most efficient accuracy rate of 97.91% with 930 prototypes, the learning speed of 1.310-4 s/sample and the classification speed of 3.9410-8 s/character.
Doo Hwa HONG June Sig SUNG Kyung Hwan OH Nam Soo KIM
Decision tree-based clustering and parameter estimation are essential steps in the training part of an HMM-based speech synthesis system. These two steps are usually performed based on the maximum likelihood (ML) criterion. However, one of the drawbacks of the ML criterion is that it is sensitive to outliers which usually result in quality degradation of the synthesized speech. In this letter, we propose an approach to detect and remove outliers for HMM-based speech synthesis. Experimental results show that the proposed approach can improve the synthetic speech, particularly when the available training speech database is insufficient.
Sho TSUGAWA Hiroyuki OHSAKI Makoto IMASE
In the literature, two connectivity-based distributed clustering schemes exist: CDC (Connectivity-based Distributed node Clustering scheme) and SDC (SCM-based Distributed Clustering). While CDC and SDC have mechanisms for maintaining clusters against nodes joining and leaving, neither method assumes that frequent changes occur in the network topology. In this paper, we propose a lightweight distributed clustering method that we term SBDC (Schelling-Based Distributed Clustering) since this scheme is derived from Schelling's model – a popular segregation model in sociology. We evaluate the effectiveness of the proposed SBDC in an environment where frequent changes arise in the network topology. Our simulation results show that SBDC outperforms CDC and SDC under frequent changes in network topology caused by high node mobility.
Frank PERBET Bjorn STENGER Atsuto MAKI
This paper presents a novel algorithm to generate homogeneous superpixels from Markov random walks. We exploit Markov clustering (MCL) as the methodology, a generic graph clustering method based on stochastic flow circulation. In particular, we introduce a graph pruning strategy called compact pruning in order to capture intrinsic local image structure. The resulting superpixels are homogeneous, i.e. uniform in size and compact in shape. The original MCL algorithm does not scale well to a graph of an image due to the square computation of the Markov matrix which is necessary for circulating the flow. The proposed pruning scheme has the advantages of faster computation, smaller memory footprint, and straightforward parallel implementation. Through comparisons with other recent techniques, we show that the proposed algorithm achieves state-of-the-art performance.
With the wide usage of multispectral images, a fast efficient multidimensional clustering method becomes not only meaningful but also necessary. In general, to speed up the multidimensional images' analysis, a multidimensional feature vector should be transformed into a lower dimensional space. The Hilbert curve is a continuous one-to-one mapping from N-dimensional space to one-dimensional space, and can preserves neighborhood as much as possible. However, because the Hilbert curve is generated by a recurve division process, 'Boundary Effects' will happen, which means data that are close in N-dimensional space may not be close in one-dimensional Hilbert order. In this paper, a new efficient approach based on the space-filling curves is proposed for classifying multispectral satellite images. In order to remove 'Boundary Effects' of the Hilbert curve, multiple Hilbert curves, z curves, and the Pseudo-Hilbert curve are used jointly. The proposed method extracts category clusters from one-dimensional data without computing any distance in N-dimensional space. Furthermore, multispectral images can be analyzed hierarchically from coarse data distribution to fine data distribution in accordance with different application. The experimental results performed on LANDSAT data have demonstrated that the proposed method is efficient to manage the multispectral images and can be applied easily.
Yuyu YUAN Chuanyi LIU Jie CHENG Xiaoliang WANG
Execution performance is critical for large-scale and data-intensive workflows. This paper proposes DISWOP, a novel scheduling algorithm for data-intensive workflow optimizations; it consists of three main steps: workflow process generation, task & resource mapping, and task clustering. To evaluate the effectiveness and efficiency of DISWOP, a comparison evaluation of different workflows is conducted a prototype workflow platform. The results show that DISWOP can speed up execution performance by about 1.6-2.3 times depending on the task scale.
We are interested in retrieving video shots or videos containing particular people from a video dataset. Owing to the large variations in pose, illumination conditions, occlusions, hairstyles and facial expressions, face tracks have recently been researched in the fields of face recognition, face retrieval and name labeling from videos. However, when the number of face tracks is very large, conventional methods, which match all or some pairs of faces in face tracks, will not be effective. Therefore, in this paper, an efficient method for finding a given person from a video dataset is presented. In our study, in according to performing research on face tracks in a single video, we also consider how to organize all the faces in videos in a dataset and how to improve the search quality in the query process. Different videos may include the same person; thus, the management of individuals in different videos will be useful for their retrieval. The proposed method includes the following three points. (i) Face tracks of the same person appearing for a period in each video are first connected on the basis of scene information with a time constriction, then all the people in one video are organized by a proposed hierarchical clustering method. (ii) After obtaining the organizational structure of all the people in one video, the people are organized into an upper layer by affinity propagation. (iii) Finally, in the process of querying, a remeasuring method based on the index structure of videos is performed to improve the retrieval accuracy. We also build a video dataset that contains six types of videos: films, TV shows, educational videos, interviews, press conferences and domestic activities. The formation of face tracks in the six types of videos is first researched, then experiments are performed on this video dataset containing more than 1 million faces and 218,786 face tracks. The results show that the proposed approach has high search quality and a short search time.
Makoto NAKATSUJI Akimichi TANAKA Toshio UCHIYAMA Ko FUJIMURA
Users recently find their interests by checking the contents published or mentioned by their immediate neighbors in social networking services. We propose semantics-based link navigation; links guide the active user to potential neighbors who may provide new interests. Our method first creates a graph that has users as nodes and shared interests as links. Then it divides the graph by link pruning to extract practical numbers, that the active user can navigate, of interest-sharing groups, i.e. communities of interests (COIs). It then attaches a different semantic tag to the link to each representative user, which best reflects the interests of COIs that they are included in, and to the link to each immediate neighbor of the active user. It finally calculates link attractiveness by analyzing the semantic tags on links. The active user can select the link to access by checking the semantic tags and link attractiveness. User interests extracted from large scale actual blog-entries are used to confirm the efficiency of our proposal. Results show that navigation based on link attractiveness and representative users allows the user to find new interests much more accurately than is otherwise possible.