Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Gyuyeong KIM
Hyun KWON Jun LEE
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Guangjin Ouyang Yong Guo Yu Lu Fang He
Yuyao LIU Qingyong LI Shi BAO Wen WANG
Cong PANG Ye NI Jia Ming CHENG Lin ZHOU Li ZHAO
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Kazuya KAKIZAKI Kazuto FUKUCHI Jun SAKUMA
Yitong WANG Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Waqas NAWAZ Muhammad UZAIR Kifayat ULLAH KHAN Iram FATIMA
Haeyoung Lee
Ji XI Pengxu JIANG Yue XIE Wei JIANG Hao DING
Weiwei JING Zhonghua LI
Sena LEE Chaeyoung KIM Hoorin PARK
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Chih-Ping Wang Duen-Ren Liu
Yuya TAKADA Rikuto MOCHIDA Miya NAKAJIMA Syun-suke KADOYA Daisuke SANO Tsuyoshi KATO
Yi Huo Yun Ge
Rikuto MOCHIDA Miya NAKAJIMA Haruki ONO Takahiro ANDO Tsuyoshi KATO
Koichi FUJII Tomomi MATSUI
Yaotong SONG Zhipeng LIU Zhiming ZHANG Jun TANG Zhenyu LEI Shangce GAO
Souhei TAKAGI Takuya KOJIMA Hideharu AMANO Morihiro KUGA Masahiro IIDA
Jun ZHOU Masaaki KONDO
Tetsuya MANABE Wataru UNUMA
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Jingjing Liu Chuanyang Liu Yiquan Wu Zuo Sun
Zhenglong YANG Weihao DENG Guozhong WANG Tao FAN Yixi LUO
Yoshiaki TAKATA Akira ONISHI Ryoma SENDA Hiroyuki SEKI
Dinesh DAULTANI Masayuki TANAKA Masatoshi OKUTOMI Kazuki ENDO
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Genta INOUE Daiki OKONOGI Satoru JIMBO Thiem Van CHU Masato MOTOMURA Kazushi KAWAMURA
Hikaru USAMI Yusuke KAMEDA
Yinan YANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
Hideyuki TOKUDA Jin NAKAZAWA Takuro YONEZAWA
Ubiquitous computing and communication are the key technology for achieving economic growth, sustainable development, safe and secure community towards a ubiquitous network society. Although the technology alone cannot solve the emerging problems, it is important to deploy services everywhere and reach real people with sensor enabled smart phones or devices. Using these devices and wireless sensor networks, we have been creating various types of ubiquitous services which support our everyday life. In this paper, we describe ubiquitous services based on a HOT-SPA model and discuss challenges in creating new ubiquitous services with smart enablers such as smart phones, wireless sensor nodes, social media, and cloud services. We first classify various types of ubiquitous service and introduce the HOT-SPA model which is aimed at modeling ubiquitous services. Several ubiquitous services, such as DIY smart object services, Twitthings, Airy Notes, and SensingCloud, are described. We then address the challenges in creating advanced ubiquitous services by enhancing coupling between a cyber and a physical space.
Doo-Won LEE Gye-Tae GIL Dong-Hoi KIM
This paper introduces a hard handover strategy with a novel adaptive hysteresis adjustment that is needed to reduce handover drop rate in 3GPP long term evolution (LTE). First of all, we adopt a Hybrid handover scheme considering both the received signal strength (RSS) and the load information of the adjacent evolved Node Bs (eNBs) as a factor for deciding the target eNB. The Hybrid scheme causes the load status between the adjacent eNBs to be largely similar. Then, we propose a modified load-based adaptive hysteresis scheme to find a suitable handover hysteresis value utilizing the feature of the small load difference between the target and serving eNBs obtained from the result of the Hybrid scheme. As a result, through the proposed modified load-based adaptive hysteresis scheme, the best target cell is very well selected according to the dynamically changing communication environments. The simulation results show that the proposed scheme provides good performance in terms of handover drop rate.
Daisuke KAMISAKA Shigeki MURAMATSU Takeshi IWAMOTO Hiroyuki YOKOYAMA
Pedestrian dead reckoning (PDR) based on human gait locomotion is a promising solution for indoor location services, which independently determine the relative position of the user using multiple sensors. Most existing PDR methods assume that all sensors are mounted in a fixed position on the user's body while walking. However, it is inconvenient for a user to mount his/her mobile phone or additional sensor modules in a specific position on his/her body such as the torso. In this paper, we propose a new PDR method and a prototype system suitable for indoor navigation systems on a mobile phone. Our method determines the user's relative position even if the sensors' orientation relative to the user is not given and changes from moment to moment. Therefore, the user does not have to mount the mobile phone containing sensors on the body and can carry it in a natural way while walking, e.g., while swinging the arms. Detailed algorithms, implementation and experimental evaluation results are presented.
Sozo INOUE Yasunobu NOHARA Masaki TAKEMORI Kozo SAKURAGAWA
We consider RFID bookshelves, which detect the location of books using RFID. An RFID bookshelf has the antennas of RFID readers in the boards, and detects the location of an RFID tag attached to a book. However, the accuracy is not good with the experience of the existing system, and sometimes reads the tag of the next or even further area. In this paper, we propose a method to improve the location detection using naive Bayes classifer, and show the experimental result. We obtained 78.6% of F-measure for total 12658 instances, and show the advantage against the straightforward approach of calculating the center of gravity of the read readers. More importantly, we show the performance is less dependent of a change of layouts and a difference of books by leave-1-layout/book-out cross validation. This is favorable for the feasibility in library operation.
Arei KOBAYASHI Shigeki MURAMATSU Daisuke KAMISAKA Takafumi WATANABE Atsunori MINAMIKAWA Takeshi IWAMOTO Hiroyuki YOKOYAMA
This paper proposes a method for using an accelerometer, microphone, and GPS in a mobile phone to recognize the movement of the user. Past attempts at identifying the movement associated with riding on a bicycle, train, bus or car and common human movements like standing still, walking or running have had problems with poor accuracy due to factors such as sudden changes in vibration or times when the vibrations resembled those for other types of movement. Moreover, previous methods have had problems with has the problem of high power consumption because of the sensor processing load. The proposed method aims to avoid these problems by estimating the reliability of the inference result, and by combining two inference modes to decrease the power consumption. Field trials demonstrate that our method achieves 90% or better average accuracy for the seven types of movement listed above. Shaka's power saving functionality enables us to extend the battery life of a mobile phone to over 100 hours while our estimation algorithm is running in the background. Furthermore, this paper uses experimental results to show the trade-off between accuracy and latency when estimating user activity.
Toshiya NAKAKURA Yasuyuki SUMI Toyoaki NISHIDA
This paper proposes a system called Neary that detects conversational fields based on similarity of auditory situation among users. The similarity of auditory situation between each pair of the users is measured by the similarity of frequency property of sound captured by head-worn microphones of the individual users. Neary is implemented with a simple algorithm and runs on portable PCs. Experimental result shows Neary can successfully distinguish groups of conversations and track dynamic changes of them. This paper also presents two examples of Neary deployment to detect user contexts during experience sharing in touring at the zoo and attending an academic conference.
An algorithm for the discrimination between human upstairs and downstairs using a tri-axial accelerometer is presented in this paper, which consists of vertical acceleration calibration, extraction of two kinds of features (Interquartile Range and Wavelet Energy), effective feature subset selection with the wrapper approach, and SVM classification. The proposed algorithm can recognize upstairs and downstairs with 95.64% average accuracy for different sensor locations, i.e. located on the subject's waist belt, in the trousers pocket, and in the shirt pocket. Even for the mixed data from all sensor locations, the average recognition accuracy can reach 94.84%. Experimental results have successfully validated the effectiveness of the proposed method.
In this letter, a new scatternet formation algorithm called hybrid mesh tree for Bluetooth ad hoc networks was proposed. The hybrid mesh tree constructs a mesh-shaped topology in one dense area that is extended by tree-shaped topology to the other areas. First, the hybrid mesh tree uses a designated root to construct a tree-shaped subnet, and then propagates a constant k in its downstream direction to determine new roots. Each new root then asks its upstream master to start a return connection procedure to convert the first tree-shaped subnet into a mesh-shaped subnet. At the same time, each new root repeats the same procedure as the designated root to build its own tree-shaped subnet until the whole scatternet is formed. Simulation results showed that the hybrid mesh tree achieved better network performance than Bluetree and generated an efficient scatternet configuration for various sizes of Bluetooth scatternets.
Kyusuk HAN Kwangjo KIM Taeshik SHON
Recent Location Based Services (LBS) extend not only information services such as car navigation services, but supporting various applications such as augmented reality and emergency services in ubiquitous computing environments. However location based services in the ubiquitous computing environment bring several security issues such as location privacy and forgery. While the privacy of the location based service is considered as the important security issue, security against location forgery is less considered. In this paper, we propose improved Han et al.'s protocol [1] that provides more lightweight computation. Our proposed model also improves the credibility of LBS by deploying multiple location sensing technologies.
Masashi KIYOMI Toshiki SAITOH Ryuhei UEHARA
The Voronoi game is a two-person perfect information game modeling a competitive facility location. The original version of the game is played on a continuous domain. Only two special cases (1-dimensional case and 1-round case) have been extensively investigated. Recently, the discrete Voronoi game of which the game arena is given as a graph was introduced. In this note, we give a complete analysis of the discrete Voronoi game on a path. There are drawing strategies for both the first and the second players, except for some trivial cases.
The Krivine-style evaluation mechanism is well-known in the implementation of higher-order functions, allowing to avoid some useless closure building. There have been a few type systems that can verify the safety of the mechanism. The incorporation of the proposed ideas into an existing compiler, however, would require significant changes in the type system of the compiler due to the use of some dedicated form of types and typing rules in the proposals. This limitation motivates us to propose an alternative light-weight Krivine typing mechanism that does not need to extend any existing type system significantly. This paper shows how GADTs (Generalized algebraic data types) can be used for typing a ZINC machine following the Krivine-style evaluation mechanism. This idea is new as far as we know. Some existing typed compilers like GHC (Glasgow Haskell compiler) already support GADTs; they can benefit from the Krivine-style evaluation mechanism in the operational semantics with no particular extension in their type systems for the safety. We show the GHC type checker allows to prove mechanically that ZINC instructions are well-typed, which highlights the effectiveness of GADTs.
Although a large number of query processing algorithms in spatial network database (SNDB) have been studied, there exists little research on route-based queries. Since moving objects move only in spatial networks, route-based queries, like in-route nearest neighbor (IRNN), are essential for Location-based Service (LBS) and Telematics applications. However, the existing IRNN query processing algorithm has a problem in that it does not consider time and space constraints. Therefore, we, in this paper, propose IRNN query processing algorithms which take both time and space constraints into consideration. Finally, we show the effectiveness of our IRNN query processing algorithms considering time and space constraints by comparing them with the existing IRNN algorithm.
A safe prime p is a prime such that (p-1)/2 is also a prime. A primality test or a safe primality test is normally a combination of trial division and a probabilistic primality test. Since the number of small odd primes used in the trial division affects the performance of the combination, researchers have studied how to obtain the optimal number of small odd primes to be used in the trial division and the expected running time of the combination for primality tests. However, in the case of safe primality tests, the analysis of the combination is more difficult, and thus no such results have been given. In this paper, we present the first probabilistic analysis on the expected running time and the optimal number of small odd primes to be used in the trial division for optimizing the tests. Experimental results show that our probabilistic analysis estimates the behavior of the safe primality tests very well.
Kohei MIYASE Kenji NODA Hideaki ITO Kazumi HATAYAMA Takashi AIKYO Yuta YAMATO Hiroshi FURUKAWA Xiaoqing WEN Seiji KAJIHARA
Test data modification based on test relaxation and X-filling is the preferred approach for reducing excessive IR-drop in at-speed scan testing to avoid test-induced yield loss. However, none of the existing test relaxation methods can control the distribution of identified don't care bits (X-bits), thus adversely affecting the effectiveness of IR-drop reduction. In this paper, we propose a novel test relaxation method, called Distribution-Controlled X-Identification (DC-XID), which controls the distribution of X-bits identified in a set of fully-specified test vectors for the purpose of effectively reducing IR-drop. Experiments on large industrial circuits demonstrate the effectiveness and practicality of the proposed method in reducing IR-drop, without lowering fault coverage, increasing test data volume and circuit size.
Most document clustering methods are a challenging issue for improving clustering performance. Document clustering based on semantic features is highly efficient. However, the method sometimes did not successfully cluster some documents, such as highly articulated documents. In order to improve the clustering success of complex documents using semantic features, this paper proposes a document clustering method that uses terms of the condensing document clusters and fuzzy association to efficiently cluster specific documents into meaningful topics based on the document set. The proposed method improves the quality of document clustering because it can extract documents from the perspective of the terms of the cluster topics using semantic features and synonyms, which can also better represent the inherent structure of the document in connection with the document cluster topics. The experimental results demonstrate that the proposed method can achieve better document clustering performance than other methods.
C. M. Althaff IRFAN Shusaku NOMURA Takaoi YAMAGISHI Yoshimasa KUROSAWA Kuniaki YAJIMA Katsuko T. NAKAHIRA Nobuyuki OGAWA Yoshimi FUKUMURA
This paper presents a new dimension in e-learning by collecting and analyzing physiological data during real-world e-learning sessions. Two different content materials, namely Interactive (IM) and Non-interactive (N-IM), were utilized to determine the physiological state of e-learners. Electrocardiogram (ECG) and Skin Conductance Level (SCL) were recorded continuously while learners experienced IM and N-IM for about 25 minutes each. Data from 18 students were collected for analysis. As a result significant difference between IM and N-IM was observed in SCL (p <.01) meanwhile there were no significance in other indices such as heart rate and its variability, and skin conductance response (SCR). This study suggests a new path in understanding e-learners' physiological state with regard to different e-learning materials; the results of this study suggest a clear distinction in physiological states in the context of different learning materials.
Hiromu TAKAHASHI Tomohiro YOSHIKAWA Takeshi FURUHASHI
Brain-Computer Interfaces (BCIs) are systems that translate one's thoughts into commands to restore control and communication to severely paralyzed people, and they are also appealing to healthy people. One of the challenges is to improve the performance of BCIs, often measured by the accuracy and the trial duration, or the information transfer rate (ITR), i.e., the mutual information per unit time. Since BCIs are communications between a user and a system, error control schemes such as forward error correction and automatic repeat request (ARQ) can be applied to BCIs to improve the accuracy. This paper presents reliability-based ARQ (RB-ARQ), a variation of ARQ designed for BCIs, which employs the maximum posterior probability for the repeat decision. The current results show that RB-ARQ is more effective than the conventional methods, i.e., better accuracy when trial duration was the same, and shorter trial duration when the accuracy was the same. This resulted in a greater information transfer rate and a greater utility, which is a more practical performance measure in the P300 speller task. The results also show that such users who achieve a poor accuracy for some reason can benefit the most from RB-ARQ, which could make BCIs more universal.
Tsuneo KATO Kengo FUJITA Nobuyuki NISHIZAWA
This paper presents efficient frame-synchronous beam pruning for HMM-based automatic speech recognition. In the conventional beam pruning, a few hypotheses that have greater potential to reach various words on a lexical tree are likely to be pruned out by a number of hypotheses that have limited potential, since all hypotheses are treated equally without considering this potential. To make the beam pruning less restrictive for hypotheses with greater potential and vice versa, the proposed method adds to the likelihood of each hypothesis a tentative reward as a monotonically increasing function of the number of reachable words from the HMM state where the hypothesis stays in a lexical tree. The reward is designed not to collapse the ASR probabilistic framework. The proposed method reduced 84% of the processing time for a grammar-based 10k-word short sentence recognition task. For a language-model-based dictation task, it also resulted in an additional 23% reduction in processing time from the beam pruning with the language model look-ahead technique.
Generally, two problems of bag-of-features in image retrieval are still considered unsolved: one is that spatial information about descriptors is not employed well, which affects the accuracy of retrieval; the other is that the trade-off between vocabulary size and good precision, which decides the storage and retrieval performance. In this paper, we propose a novel approach called Hilbert scan based bag-of-features (HS-BoF) for image retrieval. Firstly, Hilbert scan based tree representation (HSBT) is studied, which is built based on the local descriptors while spatial relationships are added into the nodes by a novel grouping rule, resulting of a tree structure for each image. Further, we give two ways of codebook production based on HSBT: multi-layer codebook and multi-size codebook. Owing to the properties of Hilbert scanning and the merits of our grouping method, sub-regions of the tree are not only flexible to the distribution of local patches but also have hierarchical relations. Extensive experiments on caltech-256, 13-scene and 1 million ImageNet images show that HS-BoF obtains higher accuracy with less memory usage.
Xue YUAN Xue-Ye WEI Yong-Duan SONG
This paper presents a pedestrian detection framework using a top-view camera. The paper contains two novel contributions for the pedestrian detection task: 1. Using shape context method to estimate the pedestrian directions and normalizing the pedestrian regions. 2. Based on the locations of the extracted head candidates, system chooses the most adaptive classifier from several classifiers automatically. Our proposed methods may solve the difficulties on top-view pedestrian detection field. Experimental was performed on video sequences with different illumination and crowed conditions, the experimental results demonstrate the efficiency of our algorithm.
Hasan S.M. AL-KHAFFAF Abdullah Z. TALIB Rosalina ABDUL SALAM
Many factors, such as noise level in the original image and the noise-removal methods that clean the image prior to performing a vectorization, may play an important role in affecting the line detection of raster-to-vector conversion methods. In this paper, we propose an empirical performance evaluation methodology that is coupled with a robust statistical analysis method to study many factors that may affect the quality of line detection. Three factors are studied: noise level, noise-removal method, and the raster-to-vector conversion method. Eleven mechanical engineering drawings, three salt-and-pepper noise levels, six noise-removal methods, and three commercial vectorization methods were used in the experiment. The Vector Recovery Index (VRI) of the detected vectors was the criterion used for the quality of line detection. A repeated measure ANOVA analyzed the VRI scores. The statistical analysis shows that all the studied factors affected the quality of line detection. It also shows that two-way interactions between the studied factors affected line detection.
Lih-Shyang CHEN Young-Jinn LAY Je-Bin HUANG Yan-De CHEN Ku-Yaw CHANG Shao-Jer CHEN
Although the Marching Cube (MC) algorithm is very popular for displaying images of voxel-based objects, its slow surface extraction process is usually considered to be one of its major disadvantages. It was pointed out that for the original MC algorithm, we can limit vertex calculations to once per vertex to speed up the surface extraction process, however, it did not mention how this process could be done efficiently. Neither was the reuse of these MC vertices looked into seriously in the literature. In this paper, we propose a “Group Marching Cube” (GMC) algorithm, to reduce the time needed for the vertex identification process, which is part of the surface extraction process. Since most of the triangle-vertices of an iso-surface are shared by many MC triangles, the vertex identification process can avoid the duplication of the vertices in the vertex array of the resultant triangle data. The MC algorithm is usually done through a hash table mechanism proposed in the literature and used by many software systems. Our proposed GMC algorithm considers a group of voxels simultaneously for the application of the MC algorithm to explore interesting features of the original MC algorithm that have not been discussed in the literature. Based on our experiments, for an object with more than 1 million vertices, the GMC algorithm is 3 to more than 10 times faster than the algorithm using a hash table. Another significant advantage of GMC is its compatibility with other algorithms that accelerate the MC algorithm. Together, the overall performance of the original MC algorithm is promoted even further.
Takashi ONISHI Masao UTIYAMA Eiichiro SUMITA
Lattice decoding in statistical machine translation (SMT) is useful in speech translation and in the translation of German because it can handle input ambiguities such as speech recognition ambiguities and German word segmentation ambiguities. In this paper, we show that lattice decoding is also useful for handling input variations. “Input variations” refers to the differences in input texts with the same meaning. Given an input sentence, we build a lattice which represents paraphrases of the input sentence. We call this a paraphrase lattice. Then, we give the paraphrase lattice as an input to a lattice decoder. The lattice decoder searches for the best path of the paraphrase lattice and outputs the best translation. Experimental results using the IWSLT dataset and the Europarl dataset show that our proposed method obtains significant gains in BLEU scores.
Amir MEHRAFSA Alireza SOKHANDAN Ghader KARIMIAN
In this paper, a new algorithm called TGA is introduced which defines the concept of time more naturally for the first time. A parameter called TimeToLive is considered for each chromosome, which is a time duration in which it could participate in the process of the algorithm. This will lead to keeping the dynamism of algorithm in addition to maintaining its convergence sufficiently and stably. Thus, the TGA guarantees not to result in premature convergence or stagnation providing necessary convergence to achieve optimal answer. Moreover, the mutation operator is used more meaningfully in the TGA. Mutation probability has direct relation with parent similarity. This kind of mutation will decrease ineffective mating percent which does not make any improvement in offspring individuals and also it is more natural. Simulation results show that one run of the TGA is enough to reach the optimum answer and the TGA outperforms the standard genetic algorithm.
Xutao DU Chunxiao XING Lizhu ZHOU
We develop a distance function for finite Chu spaces based on their behavior. Typical examples are given to show the coincidence between the distance function and intuition. We show by example that the triangle inequality should not be satisfied when it comes to comparing two processes.
Let T be a tree in which every edge is associated with a real number. The sum of a path in T is the sum of the numbers associated with the edges of the path and its length is the number of the edges in it. For two positive integers L1 ≤ L2 and two real numbers S1 ≤ S2, a path is feasible if its length is between L1 and L2 and its sum is between S1 and S2. We address the problem: Given a tree T, and four numbers, L1, L2, S1 and S2, find the longest feasible path of T. We provide an optimal O(n log n) time algorithm for the problem, where n =|T|.
In the present paper, a method for extracting user interest by constructing a hierarchy of words from social bookmarking (SBM) tags and emphasizing nouns based on the hierarchical structure (folksonomy) is proposed. Co-occurrence of the SBM tags basically have a semantic relationship. As a result of an experimental evaluation using the user profiles on Twitter, the authors discovered that the SBM tags and their word hierarchy have a rich vocabulary for extracting user interest.
Identifying the statistical independence of random variables is one of the important tasks in statistical data analysis. In this paper, we propose a novel non-parametric independence test based on a least-squares density ratio estimator. Our method, called least-squares independence test (LSIT), is distribution-free, and thus it is more flexible than parametric approaches. Furthermore, it is equipped with a model selection procedure based on cross-validation. This is a significant advantage over existing non-parametric approaches which often require manual parameter tuning. The usefulness of the proposed method is shown through numerical experiments.
Makoto YAMADA Masashi SUGIYAMA Gordon WICHERN Jaak SIMM
The least-squares probabilistic classifier (LSPC) is a computationally-efficient alternative to kernel logistic regression. However, to assure its learned probabilities to be non-negative, LSPC involves a post-processing step of rounding up negative parameters to zero, which can unexpectedly influence classification performance. In order to mitigate this problem, we propose a simple alternative scheme that directly rounds up the classifier's negative outputs, not negative parameters. Through extensive experiments including real-world image classification and audio tagging tasks, we demonstrate that the proposed modification significantly improves classification accuracy, while the computational advantage of the original LSPC remains unchanged.
Jae-seong LEE Young-cheol PARK Dae-hee YOUN Kyung-ok KANG
Although the AMR-WB+ coder provides excellent quality for speech signal, its coding model for music signals is not as optimal as the HE-AAC v2. The main causes of the poor quality of the AMR-WB+ TCX are the non-critical sampling and block artifacts. The new TCX windowing scheme proposed in this paper uses an MDCT with a 50% frame overlap, so that the problems of non-critical sampling and blocking artifacts are significantly mitigated. Due to long overlaps, the proposed scheme involves an additional codec delay. It is, however, moderate for audio services. The results of objective and subjective tests indicate that the proposed scheme achieves noticeable quality improvements for music signals over the previous TCX schemes.
Ryo NAKASHIMA Kei UTSUGI Keita TAKAHASHI Takeshi NAEMURA
We propose a new stereo image retargeting method based on the framework of shift-map image editing. Retargeting is the process of changing the image size according to the target display while preserving as much of the richness of the image as possible, and is often applied to monocular images and videos. Retargeting stereo images poses a new challenge because pixel correspondences between the stereo pair should be preserved to keep the scene's structure. The main contribution of this paper is integrating a stereo correspondence constraint into the retargeting process. Among several retargeting methods, we adopt shift-map image editing because this framework can be extended naturally to stereo images, as we show in this paper. We confirmed the effectiveness of our method through experiments.
Xu YANG De XU Songhe FENG Yingjun TANG Shuoyan LIU
This paper presents an efficient yet powerful codebook model, named classified codebook model, to categorize natural scene category. The current codebook model typically resorts to large codebook to obtain higher performance for scene categorization, which severely limits the practical applicability of the model. Our model formulates the codebook model with the theory of vector quantization, and thus uses the famous technique of classified vector quantization for scene-category modeling. The significant feature in our model is that it is beneficial for scene categorization, especially at small codebook size, while saving much computation complexity for quantization. We evaluate the proposed model on a well-known challenging scene dataset: 15 Natural Scenes. The experiments have demonstrated that our model can decrease the computation time for codebook generation. What is more, our model can get better performance for scene categorization, and the gain of performance becomes more pronounced at small codebook size.
Hong BAO Song-He FENG De XU Shuoyan LIU
Localized content-based image retrieval (LCBIR) has emerged as a hot topic more recently because in the scenario of CBIR, the user is interested in a portion of the image and the rest of the image is irrelevant. In this paper, we propose a novel region-level relevance feedback method to solve the LCBIR problem. Firstly, the visual attention model is employed to measure the regional saliency of each image in the feedback image set provided by the user. Secondly, the regions in the image set are constructed to form an affinity matrix and a novel propagation energy function is defined which takes both low-level visual features and regional significance into consideration. After the iteration, regions in the positive images with high confident scores are selected as the candidate query set to conduct the next-round retrieval task until the retrieval results are satisfactory. Experimental results conducted on the SIVAL dataset demonstrate the effectiveness of the proposed approach.
We propose a statistical method for counting pedestrians. Previous pedestrian counting methods are not applicable to highly crowded areas because they rely on the detection and tracking of individuals. The performance of detection-and-tracking methods are easily degraded for highly crowded scene in terms of both accuracy and computation time. The proposed method employs feature-based regression in the spatiotemporal domain to count pedestrians. The proposed method is accurate and requires less computation time, even for large crowds, because it does not include the detection and tracking of objects. Our test results from four hours of video sequence obtained from a highly crowded shopping mall, reveal that the proposed method is able to measure human traffic with an accuracy of 97.2% and requires only 14 ms per frame.
This letter presents a new automatic musical genre classification method based on an informative song-level representation, in which the mutual information between the feature and the genre label is maximized. By efficiently combining distance-based indexing with informative features, the proposed method represents a song as one vector instead of complex statistical models. Experiments on an audio genre DB show that the proposed method can achieve the classification accuracy comparable or superior to the state-of-the-art results.