The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] REST(332hit)

81-100hit(332hit)

  • Robust Face Alignment with Random Forest: Analysis of Initialization, Landmarks Regression, and Shape Regularization Methods

    Chun Fui LIEW  Takehisa YAIRI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2015/10/27
      Vol:
    E99-D No:2
      Page(s):
    496-504

    Random forest regressor has recently been proposed as a local landmark estimator in the face alignment problem. It has been shown that random forest regressor can achieve accurate, fast, and robust performance when coupled with a global face-shape regularizer. In this paper, we extend this approach and propose a new Local Forest Classification and Regression (LFCR) framework in order to handle face images with large yaw angles. Specifically, the LFCR has an additional classification step prior to the regression step. Our experiment results show that this additional classification step is useful in rejecting outliers prior to the regression step, thus improving the face alignment results. We also analyze each system component through detailed experiments. In addition to the selection of feature descriptors and several important tuning parameters of the random forest regressor, we examine different initialization and shape regularization processes. We compare our best outcomes to the state-of-the-art system and show that our method outperforms other parametric shape-fitting approaches.

  • Nonlinear Regression of Saliency Guided Proposals for Unsupervised Segmentation of Dynamic Scenes

    Yinhui ZHANG  Mohamed ABDEL-MOTTALEB  Zifen HE  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2015/11/06
      Vol:
    E99-D No:2
      Page(s):
    467-474

    This paper proposes an efficient video object segmentation approach that is tolerant to complex scene dynamics. Unlike existing approaches that rely on estimating object-like proposals on an intra-frame basis, the proposed approach employs temporally consistent foreground hypothesis using nonlinear regression of saliency guided proposals across a video sequence. For this purpose, we first generate salient foreground proposals at superpixel level by leveraging a saliency signature in the discrete cosine transform domain. We propose to use a random forest based nonlinear regression scheme to learn both appearance and shape features from salient foreground regions in all frames of a sequence. Availability of such features can help rank every foreground proposals of a sequence, and we show that the regions with high ranking scores are well correlated with semantic foreground objects in dynamic scenes. Subsequently, we utilize a Markov Random Field to integrate both appearance and motion coherence of the top-ranked object proposals. A temporal nonlinear regressor for generating salient object support regions significantly improves the segmentation performance compared to using only per-frame objectness cues. Extensive experiments on challenging real-world video sequences are performed to validate the feasibility and superiority of the proposed approach for addressing dynamic scene segmentation.

  • Ontology Based Framework for Interactive Self-Assessment of e-Health Applications Open Access

    Wasin PASSORNPAKORN  Sinchai KAMOLPHIWONG  

     
    INVITED PAPER

      Pubricized:
    2015/10/21
      Vol:
    E99-D No:1
      Page(s):
    2-9

    Personal e-healthcare service is growing significantly. A large number of personal e-health measuring and monitoring devices are now in the market. However, to achieve better health outcome, various devices or services need to work together. This coordination among services remains challenge, due to their variations and complexities. To address this issue, we have proposed an ontology-based framework for interactive self-assessment of RESTful e-health services. Unlike existing e-health service frameworks where they had tightly coupling between services, as well as their data schemas were difficult to change and extend in the future. In our work, the loosely coupling among services and flexibility of each service are achieved through the design and implementation based on HYDRA vocabulary and REST principles. We have implemented clinical knowledge through the combination of OWL-DL and SPARQL rules. All of these services evolve independently; their interfaces are based on REST principles, especially HATEOAS constraints. We have demonstrated how to apply our framework for interactive self-assessment in e-health applications. We have shown that it allows the medical knowledge to drive the system workflow according to the event-driven principles. New data schema can be maintained during run-time. This is the essential feature to support arriving of IoT (Internet of Things) based medical devices, which have their own data schema and evolve overtime.

  • Efficient Anchor Graph Hashing with Data-Dependent Anchor Selection

    Hiroaki TAKEBE  Yusuke UEHARA  Seiichi UCHIDA  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2015/08/17
      Vol:
    E98-D No:11
      Page(s):
    2030-2033

    Anchor graph hashing (AGH) is a promising hashing method for nearest neighbor (NN) search. AGH realizes efficient search by generating and utilizing a small number of points that are called anchors. In this paper, we propose a method for improving AGH, which considers data distribution in a similarity space and selects suitable anchors by performing principal component analysis (PCA) in the similarity space.

  • High-Speed and Local-Changes Invariant Image Matching

    Chao ZHANG  Takuya AKASHI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2015/08/03
      Vol:
    E98-D No:11
      Page(s):
    1958-1966

    In recent years, many variants of key point based image descriptors have been designed for the image matching, and they have achieved remarkable performances. However, to some images, local features appear to be inapplicable. Since theses images usually have many local changes around key points compared with a normal image, we define this special image category as the image with local changes (IL). An IL pair (ILP) refers to an image pair which contains a normal image and its IL. ILP usually loses local visual similarities between two images while still holding global visual similarity. When an IL is given as a query image, the purpose of this work is to match the corresponding ILP in a large scale image set. As a solution, we use a compressed HOG feature descriptor to extract global visual similarity. For the nearest neighbor search problem, we propose random projection indexed KD-tree forests (rKDFs) to match ILP efficiently instead of exhaustive linear search. rKDFs is built with large scale low-dimensional KD-trees. Each KD-tree is built in a random projection indexed subspace and contributes to the final result equally through a voting mechanism. We evaluated our method by a benchmark which contains 35,000 candidate images and 5,000 query images. The results show that our method is efficient for solving local-changes invariant image matching problems.

  • Posteriori Restoration of Turn-Taking and ASR Results for Incorrectly Segmented Utterances

    Kazunori KOMATANI  Naoki HOTTA  Satoshi SATO  Mikio NAKANO  

     
    PAPER-Speech and Hearing

      Pubricized:
    2015/07/24
      Vol:
    E98-D No:11
      Page(s):
    1923-1931

    Appropriate turn-taking is important in spoken dialogue systems as well as generating correct responses. Especially if the dialogue features quick responses, a user utterance is often incorrectly segmented due to short pauses within it by voice activity detection (VAD). Incorrectly segmented utterances cause problems both in the automatic speech recognition (ASR) results and turn-taking: i.e., an incorrect VAD result leads to ASR errors and causes the system to start responding though the user is still speaking. We develop a method that performs a posteriori restoration for incorrectly segmented utterances and implement it as a plug-in for the MMDAgent open-source software. A crucial part of the method is to classify whether the restoration is required or not. We cast it as a binary classification problem of detecting originally single utterances from pairs of utterance fragments. Various features are used representing timing, prosody, and ASR result information. Experiments show that the proposed method outperformed a baseline with manually-selected features by 4.8% and 3.9% in cross-domain evaluations with two domains. More detailed analysis revealed that the dominant and domain-independent features were utterance intervals and results from the Gaussian mixture model (GMM).

  • Ambient Sensor Network Technologies for Global Connectivity Support Open Access

    Masayoshi OHASHI  Nao KAWANISHI  

     
    INVITED PAPER

      Vol:
    E98-B No:9
      Page(s):
    1733-1740

    This paper discusses the core ambient sensor network (ASN) technologies in view of their support for global connectivity. First, we enumerate ASN services and use cases and then discuss the underlying core technologies, in particular, the importance of the RESTful approach for ensuring global accessibility to sensors and actuators. We also discuss several profile-handling technologies for context-aware services. Finally, we envisage the ASN trends, including our current work for cognitive behavior therapy (CBT) in mental healthcare. We strongly believe that ASN services will become widely available in the real world and an integral part of daily life and society in the near future.

  • Boosted Random Forest

    Yohei MISHINA  Ryuei MURATA  Yuji YAMAUCHI  Takayoshi YAMASHITA  Hironobu FUJIYOSHI  

     
    PAPER

      Pubricized:
    2015/06/22
      Vol:
    E98-D No:9
      Page(s):
    1630-1636

    Machine learning is used in various fields and demand for implementations is increasing. Within machine learning, a Random Forest is a multi-class classifier with high-performance classification, achieved using bagging and feature selection, and is capable of high-speed training and classification. However, as a type of ensemble learning, Random Forest determines classifications using the majority of multiple trees; so many decision trees must be built. Performance increases with the number of decision trees, requiring memory, and decreases if the number of decision trees is decreased. Because of this, the algorithm is not well suited to implementation on small-scale hardware as an embedded system. As such, we have proposed Boosted Random Forest, which introduces a boosting algorithm into the Random Forest learning method to produce high-performance decision trees that are smaller. When evaluated using databases from the UCI Machine learning Repository, Boosted Random Forest achieved performance as good or better than ordinary Random Forest, while able to reduce memory use by 47%. Thus, it is suitable for implementing Random Forests on embedded hardware with limited memory.

  • 3D CG Image Quality Metrics by Regions with 8 Viewpoints Parallax Barrier Method

    Norifumi KAWABATA  Masaru MIYAO  

     
    PAPER

      Vol:
    E98-A No:8
      Page(s):
    1696-1708

    Many previous studies on image quality assessment of 3D still images or video clips have been conducted. In particular, it is important to know the region in which assessors are interested or on which they focus in images or video clips, as represented by the ROI (Region of Interest). For multi-view 3D images, it is obvious that there are a number of viewpoints; however, it is not clear whether assessors focus on objects or background regions. It is also not clear on what assessors focus depending on whether the background region is colored or gray scale. Furthermore, while case studies on coded degradation in 2D or binocular stereoscopic videos have been conducted, no such case studies on multi-view 3D videos exist, and therefore, no results are available for coded degradation according to the object or background region in multi-view 3D images. In addition, in the case where the background region is gray scale or not, it was not revealed that there were affection for gaze point environment of assessors and subjective image quality. In this study, we conducted experiments on the subjective evaluation of the assessor in the case of coded degradation by JPEG coding of the background or object or both in 3D CG images using an eight viewpoint parallax barrier method. Then, we analyzed the results statistically and classified the evaluation scores using an SVM.

  • A Performance Study to Ensure Emergency Communications during Large Scale Disasters Using Satellite/Terrestrial Integrated Mobile Communications Systems

    Kazunori OKADA  Takayuki SHIMAZU  Akira FUJIKI  Yoshiyuki FUJINO  Amane MIURA  

     
    PAPER

      Vol:
    E98-A No:8
      Page(s):
    1627-1636

    The Satellite/Terrestrial Integrated mobile Communication System (STICS), which allows terrestrial mobile phones to communicate directly through a satellite, has been studied [1]. Satellites are unaffected by the seismic activity that causes terrestrial damage, and therefore, the STICS can be expected to be a measure that ensures emergency call connection. This paper first describes the basic characteristics of call blocking rates of terrestrial mobile phone systems in areas where non-functional base stations are geographically clustered, as investigated through computer simulations that showed an increased call blocking rate as the number of non-functional base stations increased. Further simulations showed that restricting the use of the satellite system for emergency calls only ensures the STICS's capacity to transmit emergency communications; however, these simulations also revealed a weakness in the low channel utilization rate of the satellite system [2]. Therefore, in this paper, we propose increasing the channel utilization rate with a priority channel framework that divides the satellite channels between priority channels for emergency calls and non-priority channels that can be available for emergency or general use. Simulations of this priority channel framework showed that it increased the satellite system's channel utilization rate, while continuing to ensure emergency call connection [3]. These simulations showed that the STICS with a priority channel framework can provide efficient channel utilization and still be expected to provide a valuable secondary measure to ensure emergency communications in areas with clustered non-functional base stations during large-scale disasters.

  • Information-Theoretic Limits for the Multi-Way Relay Channel with Direct Links

    Yuping SU  Ying LI  Guanghui SONG  

     
    LETTER-Information Theory

      Vol:
    E98-A No:6
      Page(s):
    1325-1328

    Information-theoretic limits of a multi-way relay channel with direct links (MWRC-DL), where multiple users exchange their messages through a relay terminal and direct links, are discussed in this paper. Under the assumption that a restricted encoder is employed at each user, an outer bound on the capacity region is derived first. Then, a decode-and-forward (DF) strategy is proposed and the corresponding rate region is characterized. The explicit outer bound and the achievable rate region for the Gaussian MWRC-DL are also derived. Numerical examples are provided to demonstrate the performance of the proposed DF strategy.

  • Robust Visual Tracking via Coupled Randomness

    Chao ZHANG  Yo YAMAGATA  Takuya AKASHI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2015/02/04
      Vol:
    E98-D No:5
      Page(s):
    1080-1088

    Tracking algorithms for arbitrary objects are widely researched in the field of computer vision. At the beginning, an initialized bounding box is given as the input. After that, the algorithms are required to track the objective in the later frames on-the-fly. Tracking-by-detection is one of the main research branches of online tracking. However, there still exist two issues in order to improve the performance. 1) The limited processing time requires the model to extract low-dimensional and discriminative features from the training samples. 2) The model is required to be able to balance both the prior and new objectives' appearance information in order to maintain the relocation ability and avoid the drifting problem. In this paper, we propose a real-time tracking algorithm called coupled randomness tracking (CRT) which focuses on dealing with these two issues. One randomness represents random projection, and the other randomness represents online random forests (ORFs). In CRT, the gray-scale feature is compressed by a sparse measurement matrix, and ORFs are used to train the sample sequence online. During the training procedure, we introduce a tree discarding strategy which helps the ORFs to adapt fast appearance changes caused by illumination, occlusion, etc. Our method can constantly adapt to the objective's latest appearance changes while keeping the prior appearance information. The experimental results show that our algorithm performs robustly with many publicly available benchmark videos and outperforms several state-of-the-art algorithms. Additionally, our algorithm can be easily utilized into a parallel program.

  • Interference Mitigation Framework Based on Interference Alignment for Femtocell-Macrocell Two Tier Cellular Systems

    Mohamed RIHAN  Maha ELSABROUTY  Osamu MUTA  Hiroshi FURUKAWA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E98-B No:3
      Page(s):
    467-476

    This paper presents a downlink interference mitigation framework for two-tier heterogeneous networks, that consist of spectrum-sharing macrocells and femtocells*. This framework establishes cooperation between the two tiers through two algorithms, namely, the restricted waterfilling (RWF) algorithm and iterative reweighted least squares interference alignment (IRLS-IA) algorithm. The proposed framework models the macrocell-femtocell two-tier cellular system as an overlay cognitive radio system in which the macrocell system plays the role of the primary user (PU) while the femtocell networks play the role of the cognitive secondary users (SUs). Through the RWF algorithm, the macrocell basestation (MBS) cooperates with the femtocell basestations (FBSs) by releasing some of its eigenmodes to the FBSs to do their transmissions even if the traffic is heavy and the MBS's signal to noise power ratio (SNR) is high. Then, the FBSs are expected to achieve a near optimum sum rate through employing the IRLS-IA algorithm to mitigate both the co-tier and cross-tier interference at the femtocell users' (FUs) receivers. Simulation results show that the proposed IRLS-IA approach provides an improved sum rate for the femtocell users compared to the conventional IA techniques, such as the leakage minimization approach and the nuclear norm based rank constraint rank minimization approach. Additionally, the proposed framework involving both IRLS-IA and RWF algorithms provides an improved total system sum rate compared with the legacy approaches for the case of multiple femtocell networks.

  • Multiple Binary Codes for Fast Approximate Similarity Search

    Shinichi SHIRAKAWA  

     
    PAPER-Pattern Recognition

      Pubricized:
    2014/12/11
      Vol:
    E98-D No:3
      Page(s):
    671-680

    One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.

  • Infrared Target Tracking Using Naïve-Bayes-Nearest-Neighbor

    Shujuan GAO  Insuk KIM  Seong Tae JHANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2014/11/18
      Vol:
    E98-D No:2
      Page(s):
    471-474

    Robust yet efficient techniques for detecting and tracking targets in infrared (IR) images are a significant component of automatic target recognition (ATR) systems. In our previous works, we have proposed infrared target detection and tracking systems based on sparse representation method. The proposed infrared target detection and tracking algorithms are based on sparse representation and Bayesian probabilistic techniques, respectively. In this paper, we adopt Naïve Bayes Nearest Neighbor (NBNN) that is an extremely simple, efficient algorithm that requires no training phase. State-of-the-art image classification techniques need a comprehensive learning and training step (e.g., using Boosting, SVM, etc.) In contrast, non-parametric Nearest Neighbor based image classifiers need no training time and they also have other more advantageous properties. Results of tracking in infrared sequences demonstrated that our algorithm is robust to illumination changes, and the tracking algorithm is found to be suitable for real-time tracking of a moving target in infrared sequences and its performance was quite good.

  • Random Forest Algorithm for Linked Data Using a Parallel Processing Environment

    Dongkyu JEON  Wooju KIM  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2014/11/12
      Vol:
    E98-D No:2
      Page(s):
    372-380

    In recent years, there has been a significant growth in the importance of data mining of graph-structured data due to this technology's rapid increase in both scale and application areas. Many previous studies have investigated decision tree learning on Semantic Web-based linked data to uncover implicit knowledge. In the present paper, we suggest a new random forest algorithm for linked data to overcome the underlying limitations of the decision tree algorithm, such as local optimal decisions and generalization error. Moreover, we designed a parallel processing environment for random forest learning to manage large-size linked data and increase the efficiency of multiple tree generation. For this purpose, we modified the previous candidate feature searching method of the decision tree algorithm for linked data to reduce the feature searching space of random forest learning and developed feature selection methods that are adjusted to linked data. Using a distributed index-based search engine, we designed a parallel random forest learning system for linked data to generate random forests in parallel. Our proposed system enables users to simultaneously generate multiple decision trees from distributed stored linked data. To evaluate the performance of the proposed algorithm, we performed experiments to compare the classification accuracy when using the single decision tree algorithm. The experimental results revealed that our random forest algorithm is more accurate than the single decision tree algorithm.

  • Nearest Neighbor Search with the Revised TLAESA

    Dong WANG  Hiroyuki MITSUHARA  Masami SHISHIBORI  

     
    PAPER

      Vol:
    E98-D No:1
      Page(s):
    65-77

    It is significant to develop better search methods to handle the rapidly increasing volume of multimedia data. For NN (Nearest Neighbor) search in metric spaces, the TLAESA (Tree Linear Approximating and Eliminating Search Algorithm) is a state of art fast search method. In this paper a method is proposed to improve the TLAESA by revising the tree structure with an optimal number of selected global pivots in the higher levels as representatives and employing the best-first search strategy. Based on an improved version of the TLAESA that succeeds in using the best-first search strategy to greatly reduce the distance calculations, this method improves the drawback that calculating less at the price of the lower pruning rate of branches. The lower pruning rate further can lead to lower search efficiency, because the priority queue used in the adopted best-first search strategy stores the information of the visited but unpruned nodes, and need be frequently accessed and sorted. In order to enhance the pruning rate of branches, the improved method tries to make more selected global pivots locate in the higher levels of the search tree as representatives. As more real distances instead of lower bound estimations of the node-representatives are used for approximating the closet node and for “branch and bound”, not only which nodes are close to the query object can be evaluated more effectively, but also the pruning rate of branches can be enhanced. Experiments show that for k-NN queries in Euclidean space, in a proper pivot selection strategy the proposed method can reach the same fewest distance calculations as the LAESA (Linear Approximating and Eliminating Search Algorithm) which saves more calculations than the TLAESA, and can achieve a higher search efficiency than the TLAESA.

  • Block Adaptive Algorithm for Signal Declipping Based on Null Space Alternating Optimization

    Tomohiro TAKAHASHI  Kazunori URUMA  Katsumi KONISHI  Toshihiro FURUKAWA  

     
    LETTER-Speech and Hearing

      Pubricized:
    2014/10/06
      Vol:
    E98-D No:1
      Page(s):
    206-209

    This letter deals with the signal declipping algorithm based on the matrix rank minimization approach, which can be applied to the signal restoration in linear systems. We focus on the null space of a low-rank matrix and provide a block adaptive algorithm of the matrix rank minimization approach to signal declipping based on the null space alternating optimization (NSAO) algorithm. Numerical examples show that the proposed algorithm is faster and has better performance than other algorithms.

  • Disaster Recovery for Transport Network through Multiple Restoration Stages

    Shohei KAMAMURA  Daisaku SHIMAZAKI  Kouichi GENDA  Koji SASAYAMA  Yoshihiko UEMATSU  

     
    PAPER-Network System

      Vol:
    E98-B No:1
      Page(s):
    171-179

    This paper proposes a disaster recovery method for transport networks. In a scenario of recovery from a disaster, a network is repaired through multiple restoration stages because repair resources are limited. In a practical case, a network should provide the reachability of important traffic in transient stages, even as service interruption risks and/or operational overheads caused by transport paths switching are suppressed. Then, we define the multi-objective optimization problem: maximizing the traffic recovery ratio and minimizing the number of switched transport paths at each stage. We formulate our problem as linear programming, and show that it yields pareto-optimal solutions of traffic recovery versus the number of switched paths. We also propose a heuristic algorithm for applying to networks consisting of a few hundred nodes, and show that it can produce sub-optimal solutions that differ only slightly from optimal solutions.

  • Individual Restoration of Tampered Pixels for Statistical Fragile Watermarking

    Maki YOSHIDA  Kazuya OHKITA  Toru FUJIWARA  

     
    PAPER

      Vol:
    E98-D No:1
      Page(s):
    58-64

    An important issue of fragile watermarking for image is to locate and restore the tampered pixels individually and accurately. This issue is resolved for concentrated tampering. In contrast, for diverse tampering, only localization is realized. This paper presents a restoration method for the most accurate scheme tolerant against diverse tampering. We analyze the error probability and experimentally confirm that the proposed method accurately restores the tampered pixels. We also show two variations based on the fact that the authentication data used for deriving the watermark is a maximum length sequence code.

81-100hit(332hit)