The search functionality is under construction.

Author Search Result

[Author] Ichiro IDE(14hit)

1-14hit
  • Estimation of the Attractiveness of Food Photography Based on Image Features

    Kazuma TAKAHASHI  Tatsumi HATTORI  Keisuke DOMAN  Yasutomo KAWANISHI  Takatsugu HIRAYAMA  Ichiro IDE  Daisuke DEGUCHI  Hiroshi MURASE  

     
    LETTER-Human-computer Interaction

      Pubricized:
    2019/05/07
      Vol:
    E102-D No:8
      Page(s):
    1590-1593

    We introduce a method to estimate the attractiveness of a food photo. It extracts image features focusing on the appearances of 1) the entire food, and 2) the main ingredients. To estimate the attractiveness of an arbitrary food photo, these features are integrated in a regression scheme. We also constructed and released a food image dataset composed of images of ten food categories taken from 36 angles and accompanied with attractiveness values. Evaluation results showed the effectiveness of integrating the two kinds of image features.

  • Human Wearable Attribute Recognition Using Probability-Map-Based Decomposition of Thermal Infrared Images

    Brahmastro KRESNARAMAN  Yasutomo KAWANISHI  Daisuke DEGUCHI  Tomokazu TAKAHASHI  Yoshito MEKADA  Ichiro IDE  Hiroshi MURASE  

     
    PAPER-Image

      Vol:
    E100-A No:3
      Page(s):
    854-864

    This paper addresses the attribute recognition problem, a field of research that is dominated by studies in the visible spectrum. Only a few works are available in the thermal spectrum, which is fundamentally different from the visible one. This research performs recognition specifically on wearable attributes, such as glasses and masks. Usually these attributes are relatively small in size when compared with the human body, on top of a large intra-class variation of the human body itself, therefore recognizing them is not an easy task. Our method utilizes a decomposition framework based on Robust Principal Component Analysis (RPCA) to extract the attribute information for recognition. However, because it is difficult to separate the body and the attributes without any prior knowledge, noise is also extracted along with attributes, hampering the recognition capability. We made use of prior knowledge; namely the location where the attribute is likely to be present. The knowledge is referred to as the Probability Map, incorporated as a weight in the decomposition by RPCA. Using the Probability Map, we achieve an attribute-wise decomposition. The results show a significant improvement with this approach compared to the baseline, and the proposed method achieved the highest performance in average with a 0.83 F-score.

  • Generation of Training Data by Degradation Models for Traffic Sign Symbol Recognition

    Hiroyuki ISHIDA  Tomokazu TAKAHASHI  Ichiro IDE  Yoshito MEKADA  Hiroshi MURASE  

     
    PAPER

      Vol:
    E90-D No:8
      Page(s):
    1134-1141

    We present a novel training method for recognizing traffic sign symbols. The symbol images captured by a car-mounted camera suffer from various forms of image degradation. To cope with degradations, similarly degraded images should be used as training data. Our method artificially generates such training data from original templates of traffic sign symbols. Degradation models and a GA-based algorithm that simulates actual captured images are established. The proposed method enables us to obtain training data of all categories without exhaustively collecting them. Experimental results show the effectiveness of the proposed method for traffic sign symbol recognition.

  • Pedestrian Detectability Estimation Considering Visual Adaptation to Drastic Illumination Change

    Yuki IMAEDA  Takatsugu HIRAYAMA  Yasutomo KAWANISHI  Daisuke DEGUCHI  Ichiro IDE  Hiroshi MURASE  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2018/02/20
      Vol:
    E101-D No:5
      Page(s):
    1457-1461

    We propose an estimation method of pedestrian detectability considering the driver's visual adaptation to drastic illumination change, which has not been studied in previous works. We assume that driver's visual characteristics change in proportion to the elapsed time after illumination change. In this paper, as a solution, we construct multiple estimators corresponding to different elapsed periods, and estimate the detectability by switching them according to the elapsed period. To evaluate the proposed method, we construct an experimental setup to present a participant with illumination changes and conduct a preliminary simulated experiment to measure and estimate the pedestrian detectability according to the elapsed period. Results show that the proposed method can actually estimate the detectability accurately after a drastic illumination change.

  • Resilient Self-Sizing ATM Network Operation and Its Evaluation

    Hiroyoshi MIWA  Jiro YAMADA  Ichiro IDE  Toyofumi TAKENAKA  

     
    PAPER-Communication Networks and Services

      Vol:
    E81-B No:10
      Page(s):
    1789-1796

    A new traffic engineering and operation of ATM networks is described, which features adaptive virtual path (VP) bandwidth control and VP network reconfiguration capabilities. We call this operation system resilient self-sizing operation. By making full use of self-sizing network (SSN) capabilities, we can operate an ATM network efficiently and keep high robustness against traffic demand fluctuation and network failures, while reducing operating costs. In a multimedia environment, the multimedia services and unpredictability of traffic demand make network traffic management a very challenging problem. SSNs, which are defined as ATM networks with self-sizing traffic engineering and operation capability are expected to overcome these difficulties. This paper proposes VP network operation methods of self-sizing networks for high flexibility and survivability. The VP network operation is composed of adaptive VP bandwidth control to absorb changes in traffic demand, VP rerouting control to recover from failures, and VP network reconfiguration control to optimize the network. The combination of these controls can achieve good performance in flexibility and survivability.

  • Efficient Tracking of News Topics Based on Chronological Semantic Structures in a Large-Scale News Video Archive

    Ichiro IDE  Tomoyoshi KINOSHITA  Tomokazu TAKAHASHI  Hiroshi MO  Norio KATAYAMA  Shin'ichi SATOH  Hiroshi MURASE  

     
    PAPER-Video Processing

      Vol:
    E95-D No:5
      Page(s):
    1288-1300

    Recent advance in digital storage technology has enabled us to archive a large volume of video data. Thanks to this trend, we have archived more than 1,800 hours of video data from a daily Japanese news show in the last ten years. When considering the effective use of such a large news video archive, we assumed that analysis of its chronological and semantic structure becomes important. We also consider that providing the users with the development of news topics is more important to help their understanding of current affairs, rather than providing a list of relevant news stories as in most of the current news video retrieval systems. Therefore, in this paper, we propose a structuring method for a news video archive, together with an interface that visualizes the structure, so that users could track the development of news topics according to their interest, efficiently. The proposed news video structure, namely the “topic thread structure”, is obtained as a result of an analysis of the chronological and semantic relation between news stories. Meanwhile, the proposed interface, namely “mediaWalker II”, allows users to track the development of news topics along the topic thread structure, and at the same time watch the video footage corresponding to each news story. Analyses on the topic thread structures obtained by applying the proposed method to actual news video footages revealed interesting and comprehensible relations between news topics in the real world. At the same time, analyses on their size quantified the efficiency of tracking a user's topic-of-interest based on the proposed topic thread structure. We consider this as a first step towards facilitating video authoring by users based on existing contents in a large-scale news video archive.

  • FOREWORD Open Access

    Ichiro IDE  Yoko YAMAKATA  

     
    FOREWORD

      Vol:
    E102-D No:7
      Page(s):
    1228-1229
  • Using Super-Pixels and Human Probability Map for Automatic Human Subject Segmentation

    Esmaeil POURJAM  Daisuke DEGUCHI  Ichiro IDE  Hiroshi MURASE  

     
    PAPER-Image

      Vol:
    E99-A No:5
      Page(s):
    943-953

    Human body segmentation has many applications in a wide variety of image processing tasks, from intelligent vehicles to entertainment. A substantial amount of research has been done in the field of segmentation and it is still one of the active research areas, resulting in introduction of many innovative methods in literature. Still, until today, a method that can overcome the human segmentation problems and adapt itself to different kinds of situations, has not been introduced. Many of methods today try to use the graph-cut framework to solve the segmentation problem. Although powerful, these methods rely on a distance penalty term (intensity difference or RGB color distance). This term does not always lead to a good separation between two regions. For example, if two regions are close in color, even if they belong to two different objects, they will be grouped together, which is not acceptable. Also, if one object has multiple parts with different colors, e.g. humans wear various clothes with different colors and patterns, each part will be segmented separately. Although this can be overcome by multiple inputs from user, the inherent problem would not be solved. In this paper, we have considered solving the problem by making use of a human probability map, super-pixels and Grab-cut framework. Using this map relives us from the need for matching the model to the actual body, thus helps to improve the segmentation accuracy. As a result, not only the accuracy has improved, but also it also became comparable to the state-of-the-art interactive methods.

  • Forecasting Traffic Volumes for Intelligent Telecommunication Services Based on Service Characteristics

    Takeshi YADA  Isami NAKAJIMA  Ichiro IDE  Hideyo MURAKAMI  

     
    PAPER-Network Design, Operation, and Management

      Vol:
    E81-B No:12
      Page(s):
    2487-2494

    A method is proposed for deriving a traffic characteristics model that can be used to forecast the traffic volume for intelligent telecommunication services. A sort of regression analysis with dummy variables is used to represent the service quantitatively and to construct the traffic characteristics model. Recursive least squares estimation, which is a special case of the Kalman filter, is applied to the traffic characteristics model to forecast the traffic volume. In the proposed modeling and forecasting, qualitative factors representing a certain service attribute are selected and using an information criterion, the model with the best fit is identified as the most suitable forecasting model. Numerical results using practical observation data showed that the proposed method produces an accurate forecast and is thus effective for practical use.

  • Single Camera Vehicle Localization Using Feature Scale Tracklets

    David WONG  Daisuke DEGUCHI  Ichiro IDE  Hiroshi MURASE  

     
    PAPER-Vision

      Vol:
    E100-A No:2
      Page(s):
    702-713

    Advances in intelligent vehicle systems have led to modern automobiles being able to aid drivers with tasks such as lane following and automatic braking. Such automated driving tasks increasingly require reliable ego-localization. Although there is a large number of sensors that can be employed for this purpose, the use of a single camera still remains one of the most appealing, but also one of the most challenging. GPS localization in urban environments may not be reliable enough for automated driving systems, and various combinations of range sensors and inertial navigation systems are often too complex and expensive for a consumer setup. Therefore accurate localization with a single camera is a desirable goal. In this paper we propose a method for vehicle localization using images captured from a single vehicle-mounted camera and a pre-constructed database. Image feature points are extracted, but the calculation of camera poses is not required — instead we make use of the feature points' scale. For image feature-based localization methods, matching of many features against candidate database images is time consuming, and database sizes can become large. Therefore, here we propose a method that constructs a database with pre-matched features of known good scale stability. This limits the number of unused and incorrectly matched features, and allows recording of the database scales into “tracklets”. These “Feature scale tracklets” are used for fast image match voting based on scale comparison with corresponding query image features. This process reduces the number of image-to-image matching iterations that need to be performed while improving the localization stability. We also present an analysis of the system performance using a dataset with high accuracy ground truth. We demonstrate robust vehicle positioning even in challenging lane change and real traffic situations.

  • Construction of Appearance Manifold with Embedded View-Dependent Covariance Matrix for 3D Object Recognition

    Lina  Tomokazu TAKAHASHI  Ichiro IDE  Hiroshi MURASE  

     
    PAPER-Pattern Recognition

      Vol:
    E91-D No:4
      Page(s):
    1091-1100

    We propose the construction of an appearance manifold with embedded view-dependent covariance matrix to recognize 3D objects which are influenced by geometric distortions and quality degradation effects. The appearance manifold is used to capture the pose variability, while the covariance matrix is used to learn the distribution of samples for gaining noise-invariance. However, since the appearance of an object in the captured image is different for every different pose, the covariance matrix value is also different for every pose position. Therefore, it is important to embed view-dependent covariance matrices in the manifold of an object. We propose two models of constructing an appearance manifold with view-dependent covariance matrix, called the View-dependent Covariance matrix by training-Point Interpolation (VCPI) and View-dependent Covariance matrix by Eigenvector Interpolation (VCEI) methods. Here, the embedded view-dependent covariance matrix of the VCPI method is obtained by interpolating every training-points from one pose to other training-points in a consecutive pose. Meanwhile, in the VCEI method, the embedded view-dependent covariance matrix is obtained by interpolating only the eigenvectors and eigenvalues without considering the correspondences of each training image. As it embeds the covariance matrix in manifold, our view-dependent covariance matrix methods are robust to any pose changes and are also noise invariant. Our main goal is to construct a robust and efficient manifold with embedded view-dependent covariance matrix for recognizing objects from images which are influenced with various degradation effects.

  • Cross-Pose Face Recognition – A Virtual View Generation Approach Using Clustering Based LVTM

    Xi LI  Tomokazu TAKAHASHI  Daisuke DEGUCHI  Ichiro IDE  Hiroshi MURASE  

     
    PAPER-Face Perception and Recognition

      Vol:
    E96-D No:3
      Page(s):
    531-537

    This paper presents an approach for cross-pose face recognition by virtual view generation using an appearance clustering based local view transition model. Previously, the traditional global pattern based view transition model (VTM) method was extended to its local version called LVTM, which learns the linear transformation of pixel values between frontal and non-frontal image pairs from training images using partial image in a small region for each location, instead of transforming the entire image pattern. In this paper, we show that the accuracy of the appearance transition model and the recognition rate can be further improved by better exploiting the inherent linear relationship between frontal-nonfrontal face image patch pairs. This is achieved based on the observation that variations in appearance caused by pose are closely related to the corresponding 3D structure and intuitively frontal-nonfrontal patch pairs from more similar local 3D face structures should have a stronger linear relationship. Thus for each specific location, instead of learning a common transformation as in the LVTM, the corresponding local patches are first clustered based on an appearance similarity distance metric and then the transition models are learned separately for each cluster. In the testing stage, each local patch for the input non-frontal probe image is transformed using the learned local view transition model corresponding to the most visually similar cluster. The experimental results on a real-world face dataset demonstrated the superiority of the proposed method in terms of recognition rate.

  • Incremental Unsupervised-Learning of Appearance Manifold with View-Dependent Covariance Matrix for Face Recognition from Video Sequences

    Lina  Tomokazu TAKAHASHI  Ichiro IDE  Hiroshi MURASE  

     
    PAPER-Pattern Recognition

      Vol:
    E92-D No:4
      Page(s):
    642-652

    We propose an appearance manifold with view-dependent covariance matrix for face recognition from video sequences in two learning frameworks: the supervised-learning and the incremental unsupervised-learning. The advantages of this method are, first, the appearance manifold with view-dependent covariance matrix model is robust to pose changes and is also noise invariant, since the embedded covariance matrices are calculated based on their poses in order to learn the samples' distributions along the manifold. Moreover, the proposed incremental unsupervised-learning framework is more realistic for real-world face recognition applications. It is obvious that it is difficult to collect large amounts of face sequences under complete poses (from left sideview to right sideview) for training. Here, an incremental unsupervised-learning framework allows us to train the system with the available initial sequences, and later update the system's knowledge incrementally every time an unlabelled sequence is input. In addition, we also integrate the appearance manifold with view-dependent covariance matrix model with a pose estimation system for improving the classification accuracy and easily detecting sequences with overlapped poses for merging process in the incremental unsupervised-learning framework. The experimental results showed that, in both frameworks, the proposed appearance manifold with view-dependent covariance matrix method could recognize faces from video sequences accurately.

  • Attribute-Aware Loss Function for Accurate Semantic Segmentation Considering the Pedestrian Orientations Open Access

    Mahmud Dwi SULISTIYO  Yasutomo KAWANISHI  Daisuke DEGUCHI  Ichiro IDE  Takatsugu HIRAYAMA  Jiang-Yu ZHENG  Hiroshi MURASE  

     
    PAPER

      Vol:
    E103-A No:1
      Page(s):
    231-242

    Numerous applications such as autonomous driving, satellite imagery sensing, and biomedical imaging use computer vision as an important tool for perception tasks. For Intelligent Transportation Systems (ITS), it is required to precisely recognize and locate scenes in sensor data. Semantic segmentation is one of computer vision methods intended to perform such tasks. However, the existing semantic segmentation tasks label each pixel with a single object's class. Recognizing object attributes, e.g., pedestrian orientation, will be more informative and help for a better scene understanding. Thus, we propose a method to perform semantic segmentation with pedestrian attribute recognition simultaneously. We introduce an attribute-aware loss function that can be applied to an arbitrary base model. Furthermore, a re-annotation to the existing Cityscapes dataset enriches the ground-truth labels by annotating the attributes of pedestrian orientation. We implement the proposed method and compare the experimental results with others. The attribute-aware semantic segmentation shows the ability to outperform baseline methods both in the traditional object segmentation task and the expanded attribute detection task.