The search functionality is under construction.

Author Search Result

[Author] Hideki NAKAYAMA(4hit)

1-4hit
  • Improving Noised Gradient Penalty with Synchronized Activation Function for Generative Adversarial Networks

    Rui YANG  Raphael SHU  Hideki NAKAYAMA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2022/05/27
      Vol:
    E105-D No:9
      Page(s):
    1537-1545

    Generative Adversarial Networks (GANs) are one of the most successful learning principles of generative models and were wildly applied to many generation tasks. In the beginning, the gradient penalty (GP) was applied to enforce the discriminator in GANs to satisfy Lipschitz continuity in Wasserstein GAN. Although the vanilla version of the gradient penalty was further modified for different purposes, seeking a better equilibrium and higher generation quality in adversarial learning remains challenging. Recently, DRAGAN was proposed to achieve the local linearity in a surrounding data manifold by applying the noised gradient penalty to promote the local convexity in model optimization. However, we show that their approach will impose a burden on satisfying Lipschitz continuity for the discriminator. Such conflict between Lipschitz continuity and local linearity in DRAGAN will result in poor equilibrium, and thus the generation quality is far from ideal. To this end, we propose a novel approach to benefit both local linearity and Lipschitz continuity for reaching a better equilibrium without conflict. In detail, we apply our synchronized activation function in the discriminator to receive a particular form of noised gradient penalty for achieving local linearity without losing the property of Lipschitz continuity in the discriminator. Experimental results show that our method can reach the superior quality of images and outperforms WGAN-GP, DiracGAN, and DRAGAN in terms of Inception Score and Fréchet Inception Distance on real-world datasets.

  • Dense Sampling Low-Level Statistics of Local Features

    Hideki NAKAYAMA  Tatsuya HARADA  Yasuo KUNIYOSHI  

     
    PAPER

      Vol:
    E93-D No:7
      Page(s):
    1727-1736

    Generic image recognition techniques are widely studied for automatic image indexing. However, many of these methods are computationally too heavy for a practically large setup. Thus, for realizing scalability, it is important to properly balance the trade-off between performance and computational cost. In recent years, methods based on a bag-of-keypoints approach have been successful and widely used. However, the preprocessing cost for building visual words becomes immense in large-scale datasets. On the other hand, methods based on global image features have been used for a long time. Because global image features can be extracted rapidly, it is relatively easy to use them with large datasets. However, the performance of global feature methods is usually poor compared to the bag-of-keypoints methods. This paper proposes a simple but powerful scheme of boosting the performance of global image features by densely sampling low-level statistical moments of local features. Also, we use a scalable learning and classification method which is substantially lighter than a SVM. Our method achieved performance comparable to state-of-the-art methods despite its remarkable simplicity.

  • Efficient Two-Step Middle-Level Part Feature Extraction for Fine-Grained Visual Categorization

    Hideki NAKAYAMA  Tomoya TSUDA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2016/02/23
      Vol:
    E99-D No:6
      Page(s):
    1626-1634

    Fine-grained visual categorization (FGVC) has drawn increasing attention as an emerging research field in recent years. In contrast to generic-domain visual recognition, FGVC is characterized by high intra-class and subtle inter-class variations. To distinguish conceptually and visually similar categories, highly discriminative visual features must be extracted. Moreover, FGVC has highly specialized and task-specific nature. It is not always easy to obtain a sufficiently large-scale training dataset. Therefore, the key to success in practical FGVC systems is to efficiently exploit discriminative features from a limited number of training examples. In this paper, we propose an efficient two-step dimensionality compression method to derive compact middle-level part-based features. To do this, we compare both space-first and feature-first convolution schemes and investigate their effectiveness. Our approach is based on simple linear algebra and analytic solutions, and is highly scalable compared with the current one-vs-one or one-vs-all approach, making it possible to quickly train middle-level features from a number of pairwise part regions. We experimentally show the effectiveness of our method using the standard Caltech-Birds and Stanford-Cars datasets.

  • Two-Path Object Knowledge Injection for Detecting Novel Objects With Single-Stage Dense Detector

    KuanChao CHU  Hideki NAKAYAMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/08/02
      Vol:
    E106-D No:11
      Page(s):
    1868-1880

    We present an effective system for integrating generative zero-shot classification modules into a YOLO-like dense detector to detect novel objects. Most double-stage-based novel object detection methods are achieved by refining the classification output branch but cannot be applied to a dense detector. Our system utilizes two paths to inject knowledge of novel objects into a dense detector. One involves injecting the class confidence for novel classes from a classifier trained on data synthesized via a dual-step generator. This generator learns a mapping function between two feature spaces, resulting in better classification performance. The second path involves re-training the detector head with feature maps synthesized on different intensity levels. This approach significantly increases the predicted objectness for novel objects, which is a major challenge for a dense detector. We also introduce a stop-and-reload mechanism during re-training for optimizing across head layers to better learn synthesized features. Our method relaxes the constraint on the detector head architecture in the previous method and has markedly enhanced performance on the MSCOCO dataset.