The search functionality is under construction.
The search functionality is under construction.

Author Search Result

[Author] Soh YOSHIDA(8hit)

1-8hit
  • Image Regularization with Total Variation and Optimized Morphological Gradient Priors

    Shoya OOHARA  Mitsuji MUNEYASU  Soh YOSHIDA  Makoto NAKASHIZUKA  

     
    LETTER-Image

      Vol:
    E102-A No:12
      Page(s):
    1920-1924

    For image restoration, an image prior that is obtained from the morphological gradient has been proposed. In the field of mathematical morphology, the optimization of the structuring element (SE) used for this morphological gradient using a genetic algorithm (GA) has also been proposed. In this paper, we introduce a new image prior that is the sum of the morphological gradients and total variation for an image restoration problem to improve the restoration accuracy. The proposed image prior makes it possible to almost match the fitness to a quantitative evaluation such as the mean square error. It also solves the problem of the artifact due to the unsuitability of the SE for the image. An experiment shows the effectiveness of the proposed image restoration method.

  • Heterogeneous-Graph-Based Video Search Reranking Using Topic Relevance

    Soh YOSHIDA  Mitsuji MUNEYASU  Takahiro OGAWA  Miki HASEYAMA  

     
    PAPER-Vision

      Vol:
    E103-A No:12
      Page(s):
    1529-1540

    In this paper, we address the problem of analyzing topics, included in a social video group, to improve the retrieval performance of videos. Unlike previous methods that focused on an individual visual aspect of videos, the proposed method aims to leverage the “mutual reinforcement” of heterogeneous modalities such as tags and users associated with video on the Internet. To represent multiple types of relationships between each heterogeneous modality, the proposed method constructs three subgraphs: user-tag, video-video, and video-tag graphs. We combine the three types of graphs to obtain a heterogeneous graph. Then the extraction of latent features, i.e., topics, becomes feasible by applying graph-based soft clustering to the heterogeneous graph. By estimating the membership of each grouped cluster for each video, the proposed method defines a new video similarity measure. Since the understanding of video content is enhanced by exploiting latent features obtained from different types of data that complement each other, the performance of visual reranking is improved by the proposed method. Results of experiments on a video dataset that consists of YouTube-8M videos show the effectiveness of the proposed method, which achieves a 24.3% improvement in terms of the mean normalized discounted cumulative gain in a search ranking task compared with the baseline method.

  • Data Extraction Method from Printed Images with Different Formats

    Mitsuji MUNEYASU  Nayuta JINDA  Yuuya MORITANI  Soh YOSHIDA  

     
    LETTER-Image Processing

      Vol:
    E100-A No:11
      Page(s):
    2355-2357

    In this paper, we propose a method of embedding and detecting data in printed images with several formats, such as different resolutions and numbers of blocks, using the camera of a tablet device. To specify the resolution of an image and the number of blocks, invisible markers that are embedded in the amplitude domain of the discrete Fourier transform of the target image are used. The proposed method can increase the variety of images suitable for data embedding.

  • New Performance Evaluation Method for Data Embedding Techniques for Printed Images Using Mobile Devices Based on a GAN

    Masahiro YASUDA  Soh YOSHIDA  Mitsuji MUNEYASU  

     
    LETTER

      Pubricized:
    2022/08/23
      Vol:
    E106-A No:3
      Page(s):
    481-485

    Methods that embed data into printed images and retrieve data from printed images captured using the camera of a mobile device have been proposed. Evaluating these methods requires printing and capturing actual embedded images, which is burdensome. In this paper, we propose a method for reducing the workload for evaluating the performance of data embedding algorithms by simulating the degradation caused by printing and capturing images using generative adversarial networks. The proposed method can represent various captured conditions. Experimental results demonstrate that the proposed method achieves the same accuracy as detecting embedded data under actual conditions.

  • U-Net Architecture for Ancient Handwritten Chinese Character Detection in Han Dynasty Wooden Slips

    Hojun SHIMOYAMA  Soh YOSHIDA  Takao FUJITA  Mitsuji MUNEYASU  

     
    PAPER-Image

      Pubricized:
    2023/05/15
      Vol:
    E106-A No:11
      Page(s):
    1406-1415

    Recent character detectors have been modeled using deep neural networks and have achieved high performance in various tasks, such as text detection in natural scenes and character detection in historical documents. However, existing methods cannot achieve high detection accuracy for wooden slips because of their multi-scale character sizes and aspect ratios, high character density, and close character-to-character distance. In this study, we propose a new U-Net-based character detection and localization framework that learns character regions and boundaries between characters. The proposed method enhances the learning performance of character regions by simultaneously learning the vertical and horizontal boundaries between characters. Furthermore, by adding simple and low-cost post-processing using the learned regions of character boundaries, it is possible to more accurately detect the location of a group of characters in a close neighborhood. In this study, we construct a wooden slip dataset. Experiments demonstrated that the proposed method outperformed existing character detection methods, including state-of-the-art character detection methods for historical documents.

  • Unbiased Pseudo-Labeling for Learning with Noisy Labels

    Ryota HIGASHIMOTO  Soh YOSHIDA  Takashi HORIHATA  Mitsuji MUNEYASU  

     
    LETTER

      Pubricized:
    2023/09/19
      Vol:
    E107-D No:1
      Page(s):
    44-48

    Noisy labels in training data can significantly harm the performance of deep neural networks (DNNs). Recent research on learning with noisy labels uses a property of DNNs called the memorization effect to divide the training data into a set of data with reliable labels and a set of data with unreliable labels. Methods introducing semi-supervised learning strategies discard the unreliable labels and assign pseudo-labels generated from the confident predictions of the model. So far, this semi-supervised strategy has yielded the best results in this field. However, we observe that even when models are trained on balanced data, the distribution of the pseudo-labels can still exhibit an imbalance that is driven by data similarity. Additionally, a data bias is seen that originates from the division of the training data using the semi-supervised method. If we address both types of bias that arise from pseudo-labels, we can avoid the decrease in generalization performance caused by biased noisy pseudo-labels. We propose a learning method with noisy labels that introduces unbiased pseudo-labeling based on causal inference. The proposed method achieves significant accuracy gains in experiments at high noise rates on the standard benchmarks CIFAR-10 and CIFAR-100.

  • Graph-Based Video Search Reranking with Local and Global Consistency Analysis

    Soh YOSHIDA  Takahiro OGAWA  Miki HASEYAMA  Mitsuji MUNEYASU  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2018/01/30
      Vol:
    E101-D No:5
      Page(s):
    1430-1440

    Video reranking is an effective way for improving the retrieval performance of text-based video search engines. This paper proposes a graph-based Web video search reranking method with local and global consistency analysis. Generally, the graph-based reranking approach constructs a graph whose nodes and edges respectively correspond to videos and their pairwise similarities. A lot of reranking methods are built based on a scheme which regularizes the smoothness of pairwise relevance scores between adjacent nodes with regard to a user's query. However, since the overall consistency is measured by aggregating only the local consistency over each pair, errors in score estimation increase when noisy samples are included within query-relevant videos' neighbors. To deal with the noisy samples, the proposed method leverages the global consistency of the graph structure, which is different from the conventional methods. Specifically, in order to detect this consistency, the propose method introduces a spectral clustering algorithm which can detect video groups, in which videos have strong semantic correlation, on the graph. Furthermore, a new regularization term, which smooths ranking scores within the same group, is introduced to the reranking framework. Since the score regularization is performed by both local and global aspects simultaneously, the accurate score estimation becomes feasible. Experimental results obtained by applying the proposed method to a real-world video collection show its effectiveness.

  • Video Search Reranking with Relevance Feedback Using Visual and Textual Similarities

    Takamasa FUJII  Soh YOSHIDA  Mitsuji MUNEYASU  

     
    PAPER-Multimedia Environment Technology

      Vol:
    E102-A No:12
      Page(s):
    1900-1909

    In video search reranking, in addition to the well-known semantic gap, the intent gap, which is the gap between the representation of the users' demand and the real search intention, is becoming a major problem restricting the improvement of reranking performance. To address this problem, we propose video search reranking based on a semantic representation by multiple tags. In the proposed method, we use relevance feedback, which the user can interact with by specifying some example videos from the initial search results. We apply the relevance feedback to reduce the gap between the real intent of the users and the video search results. In addition, we focus on the fact that multiple tags are used to represent video contents. By vectorizing multiple tags associated with videos on the basis of the Word2Vec algorithm and calculating the centroid of the tag vector as a collective representation, we can evaluate the semantic similarity between videos by using tag features. We conduct experiments on the YouTube-8M dataset, and the results show that our reranking approach is effective and efficient.