The search functionality is under construction.

Author Search Result

[Author] Koichi KISE(11hit)

1-11hit
  • Face Image Generation of Anime Characters Using an Advanced First Order Motion Model with Facial Landmarks

    Junki OSHIBA  Motoi IWATA  Koichi KISE  

     
    PAPER

      Pubricized:
    2022/10/12
      Vol:
    E106-D No:1
      Page(s):
    22-30

    Recently, deep learning for image generation with a guide for the generation has been progressing. Many methods have been proposed to generate the animation of facial expression change from a single face image by transferring some facial expression information to the face image. In particular, the method of using facial landmarks as facial expression information can generate a variety of facial expressions. However, most methods do not focus on anime characters but humans. Moreover, we attempted to apply several existing methods to anime characters by training the methods on an anime character face dataset; however, they generated images with noise, even in regions where there was no change. The first order motion model (FOMM) is an image generation method that takes two images as input and transfers one facial expression or pose to the other. By explicitly calculating the difference between the two images based on optical flow, FOMM can generate images with low noise in the unchanged regions. In the following, we focus on the aspect of the face image generation in FOMM. When we think about the employment of facial landmarks as targets, the performance of FOMM is not enough because FOMM cannot use a facial landmark as a facial expression target because the appearances of a face image and a facial landmark are quite different. Therefore, we propose an advanced FOMM method to use facial landmarks as a facial expression target. In the proposed method, we change the input data and data flow to use facial landmarks. Additionally, to generate face images with expressions that follow the target landmarks more closely, we introduce the landmark estimation loss, which is computed by comparing the landmark detected from the generated image with the target landmark. Our experiments on an anime character face image dataset demonstrated that our method is effective for landmark-guided face image generation for anime characters. Furthermore, our method outperformed other methods quantitatively and generated face images with less noise.

  • Learning Pyramidal Feature Hierarchy for 3D Reconstruction

    Fairuz Safwan MAHAD  Masakazu IWAMURA  Koichi KISE  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2021/11/16
      Vol:
    E105-D No:2
      Page(s):
    446-449

    Neural network-based three-dimensional (3D) reconstruction methods have produced promising results. However, they do not pay particular attention to reconstructing detailed parts of objects. This occurs because the network is not designed to capture the fine details of objects. In this paper, we propose a network designed to capture both the coarse and fine details of objects to improve the reconstruction of the fine parts of objects.

  • Exploring Sensor Modalities to Capture User Behaviors for Reading Detection

    Md. Rabiul ISLAM  Andrew W. VARGO  Motoi IWATA  Masakazu IWAMURA  Koichi KISE  

     
    LETTER-Human-computer Interaction

      Pubricized:
    2022/06/20
      Vol:
    E105-D No:9
      Page(s):
    1629-1633

    Accurately describing user behaviors with appropriate sensors is always important when developing computing cost-effective systems. This paper employs datasets recorded for fine-grained reading detection using the J!NS MEME, an eye-wear device with electrooculography (EOG), accelerometer, and gyroscope sensors. We generate models for all possible combinations of the three sensors and employ self-supervised learning and supervised learning in order to gain an understanding of optimal sensor settings. The results show that only the EOG sensor performs roughly as well as the best performing combination of other sensors. This gives an insight into selecting the appropriate sensors for fine-grained reading detection, enabling cost-effective computation.

  • Document Image Retrieval for QA Systems Based on the Density Distributions of Successive Terms

    Koichi KISE  Shota FUKUSHIMA  Keinosuke MATSUMOTO  

     
    PAPER-Document Image Retrieval

      Vol:
    E88-D No:8
      Page(s):
    1843-1851

    Question answering (QA) is the task of retrieving an answer in response to a question by analyzing documents. Although most of the efforts in developing QA systems are devoted to dealing with electronic text, we consider it is also necessary to develop systems for document images. In this paper, we propose a method of document image retrieval for such QA systems. Since the task is not to retrieve all relevant documents but to find the answer somewhere in documents, retrieval should be precision oriented. The main contribution of this paper is to propose a method of improving precision of document image retrieval by taking into account the co-occurrence of successive terms in a question. The indexing scheme is based on two-dimensional distributions of terms and the weight of co-occurrence is measured by calculating the density distributions of terms. The proposed method was tested by using 1253 pages of documents about the major league baseball with 20 questions and found that it is superior to the baseline method proposed by the authors.

  • Digital Watermarking Method for Printed Matters Using Deep Learning for Detecting Watermarked Areas

    Hiroyuki IMAGAWA  Motoi IWATA  Koichi KISE  

     
    PAPER

      Pubricized:
    2020/10/07
      Vol:
    E104-D No:1
      Page(s):
    34-42

    There are some technologies like QR codes to obtain digital information from printed matters. Digital watermarking is one of such techniques. Compared with other techniques, digital watermarking is suitable for adding information to images without spoiling their design. For such purposes, digital watermarking methods for printed matters using detection markers or image registration techniques for detecting watermarked areas are proposed. However, the detection markers themselves can damage the appearance such that the advantages of digital watermarking, which do not lose design, are not fully utilized. On the other hand, methods using image registration techniques are not able to work for non-registered images. In this paper, we propose a novel digital watermarking method using deep learning for the detection of watermarked areas instead of using detection markers or image registration. The proposed method introduces a semantic segmentation based on deep learning model for detecting watermarked areas from printed matters. We prepare two datasets for training the deep learning model. One is constituted of geometrically transformed non-watermarked and watermarked images. The number of images in this dataset is relatively large because the images can be generated based on image processing. This dataset is used for pre-training. The other is obtained from actually taken photographs including non-watermarked or watermarked printed matters. The number of this dataset is relatively small because taking the photographs requires a lot of effort and time. However, the existence of pre-training allows a fewer training images. This dataset is used for fine-tuning to improve robustness for print-cam attacks. In the experiments, we investigated the performance of our method by implementing it on smartphones. The experimental results show that our method can carry 96 bits of information with watermarked printed matters.

  • Practical Watermarking Method Estimating Watermarked Region from Recaptured Videos on Smartphone

    Motoi IWATA  Naoyoshi MIZUSHIMA  Koichi KISE  

     
    PAPER

      Pubricized:
    2016/10/07
      Vol:
    E100-D No:1
      Page(s):
    24-32

    In these days, we can see digital signages in many places, for example, inside stations or trains with the distribution of attractive promotional video clips. Users can easily get additional information related to such video clips via mobile devices such as smartphone by using some websites for retrieval. However, such retrieval is time-consuming and sometimes leads users to incorrect information. Therefore, it is desirable that the additional information can be directly obtained from the video clips. We implement a suitable digital watermarking method on smartphone to extract watermarks from video clips on signages in real-time. The experimental results show that the proposed method correctly extracts watermarks in a second on smartphone.

  • Effectiveness of Passage-Based Document Retrieval for Short Queries

    Koichi KISE  Markus JUNKER  Andreas DENGEL  Keinosuke MATSUMOTO  

     
    PAPER

      Vol:
    E86-D No:9
      Page(s):
    1753-1761

    Document retrieval is a fundamental but important task for intelligent access to a huge amount of information stored in documents. Although the history of its research is long, it is still a hard task especially in the case that lengthy documents are retrieved with very short queries (a few keywords). For the retrieval of long documents, methods called passage-based document retrieval have proven to be effective. In this paper, we experimentally show that a passage-based method based on window passages is also effective for dealing with short queries on condition that documents are not too short. We employ a method called "density distributions" as a method based on window passages, and compare it with three conventional methods: the simple vector space model, pseudo relevance feedback and latent semantic indexing. We also compare it with a passage-based method based on discourse passages.

  • Representing, Utilizing and Acquiring Knowledge for Document lmage Understanding

    Koichi KISE  Noboru BABAGUCHI  

     
    PAPER

      Vol:
    E77-D No:7
      Page(s):
    770-777

    This paper discusses the role of knowledge in document image understanding from the viewpoints of representation, utilization and acquisition. For the representation of knowledge, we propose two models, a layout model and a content model, which represent knowledge about the layout structure and content of a document, respectively. For the utilization of knowledge, we implement layout analysis and content analysis which utilize a layout model and a content model, respectively. The strategy of hypothesis generation and verification is introduced in order to integrate these two kinds of analysis. For the acquisition of knowledge, we propose a method of incremental acquisition of a layout model from a stream of example documents. From the experimental results of document image understanding and knowledge acquisition using 50 samples of visiting cards, we verified the effectiveness of the proposed method.

  • Individuality-Preserving Silhouette Extraction for Gait Recognition and Its Speedup

    Masakazu IWAMURA  Shunsuke MORI  Koichiro NAKAMURA  Takuya TANOUE  Yuzuko UTSUMI  Yasushi MAKIHARA  Daigo MURAMATSU  Koichi KISE  Yasushi YAGI  

     
    PAPER-Pattern Recognition

      Pubricized:
    2021/03/24
      Vol:
    E104-D No:7
      Page(s):
    992-1001

    Most gait recognition approaches rely on silhouette-based representations due to high recognition accuracy and computational efficiency. A fundamental problem for those approaches is how to extract individuality-preserved silhouettes from real scenes accurately. Foreground colors may be similar to background colors, and the background is cluttered. Therefore, we propose a method of individuality-preserving silhouette extraction for gait recognition using standard gait models (SGMs) composed of clean silhouette sequences of various training subjects as shape priors. The SGMs are smoothly introduced into a well-established graph-cut segmentation framework. Experiments showed that the proposed method achieved better silhouette extraction accuracy by more than 2.3% than representative methods and better identification rate of gait recognition (improved by more than 11.0% at rank 20). Besides, to reduce the computation cost, we introduced approximation in the calculation of dynamic programming. As a result, without reducing the segmentation accuracy, we reduced 85.0% of the computational cost.

  • Data Embedding into Characters Open Access

    Koichi KISE  Shinichiro OMACHI  Seiichi UCHIDA  Masakazu IWAMURA  Marcus LIWICKI  

     
    INVITED PAPER

      Vol:
    E98-D No:1
      Page(s):
    10-20

    This paper reviews several trials of re-designing conventional communication medium, i.e., characters, for enriching their functions by using data-embedding techniques. For example, characters are re-designed to have better machine-readability even under various geometric distortions by embedding a geometric invariant into each character image to represent class label of the character. Another example is to embed various information into handwriting trajectory by using a new pen device, called a data-embedding pen. An experimental result showed that we can embed 32-bit information into a handwritten line of 5 cm length by using the pen device. In addition to those applications, we also discuss the relationship between data-embedding and pattern recognition in a theoretical point of view. Several theories tell that if we have appropriate supplementary information by data-embedding, we can enhance pattern recognition performance up to 100%.

  • Learning Multi-Level Features for Improved 3D Reconstruction

    Fairuz SAFWAN MAHAD  Masakazu IWAMURA  Koichi KISE  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2022/12/08
      Vol:
    E106-D No:3
      Page(s):
    381-390

    3D reconstruction methods using neural networks are popular and have been studied extensively. However, the resulting models typically lack detail, reducing the quality of the 3D reconstruction. This is because the network is not designed to capture the fine details of the object. Therefore, in this paper, we propose two networks designed to capture both the coarse and fine details of the object to improve the reconstruction of the detailed parts of the object. To accomplish this, we design two networks. The first network uses a multi-scale architecture with skip connections to associate and merge features from other levels. For the second network, we design a multi-branch deep generative network that separately learns the local features, generic features, and the intermediate features through three different tailored components. In both network architectures, the principle entails allowing the network to learn features at different levels that can reconstruct the fine parts and the overall shape of the reconstructed 3D model. We show that both of our methods outperformed state-of-the-art approaches.