The search functionality is under construction.

Author Search Result

[Author] Takio KURITA(20hit)

1-20hit
  • Gesture Recognition Using HLAC Features of PARCOR Images

    Takio KURITA  Satoru HAYAMIZU  

     
    PAPER-Image Processing, Image Pattern Recognition

      Vol:
    E86-D No:4
      Page(s):
    719-726

    This paper proposes a gesture recognition method which uses higher order local autocorrelation (HLAC) features extracted from PARCOR images. To extract dominant information from a sequence of images, we apply linear prediction coding technique to the sequence of pixel intensities and PARCOR images are constructed from the PARCOR coefficients of the sequences of the pixel values. From the PARCOR images, HLAC features are extracted and the sequences of the features are used as the input vectors of the Hidden Markov Model (HMM) based recognizer. Since HLAC features are inherently shift-invariant and computationally inexpensive, the proposed method becomes robust to changes in the person's position and makes real-time gesture recognition possible. Experimental results of gesture recognition are shown to evaluate the performance of the proposed method.

  • Optimum Nonlinear Discriminant Analysis and Discriminant Kernel Support Vector Machine

    Akinori HIDAKA  Takio KURITA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2016/08/04
      Vol:
    E99-D No:11
      Page(s):
    2734-2744

    Kernel discriminant analysis (KDA) is the mainstream approach of nonlinear discriminant analysis (NDA). Since it uses the kernel trick, KDA does not consider its nonlinear discriminant mapping explicitly. In this paper, another NDA approach where the nonlinear discriminant mapping is analytically given is developed. This study is based on the theory of optimal nonlinear discriminant analysis (ONDA) of which the nonlinear mapping is exactly expressed by using the Bayesian posterior probability. This theory indicates that various NDA can be derived by estimating the Bayesian posterior probability in ONDA with various estimation methods. Also, ONDA brings an insight about novel kernel functions, called discriminant kernel (DK), which is defined by also using the posterior probabilities. In this paper, several NDA and DK derived from ONDA with several posterior probability estimators are developed and evaluated. Given fine estimation methods of the Bayesian posterior probability, they give good discriminant spaces for visualization or classification.

  • Sample Selection Approach with Number of False Predictions for Learning with Noisy Labels

    Yuichiro NOMURA  Takio KURITA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2022/07/21
      Vol:
    E105-D No:10
      Page(s):
    1759-1768

    In recent years, deep neural networks (DNNs) have made a significant impact on a variety of research fields and applications. One drawback of DNNs is that it requires a huge amount of dataset for training. Since it is very expensive to ask experts to label the data, many non-expert data collection methods such as web crawling have been proposed. However, dataset created by non-experts often contain corrupted labels, and DNNs trained on such dataset are unreliable. Since DNNs have an enormous number of parameters, it tends to overfit to noisy labels, resulting in poor generalization performance. This problem is called Learning with Noisy labels (LNL). Recent studies showed that DNNs are robust to the noisy labels in the early stage of learning before over-fitting to noisy labels because DNNs learn the simple patterns first. Therefore DNNs tend to output true labels for samples with noisy labels in the early stage of learning, and the number of false predictions for samples with noisy labels is higher than for samples with clean labels. Based on these observations, we propose a new sample selection approach for LNL using the number of false predictions. Our method periodically collects the records of false predictions during training, and select samples with a low number of false predictions from the recent records. Then our method iteratively performs sample selection and training a DNNs model using the updated dataset. Since the model is trained with more clean samples and records more accurate false predictions for sample selection, the generalization performance of the model gradually increases. We evaluated our method on two benchmark datasets, CIFAR-10 and CIFAR-100 with synthetically generated noisy labels, and the obtained results which are better than or comparative to the-state-of-the-art approaches.

  • Deep Metric Learning for Multi-Label and Multi-Object Image Retrieval

    Jonathan MOJOO  Takio KURITA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2021/03/08
      Vol:
    E104-D No:6
      Page(s):
    873-880

    Content-based image retrieval has been a hot topic among computer vision researchers for a long time. There have been many advances over the years, one of the recent ones being deep metric learning, inspired by the success of deep neural networks in many machine learning tasks. The goal of metric learning is to extract good high-level features from image pixel data using neural networks. These features provide useful abstractions, which can enable algorithms to perform visual comparison between images with human-like accuracy. To learn these features, supervised information of image similarity or relative similarity is often used. One important issue in deep metric learning is how to define similarity for multi-label or multi-object scenes in images. Traditionally, pairwise similarity is defined based on the presence of a single common label between two images. However, this definition is very coarse and not suitable for multi-label or multi-object data. Another common mistake is to completely ignore the multiplicity of objects in images, hence ignoring the multi-object facet of certain types of datasets. In our work, we propose an approach for learning deep image representations based on the relative similarity of both multi-label and multi-object image data. We introduce an intuitive and effective similarity metric based on the Jaccard similarity coefficient, which is equivalent to the intersection over union of two label sets. Hence we treat similarity as a continuous, as opposed to discrete quantity. We incorporate this similarity metric into a triplet loss with an adaptive margin, and achieve good mean average precision on image retrieval tasks. We further show, using a recently proposed quantization method, that the resulting deep feature can be quantized whilst preserving similarity. We also show that our proposed similarity metric performs better for multi-object images than a previously proposed cosine similarity-based metric. Our proposed method outperforms several state-of-the-art methods on two benchmark datasets.

  • Multi-Structural Texture Analysis Using Mathematical Morphology

    Lei YANG  Akira ASANO  Liang LI  Chie MURAKI ASANO  Takio KURITA  

     
    PAPER-Image

      Vol:
    E95-A No:10
      Page(s):
    1759-1767

    In this paper, we propose a novel texture analysis method capable of estimating multiple primitives, which are elements repetitively arranged to compose a texture, in multi-structured textures. The approach is based on a texture description model that uses mathematical morphology, called the “Primitive, Grain, and Point Configuration (PGPC)” texture model. The estimation of primitives based on the PGPC texture model involves searching the optimal structuring element for primitives according to a size distribution function and a magnification. The proposed method achieves the following two improvements: (1) the simultaneous estimation of a multiple number of primitives in multi-structured textures with a ranking of primitives on the basis of their significance. and (2) the introduction of flexibility in the process of magnification to obtain a higher degree of fitness of large grains. With a computational combination of different primitives, the method provides an ordered priority to denote the significance of elements. The promising performance of the proposed method is experimentally shown by a comparison with conventional methods.

  • Completion of Missing Labels for Multi-Label Annotation by a Unified Graph Laplacian Regularization

    Jonathan MOJOO  Yu ZHAO  Muthu Subash KAVITHA  Junichi MIYAO  Takio KURITA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2020/07/03
      Vol:
    E103-D No:10
      Page(s):
    2154-2161

    The task of image annotation is becoming enormously important for efficient image retrieval from the web and other large databases. However, huge semantic information and complex dependency of labels on an image make the task challenging. Hence determining the semantic similarity between multiple labels on an image is useful to understand any incomplete label assignment for image retrieval. This work proposes a novel method to solve the problem of multi-label image annotation by unifying two different types of Laplacian regularization terms in deep convolutional neural network (CNN) for robust annotation performance. The unified Laplacian regularization model is implemented to address the missing labels efficiently by generating the contextual similarity between labels both internally and externally through their semantic similarities, which is the main contribution of this study. Specifically, we generate similarity matrices between labels internally by using Hayashi's quantification method-type III and externally by using the word2vec method. The generated similarity matrices from the two different methods are then combined as a Laplacian regularization term, which is used as the new objective function of the deep CNN. The Regularization term implemented in this study is able to address the multi-label annotation problem, enabling a more effectively trained neural network. Experimental results on public benchmark datasets reveal that the proposed unified regularization model with deep CNN produces significantly better results than the baseline CNN without regularization and other state-of-the-art methods for predicting missing labels.

  • An Efficient Clustering Algorithm for Region Merging

    Takio KURITA  

     
    PAPER

      Vol:
    E78-D No:12
      Page(s):
    1546-1551

    This paper proposes an efficient clustering algorithm for region merging. To speed up the search of the best pair of regions which is merged into one region, dissimilarity values of all possible pairs of regions are stored in a heap. Then the best pair can be found as the element of the root node of the binary tree corresponding to the heap. Since only adjacent pairs of regions are possible to be merged in image segmentation, this constraints of neighboring relations are represented by sorted linked lists. Then we can reduce the computation for updating the dissimilarity values and neighboring relations which are influenced by the merging of the best pair. The proposed algorithm is applied to the segmentations of a monochrome image and range images.

  • A Method to Reduce Redundant Hidden Nodes

    Iwao SEKITA  Takio KURITA  David K. Y. CHIU  Hideki ASOH  

     
    PAPER-Network Synthesis

      Vol:
    E77-D No:4
      Page(s):
    443-449

    The number of nodes in a hidden layer of a feed-forward layered network reflects an optimality condition of the network in coding a function. It also affects the computation time and the ability of the network to generalize. When an arbitrary number of hidden nodes is used in designing the network, redundancy of hidden nodes often can be seen. In this paper, a method of reducing hidden nodes is proposed on the condition that a reduced network maintains the performances of the original network within an accepted level of tolerance. This method can be applied to estimate the performances of a network with fewer hidden nodes. The estimated performances indicate the lower bounds of the actual performances of the network. Experiments were performed using the Fisher's IRIS data, a set of SONAR data, and the XOR data for classification. The results suggest that sufficient number of hidden nodes, fewer than the original number, can be estimated by the proposed method.

  • Consistency Regularization on Clean Samples for Learning with Noisy Labels

    Yuichiro NOMURA  Takio KURITA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/10/28
      Vol:
    E105-D No:2
      Page(s):
    387-395

    In the recent years, deep learning has achieved significant results in various areas of machine learning. Deep learning requires a huge amount of data to train a model, and data collection techniques such as web crawling have been developed. However, there is a risk that these data collection techniques may generate incorrect labels. If a deep learning model for image classification is trained on a dataset with noisy labels, the generalization performance significantly decreases. This problem is called Learning with Noisy Labels (LNL). One of the recent researches on LNL, called DivideMix [1], has successfully divided the dataset into samples with clean labels and ones with noisy labels by modeling loss distribution of all training samples with a two-component Mixture Gaussian model (GMM). Then it treats the divided dataset as labeled and unlabeled samples and trains the classification model in a semi-supervised manner. Since the selected samples have lower loss values and are easy to classify, training models are in a risk of overfitting to the simple pattern during training. To train the classification model without overfitting to the simple patterns, we propose to introduce consistency regularization on the selected samples by GMM. The consistency regularization perturbs input images and encourages model to outputs the same value to the perturbed images and the original images. The classification model simultaneously receives the samples selected as clean and their perturbed ones, and it achieves higher generalization performance with less overfitting to the selected samples. We evaluated our method with synthetically generated noisy labels on CIFAR-10 and CIFAR-100 and obtained results that are comparable or better than the state-of-the-art method.

  • Object Tracking by Maximizing Classification Score of Detector Based on Rectangle Features

    Akinori HIDAKA  Kenji NISHIDA  Takio KURITA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E91-D No:8
      Page(s):
    2163-2170

    In this paper, we propose a novel classifier-based object tracker. Our tracker is the combination of Rectangle Feature (RF) based detector [17],[18] and optical-flow based tracking method [1]. We show that the gradient of extended RFs can be calculated rapidly by using Integral Image method. The proposed tracker was tested on real video sequences. We applied our tracker for face tracking and car tracking experiments. Our tracker worked over 100 fps while maintaining comparable accuracy to RF based detector. Our tracking routine that does not contain image I/O processing can be performed about 500 to 2,500 fps with sufficient tracking accuracy.

  • Adaptive Background Estimation: Computing a Pixel-Wise Learning Rate from Local Confidence and Global Correlation Values

    Mickael PIC  Luc BERTHOUZE  Takio KURITA  

     
    PAPER-Background Estimation

      Vol:
    E87-D No:1
      Page(s):
    50-57

    Adaptive background techniques are useful for a wide spectrum of applications, ranging from security surveillance, traffic monitoring to medical and space imaging. With a properly estimated background, moving or new objects can be easily detected and tracked. Existing techniques are not suitable for real-world implementation, either because they are slow or because they do not perform well in the presence of frequent outliers or camera motion. We address the issue by computing a learning rate for each pixel, a function of a local confidence value that estimates whether a pixel is (or not) an outlier, and a global correlation value that detects camera motion. After discussing the role of each parameter, we report experimental results, showing that our technique is fast but efficient, even in a real-world situation. Furthermore, we show that the same method applies equally well to a 3-camera stereoscopic system for depth perception.

  • Improvements of Local Descriptor in HOG/SIFT by BOF Approach

    Zhouxin YANG  Takio KURITA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E97-D No:5
      Page(s):
    1293-1303

    Numerous studies have been focusing on the improvement of bag of features (BOF), histogram of oriented gradient (HOG) and scale invariant feature transform (SIFT). However, few works have attempted to learn the connection between them even though the latter two are widely used as local feature descriptor for the former one. Motivated by the resemblance between BOF and HOG/SIFT in the descriptor construction, we improve the performance of HOG/SIFT by a) interpreting HOG/SIFT as a variant of BOF in descriptor construction, and then b) introducing recently proposed approaches of BOF such as locality preservation, data-driven vocabulary, and spatial information preservation into the descriptor construction of HOG/SIFT, which yields the BOF-driven HOG/SIFT. Experimental results show that the BOF-driven HOG/SIFT outperform the original ones in pedestrian detection (for HOG), scene matching and image classification (for SIFT). Our proposed BOF-driven HOG/SIFT can be easily applied as replacements of the original HOG/SIFT in current systems since they are generalized versions of the original ones.

  • A Kernel-Based Fisher Discriminant Analysis for Face Detection

    Takio KURITA  Toshiharu TAGUCHI  

     
    PAPER-Pattern Recognition

      Vol:
    E88-D No:3
      Page(s):
    628-635

    This paper presents a modification of kernel-based Fisher discriminant analysis (FDA) to design one-class classifier for face detection. In face detection, it is reasonable to assume "face" images to cluster in certain way, but "non face" images usually do not cluster since different kinds of images are included. It is difficult to model "non face" images as a single distribution in the discriminant space constructed by the usual two-class FDA. Also the dimension of the discriminant space constructed by the usual two-class FDA is bounded by 1. This means that we can not obtain higher dimensional discriminant space. To overcome these drawbacks of the usual two-class FDA, the discriminant criterion of FDA is modified such that the trace of covariance matrix of "face" class is minimized and the sum of squared errors between the average vector of "face" class and feature vectors of "non face" images are maximized. By this modification a higher dimensional discriminant space can be obtained. Experiments are conducted on "face" and "non face" classification using face images gathered from the available face databases and many face images on the Web. The results show that the proposed method can outperform the support vector machine (SVM). A close relationship between the proposed kernel-based FDA and kernel-based Principal Component Analysis (PCA) is also discussed.

  • Scale Invariant Face Detection and Classification Method Using Shift Invariant Features Extracted from Log-Polar Image

    Kazuhiro HOTTA  Taketoshi MISHIMA  Takio KURITA  

     
    PAPER

      Vol:
    E84-D No:7
      Page(s):
    867-878

    This paper presents a scale invariant face detection and classification method which uses shift invariant features extracted from a Log-Polar image. Scale changes of a face in an image are represented as shift along the horizontal axis in the Log-Polar image. In order to obtain scale invariant features, shift invariant features are extracted from each row of the Log-Polar image. Autocorrelations, Fourier spectrum, and PARCOR coefficients are used as shift invariant features. These features are then combined with simple classification methods based on Linear Discriminant Analysis to realize scale invariant face detection and classification. The effectiveness of the proposed face detection method is confirmed by experiments using face images captured under different scales, backgrounds, illuminations, and dates. To evaluate the proposed face classification method, we performed experiments using 2,800 face images with 7 scales under 2 different backgrounds and face images of 52 persons.

  • Estimation of Camera Rotation Using Quasi Moment Features

    Hiroyuki SHIMAI  Toshikatsu KAWAMOTO  Takaomi SHIGEHARA  Taketoshi MISHIMA  Masaru TANAKA  Takio KURITA  

     
    PAPER

      Vol:
    E83-A No:6
      Page(s):
    1005-1013

    We present two estimation methods for camera rotation from two images obtained by the active camera before and after rotation. Based on the representation of the projected rotation group, quasi moment features are constructed. Camera rotation can be estimated by applying the singular value decomposition (SVD) or Newton's method to tensor quasi moment features. In both cases, we can estimate 3D rotation of the active camera from only two projected images. We also give some experiments for the estimation of the actual active camera rotation to show the effectiveness of these methods.

  • Effect of Additive Noise for Multi-Layered Perceptron with AutoEncoders

    Motaz SABRI  Takio KURITA  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2017/04/20
      Vol:
    E100-D No:7
      Page(s):
    1494-1504

    This paper investigates the effect of noises added to hidden units of AutoEncoders linked to multilayer perceptrons. It is shown that internal representation of learned features emerges and sparsity of hidden units increases when independent Gaussian noises are added to inputs of hidden units during the deep network training. It is also shown that the weights that connect the contaminated hidden units with the next layer have smaller values and outputs of hidden units tend to be more definite (0 or 1). This is expected to improve the generalization ability of the network through this automatic structuration by adding the noises. This network structuration was confirmed by experiments for MNIST digits classification via a deep neural network model.

  • Extraction of Combined Features from Global/Local Statistics of Visual Words Using Relevant Operations

    Tetsu MATSUKAWA  Takio KURITA  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E93-D No:10
      Page(s):
    2870-2874

    This paper presents a combined feature extraction method to improve the performance of bag-of-features image classification. We apply 10 relevant operations to global/local statistics of visual words. Because the pairwise combination of visual words is large, we apply feature selection methods including fisher discriminant criterion and L1-SVM. The effectiveness of the proposed method is confirmed through the experiment.

  • FOREWORD Open Access

    Takio KURITA  

     
    FOREWORD

      Vol:
    E92-D No:7
      Page(s):
    1337-1337
  • Improved Head and Data Augmentation to Reduce Artifacts at Grid Boundaries in Object Detection

    Shinji UCHINOURA  Takio KURITA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/10/23
      Vol:
    E107-D No:1
      Page(s):
    115-124

    We investigated the influence of horizontal shifts of the input images for one stage object detection method. We found that the object detector class scores drop when the target object center is at the grid boundary. Many approaches have focused on reducing the aliasing effect of down-sampling to achieve shift-invariance. However, down-sampling does not completely solve this problem at the grid boundary; it is necessary to suppress the dispersion of features in pixels close to the grid boundary into adjacent grid cells. Therefore, this paper proposes two approaches focused on the grid boundary to improve this weak point of current object detection methods. One is the Sub-Grid Feature Extraction Module, in which the sub-grid features are added to the input of the classification head. The other is Grid-Aware Data Augmentation, where augmented data are generated by the grid-level shifts and are used in training. The effectiveness of the proposed approaches is demonstrated using the COCO validation set after applying the proposed method to the FCOS architecture.

  • An Efficient Search Method Based on Dynamic Attention Map by Ising Model

    Kazuhiro HOTTA  Masaru TANAKA  Takio KURITA  Taketoshi MISHIMA  

     
    PAPER

      Vol:
    E88-D No:10
      Page(s):
    2286-2295

    This paper presents Dynamic Attention Map by Ising model for face detection. In general, a face detector can not know where faces there are and how many faces there are in advance. Therefore, the face detector must search the whole regions on the image and requires much computational time. To speed up the search, the information obtained at previous search points should be used effectively. In order to use the likelihood of face obtained at previous search points effectively, Ising model is adopted to face detection. Ising model has the two-state spins; "up" and "down". The state of a spin is updated by depending on the neighboring spins and an external magnetic field. Ising spins are assigned to "face" and "non-face" states of face detection. In addition, the measured likelihood of face is integrated into the energy function of Ising model as the external magnetic field. It is confirmed that face candidates would be reduced effectively by spin flip dynamics. To improve the search performance further, the single level Ising search method is extended to the multilevel Ising search. The interactions between two layers which are characterized by the renormalization group method is used to reduce the face candidates. The effectiveness of the multilevel Ising search method is also confirmed by the comparison with the single level Ising search method.