The search functionality is under construction.

Keyword Search Result

[Keyword] pLSA(7hit)

1-7hit
  • Self-Learning pLSA Model for Abnormal Behavior Detection in Crowded Scenes

    Shuoyan LIU  Enze YANG  Kai FANG  

     
    LETTER-Pattern Recognition

      Pubricized:
    2020/11/30
      Vol:
    E104-D No:3
      Page(s):
    473-476

    Abnormal behavior detection is now a widely concerned research field, especially for crowded scenes. However, most traditional unsupervised approaches often suffered from the problem when the normal events in the scenario with large visual variety. This paper proposes a self-learning probabilistic Latent Semantic Analysis, which aims at taking full advantage of the high-level abnormal information to solve problems. We select the informative observations to construct the “reference events” from the training sets as a high-level guidance cue. Specifically, the training set is randomly divided into two separate subsets. One is used to learn this model, which is defined as the initialization sequence of “reference events”. The other aims to update this model and the the infrequent samples are chosen into the “reference events”. Finally, we define anomalies using events that are least similar to “reference events”. The experimental result demonstrates that the proposed model can detect anomalies accurately and robustly in the real-world crowd environment.

  • Multi-Scale Multi-Level Generative Model in Scene Classification

    Wenjie XIE  De XU  Yingjun TANG  Geng CUI  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E94-D No:1
      Page(s):
    167-170

    Previous works show that the probabilistic Latent Semantic Analysis (pLSA) model is one of the best generative models for scene categorization and can obtain an acceptable classification accuracy. However, this method uses a certain number of topics to construct the final image representation. In such a way, it restricts the image description to one level of visual detail and cannot generate a higher accuracy rate. In order to solve this problem, we propose a novel generative model, which is referred to as multi-scale multi-level probabilistic Latent Semantic Analysis model (msml-pLSA). This method consists of two parts: multi-scale part, which extracts visual details from the image of diverse resolutions, and multi-level part, which concentrates multiple levels of topic representation to model scene. The msml-pLSA model allows for the description of fine and coarse local image detail in one framework. The proposed method is evaluated on the well-known scene classification dataset with 15 scene categories, and experimental results show that the proposed msml-pLSA model can improve the classification accuracy compared with the typical classification methods.

  • Multiple Object Category Detection and Localization Using Generative and Discriminative Models

    Dipankar DAS  Yoshinori KOBAYASHI  Yoshinori KUNO  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E92-D No:10
      Page(s):
    2112-2121

    This paper proposes an integrated approach to simultaneous detection and localization of multiple object categories using both generative and discriminative models. Our approach consists of first generating a set of hypotheses for each object category using a generative model (pLSA) with a bag of visual words representing each object. Based on the variation of objects within a category, the pLSA model automatically fits to an optimal number of topics. Then, the discriminative part verifies each hypothesis using a multi-class SVM classifier with merging features that combines spatial shape and appearance of an object. In the post-processing stage, environmental context information along with the probabilistic output of the SVM classifier is used to improve the overall performance of the system. Our integrated approach with merging features and context information allows reliable detection and localization of various object categories in the same image. The performance of the proposed framework is evaluated on the various standards (MIT-CSAIL, UIUC, TUD etc.) and the authors' own datasets. In experiments we achieved superior results to some state of the art methods over a number of standard datasets. An extensive experimental evaluation on up to ten diverse object categories over thousands of images demonstrates that our system works for detecting and localizing multiple objects within an image in the presence of cluttered background, substantial occlusion, and significant scale changes.

  • Category Constrained Learning Model for Scene Classification

    Yingjun TANG  De XU  Guanghua GU  Shuoyan LIU  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E92-D No:2
      Page(s):
    357-360

    We present a novel model, named Category Constraint-Latent Dirichlet Allocation (CC-LDA), to learn and recognize natural scene category. Previous work had to resort to additional classifier after obtaining image topic representation. Our model puts the category information in topic inference, so every category is represented in a different topics simplex and topic size, which is consistent with human cognitive habit. The significant feature in our model is that it can do discrimination without combined additional classifier, during the same time of getting topic representation. We investigate the classification performance with variable scene category tasks. The experiments have demonstrated that our learning model can get better performance with less training data.

  • Adaptively Combining Local with Global Information for Natural Scenes Categorization

    Shuoyan LIU  De XU  Xu YANG  

     
    LETTER-Image Recognition, Computer Vision

      Vol:
    E91-D No:7
      Page(s):
    2087-2090

    This paper proposes the Extended Bag-of-Visterms (EBOV) to represent semantic scenes. In previous methods, most representations are bag-of-visterms (BOV), where visterms referred to the quantized local texture information. Our new representation is built by introducing global texture information to extend standard bag-of-visterms. In particular we apply the adaptive weight to fuse the local and global information together in order to provide a better visterm representation. Given these representations, scene classification can be performed by pLSA (probabilistic Latent Semantic Analysis) model. The experiment results show that the appropriate use of global information improves the performance of scene classification, as compared with BOV representation that only takes the local information into account.

  • Language Modeling Using PLSA-Based Topic HMM

    Atsushi SAKO  Tetsuya TAKIGUCHI  Yasuo ARIKI  

     
    PAPER-Language Modeling

      Vol:
    E91-D No:3
      Page(s):
    522-528

    In this paper, we propose a PLSA-based language model for sports-related live speech. This model is implemented using a unigram rescaling technique that combines a topic model and an n-gram. In the conventional method, unigram rescaling is performed with a topic distribution estimated from a recognized transcription history. This method can improve the performance, but it cannot express topic transition. By incorporating the concept of topic transition, it is expected that the recognition performance will be improved. Thus, the proposed method employs a "Topic HMM" instead of a history to estimate the topic distribution. The Topic HMM is an Ergodic HMM that expresses typical topic distributions as well as topic transition probabilities. Word accuracy results from our experiments confirmed the superiority of the proposed method over a trigram and a PLSA-based conventional method that uses a recognized history.

  • Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions

    Yuya AKITA  Tatsuya KAWAHARA  

     
    PAPER-Spoken Language Systems

      Vol:
    E88-D No:3
      Page(s):
    439-445

    Appropriate language modeling is one of the major issues for automatic transcription of spontaneous speech. We propose an adaptation method for statistical language models based on both topic and speaker characteristics. This approach is applied for automatic transcription of meetings and panel discussions, in which multiple participants speak on a given topic in their own speaking style. A baseline language model is a mixture of two models, which are trained with different corpora covering various topics and speakers, respectively. Then, probabilistic latent semantic analysis (PLSA) is performed on the same respective corpora and the initial ASR result to provide two sets of unigram probabilities conditioned on input speech, with regard to topics and speaker characteristics, respectively. Finally, the baseline model is adapted by scaling N-gram probabilities with these unigram probabilities. For speaker adaptation purpose, we make use of a portion of the Corpus of Spontaneous Japanese (CSJ) in which a large number of speakers gave talks for given topics. Experimental evaluation with real discussions showed that both topic and speaker adaptation reduced test-set perplexity, and in total, an average reduction rate of 8.5% was obtained. Furthermore, improvement on word accuracy was also achieved by the proposed adaptation method.