The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] tagging(8hit)

1-8hit
  • Master-Teacher-Student: A Weakly Labelled Semi-Supervised Framework for Audio Tagging and Sound Event Detection

    Yuzhuo LIU  Hangting CHEN  Qingwei ZHAO  Pengyuan ZHANG  

     
    LETTER-Speech and Hearing

      Pubricized:
    2022/01/13
      Vol:
    E105-D No:4
      Page(s):
    828-831

    Weakly labelled semi-supervised audio tagging (AT) and sound event detection (SED) have become significant in real-world applications. A popular method is teacher-student learning, making student models learn from pseudo-labels generated by teacher models from unlabelled data. To generate high-quality pseudo-labels, we propose a master-teacher-student framework trained with a dual-lead policy. Our experiments illustrate that our model outperforms the state-of-the-art model on both tasks.

  • Phrase-Based Statistical Model for Korean Morpheme Segmentation and POS Tagging

    Seung-Hoon NA  Young-Kil KIM  

     
    PAPER-Natural Language Processing

      Pubricized:
    2017/11/13
      Vol:
    E101-D No:2
      Page(s):
    512-522

    In this paper, we propose a novel phrase-based model for Korean morphological analysis by considering a phrase as the basic processing unit, which generalizes all the other existing processing units. The impetus for using phrases this way is largely motivated by the success of phrase-based statistical machine translation (SMT), which convincingly shows that the larger the processing unit, the better the performance. Experimental results using the SEJONG dataset show that the proposed phrase-based models outperform the morpheme-based models used as baselines. In particular, when combined with the conditional random field (CRF) model, our model leads to statistically significant improvements over the state-of-the-art CRF method.

  • Character-Level Dependency Model for Joint Word Segmentation, POS Tagging, and Dependency Parsing in Chinese

    Zhen GUO  Yujie ZHANG  Chen SU  Jinan XU  Hitoshi ISAHARA  

     
    PAPER-Natural Language Processing

      Pubricized:
    2015/10/06
      Vol:
    E99-D No:1
      Page(s):
    257-264

    Recent work on joint word segmentation, POS (Part Of Speech) tagging, and dependency parsing in Chinese has two key problems: the first is that word segmentation based on character and dependency parsing based on word were not combined well in the transition-based framework, and the second is that the joint model suffers from the insufficiency of annotated corpus. In order to resolve the first problem, we propose to transform the traditional word-based dependency tree into character-based dependency tree by using the internal structure of words and then propose a novel character-level joint model for the three tasks. In order to resolve the second problem, we propose a novel semi-supervised joint model for exploiting n-gram feature and dependency subtree feature from partially-annotated corpus. Experimental results on the Chinese Treebank show that our joint model achieved 98.31%, 94.84% and 81.71% for Chinese word segmentation, POS tagging, and dependency parsing, respectively. Our model outperforms the pipeline model of the three tasks by 0.92%, 1.77% and 3.95%, respectively. Particularly, the F1 value of word segmentation and POS tagging achieved the best result compared with those reported until now.

  • Social Network and Tag Sources Based Augmenting Collaborative Recommender System

    Tinghuai MA  Jinjuan ZHOU  Meili TANG  Yuan TIAN  Abdullah AL-DHELAAN  Mznah AL-RODHAAN  Sungyoung LEE  

     
    PAPER-Office Information Systems, e-Business Modeling

      Pubricized:
    2014/12/26
      Vol:
    E98-D No:4
      Page(s):
    902-910

    Recommender systems, which provide users with recommendations of content suited to their needs, have received great attention in today's online business world. However, most recommendation approaches exploit only a single source of input data and suffer from the data sparsity problem and the cold start problem. To improve recommendation accuracy in this situation, additional sources of information, such as friend relationship and user-generated tags, should be incorporated in recommendation systems. In this paper, we revise the user-based collaborative filtering (CF) technique, and propose two recommendation approaches fusing user-generated tags and social relations in a novel way. In order to evaluate the performance of our approaches, we compare experimental results with two baseline methods: user-based CF and user-based CF with weighted friendship similarity using the real datasets (Last.fm and Movielens). Our experimental results show that our methods get higher accuracy. We also verify our methods in cold-start settings, and our methods achieve more precise recommendations than the compared approaches.

  • Exploring Social Relations for Personalized Tag Recommendation in Social Tagging Systems

    Kaipeng LIU  Binxing FANG  Weizhe ZHANG  

     
    PAPER

      Vol:
    E94-D No:3
      Page(s):
    542-551

    With the emergence of Web 2.0, social tagging systems become highly popular in recent years and thus form the so-called folksonomies. Personalized tag recommendation in social tagging systems is to provide a user with a ranked list of tags for a specific resource that best serves the user's needs. Many existing tag recommendation approaches assume that users are independent and identically distributed. This assumption ignores the social relations between users, which are increasingly popular nowadays. In this paper, we investigate the role of social relations in the task of tag recommendation and propose a personalized collaborative filtering algorithm. In addition to the social annotations made by collaborative users, we inject the social relations between users and the content similarities between resources into a graph representation of folksonomies. To fully explore the structure of this graph, instead of computing similarities between objects using feature vectors, we exploit the method of random-walk computation of similarities, which furthermore enable us to model a user's tag preferences with the similarities between the user and all the tags. We combine both the collaborative information and the tag preferences to recommend personalized tags to users. We conduct experiments on a dataset collected from a real-world system. The results of comparative experiments show that the proposed algorithm outperforms state-of-the-art tag recommendation algorithms in terms of prediction quality measured by precision, recall and NDCG.

  • Minimizing Human Intervention for Constructing Korean Part-of-Speech Tagged Corpus

    Do-Gil LEE  Gumwon HONG  Seok Kee LEE  Hae-Chang RIM  

     
    LETTER-Natural Language Processing

      Vol:
    E93-D No:8
      Page(s):
    2336-2338

    The construction of annotated corpora requires considerable manual effort. This paper presents a pragmatic method to minimize human intervention for the construction of Korean part-of-speech (POS) tagged corpus. Instead of focusing on improving the performance of conventional automatic POS taggers, we devise a discriminative POS tagger which can selectively produce either a single analysis or multiple analyses based on the tagging reliability. The proposed approach uses two decision rules to judge the tagging reliability. Experimental results show that the proposed approach can effectively control the quality of corpus and the amount of manual annotation by the threshold value of the rule.

  • Detecting New Words from Chinese Text Using Latent Semi-CRF Models

    Xiao SUN  Degen HUANG  Fuji REN  

     
    PAPER-Natural Language Processing

      Vol:
    E93-D No:6
      Page(s):
    1386-1393

    Chinese new words and their part-of-speech (POS) are particularly problematic in Chinese natural language processing. With the fast development of internet and information technology, it is impossible to get a complete system dictionary for Chinese natural language processing, as new words out of the basic system dictionary are always being created. A latent semi-CRF model, which combines the strengths of LDCRF (Latent-Dynamic Conditional Random Field) and semi-CRF, is proposed to detect the new words together with their POS synchronously regardless of the types of the new words from the Chinese text without being pre-segmented. Unlike the original semi-CRF, the LDCRF is applied to generate the candidate entities for training and testing the latent semi-CRF, which accelerates the training speed and decreases the computation cost. The complexity of the latent semi-CRF could be further adjusted by tuning the number of hidden variables in LDCRF and the number of the candidate entities from the Nbest outputs of the LDCRF. A new-words-generating framework is proposed for model training and testing, under which the definitions and distributions of the new words conform to the ones existing in real text. Specific features called "Global Fragment Information" for new word detection and POS tagging are adopted in the model training and testing. The experimental results show that the proposed method is capable of detecting even low frequency new words together with their POS tags. The proposed model is found to be performing competitively with the state-of-the-art models presented.

  • Joint Chinese Word Segmentation and POS Tagging Using an Error-Driven Word-Character Hybrid Model

    Canasai KRUENGKRAI  Kiyotaka UCHIMOTO  Jun'ichi KAZAMA  Yiou WANG  Kentaro TORISAWA  Hitoshi ISAHARA  

     
    PAPER-Morphological/Syntactic Analysis

      Vol:
    E92-D No:12
      Page(s):
    2298-2305

    In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging. Our word-character hybrid model offers high performance since it can handle both known and unknown words. We describe our strategies that yield good balance for learning the characteristics of known and unknown words and propose an error-driven policy that delivers such balance by acquiring examples of unknown words from particular errors in a training corpus. We describe an efficient framework for training our model based on the Margin Infused Relaxed Algorithm (MIRA), evaluate our approach on the Penn Chinese Treebank, and show that it achieves superior performance compared to the state-of-the-art approaches reported in the literature.