1-3hit |
Xuan WANG Bofeng ZHANG Mingqing HUANG Furong CHANG Zhuocheng ZHOU
When individuals make a purchase from online sources, they may lack first-hand knowledge of the product. In such cases, they will judge the quality of the item by the reviews other consumers have posted. Therefore, it is significant to determine whether comments about a product are credible. Most often, conventional research on comment credibility has employed supervised machine learning methods, which have the disadvantage of needing large quantities of training data. This paper proposes an unsupervised method for judging comment credibility based on the Biterm Sentiment Latent Dirichlet Allocation (BS-LDA) model. Using this approach, first we derived some distributions and calculated each comment's credibility score via them. A comment's credibility was judged based on whether it achieved a threshold score. Our experimental results using comments from Amazon.com demonstrated that the overall performance of our approach can play an important role in determining the credibility of comments in some situation.
Daichi KITAMURA Hiroshi SARUWATARI Kosuke YAGI Kiyohiro SHIKANO Yu TAKAHASHI Kazunobu KONDO
In this letter, we address monaural source separation based on supervised nonnegative matrix factorization (SNMF) and propose a new penalized SNMF. Conventional SNMF often degrades the separation performance owing to the basis-sharing problem. Our penalized SNMF forces nontarget bases to become different from the target bases, which increases the separated sound quality.
In-Su KANG Seung-Hoon NA Jong-Hyeok LEE
Compound noun segmentation is a key component for Korean language processing. Supervised approaches require some types of human intervention such as maintaining lexicons, manually segmenting the corpora, or devising heuristic rules. Thus, they suffer from the unknown word problem, and cannot distinguish domain-oriented or corpus-directed segmentation results from the others. These problems can be overcome by unsupervised approaches that employ segmentation clues obtained purely from a raw corpus. However, most unsupervised approaches require tuning of empirical parameters or learning of the statistical dictionary. To develop a tuning-less, learning-free unsupervised segmentation algorithm, this study proposes a pruning-based unsupervised technique that eliminates unhelpful segmentation candidates. In addition, unlike previous unsupervised methods that have relied on purely character-based segmentation clues, this study utilizes word-based segmentation clues. Experimental evaluations show that the pruning scheme is very effective to unsupervised segmentation of Korean compound nouns, and the use of word-based prior knowledge enables better segmentation accuracy. This study also shows that the proposed algorithm performs competitively with or better than other unsupervised methods.