IEICE global.ieice.org Site

Keyword Search Result

[Keyword] microblog(6hit)

1-6hit

Capacity Control of Social Media Diffusion for Real-Time Analysis System
Miki ENOKI Issei YOSHIDA Masato OGUCHI

PAPER

Pubricized:
2017/01/17
Vol:
E100-D No:4
Page(s):
776-784
In Twitter-like services, countless messages are being posted in real-time every second all around the world. Timely knowledge about what kinds of information are diffusing in social media is quite important. For example, in emergency situations such as earthquakes, users provide instant information on their situation through social media. The collective intelligence of social media is useful as a means of information detection complementary to conventional observation. We have developed a system for monitoring and analyzing information diffusion data in real-time by tracking retweeted tweets. A tweet retweeted by many users indicates that they find the content interesting and impactful. Analysts who use this system can find tweets retweeted by many users and identify the key people who are retweeted frequently by many users or who have retweeted tweets about particular topics. However, bursting situations occur when thousands of social media messages are suddenly posted simultaneously, and the lack of machine resources to handle such situations lowers the system's query performance. Since our system is designed to be used interactively in real-time by many analysts, waiting more than one second for a query results is simply not acceptable. To maintain an acceptable query performance, we propose a capacity control method for filtering incoming tweets using extra attribute information from tweets themselves. Conventionally, there is a trade-off between the query performance and the accuracy of the analysis results. We show that the query performance is improved by our proposed method and that our method is better than the existing methods in terms of maintaining query accuracy.
Microblog Retrieval Using Ensemble of Feature Sets through Supervised Feature Selection
Abu Nowshed CHY Md Zia ULLAH Masaki AONO

PAPER

Pubricized:
2017/01/17
Vol:
E100-D No:4
Page(s):
793-806
Microblog, especially twitter, has become an integral part of our daily life for searching latest news and events information. Due to the short length characteristics of tweets and frequent use of unconventional abbreviations, content-relevance based search cannot satisfy user's information need. Recent research has shown that considering temporal and contextual aspects in this regard has improved the retrieval performance significantly. In this paper, we focus on microblog retrieval, emphasizing the alleviation of the vocabulary mismatch, and the leverage of the temporal (e.g., recency and burst nature) and contextual characteristics of tweets. To address the temporal and contextual aspect of tweets, we propose new features based on query-tweet time, word embedding, and query-tweet sentiment correlation. We also introduce some popularity features to estimate the importance of a tweet. A three-stage query expansion technique is applied to improve the relevancy of tweets. Moreover, to determine the temporal and sentiment sensitivity of a query, we introduce query type determination techniques. After supervised feature selection, we apply random forest as a feature ranking method to estimate the importance of selected features. Then, we make use of ensemble of learning to rank (L2R) framework to estimate the relevance of query-tweet pair. We conducted experiments on TREC Microblog 2011 and 2012 test collections over the TREC Tweets2011 corpus. Experimental results demonstrate the effectiveness of our method over the baseline and known related works in terms of precision at 30 (P@30), mean average precision (MAP), normalized discounted cumulative gain at 30 (NDCG@30), and R-precision (R-Prec) metrics.
Incorporation of Target Specific Knowledge for Sentiment Analysis on Microblogging
Yongyos KAEWPITAKKUN Kiyoaki SHIRAI

PAPER

Pubricized:
2016/01/14
Vol:
E99-D No:4
Page(s):
959-968
Sentiment analysis of microblogging has become an important classification task because a large amount of user-generated content is published on the Internet. In Twitter, it is common that a user expresses several sentiments in one tweet. Therefore, it is important to classify the polarity not of the whole tweet but of a specific target about which people express their opinions. Moreover, the performance of the machine learning approach greatly depends on the domain of the training data and it is very time-consuming to manually annotate a large set of tweets for a specific domain. In this paper, we propose a method for sentiment classification at the target level by incorporating the on-target sentiment features and user-aware features into the classifier trained automatically from the data createdfor the specific target. An add-on lexicon, extended target list, and competitor list are also constructed as knowledge sources for the sentiment analysis. None of the processes in the proposed framework require manual annotation. The results of our experiment show that our method is effective and improves on the performance of sentiment classification compared to the baselines.
Exploring Time Aware Features in Microblog to Measure TV Ratings
Joon Yeon CHOEH Hong Joo LEE Eugene J. S. WON

LETTER-Office Information Systems, e-Business Modeling

Vol:
E97-D No:10
Page(s):
2810-2813
In measuring TV ratings, some features can be significant at a certain time, whereas they can be meaningless in other time periods. Because the importance of features can change, a model capturing the time changing relevance is required in order to estimate TV ratings more accurately. Therefore, we focus on the time-awareness of features, particularly the time when the words of tweets are used. We develop a correlation-based, time-aware feature selection algorithm which finds the optimal time period of each feature, and the estimation method using e-SVR based on top-n-features that are ordered by correlation. We identify that the correlation values between features and TV ratings vary according to the time of postings - before and after the broadcast time. This implies that the relevance of features can change according to the time of the tweets. Experimental results indicate that the proposed method has better performance compared with the method based on count-based features. This result implies that understanding the time-dependency of features can be helpful in improving the accuracy of measuring TV ratings.
Creating Stories from Socially Curated Microblog Messages
Akisato KIMURA Kevin DUH Tsutomu HIRAO Katsuhiko ISHIGURO Tomoharu IWATA Albert AU YEUNG

PAPER-Artificial Intelligence, Data Mining

Vol:
E97-D No:6
Page(s):
1557-1566
Social media such as microblogs have become so pervasive such that it is now possible to use them as sensors for real-world events and memes. While much recent research has focused on developing automatic methods for filtering and summarizing these data streams, we explore a different trend called social curation. In contrast to automatic methods, social curation is characterized as a human-in-the-loop and sometimes crowd-sourced mechanism for exploiting social media as sensors. Although social curation web services like Togetter, Naver Matome and Storify are gaining popularity, little academic research has studied the phenomenon. In this paper, our goal is to investigate the phenomenon and potential of this new field of social curation. First, we perform an in-depth analysis of a large corpus of curated microblog data. We seek to understand why and how people participate in this laborious curation process. We then explore new ways in which information retrieval and machine learning technologies can be used to assist curators. In particular, we propose a novel method based on a learning-to-rank framework that increases the curator's productivity and breadth of perspective by suggesting which novel microblogs should be added to the curated content.
Affect Computation of Chinese Short Text
Xia MAO Lin JIANG Yuli XUE

LETTER-Natural Language Processing

Vol:
E95-D No:11
Page(s):
2741-2744
Microblogs are a rising social network with distinguishing features such as simplicity and convenience and has already attracted a large number of users and triggered massive information explosion concerning individuals' own statuses and opinions. While sentiment analysis of the messages in microblogs is of great value, most of present studies are on English microblogs and few are on Chinese microblogs. Compared to English, Chinese has its unique expression style, such as no spaces or other word delimiters. Furthermore, Chinese short text also has its own properties. Thus we are inspired to explore effective features for sentiment classification of Chinese short text. In this paper, we propose to study user-related sentiment classification of Chinese microblogs in terms of the statistical and semantic characteristics, and deisgn the corresponding features: ratio of positive words and negative words (PNR), position feature (POS), collocation of verbs (COL), auxiliary words (AU). Then we employ an SVM-based method to classify the sentiment. Experiments show that the features we design is effective in recognizing the sentiment of messages in microblogs.