IEICE global.ieice.org Site

Keyword Search Result

[Keyword] naive Bayes(8hit)

1-8hit

Machine Learning-Based Approach for Depression Detection in Twitter Using Content and Activity Features
Hatoon S. ALSAGRI Mourad YKHLEF

PAPER-Data Engineering, Web Information Systems

Pubricized:
2020/04/24
Vol:
E103-D No:8
Page(s):
1825-1832
Social media channels, such as Facebook, Twitter, and Instagram, have altered our world forever. People are now increasingly connected than ever and reveal a sort of digital persona. Although social media certainly has several remarkable features, the demerits are undeniable as well. Recent studies have indicated a correlation between high usage of social media sites and increased depression. The present study aims to exploit machine learning techniques for detecting a probable depressed Twitter user based on both, his/her network behavior and tweets. For this purpose, we trained and tested classifiers to distinguish whether a user is depressed or not using features extracted from his/her activities in the network and tweets. The results showed that the more features are used, the higher are the accuracy and F-measure scores in detecting depressed users. This method is a data-driven, predictive approach for early detection of depression or other mental illnesses. This study's main contribution is the exploration part of the features and its impact on detecting the depression level.
Hardware-Accelerated Secured Naïve Bayesian Filter Based on Partially Homomorphic Encryption
Song BIAN Masayuki HIROMOTO Takashi SATO

PAPER-Cryptography and Information Security

Vol:
E102-A No:2
Page(s):
430-439
In this work, we provide the first practical secure email filtering scheme based on homomorphic encryption. Specifically, we construct a secure naïve Bayesian filter (SNBF) using the Paillier scheme, a partially homomorphic encryption (PHE) scheme. We first show that SNBF can be implemented with only the additive homomorphism, thus eliminating the need to employ expensive fully homomorphic schemes. In addition, the design space for specialized hardware architecture realizing SNBF is explored. We utilize a recursive Karatsuba Montgomery structure to accelerate the homomorphic operations, where multiplication of 2048-bit integers are carried out. Through the experiment, both software and hardware versions of the SNBF are implemented. On software, 104-105x runtime and 103x storage reduction are achieved by SNBF, when compared to existing fully homomorphic approaches. By instantiating the designed hardware for SNBF, a further 33x runtime and 1919x power reduction are achieved. The proposed hardware implementation classifies an average-length email in under 0.5s, which is much more practical than existing solutions.
Empirical Studies of a Kernel Density Estimation Based Naive Bayes Method for Software Defect Prediction
Haijin JI Song HUANG Xuewei LV Yaning WU Yuntian FENG

PAPER-Software Engineering

Pubricized:
2018/10/03
Vol:
E102-D No:1
Page(s):
75-84
Software defect prediction (SDP) plays a significant part in allocating testing resources reasonably, reducing testing costs, and ensuring software quality. One of the most widely used algorithms of SDP models is Naive Bayes (NB) because of its simplicity, effectiveness and robustness. In NB, when a data set has continuous or numeric attributes, they are generally assumed to follow normal distributions and incorporate the probability density function of normal distribution into their conditional probabilities estimates. However, after conducting a Kolmogorov-Smirnov test, we find that the 21 main software metrics follow non-normal distribution at the 5% significance level. Therefore, this paper proposes an improved NB approach, which estimates the conditional probabilities of NB with kernel density estimation of training data sets, to help improve the prediction accuracy of NB for SDP. To evaluate the proposed method, we carry out experiments on 34 software releases obtained from 10 open source projects provided by PROMISE repository. Four well-known classification algorithms are included for comparison, namely Naive Bayes, Support Vector Machine, Logistic Regression and Random Tree. The obtained results show that this new method is more successful than the four well-known classification algorithms in the most software releases.
Naive Bayes Classifier Based Partitioner for MapReduce
Lei CHEN Wei LU Ergude BAO Liqiang WANG Weiwei XING Yuanyuan CAI

PAPER-Graphs and Networks

Vol:
E101-A No:5
Page(s):
778-786
MapReduce is an effective framework for processing large datasets in parallel over a cluster. Data locality and data skew on the reduce side are two essential issues in MapReduce. Improving data locality can decrease network traffic by moving reduce tasks to the nodes where the reducer input data is located. Data skew will lead to load imbalance among reducer nodes. Partitioning is an important feature of MapReduce because it determines the reducer nodes to which map output results will be sent. Therefore, an effective partitioner can improve MapReduce performance by increasing data locality and decreasing data skew on the reduce side. Previous studies considering both essential issues can be divided into two categories: those that preferentially improve data locality, such as LEEN, and those that preferentially improve load balance, such as CLP. However, all these studies ignore the fact that for different types of jobs, the priority of data locality and data skew on the reduce side may produce different effects on the execution time. In this paper, we propose a naive Bayes classifier based partitioner, namely, BAPM, which achieves better performance because it can automatically choose the proper algorithm (LEEN or CLP) by leveraging the naive Bayes classifier, i.e., considering job type and bandwidth as classification attributes. Our experiments are performed in a Hadoop cluster, and the results show that BAPM boosts the computing performance of MapReduce. The selection accuracy reaches 95.15%. Further, compared with other popular algorithms, under specific bandwidths, the improvement BAPM achieved is up to 31.31%.
Scalable Privacy-Preserving Data Mining with Asynchronously Partitioned Datasets
Hiroaki KIKUCHI Daisuke KAGAWA Anirban BASU Kazuhiko ISHII Masayuki TERADA Sadayuki HONGO

PAPER-Public Key Based Protocols

Vol:
E96-A No:1
Page(s):
111-120
In the Naive Bayes classification problem using a vertically partitioned dataset, the conventional scheme to preserve privacy of each partition uses a secure scalar product and is based on the assumption that the data is synchronized amongst common unique identities. In this paper, we attempt to discard this assumption in order to develop a more efficient and secure scheme to perform classification with minimal disclosure of private data. Our proposed scheme is based on the work by Vaidya and Clifton [2], which uses commutative encryption to perform secure set intersection so that the parties with access to the individual partitions have no knowledge of the intersection. The evaluations presented in this paper are based on experimental results, which show that our proposed protocol scales well with large sparse datasets*.
Comparative Analysis of Automatic Exudate Detection between Machine Learning and Traditional Approaches
Akara SOPHARAK Bunyarit UYYANONVARA Sarah BARMAN Thomas WILLIAMSON

PAPER-Biological Engineering

Vol:
E92-D No:11
Page(s):
2264-2271
To prevent blindness from diabetic retinopathy, periodic screening and early diagnosis are neccessary. Due to lack of expert ophthalmologists in rural area, automated early exudate (one of visible sign of diabetic retinopathy) detection could help to reduce the number of blindness in diabetic patients. Traditional automatic exudate detection methods are based on specific parameter configuration, while the machine learning approaches which seems more flexible may be computationally high cost. A comparative analysis of traditional and machine learning of exudates detection, namely, mathematical morphology, fuzzy c-means clustering, naive Bayesian classifier, Support Vector Machine and Nearest Neighbor classifier are presented. Detected exudates are validated with expert ophthalmologists' hand-drawn ground-truths. The sensitivity, specificity, precision, accuracy and time complexity of each method are also compared.
Semi-Supervised Learning to Classify Evaluative Expressions from Labeled and Unlabeled Texts
Yasuhiro SUZUKI Hiroya TAKAMURA Manabu OKUMURA

PAPER

Vol:
E90-D No:10
Page(s):
1516-1522
In this paper, we present a method to automatically acquire a large-scale vocabulary of evaluative expressions from a large corpus of blogs. For the purpose, this paper presents a semi-supervised method for classifying evaluative expressions, that is, tuples of subjects, their attributes, and evaluative words, that indicate either favorable or unfavorable opinions towards a specific subject. Due to its characteristics, our semi-supervised method can classify evaluative expressions in a corpus by their polarities, starting from a very small set of seed training examples and using contextual information in the sentences the expressions belong to. Our experimental results with real Weblog data as our corpus show that this bootstrapping approach can improve the accuracy of methods for classifying favorable and unfavorable opinions. We also show that a reasonable amount of evaluative expressions can be really acquired.
Topic Document Model Approach for Naive Bayes Text Classification
Sang-Bum KIM Hae-Chang RIM Jin-Dong KIM

LETTER-Natural Language Processing

Vol:
E88-D No:5
Page(s):
1091-1094
The multinomial naive Bayes model has been widely used for probabilistic text classification. However, the parameter estimation for this model sometimes generates inappropriate probabilities. In this paper, we propose a topic document model for the multinomial naive Bayes text classification, where the parameters are estimated from normalized term frequencies of each training document. Experiments are conducted on Reuters 21578 and 20 Newsgroup collections, and our proposed approach obtained a significant improvement in performance compared to the traditional multinomial naive Bayes.

Keyword Search Result

[Keyword] naive Bayes(8hit)

Machine Learning-Based Approach for Depression Detection in Twitter Using Content and Activity Features

Hardware-Accelerated Secured Naïve Bayesian Filter Based on Partially Homomorphic Encryption

Empirical Studies of a Kernel Density Estimation Based Naive Bayes Method for Software Defect Prediction

Naive Bayes Classifier Based Partitioner for MapReduce

Scalable Privacy-Preserving Data Mining with Asynchronously Partitioned Datasets

Comparative Analysis of Automatic Exudate Detection between Machine Learning and Traditional Approaches

Semi-Supervised Learning to Classify Evaluative Expressions from Labeled and Unlabeled Texts

Topic Document Model Approach for Naive Bayes Text Classification

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles