IEICE global.ieice.org Site

Author Search Result

[Author] Akito MONDEN(13hit)

1-13hit

Industry Application of Software Development Task Measurement System: TaskPit
Pawin SUTHIPORNOPAS Pattara LEELAPRUTE Akito MONDEN Hidetake UWANO Yasutaka KAMEI Naoyasu UBAYASHI Kenji ARAKI Kingo YAMADA Ken-ichi MATSUMOTO

PAPER-Software Engineering

Pubricized:
2016/12/20
Vol:
E100-D No:3
Page(s):
462-472
To identify problems in a software development process, we have been developing an automated measurement tool called TaskPit, which monitors software development tasks such as programming, testing and documentation based on the execution history of software applications. This paper introduces the system requirements, design and implementation of TaskPit; then, presents two real-world case studies applying TaskPit to actual software development. In the first case study, we applied TaskPit to 12 software developers in a certain software development division. As a result, several concerns (to be improved) have been revealed such as (a) a project leader spent too much time on development tasks while he was supposed to be a manager rather than a developer, (b) several developers rarely used e-mails despite the company's instruction to use e-mail as much as possible to leave communication records during development, and (c) several developers wrote too long e-mails to their customers. In the second case study, we have recorded the planned, actual, and self reported time of development tasks. As a result, we found that (d) there were unplanned tasks in more than half of days, and (e) the declared time became closer day by day to the actual time measured by TaskPit. These findings suggest that TaskPit is useful not only for a project manager who is responsible for process monitoring and improvement but also for a developer who wants to improve by him/herself.
Investigating and Projecting Population Structures in Open Source Software Projects: A Case Study of Projects in GitHub
Saya ONOUE Hideaki HATA Akito MONDEN Kenichi MATSUMOTO

PAPER-Software Engineering

Pubricized:
2016/02/05
Vol:
E99-D No:5
Page(s):
1304-1315
- HTML
- PDF(838.4KB) >> Buy this Article
- Errata[Uploaded on July 1,2016]
GitHub is a developers' social networking service that hosts a great number of open source software (OSS) projects. Although some of the hosted projects are growing and have many developers, most projects are organized by a few developers and face difficulties in terms of sustainability. OSS projects depend mainly on volunteer developers, and attracting and retaining these volunteers are major concerns of the project stakeholders. To investigate the population structures of OSS development communities in detail and conduct software analytics to obtain actionable information, we apply a demographic approach. Demography is the scientific study of population and seeks to identify the levels and trends in the size and components of a population. This paper presents a case study, investigating the characteristics of the population structures of OSS projects on GitHub, and shows population projections generated with the well-known cohort component method. We found that there are four types of population structures in OSS development communities in terms of experiences and contributions. In addition, we projected the future population accurately using a cohort component population projection method. This method predicts a population of the next period using a survival rate calculated from past population. To the best of our knowledge, this is the first study that applied demography to the field of OSS research. Our approach addressing OSS-related problems based on demography will bring new insights, since studying population is novel in OSS research. Understanding current and future structures of OSS projects can help practitioners to monitor a project, gain awareness of what is happening, manage risks, and evaluate past decisions.
An Algorithm for Automatic Collation of Vocabulary Decks Based on Word Frequency
Zeynep YÜCEL Parisa SUPITAYAKUL Akito MONDEN Pattara LEELAPRUTE

PAPER-Educational Technology

Pubricized:
2020/05/07
Vol:
E103-D No:8
Page(s):
1865-1874
This study focuses on computer based foreign language vocabulary learning systems. Our objective is to automatically build vocabulary decks with desired levels of relative difficulty relations. To realize this goal, we exploit the fact that word frequency is a good indicator of vocabulary difficulty. Subsequently, for composing the decks, we pose two requirements as uniformity and diversity. Namely, the difficulty level of the cards in the same deck needs to be uniform enough so that they can be grouped together and difficulty levels of the cards in different decks need to be diverse enough so that they can be grouped in different decks. To assess uniformity and diversity, we use rank-biserial correlation and propose an iterative algorithm, which helps in attaining desired levels of uniformity and diversity based on word frequency in daily use of language. In experiments, we employed a spaced repetition flashcard software and presented users various decks built with the proposed algorithm, which contain cards from different content types. From users' activity logs, we derived several behavioral variables and examined the polyserial correlation between these variables and difficulty levels across different word classes. This analysis confirmed that the decks compiled with the proposed algorithm induce an effect on behavioral variables in line with the expectations. In addition, a series of experiments with decks involving varying content types confirmed that this relation is independent of word class.
LSA-X: Exploiting Productivity Factors in Linear Size Adaptation for Analogy-Based Software Effort Estimation
Passakorn PHANNACHITTA Akito MONDEN Jacky KEUNG Kenichi MATSUMOTO

PAPER-Software Engineering

Pubricized:
2015/10/15
Vol:
E99-D No:1
Page(s):
151-162
Analogy-based software effort estimation has gained a considerable amount of attention in current research and practice. Its excellent estimation accuracy relies on its solution adaptation stage, where an effort estimate is produced from similar past projects. This study proposes a solution adaptation technique named LSA-X that introduces an approach to exploit the potential of productivity factors, i.e., project variables with a high correlation with software productivity, in the solution adaptation stage. The LSA-X technique tailors the exploitation of the productivity factors with a procedure based on the Linear Size Adaptation (LSA) technique. The results, based on 19 datasets show that in circumstances where a dataset exhibits a high correlation coefficient between productivity and a related factor (r≥0.30), the proposed LSA-X technique statistically outperformed (95% confidence) the other 8 commonly used techniques compared in this study. In other circumstances, our results suggest using any linear adaptation technique based on software size to compensate for the limitations of the LSA-X technique.
Empirical Evaluation of Mimic Software Project Data Sets for Software Effort Estimation
Maohua GAN Zeynep YÜCEL Akito MONDEN Kentaro SASAKI

PAPER-Software Engineering

Pubricized:
2020/07/03
Vol:
E103-D No:10
Page(s):
2094-2103
To conduct empirical research on industry software development, it is necessary to obtain data of real software projects from industry. However, only few such industry data sets are publicly available; and unfortunately, most of them are very old. In addition, most of today's software companies cannot make their data open, because software development involves many stakeholders, and thus, its data confidentiality must be strongly preserved. To that end, this study proposes a method for artificially generating a “mimic” software project data set, whose characteristics (such as average, standard deviation and correlation coefficients) are very similar to a given confidential data set. Instead of using the original (confidential) data set, researchers are expected to use the mimic data set to produce similar results as the original data set. The proposed method uses the Box-Muller transform for generating normally distributed random numbers; and exponential transformation and number reordering for data mimicry. To evaluate the efficacy of the proposed method, effort estimation is considered as potential application domain for employing mimic data. Estimation models are built from 8 reference data sets and their concerning mimic data. Our experiments confirmed that models built from mimic data sets show similar effort estimation performance as the models built from original data sets, which indicate the capability of the proposed method in generating representative samples.
A Novel Approach to Address External Validity Issues in Fault Prediction Using Bandit Algorithms
Teruki HAYAKAWA Masateru TSUNODA Koji TODA Keitaro NAKASAI Amjed TAHIR Kwabena Ebo BENNIN Akito MONDEN Kenichi MATSUMOTO

LETTER-Software Engineering

Pubricized:
2020/10/30
Vol:
E104-D No:2
Page(s):
327-331
Various software fault prediction models have been proposed in the past twenty years. Many studies have compared and evaluated existing prediction approaches in order to identify the most effective ones. However, in most cases, such models and techniques provide varying results, and their outcomes do not result in best possible performance across different datasets. This is mainly due to the diverse nature of software development projects, and therefore, there is a risk that the selected models lead to inconsistent results across multiple datasets. In this work, we propose the use of bandit algorithms in cases where the accuracy of the models are inconsistent across multiple datasets. In the experiment discussed in this work, we used four conventional prediction models, tested on three different dataset, and then selected the best possible model dynamically by applying bandit algorithms. We then compared our results with those obtained using majority voting. As a result, Epsilon-greedy with ϵ=0.3 showed the best or second-best prediction performance compared with using only one prediction model and majority voting. Our results showed that bandit algorithms can provide promising outcomes when used in fault prediction.
Tamper-Resistant Software System Based on a Finite State Machine
Akito MONDEN Antoine MONSIFROT Clark THOMBORSON

PAPER-Tamper-Resistance

Vol:
E88-A No:1
Page(s):
112-122
Many computer systems are designed to make it easy for end-users to install and update software. An undesirable side effect, from the perspective of many software producers, is that hostile end-users may analyze or tamper with the software being installed or updated. This paper proposes a way to avoid the side effect without affecting the ease of installation and updating. We construct a computer system M with the following properties: 1) the end-user may install a program P in any conveniently accessible area of M; 2) the program P contains encoded instructions whose semantics are obscure and difficult to understand; and 3) an internal interpreter W, embedded in a non-accessible area of M, interprets the obfuscated instructions without revealing their semantics. Our W is a finite state machine (FSM) which gives context-dependent semantics and operand syntax to the encoded instructions in P; thus, attempts to statically analyze the relation between instructions and their semantics will not succeed. We present a systematic method for constructing a P whose instruction stream is always interpreted correctly regardless of its input, even though changes in input will (in general) affect the execution sequence of instructions in P. Our framework is easily applied to conventional computer systems by adding a FSM unit to a virtual machine or a reconfigurable processor.
Java Birthmarks--Detecting the Software Theft--
Haruaki TAMADA Masahide NAKAMURA Akito MONDEN Ken-ichi MATSUMOTO

PAPER-Application Information Security

Vol:
E88-D No:9
Page(s):
2148-2158
To detect the theft of Java class files efficiently, we propose a concept of Java birthmarks, which are unique and native characteristics of every class file. For a pair of class files p and q, if q has the same birthmark as p's, q is suspected as a copy of p. Ideally, the birthmarks should satisfy the following properties: (a) preservation - the birthmarks should be preserved even if the original class file is tampered with, and (b) distinction - independent class files must be distinguished by completely different birthmarks. Taking (a) and (b) into account, we propose four types of birthmarks for Java class files. To show the effectiveness of the proposed birthmarks, we conduct three experiments. In the first experiment, we demonstrate that the proposed birthmarks are sufficiently robust against automatic program transformation (93.3876% of the birthmarks were preserved). The second experiment shows that the proposed birthmarks successfully distinguish non-copied files in a practical Java application (97.8005% of given class files were distinguished). In the third experiment, we exploit different Java compilers to confirm that the proposed Java birthmarks are core characteristics independent of compiler-specific issues.
Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization
Takashi WATANABE Akito MONDEN Zeynep YÜCEL Yasutaka KAMEI Shuji MORISAKI

PAPER-Software Engineering

Pubricized:
2018/06/13
Vol:
E101-D No:9
Page(s):
2269-2278
Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.
Exploiting Eye Movements for Evaluating Reviewer's Performance in Software Review
Hidetake UWANO Masahide NAKAMURA Akito MONDEN Ken-ichi MATSUMOTO

PAPER-Reliability, Maintainability and Safety Analysis

Vol:
E90-A No:10
Page(s):
2290-2300
This paper proposes to use eye movements to characterize the performance of individuals in reviewing software documents. We design and implement a system called DRESREM, which measures and records eye movements of document reviewers. Based on the eye movements captured by eye tracking device, the system computes the line number of the document that the reviewer is currently looking at. The system can also record and play back how the eyes moved during the review process. To evaluate the effectiveness of the system we conducted an experiment to analyze 30 processes of source code review (6 programs, 5 subjects) using the system. As a result, we have identified a particular pattern, called scan, in the subject's eye movements. Quantitative analysis showed that reviewers who did not spend enough time on the scan took more time to find defects on average.
Customizing GQM Models for Software Project Monitoring
Akito MONDEN Tomoko MATSUMURA Mike BARKER Koji TORII Victor R. BASILI

PAPER

Vol:
E95-D No:9
Page(s):
2169-2182
This paper customizes Goal/Question/Metric (GQM) project monitoring models for various projects and organizations to take advantage of the data from the software tool EPM and to allow the tailoring of the interpretation models based upon the context and success criteria for each project and organization. The basic idea is to build less concrete models that do not include explicit baseline values to interpret metrics values. Instead, we add hypothesis and interpretation layers to the models to help people of different projects make decisions in their own context. We applied the models to two industrial projects, and found that our less concrete models could successfully identify typical problems in software projects.
Analysis of Work Efficiency and Quality of Software Maintenance Using Cross-Company Dataset
Masateru TSUNODA Akito MONDEN Kenichi MATSUMOTO Sawako OHIWA Tomoki OSHINO

PAPER

Pubricized:
2020/08/31
Vol:
E104-D No:1
Page(s):
76-90
Software maintenance is an important activity in the software lifecycle. Software maintenance does not only mean removing faults found after software release. Software needs extensions or modifications of its functions owing to changes in the business environment and software maintenance also refers to them. To help users and service suppliers benchmark work efficiency for software maintenance, and to clarify the relationships between software quality, work efficiency, and unit cost of staff, we used a dataset that includes 134 data points collected by the Economic Research Association in 2012, and analyzed the factors that affected the work efficiency of software maintenance. In the analysis, using a multiple regression model, we clarified the relationships between work efficiency and programming language and productivity factors. To analyze the influence to the quality, relationships of fault ratio was analyzed using correlation coefficients. The programming language and productivity factors affect work efficiency. Higher work efficiency and higher unit cost of staff do not affect the quality of software maintenance.
Influence of Outliers on Estimation Accuracy of Software Development Effort
Kenichi ONO Masateru TSUNODA Akito MONDEN Kenichi MATSUMOTO

PAPER

Pubricized:
2020/10/02
Vol:
E104-D No:1
Page(s):
91-105
When applying estimation methods, the issue of outliers is inevitable. The extent of their influence has not been clarified, though several studies have evaluated outlier elimination methods. It is unclear whether we should always be sensitive to outliers, whether outliers should always be removed before estimation, and what amount of precaution is required for collecting project data. Therefore, the goal of this study is to illustrate a guideline that suggests how sensitively we should handle outliers. In the analysis, we experimentally add outliers to three datasets, to analyze their influence. We modified the percentage of outliers, their extent (e.g., we varied the actual effort from 100 to 200 person-hours when the extent was 100%), the variables including outliers (e.g., adding outliers to function points or effort), and the locations of outliers in a dataset. Next, the effort was estimated using these datasets. We used multiple linear regression analysis and analogy based estimation to estimate the development effort. The experimental results indicate that the influence of outliers on the estimation accuracy is non-trivial when the extent or percentage of outliers is considerable (i.e., 100% and 20%, respectively). In contrast, their influence is negligible when the extent and percentage are small (i.e., 50% and 10%, respectively). Moreover, in some cases, the linear regression analysis was less affected by outliers than analogy based estimation.

Author Search Result

[Author] Akito MONDEN(13hit)

Industry Application of Software Development Task Measurement System: TaskPit

Investigating and Projecting Population Structures in Open Source Software Projects: A Case Study of Projects in GitHub

An Algorithm for Automatic Collation of Vocabulary Decks Based on Word Frequency

LSA-X: Exploiting Productivity Factors in Linear Size Adaptation for Analogy-Based Software Effort Estimation

Empirical Evaluation of Mimic Software Project Data Sets for Software Effort Estimation

A Novel Approach to Address External Validity Issues in Fault Prediction Using Bandit Algorithms

Tamper-Resistant Software System Based on a Finite State Machine

Java Birthmarks--Detecting the Software Theft--

Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization

Exploiting Eye Movements for Evaluating Reviewer's Performance in Software Review

Customizing GQM Models for Software Project Monitoring

Analysis of Work Efficiency and Quality of Software Maintenance Using Cross-Company Dataset

Influence of Outliers on Estimation Accuracy of Software Development Effort

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles