Khine Yin MON Masanari KONDO Eunjong CHOI Osamu MIZUNO
Defect prediction approaches have been greatly contributing to software quality assurance activities such as code review or unit testing. Just-in-time defect prediction approaches are developed to predict whether a commit is a defect-inducing commit or not. Prior research has shown that commit-level prediction is not enough in terms of effort, and a defective commit may contain both defective and non-defective files. As the defect prediction community is promoting fine-grained granularity prediction approaches, we propose our novel class-level prediction, which is finer-grained than the file-level prediction, based on the files of the commits in this research. We designed our model for Python projects and tested it with ten open-source Python projects. We performed our experiment with two settings: setting with product metrics only and setting with product metrics plus commit information. Our investigation was conducted with three different classifiers and two validation strategies. We found that our model developed by random forest classifier performs the best, and commit information contributes significantly to the product metrics in 10-fold cross-validation. We also created a commit-based file-level prediction for the Python files which do not have the classes. The file-level model also showed a similar condition as the class-level model. However, the results showed a massive deviation in time-series validation for both levels and the challenge of predicting Python classes and files in a realistic scenario.
Kosuke OHARA Hirohisa AMAN Sousuke AMASAKI Tomoyuki YOKOGAWA Minoru KAWAHARA
This paper focuses on the “data collection period” for training a better Just-In-Time (JIT) defect prediction model — the early commit data vs. the recent one —, and conducts a large-scale comparative study to explore an appropriate data collection period. Since there are many possible machine learning algorithms for training defect prediction models, the selection of machine learning algorithms can become a threat to validity. Hence, this study adopts the automatic machine learning method to mitigate the selection bias in the comparative study. The empirical results using 122 open-source software projects prove the trend that the dataset composed of the recent commits would become a better training set for JIT defect prediction models.
Gang JIN Jingsheng ZHAI Jianguo WEI
In this paper, we propose an end-to-end two-branch feature attention network. The network is mainly used for single image dehazing. The network consists of two branches, we call it CAA-Net: 1) A U-NET network composed of different-level feature fusion based on attention (FEPA) structure and residual dense block (RDB). In order to make full use of all the hierarchical features of the image, we use RDB. RDB contains dense connected layers and local feature fusion with local residual learning. We also propose a structure which called FEPA.FEPA structure could retain the information of shallow layer and transfer it to the deep layer. FEPA is composed of serveral feature attention modules (FPA). FPA combines local residual learning with channel attention mechanism and pixel attention mechanism, and could extract features from different channels and image pixels. 2) A network composed of several different levels of FEPA structures. The network could make feature weights learn from FPA adaptively, and give more weight to important features. The final output result of CAA-Net is the combination of all branch prediction results. Experimental results show that the CAA-Net proposed by us surpasses the most advanced algorithms before for single image dehazing.
It has been widely recognized that in compressed sensing, many restricted isometry property (RIP) conditions can be easily obtained by using the null space property (NSP) with its null space constant (NSC) 0<θ≤1 to construct a contradicted method for sparse signal recovery. However, the traditional NSP with θ=1 will lead to conservative RIP conditions. In this paper, we extend the NSP with 0<θ<1 to a scale NSP, which uses a factor τ to scale down all vectors belonged to the Null space of a sensing matrix. Following the popular proof procedure and using the scale NSP, we establish more relaxed RIP conditions with the scale factor τ, which guarantee the bounded approximation recovery of all sparse signals in the bounded noisy through the constrained l1 minimization. An application verifies the advantages of the scale factor in the number of measurements.
Naotake YAMAMOTO Taichi SASAKI Atsushi YAMAMOTO Tetsuya HISHIKAWA Kentaro SAITO Jun-ichi TAKADA Toshiyuki MAEYAMA
A path loss prediction formula for IoT (Internet of Things) wireless communication close to ceiling beams in the 920MHz band is presented. In first step of our investigation, we conduct simulations using the FDTD (Finite Difference Time Domain) method and propagation tests close to a beam on the ceiling of a concrete building. In the second step, we derive a path loss prediction formula from the simulation results by using the FDTD method, by dividing into three regions of LoS (line-of-sight) situation, situation in the vicinity of the beam, and NLoS (non-line-of-sight) situation, depending on the positional relationship between the beam and transmitter (Tx) and receiver (Rx) antennas. For each condition, the prediction formula is expressed by a relatively simple form as a function of height of the antennas with respect to the beam bottom. Thus, the prediction formula is very useful for the wireless site planning for the IoT wireless devices set close to concrete beam ceiling.
In [2], Choi et al. proposed an identity-based password-authenticated key exchange (iPAKE) protocol using the Boneh-Franklin IBE scheme, and its generic construction (UKAM-PiE) that was standardized in ISO/IEC 11770-4/AMD 1. In this paper, we show that the iPAKE and UKAM-PiE protocols are insecure against passive/active attacks by a malicious PKG (Private Key Generator) where the malicious PKG can find out all clients' passwords by just eavesdropping on the communications, and the PKG can share a session key with any client by impersonating the server. Then, we propose a strengthened PAKE (for short, SPAIBE) protocol with IBE, which prevents such a malicious PKG's passive/active attacks. Also, we formally prove the security of the SPAIBE protocol in the random oracle model and compare relevant PAKE protocols in terms of efficiency, number of passes, and security against a malicious PKG.
Hailan ZHOU Longyun KANG Xinwei DUAN Ming ZHAO
In the conventional single-phase PWM rectifier, the sinusoidal fluctuating current and voltage on the grid side will generate power ripple with a doubled grid frequency which leads to a secondary ripple in the DC output voltage, and the switching frequency of the conventional model predictive control strategy is not fixed. In order to solve the above two problems, a control strategy for suppressing the secondary ripple based on the three-vector fixed-frequency model predictive current control is proposed. Taking the capacitive energy storage type single-phase PWM rectifier as the research object, the principle of its active filtering is analyzed and a model predictive control strategy is proposed. Simulation and experimental results show that the proposed strategy can significantly reduce the secondary ripple of the DC output voltage, reduce the harmonic content of the input current, and achieve a constant switching frequency.
Koji ISHIBASHI Takanori HARA Sota UCHIMURA Tetsuya IYE Yoshimi FUJII Takahide MURAKAMI Hiroyuki SHINBO
In this paper, we propose new radio access network (RAN) architecture for reliable millimeter-wave (mmWave) communications, which has the flexibility to meet users' diverse and fluctuating requirements in terms of communication quality. This architecture is composed of multiple radio units (RUs) connected to a common distributed unit (DU) via fronthaul links to virtually enlarge its coverage. We further present grant-free non-orthogonal multiple access (GF-NOMA) for low-latency uplink communications with a massive number of users and robust coordinated multi-point (CoMP) transmission using blockage prediction for uplink/downlink communications with a high data rate and a guaranteed minimum data rate as the technical pillars of the proposed RAN. The numerical results indicate that our proposed architecture can meet completely different user requirements and realize a user-centric design of the RAN for beyond 5G/6G.
Stance prediction on social media aims to infer the stances of users towards a specific topic or event, which are not expressed explicitly. It is of great significance for public opinion analysis to extract and determine users' stances using user-generated content on social media. Existing research makes use of various signals, ranging from text content to online network connections of users on these platforms. However, it lacks joint modeling of the heterogeneous information for stance prediction. In this paper, we propose a self-supervised heterogeneous graph contrastive learning framework for stance prediction in online debate forums. Firstly, we perform data augmentation on the original heterogeneous information network to generate an augmented view. The original view and augmented view are learned from a meta-path based graph encoder respectively. Then, the contrastive learning among the two views is conducted to obtain high-quality representations of users and issues. Finally, the stance prediction is accomplished by matrix factorization between users and issues. The experimental results on an online debate forum dataset show that our model outperforms other competitive baseline methods significantly.
Yoshitaka KIDANI Haruhisa KATO Kei KAWAMURA Hiroshi WATANABE
Geometric partitioning mode (GPM) is a new inter prediction tool adopted in versatile video coding (VVC), which is the latest video coding of international standard developed by joint video expert team in 2020. Different from the regular inter prediction performed on rectangular blocks, GPM separates a coding block into two regions by the pre-defined 64 types of straight lines, generates inter predicted samples for each separated region, and then blends them to obtain the final inter predicted samples. With this feature, GPM improves the prediction accuracy at the boundary between the foreground and background with different motions. However, GPM has room to further improve the prediction accuracy if the final predicted samples can be generated using not only inter prediction but also intra prediction. In this paper, we propose a GPM with inter and intra prediction to achieve further enhanced compression capability beyond VVC. To maximize the coding performance of the proposed method, we also propose the restriction of the applicable intra prediction mode number and the prohibition of applying the intra prediction to both GPM-separated regions. The experimental results show that the proposed method improves the coding performance gain by the conventional GPM method of VVC by 1.3 times, and provides an additional coding performance gain of 1% bitrate savings in one of the coding structures for low-latency video transmission where the conventional GPM method cannot be utilized.
Zhi LIU Jia CAO Xiaohan GUAN Mengmeng ZHANG
Inter-channel correlation is one of the redundancy which need to be eliminated in video coding. In the latest video coding standard H.266/VVC, the DM (Direct Mode) and CCLM (Cross-component Linear Model) modes have been introduced to reduce the similarity between luminance and chroma. However, inter-channel correlation is still observed. In this paper, a new inter-channel prediction algorithm is proposed, which utilizes coloring principle to predict chroma pixels. From the coloring perspective, for most natural content video frames, the three components Y, U and V always demonstrate similar coloring pattern. Therefore, the U and V components can be predicted using the coloring pattern of the Y component. In the proposed algorithm, correlation coefficients are obtained in a lightweight way to describe the coloring relationship between current pixel and reference pixel in Y component, and used to predict chroma pixels. The optimal position for the reference samples is also designed. Base on the selected position of the reference samples, two new chroma prediction modes are defined. Experiment results show that, compared with VTM 12.1, the proposed algorithm has an average of -0.92% and -0.96% BD-rate improvement for U and V components, for All Intra (AI) configurations. At the same time, the increased encoding time and decoding time can be ignored.
Tomu MAKITA Atsuki NAGAO Tatsuki OKADA Kazuhisa SETO Junichi TERUYAMA
A branching program is a well-studied model of computation and a representation for Boolean functions. It is a directed acyclic graph with a unique root node, some accepting nodes, and some rejecting nodes. Except for the accepting and rejecting nodes, each node has a label with a variable and each outgoing edge of the node has a label with a 0/1 assignment of the variable. The satisfiability problem for branching programs is, given a branching program with n variables and m nodes, to determine if there exists some assignment that activates a consistent path from the root to an accepting node. The width of a branching program is the maximum number of nodes at any level. The satisfiability problem for width-2 branching programs is known to be NP-complete. In this paper, we present a satisfiability algorithm for width-2 branching programs with n variables and cn nodes, and show that its running time is poly(n)·2(1-µ(c))n, where µ(c)=1/2O(c log c). Our algorithm consists of two phases. First, we transform a given width-2 branching program to a set of some structured formulas that consist of AND and Exclusive-OR gates. Then, we check the satisfiability of these formulas by a greedy restriction method depending on the frequency of the occurrence of variables.
Masamoto FUKAWA Xiaoqi DENG Shinya IMAI Taiga HORIGUCHI Ryo ONO Ikumi RACHI Sihan A Kazuma SHINOMURA Shunsuke NIWA Takeshi KUDO Hiroyuki ITO Hitoshi WAKABAYASHI Yoshihiro MIYAKE Atsushi HORI
A method to predict lightning by machine learning analysis of atmospheric electric fields is proposed for the first time. In this study, we calculated an anomaly score with long short-term memory (LSTM), a recurrent neural network analysis method, using electric field data recorded every second on the ground. The threshold value of the anomaly score was defined, and a lightning alarm at the observation point was issued or canceled. Using this method, it was confirmed that 88.9% of lightning occurred while alarming. These results suggest that a lightning prediction system with an electric field sensor and machine learning can be developed in the future.
Chenchen MENG Jun WANG Chengzhi DENG Yuanyun WANG Shengqian WANG
Feature representation is a key component of most visual tracking algorithms. It is difficult to deal with complex appearance changes with low-level hand-crafted features due to weak representation capacities of such features. In this paper, we propose a novel tracking algorithm through combining a joint dictionary pair learning with convolutional neural networks (CNN). We utilize CNN model that is trained on ImageNet-Vid to extract target features. The CNN includes three convolutional layers and two fully connected layers. A dictionary pair learning follows the second fully connected layer. The joint dictionary pair is learned upon extracted deep features by the trained CNN model. The temporal variations of target appearances are learned in the dictionary learning. We use the learned dictionaries to encode target candidates. A linear combination of atoms in the learned dictionary is used to represent target candidates. Extensive experimental evaluations on OTB2015 demonstrate the superior performances against SOTA trackers.
Recently several researchers have proposed various methods to build intelligent stock trading and portfolio management systems using rapid advancements in artificial intelligence including machine learning techniques. However, existing technical analysis-based stock price prediction studies primarily depend on price change or price-related moving average patterns, and information related to trading volume is only used as an auxiliary indicator. This study focuses on the effect of changes in trading volume on stock prices and proposes a novel method for short-term stock price predictions based on trading volume patterns. Two rapid volume decrease patterns are defined based on the combinations of multiple volume moving averages. The dataset filtered using these patterns is learned through the supervised learning of neural networks. Experimental results based on the data from Korea Composite Stock Price Index and Korean Securities Dealers Automated Quotation, show that the proposed prediction system can achieve a trading performance that significantly exceeds the market average.
Loan default prediction has been a significant problem in the financial domain because overdue loans may incur significant losses. Machine learning methods have been introduced to solve this problem, but there are still many challenges including feature multicollinearity, imbalanced labels, and small data sample problems. To replicate the success of deep learning in many areas, an effective regularization technique named muddling label regularization is introduced in this letter, and an ensemble of feed-forward neural networks is proposed, which outperforms machine learning and deep learning baselines in a real-world dataset.
The shared last level cache (SLLC) in tile chip multiprocessors (TCMP) provides a low off-chip miss rate, but it causes a long on-chip access latency. In the two-level cache hierarchy, data replication stores replicas of L1 victims in the local LLC (L2 cache) to obtain a short local LLC access latency on the next accesses. Many data replication mechanisms have been proposed, but they do not consider both L1 victim reuse behaviors and LLC replica reception capability. They either produce many useless replicas or increase LLC pressure, which limits the improvement of system performance. In this paper, we propose a two-level cache aware adaptive data replication mechanism (TCDR), which controls replication based on both L1 victim reuse behaviors prediction and LLC replica reception capability monitoring. TCDR not only increases the accuracy of L1 replica selection, but also avoids the pressure of replication on LLC. The results show that TCDR improves the system performance with reasonable hardware overhead.
Hiroaki NAKABAYASHI Kiyoaki ITOI
Basic characteristics for relating design and base station layout design in land mobile communications are provided through a propagation model for path loss prediction. Owing to the rapid annual increase in traffic data, the number of base stations has increased accordingly. Therefore, propagation models for various scenarios and frequency bands are necessitated. To solve problems optimization and creation methods using the propagation model, a path loss prediction method that merges multiple models in machine learning is proposed herein. The method is discussed based on measurement values from Kitakyushu-shi. In machine learning, the selection of input parameters and suppression of overlearning are important for achieving highly accurate predictions. Therefore, the acquisition of conventional models based on the propagation environment and the use of input parameters of high importance are proposed. The prediction accuracy for Kitakyushu-shi using the proposed method indicates a root mean square error (RMSE) of 3.68dB. In addition, predictions are performed in Narashino-shi to confirm the effectiveness of the method in other urban scenarios. Results confirm the effectiveness of the proposed method for the urban scenario in Narashino-shi, and an RMSE of 4.39dB is obtained for the accuracy.
Jiaheng LIU Ryusuke EGAWA Hiroyuki TAKIZAWA
As the number of cores on a processor increases, cache hierarchies contain more cache levels and a larger last level cache (LLC). Thus, the power and energy consumption of the cache hierarchy becomes non-negligible. Meanwhile, because the cache usage behaviors of individual applications can be different, it is possible to achieve higher energy efficiency of the computing system by determining the appropriate cache configurations for individual applications. This paper proposes a cache control mechanism to improve energy efficiency by adjusting a cache hierarchy to each application. Our mechanism first bypasses and disables a less-significant cache level, then partially disables the LLC, and finally adjusts the associativity if it suffers from a large number of conflict misses. The mechanism can achieve significant energy saving at the sacrifice of small performance degradation. The evaluation results show that our mechanism improves energy efficiency by 23.9% and 7.0% on average over the baseline and the cache-level bypassing mechanisms, respectively. In addition, even if the LLC resource contention occurs, the proposed mechanism is still effective for improving energy efficiency.
Kana MIYAMOTO Hiroki TANAKA Satoshi NAKAMURA
Music is often used for emotion induction because it can change the emotions of people. However, since we subjectively feel different emotions when listening to music, we propose an emotion induction system that generates music that is adapted to each individual. Our system automatically generates suitable music for emotion induction based on the emotions predicted from an electroencephalogram (EEG). We examined three elements for constructing our system: 1) a music generator that creates music that induces emotions that resemble the inputs, 2) emotion prediction using EEG in real-time, and 3) the control of a music generator using the predicted emotions for making music that is suitable for inducing emotions. We constructed our proposed system using these elements and evaluated it. The results showed its effectiveness for inducing emotions and suggest that feedback loops that tailor stimuli to individuals can successfully induce emotions.