1-10hit |
Kosuke OHARA Hirohisa AMAN Sousuke AMASAKI Tomoyuki YOKOGAWA Minoru KAWAHARA
This paper focuses on the “data collection period” for training a better Just-In-Time (JIT) defect prediction model — the early commit data vs. the recent one —, and conducts a large-scale comparative study to explore an appropriate data collection period. Since there are many possible machine learning algorithms for training defect prediction models, the selection of machine learning algorithms can become a threat to validity. Hence, this study adopts the automatic machine learning method to mitigate the selection bias in the comparative study. The empirical results using 122 open-source software projects prove the trend that the dataset composed of the recent commits would become a better training set for JIT defect prediction models.
Takuma HAMAGAMI Shinsuke HARA Hiroyuki YOMO Ryusuke MIYAMOTO Yasutaka KAWAMOTO Takunori SHIMAZAKI Hiroyuki OKUHATA
When we collect vital data from exercisers by putting wireless sensor nodes to them, the reliability of the wireless data collection is dependent on the position of node on the body of exerciser, therefore, in order to determine the suitable body position, it is essential to evaluate the data collection performances by changing the body positions of nodes in experiments involving human subjects. However, their fair comparison is problematic, because the experiments have no repeatability, that is, we cannot evaluate the performances for multiple body positions in an experiment at the same time. In this paper, we predict the performances by a software network simulator. Using two main functions such as a channel state function and a mobility function, the network simulator can repeatedly generate the same channel and mobility conditions for nodes. Numerical result obtained by the network simulator shows that when collecting vital data from twenty two footballers in a game, among three body position such as waist, forearm and calf, the forearm position gives the highest data collection rate and the predicted data collection rates agree well with the ones obtained by an experiment involving real subjects.
Yimin ZHAO Song XIAO Hongping GAN Lizhao LI Lina XIAO
To efficiently collect sensor readings in cluster-based wireless sensor networks, we propose a structural compressed network coding (SCNC) scheme that jointly considers structural compressed sensing (SCS) and network coding (NC). The proposed scheme exploits the structural compressibility of sensor readings for data compression and reconstruction. Random linear network coding (RLNC) is used to re-project the measurements and thus enhance network reliability. Furthermore, we calculate the energy consumption of intra- and inter-cluster transmission and analyze the effect of the cluster size on the total transmission energy consumption. To that end, we introduce an iterative reweighed sparsity recovery algorithm to address the all-or-nothing effect of RLNC and decrease the recovery error. Experiments show that the SCNC scheme can decrease the number of measurements required for decoding and improve the network's robustness, particularly when the loss rate is high. Moreover, the proposed recovery algorithm has better reconstruction performance than several other state-of-the-art recovery algorithms.
Nobuyoshi KOMURO Sho MOTEGI Kosuke SANADA Jing MA Zhetao LI Tingrui PEI Young-June CHOI Hiroo SEKIYA
This paper proposes a Watts and Strogatz-model based routing method for wireless sensor network along with link-exchange operation. The proposed routing achieves low data-collection delay because of hub-node existence. By applying the link exchanges, node with low remaining battery level can escape from a hub node. Therefore, the proposed routing method achieves the fair battery-power consumptions among sensor nodes. It is possible for the proposed method to prolong the network lifetime with keeping the small-world properties. Simulation results show the effectiveness of the proposed method.
Takaaki SUETSUGU Takayuki TORIKAI Hiroshi FURUKAWA
In tree-based wireless sensor networks (WSNs), multihop sensor nodes require a longer time frame to send sensed data to a sink node as the number of hops increases. The time taken for delivery of sensed data becomes a critical issue when a large WSN is deployed. This paper proposes a new data collection scheme with rapid data delivery that utilizes the so-called mobile agent technique. The proposed scheme achieves high data collection efficiency while not relying on route optimization unlike conventional data collection techniques. Simulation results show that the larger the size or the maximum hops of the network, the more effective the proposed scheme becomes. Effectiveness of the proposed scheme is also confirmed through field experiments with actual sensor devices.
Yoshito TOBE Niwat THEPVILOJANAPONG Kaoru SEZAKI
Because of the large scale of wireless sensor networks, the configuration needs to be done autonomously. In this paper, we present Scalable Data Collection (SDC) protocol, a tree-based protocol for collecting data over multi-hop, wireless sensor networks. The design of the protocol aims to satisfy the requirements of sensor networks that every sensor transmits sensed data to a sink node periodically or spontaneously. The sink nodes construct the tree by broadcasting a solicit packet to discover the child nodes. The sensor receiving this packet decides on an appropriate parent to which it will attach, it then broadcasts the same packet to discover its child nodes. Through this process, the tree is created autonomously without any flooding of the routing packets. SDC avoids periodic updating of routing information but the tree need to be reconstructed upon node failures or adding of new nodes. The states required on each sensor are constant and independent of network size, therefore SDC scales better than the existing protocols. Moreover, each sensor can make forwarding decisions regardless of the knowledge on geographical information. We evaluated the performance of SDC by using the ns-2 simulator and comparing with Directed Diffusion, DSR, AODV, and OLSR. The simulation results demonstrate that SDC achieves much higher delivery ratio, shorter delay, as well as high scalability in various scenarios.
Niwat THEPVILOJANAPONG Yoshito TOBE Kaoru SEZAKI
In this paper, we present Scalable Data Collection (SDC) protocol, a tree-based protocol for collecting data over multi-hop, wireless sensor networks. The design of the protocol aims to satisfy the requirements of sensor networks that every sensor transmits sensed data to a sink node periodically or spontaneously. The sink nodes construct the tree by broadcasting a HELLO packet to discover the child nodes. The sensor receiving this packet decides an appropriate parent to which it will attach, it then broadcasts the HELLO packet to discover its child nodes. Based on this process, the tree is quickly created without flooding of any routing packets. SDC avoids periodic updating of routing information but the tree will be reconstructed upon node failures or adding of new nodes. The states required on each sensor are constant and independent of network size, thereby SDC scales better than the existing protocols. Moreover, each sensor can make forwarding decisions regardless of the knowledge on geographical information. We evaluate the performance of SDC by using the ns-2 simulator and comparing with Directed Diffusion, DSR, AODV, and OLSR. The simulation results demonstrate that SDC achieves much higher delivery ratio and lower delay as well as scalability in various scenarios.
Akira FUKUDA Kaiji MUKUMOTO Yasuaki YOSHIHIRO Kei NAKANO Satoshi OHICHI Masashi NAGASAWA Hisao YAMAGISHI Natsuo SATO Akira KADOKURA Huigen YANG Mingwu YAO Sen ZHANG Guojing HE Lijun JIN
In December 2001, the authors started two kinds of experiments on the meteor burst communication (MBC) in Antarctica to study the ability of MBC as a communication medium for data collection systems in that region. In the first experiment, a continuous tone signal is transmitted from Zhongshan Station. The signal received at Syowa Station (about 1,400 km apart) is recorded and analyzed. This experiment is aimed to study basic properties of the meteor burst channel in that high latitude region. On the other hand, the second experiment is designed to estimate data throughput of a commercial MBC system in that region. A remote station at Zhongshan Station tries to transfer data packets each consisting of 10 data words to the master station at Syowa Station. Data packets are generated with five minutes interval. In this paper, we explain the experiments, briefly examine the results of the first year (from April 2002 to March 2003), and put forward the plan for the experiments in the second and third year. From the data available thus far, we can see that 1) the sinusoidal daily variation in the meteor activity typical in middle and low latitude regions can not be clearly seen, 2) non-meteoric propagations frequently dominate the channel especially during night hours, 3) about 60% of the generated data packets are successfully transferred to the master station within two hours delay even though we are now operating the data transfer system only for five minutes in each ten minutes interval, etc.
We propose a new effective method of managing flash memory space for flash memory-specific file systems based on a log-structured file system. Flash memory has attractive features such as non-volatility and fast I/O speed, but it also suffers from inability to update in situ and from limited usage (erase) cycles. These drawbacks necessitate a number of changes to conventional storage (file) management techniques. Our focus is on lowering cleaning cost and evenly utilizing flash memory cells while maintaining a balance between these two often-conflicting goals. The proposed cleaning method performs well especially when storage utilization and the degree of locality are high. The cleaning efficiency is enhanced by dynamically separating cold data and non-cold data, which is called 'collection operation.' The second goal, that of cycle-leveling, is achieved to the degree that the maximum difference between erase cycles is below the error range of the hardware. Experimental results show that the proposed technique provides sufficient performance for reliable flash storage systems.
Shu NAKAZATO Ikuo KUDO Katsuhiko SHIRAI
In this paper, we propose a new method of dialogue data collection which can be used to evaluate modules of a spoken dialogue system. To evaluate the module, it is necessary to use suitable data. Human-human dialogue data have not been appropriate to module evaluation, because spontaneous data usually include too much specific phenomena such as fillers, restarts, pauses, and hesitations. Human-machine dialogue data have not been appropriate to module evaluation, because the dialogue was unnatural and the available vocabularies were limited. Here, we propose 'Hybrid method' for the collection of spoken dialogue data. The merit is that, the collected data can be used as test data for the evaluation of a spoken dialogue system without any modification. In our method a human takes the role of some modules of the system and the system, also, works as the other part of the system together. For example, humans works as the speech recognition module and the dialogue management and a machine does the other part, response generation module. The collected data are good for the evaluation of the speech recognition and the dialogue management modules. The reasons are as follows. (1) Lexicon: The lexicon was composed of limited words and dependent on the task. (2) Grammar: The intention expressed by the subjects were concise and clear. (3) Topics: There were few utterances outside the task domain. The collected data can be used test data for the evaluation of a spoken dialogue system without any modification.