Jung-Been LEE Taek LEE Hoh Peter IN
Mining software artifacts is a useful way to understand the source code of software projects. Topic modeling in particular has been widely used to discover meaningful information from software artifacts. However, software artifacts are unstructured and contain a mix of textual types within the natural text. These software artifact characteristics worsen the performance of topic modeling. Among several natural language pre-processing tasks, removing stop words to reduce meaningless and uninteresting terms is an efficient way to improve the quality of topic models. Although many approaches are used to generate effective stop words, the lists are outdated or too general to apply to mining software artifacts. In addition, the performance of the topic model is sensitive to the datasets used in the training for each approach. To resolve these problems, we propose an automatic stop word generation approach for topic models of software artifacts. By measuring topic coherence among words in the topic using Pointwise Mutual Information (PMI), we added words with a low PMI score to our stop words list for every topic modeling loop. Through our experiment, we proved that our stop words list results in a higher performance of the topic model than lists from other approaches.
Yoichi SASAKI Tetsuo SHIBUYA Kimihito ITO Hiroki ARIMURA
In this paper, we study the approximate point set matching (APSM) problem with minimum RMSD score under translation, rotation, and one-to-one correspondence in d-dimension. Since most of the previous works about APSM problems use similality scores that do not especially care about one-to-one correspondence between points, such as Hausdorff distance, we cannot easily apply previously proposed methods to our APSM problem. So, we focus on speed-up of exhaustive search algorithms that can find all approximate matches. First, we present an efficient branch-and-bound algorithm using a novel lower bound function of the minimum RMSD score for the enumeration version of APSM problem. Then, we modify this algorithm for the optimization version. Next, we present another algorithm that runs fast with high probability when a set of parameters are fixed. Experimental results on both synthetic datasets and real 3-D molecular datasets showed that our branch-and-bound algorithm achieved significant speed-up over the naive algorithm still keeping the advantage of generating all answers.
Ryo KAZAMA Kazuki SEKINE Satoshi ITO
Image quality depends on the randomness of the k-space signal under-sampling in compressed sensing MRI (CS-MRI), especially for two-dimensional image acquisition. We investigate the feasibility of non-random signal under-sampling CS-MRI to stabilize the quality of reconstructed images and avoid arbitrariness in sampling point selection. Regular signal under-sampling for the phase-encoding direction is adopted, in which sampling points are chosen at equal intervals for the phase-encoding direction while varying the sampling density. Curvelet transform was adopted to remove the aliasing artifacts due to regular signal under-sampling. To increase the incoherence between the measurement matrix and the sparsifying transform function, the scale of the curvelet transform was varied in each iterative image reconstruction step. We evaluated the obtained images by the peak-signal-to-noise ratio and root mean squared error in localized 3×3 pixel regions. Simulation studies and experiments showed that the signal-to-noise ratio and the structural similarity index of reconstructed images were comparable to standard random under-sampling CS. This study demonstrated the feasibility of non-random under-sampling based CS by using the multi-scale curvelet transform as a sparsifying transform function. The technique may help to stabilize the obtained image quality in CS-MRI.
Yoshiki KURIHARA Yuki KOIZUMI Toru HASEGAWA Mayutan ARUMAITHURAI
Location-based forwarding is a key driver for location-based services. This paper designs forwarding information data structures for location-based forwarding in Internet Service Provider (ISP) scale networks based on Named Data Networking (NDN). Its important feature is a naming scheme which represents locations by leveraging space-filling curves.
Yi JIANG Kenichiro YAMAZAKI Toshihiro HAYATA Kohei IZUI Kanada NAKAYASU Toshifumi SATO Tatsuki OKUYAMA Jun MASHINO Satoshi SUYAMA Yukihiko OKUMURA
Massive multiple input and multiple output (Massive MIMO) is a key technique to achieve high system capacity and user data rate for the fifth generation (5G) radio access network (RAN). To implement Massive MIMO in 5G, how much Massive MIMO meets our expectation with various user equipment (UEs) in different environments should be carefully addressed. We focused on using Massive MIMO in the low super-high-frequency (SHF) band, which is expected to be used for 5G commercial bands relatively soon. We previously developed a prototype low-SHF-band centralized-RAN Massive MIMO system that has a flexible active antenna system (AAS)-unit configuration and facilitates advanced radio coordination features, such as coordinated beamforming (CB) coordinated multi-point (CoMP). In this study, we conduct field trials to evaluate downlink (DL) multi-user (MU)-MIMO performance by using our prototype system in outdoor and indoor environments. The results indicate that about 96% of the maximum total DL system throughput can be achieved with 1 AAS unit outdoors and 2 AAS units indoors. We also investigate channel capacity based on the real propagation channel estimation data measured by the prototype system. Compared with without-CB mode, the channel capacity of with-CB mode increases by a maximum of 80% and 104%, respectively, when the location of UEs are randomly selected in the outdoor and indoor environments. Furthermore, the results from the field trial of with-CB mode with eight UEs indicate that the total DL system throughput and user data rate can be significantly improved.
Shinnosuke SARUWATARI Fuyuki ISHIKAWA Tsutomu KOBAYASHI Shinichi HONIDEN
Refinement-based formal specification is a promising approach to the increasing complexity of software systems, as demonstrated in the formal method Event-B. It allows stepwise modeling and verifying of complex systems with multiple steps at different abstraction levels. However, making changes is more difficult, as caution is necessary to avoid breaking the consistency between the steps. Judging whether a change is valid or not is a non-trivial task, as the logical dependency relationships between the modeling elements (predicates) are implicit and complex. In this paper, we propose a method for analyzing the impact of the changes of Event-B. By attaching labels to modeling elements (predicates), the method helps engineers understand how a model is structured and what needs to be modified to accomplish a change.
LINE is currently the most popular messaging service in Japan. Communications using LINE are protected by the original encryption scheme, called LINE Encryption, and specifications of the client-to-server transport encryption protocol and the client-to-client message end-to-end encryption protocol are published by the Technical Whitepaper. Though a spoofing attack (i.e., a malicious client makes another client misunderstand the identity of the peer) and a reply attack (i.e., a message in a session is sent again in another session by a man-in-the-middle adversary, and the receiver accepts these messages) to the end-to-end protocol have been shown, no formal security analysis of these protocols is known. In this paper, we show a formal verification result of secrecy of application data and authenticity for protocols of LINE Encryption (Version 1.0) by using the automated security verification tool ProVerif. Especially, since it is claimed that the transport protocol satisfies forward secrecy (i.e., even if the static private key is leaked, security of application data is guaranteed), we verify forward secrecy for client's data and for server's data of the transport protocol, and we find an attack to break secrecy of client's application data. Moreover, we find the spoofing attack and the reply attack, which are reported in previous papers.
Daisuke KITAYAMA Kiichi TATEISHI Daisuke KURITA Atsushi HARADA Minoru INOMATA Tetsuro IMAI Yoshihisa KISHIYAMA Hideshi MURAI Shoji ITOH Arne SIMONSSON Peter ÖKVIST
This paper describes the results of outdoor mobility measurements and high-speed vehicle tests that clarify the 4-by-8 multiple-input multiple-output (MIMO) throughput performance when applying distributed MIMO with narrow antenna-beam tracking in a 28-GHz frequency band in the downlink of a 5G cellular radio access system. To clarify suitable transmission point (TP) deployment for mobile stations (MS) moving at high speed, we examine two arrangements for 3TPs. The first sets all TPs in a line along the same side of the path traversed by the MS, and the other sets one TP on the other side of the path. The experiments in which the MS is installed on a moving wagon reveal that the latter deployment case enables a high peak data rate and high average throughput performance exhibiting the peak throughput of 15Gbps at the vehicle speed of 3km/h. Setting the MS in a vehicle travelling at 30km/h yielded the peak throughput of 13Gbps. The peak throughput of 11Gbps is achieved at the vehicle speed of 100km/h, and beam tracking and intra-baseband unit hand over operation are successfully demonstrated even at this high vehicle speed.
Tatsuki OKUYAMA Satoshi SUYAMA Jun MASHINO Kazushi MURAOKA Kohei IZUI Kenichiro YAMAZAKI Yukihiko OKUMURA
The beamforming (BF) provided by Massive MIMO is a promising technique for the fifth-generation (5G) mobile communication system. In low SHF bands such as 3-6GHz, fully digital Massive MIMO can be a feasible option. Previous works proposed eigenvector zero-forcing (E-ZF) as a digital precoding algorithm to lower the complexity of block diagonalization (BD). On the other hand, another previous work aiming to reduce complexity of BD due to the number of antenna elements proposed digital fixed BF and channel-state-information based precoding (Digital FBCP) with BD whose parameter is the number of beams. Moreover, in order to lower the complexity of the Digital FBCP with BD while retaining the transmission performance, this paper proposes Digital FBCP with E-ZF as a lower complexity digital BF algorithm. The pros and cons of these digital BF algorithms in terms of transmission performance and computational complexity are clarified to select the most appropriate algorithm for the fully digital Massive MIMO. Furthermore, E-ZF can be implemented to 4.5GHz-band fully digital Massive MIMO equipment only when the number of antenna elements is less than or equal to 64, and thus 5G experimental trial employing E-ZF was carried out in Tokyo, Japan where early 5G commercial services will launch. To the best of our knowledge, this was the first outdoor experiment on 4.5GHz-ban Massive MIMO in a dense urban area. An outdoor experiment in a rural area was also carried out. This paper shows both a coverage performance under the single user condition and system throughput performance under a densely deployed four-user condition in the outdoor experimental trials employing the E-ZF algorithm. We reveal that, in the MU-MIMO experiment, the measured system throughput is almost 80% of the maximum system throughput even if users are closely located in the dense urban area thanks to the E-ZF algorithm.
Daisuke KURITA Kiichi TATEISHI Daisuke KITAYAMA Atsushi HARADA Yoshihisa KISHIYAMA Hideshi MURAI Shoji ITOH Arne SIMONSSON Peter ÖKVIST
This paper evaluates a variety of key 5G technologies such as base station (BS) massive multiple-input multiple-output (MIMO) antennas, beamforming and tracking, intra-baseband unit (BBU) hand over (HO), and coverage. This is done in different interesting 5G areas with a variety of radio conditions such as an indoor office building lobby, an outdoor parking area, and a realistic urban deployment of a 5G radio access system with BSs installed in buildings to deploy a 5G trial area in the Tokyo Odaiba waterfront area. Experimental results show that throughput exceeding 10Gbps is achieved in a 730MHz bandwidth using 8 component carriers, and distributed MIMO throughput gain is achieved in various transmission point deployments in the indoor office building lobby and outdoor parking area using two radio units (RUs). In particular, in the outdoor parking area, a distinct advantage from distributed MIMO is expected and the distributed MIMO gain in throughput of 60% is achieved. The experimental results also clarify the downlink performance in an urban deployment. The experimental results show that throughput exceeding 1.5Gbps is achieved in the area and approximately 200 Mbps is achieved at 500m away from the BS. We also confirm that the beam tracking and intra-BBU HO work well compensating for high path loss at 28-GHz, and achieve coverage 500m from the BS. On the other hand, line of sight (LoS) and non-line-of sight (N-LoS) conditions are critical to 5G performance in the 28-GHz band, and we observe that 5G connections are sometimes dropped behind trees, buildings, and under footbridges.
Xinyu DA Lei NI Hehao NIU Hang HU Shaohua YUE Miao ZHANG
In this work, we investigate a joint transmit beamforming and artificial noise (AN) covariance matrix design in a multiple-input multiple-output (MIMO) cognitive radio (CR) downlink network with simultaneous wireless information and power transfer (SWIPT), where the malicious energy receivers (ERs) may decode the desired information and hence can be treated as potential eavesdroppers (Eves). In order to improve the secure performance of the transmission, AN is embedded to the information-bearing signal, which acts as interference to the Eves and provides energy to all receivers. Specifically, this joint design is studied under a practical non-linear energy harvesting (EH) model, our aim is to maximize the secrecy rate at the SR subject to the transmit power budget, EH constraints and quality of service (QoS) requirement. The original problem is not convex and challenging to be solved. To circumvent its intractability, an equivalent reformulation of this secrecy rate maximization (SRM) problem is introduced, wherein the resulting problem is primal decomposable and thus can be handled by alternately solving two convex subproblems. Finally, numerical results are presented to verify the effectiveness of our proposed scheme.
Mizuki SUGA Atsushi OHTA Kazuto GOTO Takahiro TSUCHIYA Nobuaki OTSUKI Yushi SHIRATO Naoki KITA Takeshi ONIZAWA
A propagation experiment on an actual channel is conducted to confirm the effectiveness of the 1-tap time domain beamforming (TDBF) technique we proposed in previous work. This technique offers simple beamforming for the millimeter waveband massive multiple-input multiple-output (MIMO) applied wireless backhaul and so supports the rapid deployment of fifth generation mobile communications (5G) small cells. This paper details propagation experiments in the 75GHz band and the characteristics evaluations of 1-tap TDBF as determined from actual channel measurements. The results show that 1-tap TDBF array gain nearly equals the frequency domain maximal ratio combining (MRC) value, which is ideal processing; the difference is within 0.5dB. In addition, 1-tap TDBF can improve on the signal-to-interference power ratio (SIR) by about 13% when space division multiplexing (SDM) is performed assuming existing levels of channel estimation error.
Shohei YOSHIOKA Satoshi SUYAMA Tatsuki OKUYAMA Jun MASHINO Yukihiko OKUMURA
Towards furthering the industrial revolution, the concept of a new cellular network began to be drawn up around 2010 as the fifth generation (5G) mobile wireless communication system. One of the main differences between the fourth generation (4G) mobile communication system Long Term Evolution (LTE) and 5G new radio (NR) is the frequency bands utilized. 5G NR assumes higher frequency bands. Effective utilization of the higher frequency bands needs to resolve the technical issue of the larger path-loss. Massive multiple-input multiple-output (Massive MIMO) beamforming (BF) technology contributes to overcome this problem, hence further study of Massive MIMO BF for each frequency band is necessary toward high-performance and easy implementation. In this paper, then, we propose a Massive MIMO method with fully-digital BF based on two-tap precoding for low super high frequency (SHF) band downlink (DL) transmissions (called as Digital FBCP). Additionally, three intersite coordination algorithms for Digital FBCP are presented for multi-site environments and one of the three algorithms is enhanced. It is shown that Digital FBCP achieves better throughput performance than a conventional algorithm with one-tap precoding. Considering performance of intersite coordination as well, it is concluded that Digital FBCP can achieve around 5 Gbps in various practical environments.
Jan LEWANDOWSKY Gerhard BAUCH Matthias TSCHAUNER Peter OPPERMANN
Receiver implementations with very low quantization resolution will play an important role in 5G, as high precision quantization and signal processing are costly in terms of computational resources and chip area. Therefore, low resolution receivers with quasi optimum performance will be required to meet complexity and latency constraints. The Information Bottleneck method allows for a novel, information centric approach to design such receivers. The method was originally introduced by Naftali Tishby et al. and mostly used in the machine learning field so far. Interestingly, it can also be applied to build surprisingly good digital communication receivers which work fundamentally different than state-of-the-art receivers. Instead of minimizing the quantization error, receiver components with maximum preservation of relevant information for a given bit width can be designed. All signal processing in the resulting receivers is performed using only simple lookup operations. In this paper, we first provide a brief introduction to the design of receiver components with the Information Bottleneck method. We keep referring to decoding of low-density parity-check codes as a practical example. The focus of the paper lies on practical decoder implementations on a digital signal processor which illustrate the potential of the proposed technique. An Information Bottleneck decoder with 4bit message passing decoding is found to outperform 8bit implementations of the well-known min-sum decoder in terms of bit error rate and to perform extremely close to an 8bit belief propagation decoder, while offering considerably higher net decoding throughput than both conventional decoders.
Eun-Sung JUNG Si LIU Rajkumar KETTIMUTHU Sungwook CHUNG
The scale of scientific data generated by experimental facilities and simulations in high-performance computing facilities has been proliferating with the emergence of IoT-based big data. In many cases, this data must be transmitted rapidly and reliably to remote facilities for storage, analysis, or sharing, for the Internet of Things (IoT) applications. Simultaneously, IoT data can be verified using a checksum after the data has been written to the disk at the destination to ensure its integrity. However, this end-to-end integrity verification inevitably creates overheads (extra disk I/O and more computation). Thus, the overall data transfer time increases. In this article, we evaluate strategies to maximize the overlap between data transfer and checksum computation for astronomical observation data. Specifically, we examine file-level and block-level (with various block sizes) pipelining to overlap data transfer and checksum computation. We analyze these pipelining approaches in the context of GridFTP, a widely used protocol for scientific data transfers. Theoretical analysis and experiments are conducted to evaluate our methods. The results show that block-level pipelining is effective in maximizing the overlap mentioned above, and can improve the overall data transfer time with end-to-end integrity verification by up to 70% compared to the sequential execution of transfer and checksum, and by up to 60% compared to file-level pipelining.
Hanxing XUE Jiali YOU Jinlin WANG
Smart-routers develop greatly in recent years as one of the representative products of IoT and Smart home. Different from traditional routers, they have storage and processing capacity. Actually, smart-routers in the same location or ISP have better link conditions and can provide high quality service to each other. Therefore, for the content required services, how to construct the overlay network and efficiently deploy replications of popular content in smart-routers' network are critical. The performance of existing centralized models is limited by the bottleneck of the single point's performance. In order to improve the stability and scalability of the system through the capability of smart-router, we propose a novel intelligent and decentralized content diffusion system in smart-router network. In the system, the content will be quickly and autonomously diffused in the network which follows the specific requirement of coverage rate in neighbors. Furthermore, we design a heuristic node selection algorithm (MIG) and a replacement algorithm (MCL) to assist the diffusion of content. Specifically, system based MIG will select neighbor with the maximum value of information gain to cache the replication. The replication with the least loss of the coverage rate gain will be replaced in the system based on MCL. Through the simulation experiments, at the same requirement of coverage rate, MIG can reduce the number of replications by at least 20.2% compared with other algorithms. Compared with other replacement algorithms, MCL achieves the best successful service rate which means how much ratio of the service can be provided by neighbors. The system based on the MIG and MCL can provide stable service with the lowest bandwidth and storage cost.
Yuta TAKAHASHI Tatsuki OKUYAMA Kazushi MURAOKA Satoshi SUYAMA Jun MASHINO Yukihiko OKUMURA
Field trials of the fifth-generation (5G) mobile communication system using 28GHz band at which almost 1GHz bandwidth will be available have been performed all over the world. To realize large coverage with such a high frequency band, beamforming by Massive MIMO (Multiple Input Multiple Output) is necessary to compensate the large path loss. Furthermore, beam tracking which adaptively changes beam direction according to user location, is an important function to support user mobility. In previous works, field trials in subway environment at 25GHz band was carried out, but only fixed beam were employed. On the other hand, the field trials result of 28 GHz-band 5G transmission employing beam tracking in the road environment has been reported. Therefore, we conducted 5G field trials in the actual railway environment using 28GHz band experimental equipment employing beam tracking. This paper reveals the downlink performance achieved by using railway cars traveling at 90km/h. In addition, we show how mobile stations position in the railway car affects the performance of 5G transmission.
Nam Tuan LY Kha Cong NGUYEN Cuong Tuan NGUYEN Masaki NAKAGAWA
This paper presents recognition of anomalously deformed Kana sequences in Japanese historical documents, for which a contest was held by IEICE PRMU 2017. The contest was divided into three levels in accordance with the number of characters to be recognized: level 1: single characters, level 2: sequences of three vertically written Kana characters, and level 3: unrestricted sets of characters composed of three or more characters possibly in multiple lines. This paper focuses on the methods for levels 2 and 3 that won the contest. We basically follow the segmentation-free approach and employ the hierarchy of a Convolutional Neural Network (CNN) for feature extraction, Bidirectional Long Short-Term Memory (BLSTM) for frame prediction, and Connectionist Temporal Classification (CTC) for text recognition, which is named a Deep Convolutional Recurrent Network (DCRN). We compare the pretrained CNN approach and the end-to-end approach with more detailed variations for level 2. Then, we propose a method of vertical text line segmentation and multiple line concatenation before applying DCRN for level 3. We also examine a two-dimensional BLSTM (2DBLSTM) based method for level 3. We present the evaluation of the best methods by cross validation. We achieved an accuracy of 89.10% for the three-Kana-character sequence recognition and an accuracy of 87.70% for the unrestricted Kana recognition without employing linguistic context. These results prove the performances of the proposed models on the level 2 and 3 tasks.
Tao WANG Mingfang WANG Yating WU Yanzan SUN
This paper proposes an energy efficiency (EE) maximized resource allocation (RA) algorithm in orthogonal frequency division multiple access (OFDMA) downlink networks with multiple relays, where a novel opportunistic subcarrier pair based decode-and-forward (DF) protocol with beamforming is used. Specifically, every data transmission is carried out in two consecutive time slots. During every transmission, multiple parallel paths, including relayed paths and direct paths, are established by the proposed RA algorithm. As for the protocol, each subcarrier in the 1st slot can be paired with any subcarrier in 2nd slot to best utilize subcarrier resources. Furthermore, for each relayed path, multiple (not just single or all) relays can be chosen to apply beamforming at the subcarrier in the 2nd slot. Each direct path is constructed by an unpaired subcarrier in either the 1st or 2nd slot. In order to guarantee an acceptable spectrum efficiency, we also introduce a minimum rate constraint. The EE-maximized problem is a highly nonlinear optimization problem, which contains both continuous, discrete variables and has a fractional structure. To solve the problem, the best relay set and resource allocation for a relayed path are derived first, then we design an iterative algorithm to find the optimal RA for the network. Finally, numerical experiments are taken to demonstrate the effectiveness of the proposed algorithm, and show the impact of minimum rate requirement, user number and circuit power on the network EE.
Hideaki KINSHO Rie TAGYO Daisuke IKEGAMI Takahiro MATSUDA Jun OKAMOTO Tetsuya TAKINE
In this paper, we consider network monitoring techniques to estimate communication qualities in wide-area mobile networks, where an enormous number of heterogeneous components such as base stations, routers, and servers are deployed. We assume that average delays of neighboring base stations are comparable, most of servers have small delays, and delays at core routers are negligible. Under these assumptions, we propose Heterogeneous Delay Tomography (HDT) to estimate the average delay at each network component from end-to-end round trip times (RTTs) between mobile terminals and servers. HDT employs a crowdsourcing approach to collecting RTTs, where voluntary mobile users report their empirical RTTs to a data collection center. From the collected RTTs, HDT estimates average delays at base stations in the Graph Fourier Transform (GFT) domain and average delays at servers, by means of Compressed Sensing (CS). In the crowdsourcing approach, the performance of HDT may be degraded when the voluntary mobile users are unevenly distributed. To resolve this problem, we further extend HDT by considering the number of voluntary mobile users. With simulation experiments, we evaluate the performance of HDT.