Ping ZENG Qingping TAN Xiankai MENG Haoyu ZHANG Jianjun XU
Determining the validity of knowledge triples and filling in the missing entities or relationships in the knowledge graph are the crucial tasks for large-scale knowledge graph completion. So far, the main solutions use machine learning methods to learn the low-dimensional distributed representations of entities and relationships to complete the knowledge graph. Among them, translation models obtain excellent performance. However, the proposed translation models do not adequately consider the indirect relationships among entities, affecting the precision of the representation. Based on the long short-term memory neural network and existing translation models, we propose a multiple-module hybrid neural network model called TransP. By modeling the entity paths and their relationship paths, TransP can effectively excavate the indirect relationships among the entities, and thus, improve the quality of knowledge graph completion tasks. Experimental results show that TransP outperforms state-of-the-art models in the entity prediction task, and achieves the comparable performance with previous models in the relationship prediction task.
Koki ISHIDA Masamitsu TANAKA Takatsugu ONO Koji INOUE
CMOS microprocessors are limited in their capacity for clock speed improvement because of increasing computing power, i.e., they face a power-wall problem. Single-flux-quantum (SFQ) circuits offer a solution with their ultra-fast-speed and ultra-low-power natures. This paper introduces our contributions towards ultra-high-speed cryogenic SFQ computing. The first step is to design SFQ microprocessors. From qualitatively and quantitatively evaluating past-designed SFQ microprocessors, we have found that revisiting the architecture of SFQ microprocessors and on-chip caches is the first critical challenge. On the basis of cross-layer discussions and analysis, we came to the conclusion that a bit-parallel gate-level pipeline architecture is the best solution for SFQ designs. This paper summarizes our current research results targeting SFQ microprocessors and on-chip cache architectures.
Qian LI Xiaojuan LI Bin WU Yunpeng XIAO
In social networks, predicting user behavior under social hotspots can aid in understanding the development trend of a topic. In this paper, we propose a retweeting prediction method for social hotspots based on tensor decomposition, using user information, relationship and behavioral data. The method can be used to predict the behavior of users and analyze the evolvement of topics. Firstly, we propose a tensor-based mechanism for mining user interaction, and then we propose that the tensor be used to solve the problem of inaccuracy that arises when interactively calculating intensity for sparse user interaction data. At the same time, we can analyze the influence of the following relationship on the interaction between users based on characteristics of the tensor in data space conversion and projection. Secondly, time decay function is introduced for the tensor to quantify further the evolution of user behavior in current social hotspots. That function can be fit to the behavior of a user dynamically, and can also solve the problem of interaction between users with time decay. Finally, we invoke time slices and discretization of the topic life cycle and construct a user retweeting prediction model based on logistic regression. In this way, we can both explore the temporal characteristics of user behavior in social hotspots and also solve the problem of uneven interaction behavior between users. Experiments show that the proposed method can improve the accuracy of user behavior prediction effectively and aid in understanding the development trend of a topic.
Chang-Hee KANG Sung-Soon PARK Young-Hwan YOU Hyoung-Kyu SONG
In wireless communication systems, OFDM technology is a communication method that can yield high data rates. However, OFDM systems suffer high PAPR values due to the use of many of subcarriers. The SLM and the PTS technique were proposed to solve the PAPR problem in OFDM systems. However, these approaches have the disadvantage of having high complexity. This paper proposes a method which has lower complexity than the conventional PTS method but has less performance degradation.
Guan YUAN Mingjun ZHU Shaojie QIAO Zhixiao WANG Lei ZHANG
With the extensive use of location based devices, trajectories of various kinds of moving objects can be collected and stored. As time going on, the volume of trajectory data increases exponentially, which presents a series of problems in storage, transmission and analysis. Moreover, GPS trajectories are never perfectly accurate and sometimes with high noise. Therefore, how to overcome these problems becomes an urgent task in trajectory data mining and related applications. In this paper, an adaptive noise filtering trajectory compression and recovery algorithm based on Compressed Sensing (CS) is proposed. Firstly, a noise reduction model is introduced to filter the high noise in GPS trajectories. Secondly, the compressed data can be obtained by the improved GPS Trajectory Data Compression Algorithm. Thirdly, an adaptive GPS trajectory data recovery algorithm is adopted to restore the compressed trajectories to their original status approximately. Finally, comprehensive experiments on real and synthetic datasets demonstrate that the proposed algorithm is not only good at noise filtering, but also with high compression ratio and recovery performance compared to current algorithms.
Ziwei DENG Yilin HOU Xina CHENG Takeshi IKENAGA
3D ball tracking is of great significance in ping-pong game analysis, which can be utilized to applications such as TV contents and tactic analysis, with some of them requiring real-time implementation. This paper proposes a CPU-GPU platform based Particle Filter for multi-view ball tracking including 4 proposals. The multi-peak estimation and the ball-like observation model are proposed in the algorithm design. The multi-peak estimation aims at obtaining a precise ball position in case the particles' likelihood distribution has multiple peaks under complex circumstances. The ball-like observation model with 4 different likelihood evaluation, utilizes the ball's unique features to evaluate the particle's similarity with the target. In the GPU implementation, the double-queue structure and the vectorized data combination are proposed. The double-queue structure aims at achieving task parallelism between some data-independent tasks. The vectorized data combination reduces the time cost in memory access by combining 3 different image data to 1 vector data. Experiments are based on ping-pong videos recorded in an official match taken by 4 cameras located in 4 corners of the court. The tracking success rate reaches 99.59% on CPU. With the GPU acceleration, the time consumption is 8.8 ms/frame, which is sped up by a factor of 98 compared with its CPU version.
Li QUAN Zhi-liang WANG Xin LIU
Reinforcement learning has been used to adaptive service composition. However, traditional algorithms are not suitable for large-scale service composition. Based on Q-Learning algorithm, a multi-task oriented algorithm named multi-Q learning is proposed to realize subtask-assistance strategy for large-scale and adaptive service composition. Differ from previous studies that focus on one task, we take the relationship between multiple service composition tasks into account. We decompose complex service composition task into multiple subtasks according to the graph theory. Different tasks with the same subtasks can assist each other to improve their learning speed. The results of experiments show that our algorithm could obtain faster learning speed obviously than traditional Q-learning algorithm. Compared with multi-agent Q-learning, our algorithm also has faster convergence speed. Moreover, for all involved service composition tasks that have the same subtasks between each other, our algorithm can improve their speed of learning optimal policy simultaneously in real-time.
Motoko TACHIBANA Kohei YAMAMOTO Kurato MAENO
Radar is expected in advanced driver-assistance systems for environmentally robust measurements. In this paper, we propose a novel radar signal segmentation method by using a complex-valued fully convolutional network (CvFCN) that comprises complex-valued layers, real-valued layers, and a bidirectional conversion layer between them. We also propose an efficient automatic annotation system for dataset generation. We apply the CvFCN to two-dimensional (2D) complex-valued radar signal maps (r-maps) that comprise angle and distance axes. An r-maps is a 2D complex-valued matrix that is generated from raw radar signals by 2D Fourier transformation. We annotate the r-maps automatically using LiDAR measurements. In our experiment, we semantically segment r-map signals into pedestrian and background regions, achieving accuracy of 99.7% for the background and 96.2% for pedestrians.
Abdel MARTINEZ ALONSO Masaya MIYAHARA Akira MATSUZAWA
A 7GS/s complete-DDFS-solution featuring a two-times interleaved RDAC with 1.2Vpp-diff output swing was fabricated in 65nm CMOS. The frequency tuning and amplitude resolutions are 24-bits and 10-bits respectively. The RDAC includes a mixed-signal, high-speed architecture for random swapping thermometer coding dynamic element matching that improves the narrowband SFDR up to 8dB for output frequencies below 1.85GHz. The proposed techniques enable a 7 GS/s operation with a spurious-free dynamic range better than 32dBc over the full Nyquist bandwidth. The worst case narrowband SFDR is 42dBc. This system consumes 87.9mW/(GS/s) from a 1.2V power supply when the RSTC-DEM method is enabled, resulting in a FoM of 458.9GS/s·2(SFDR/6)/W. A proof-of-concept chip with an active area of only 0.22mm2 was measured in prototypes encapsulated in a 144-pins low profile quad flat package.
Motofumi NAKANISHI Shintaro IZUMI Mio TSUKAHARA Hiroshi KAWAGUCHI Hiromitsu KIMURA Kyoji MARUMOTO Takaaki FUCHIKAMI Yoshikazu FUJIMORI Masahiko YOSHIMOTO
This paper presents an algorithm for a physical activity (PA) classification and metabolic equivalents (METs) monitoring and its System-on-a-Chip (SoC) implementation to realize both power reduction and high estimation accuracy. Long-term PA monitoring is an effective means of preventing lifestyle-related diseases. Low power consumption and long battery life are key features supporting the wider dissemination of the monitoring system. As described herein, an adaptive sampling method is implemented for longer battery life by minimizing the active rate of acceleration without decreasing accuracy. Furthermore, advanced PA classification using both the heart rate and acceleration is introduced. The proposed algorithms are evaluated by experimentation with eight subjects in actual conditions. Evaluation results show that the root mean square error with respect to the result of processing with fixed sampling rate is less than 0.22[METs], and the mean absolute error is less than 0.06[METs]. Furthermore, to minimize the system-level power dissipation, a dedicated SoC is implemented using 130-nm CMOS process with FeRAM. A non-volatile CPU using non-volatile memory and a flip-flop is used to reduce the stand-by power. The proposed algorithm, which is implemented using dedicated hardware, reduces the active rate of the CPU and accelerometer. The current consumption of the SoC is less than 3-µA. And the evaluation system using the test chip achieves 74% system-level power reduction. The total current consumption including that of the accelerometer is 11.3-µA on average.
Jianbin ZHOU Dajiang ZHOU Takeshi YOSHIMURA Satoshi GOTO
Compressed Sensing based CMOS image sensor (CS-CIS) is a new generation of CMOS image sensor that significantly reduces the power consumption. For CS-CIS, the image quality and data volume of output are two important issues to concern. In this paper, we first proposed an algorithm to generate a series of deterministic and ternary matrices, which improves the image quality, reduces the data volume and are compatible with CS-CIS. Proposed matrices are derived from the approximate DCT and trimmed in 2D-zigzag order, thus preserving the energy compaction property as DCT does. Moreover, we proposed matrix row operations adaptive to the proposed matrix to further compress data (measurements) without any image quality loss. At last, a low-cost VLSI architecture of measurements compression with proposed matrix row operations is implemented. Experiment results show our proposed matrix significantly improve the coding efficiency by BD-PSNR increase of 4.2 dB, comparing with the random binary matrix used in the-state-of-art CS-CIS. The proposed matrix row operations for measurement compression further increases the coding efficiency by 0.24 dB BD-PSNR (4.8% BD-rate reduction). The VLSI architecture is only 4.3 K gates in area and 0.3 mW in power consumption.
Maoshen JIA Jundai SUN Feng DENG Junyue SUN
In this work, a multiple source separation method with joint sparse and non-sparse components recovery is proposed by using dual similarity determination. Specifically, a dual similarity coefficient is designed based on normalized cross-correlation and Jaccard coefficients, and its reasonability is validated via a statistical analysis on a quantitative effective measure. Thereafter, by regarding the sparse components as a guide, the non-sparse components are recovered using the dual similarity coefficient. Eventually, a separated signal is obtained by a synthesis of the sparse and non-sparse components. Experimental results demonstrate the separation quality of the proposed method outperforms some existing BSS methods including sparse components separation based methods, independent components analysis based methods and soft threshold based methods.
Ayae ICHINOSE Atsuko TAKEFUSA Hidemoto NAKADA Masato OGUCHI
Many life-log analysis applications, which transfer data from cameras and sensors to a Cloud and analyze them in the Cloud, have been developed as the use of various sensors and Cloud computing technologies has spread. However, difficulties arise because of the limited network bandwidth between such sensors and the Cloud. In addition, sending raw sensor data to a Cloud may introduce privacy issues. Therefore, we propose a pipelined method for distributed deep learning processing between sensors and the Cloud to reduce the amount of data sent to the Cloud and protect the privacy of users. In this study, we measured the processing times and evaluated the performance of our method using two different datasets. In addition, we performed experiments using three types of machines with different performance characteristics on the client side and compared the processing times. The experimental results show that the accuracy of deep learning with coarse-grained data is comparable to that achieved with the default parameter settings, and the proposed distributed processing method has performance advantages in cases of insufficient network bandwidth between realistic sensors and a Cloud environment. In addition, it is confirmed that the process that most affects the overall processing time varies depending on the machine performance on the client side, and the most efficient distribution method similarly differs.
Chihiro TSUTAKE Toshiyuki YOSHIDA
Many of affine motion compensation techniques proposed thus far employ least-square-based techniques in estimating affine parameters, which requires a hardware structure different from conventional block-matching-based one. This paper proposes a new affine motion estimation/compensation framework friendly to block-matching-based parameter estimation, and applies it to an HEVC encoder to demonstrate its coding efficiency and computation cost. To avoid a nest of search loops, a new affine motion model is first introduced by decomposing the conventional 4-parameter affine model into two 3-parameter ones. Then, a block-matching-based fast parameter estimation technique is proposed for the models. The experimental results given in this paper show that our approach is advantageous over conventional techniques.
Xiaomin JIN Yuanan LIU Wenhao FAN Fan WU Bihua TANG
Mobile cloud computing (MCC) has been proposed as a new approach to enhance mobile device performance via computation offloading. The growth in cloud computing energy consumption is placing pressure on both the environment and cloud operators. In this paper, we focus on energy-efficient resource management in MCC and aim to reduce cloud operators' energy consumption through resource management. We establish a deterministic resource management model by solving a combinatorial optimization problem with constraints. To obtain the resource management strategy in deterministic scenarios, we propose a deterministic strategy algorithm based on the adaptive group genetic algorithm (AGGA). Wireless networks are used to connect to the cloud in MCC, which causes uncertainty in resource management in MCC. Based on the deterministic model, we establish a stochastic model that involves a stochastic optimization problem with chance constraints. To solve this problem, we propose a stochastic strategy algorithm based on Monte Carlo simulation and AGGA. Experiments show that our deterministic strategy algorithm obtains approximate optimal solutions with low algorithmic complexity with respect to the problem size, and our stochastic strategy algorithm saves more energy than other algorithms while satisfying the chance constraints.
In this paper, we propose a Mobile Edge Internet of Things (MEIoT) architecture by leveraging the fiber-wireless access technology, the cloudlet concept, and the software defined networking framework. The MEIoT architecture brings computing and storage resources close to Internet of Things (IoT) devices in order to speed up IoT data sharing and analytics. Specifically, the IoT devices (belonging to the same user) are associated to a specific proxy Virtual Machine (VM) in the nearby cloudlet. The proxy VM stores and analyzes the IoT data (generated by its IoT devices) in real-time. Moreover, we introduce the semantic and social IoT technology in the context of MEIoT to solve the interoperability and inefficient access control problem in the IoT system. In addition, we propose two dynamic proxy VM migration methods to minimize the end-to-end delay between proxy VMs and their IoT devices and to minimize the total on-grid energy consumption of the cloudlets, respectively. Performance of the proposed methods is validated via extensive simulations.
Wenhua SHI Xiongwei ZHANG Xia ZOU Meng SUN Wei HAN Li LI Gang MIN
A monaural speech enhancement method combining deep neural network (DNN) with low rank analysis and speech present probability is proposed in this letter. Low rank and sparse analysis is first applied on the noisy speech spectrogram to get the approximate low rank representation of noise. Then a joint feature training strategy for DNN based speech enhancement is presented, which helps the DNN better predict the target speech. To reduce the residual noise in highly overlapping regions and high frequency domain, speech present probability (SPP) weighted post-processing is employed to further improve the quality of the speech enhanced by trained DNN model. Compared with the supervised non-negative matrix factorization (NMF) and the conventional DNN method, the proposed method obtains improved speech enhancement performance under stationary and non-stationary conditions.
Jiahui LUO Zhijian CHEN Xiaoyan XIANG Jianyi MENG
This work presents a low-complexity lossless electrocardiogram (ECG) compression ASIC for wireless sensors. Three linear predictors aiming for different signal characteristics are provided for prediction based on a history table that records of the optimum predictors for recent samples. And unlike traditional methods using a unified encoder, the prediction error is encoded by a hybrid Golomb encoder combining Exp-Golomb and Golomb-Rice and can adaptively configure the encoding scheme according to the predictor selection. The novel adaptive prediction and encoding scheme contributes to a compression rate of 2.77 for the MIT-BIH Arrhythmia database. Implemented in 40nm CMOS process, the design takes a small gate count of 1.82K with 37.6nW power consumption under 0.9V supply voltage.
Yuan GAO Chengdong WU Xiaosheng YU Wei ZHOU Jiahui WU
Efficient optic disc (OD) segmentation plays a significant role in retinal image analysis and retinal disease screening. In this paper, we present a full-automatic segmentation approach called double boundary extraction for the OD segmentation. The proposed approach consists of the following two stages: first, we utilize an unsupervised learning technology and statistical method based on OD boundary information to obtain the initial contour adaptively. Second, the final optic disc boundary is extracted using the proposed LSO model. The performance of the proposed method is tested on the public DIARETDB1 database and the experimental results demonstrate the effectiveness and advantage of the proposed method.
Heemang SONG Seunghoon CHO Kyung-Jin YOU Hyun-Chool SHIN
In this paper, we propose an automotive radar sensor compensation method improving direction of arrival (DOA) and preventing target split tracking. Amplitude and phase mismatching and mutual coupling between radar sensor arrays cause an inaccuracy problem in DOA estimation. By quantifying amplitude and phase distortion levels for each angle, we compensate the sensor distortion. Applying the proposed method to Bartlett, Capon and multiple signal classification (MUSIC) algorithms, we experimentally demonstrate the performance improvement using both experimental data from the chamber and real data obtained in actual road.