Riku AKEMA Masao YAMAGISHI Isao YAMADA
Approximate Simultaneous Diagonalization (ASD) is a problem to find a common similarity transformation which approximately diagonalizes a given square-matrix tuple. Many data science problems have been reduced into ASD through ingenious modelling. For ASD, the so-called Jacobi-like methods have been extensively used. However, the methods have no guarantee to suppress the magnitude of off-diagonal entries of the transformed tuple even if the given tuple has an exact common diagonalizer, i.e., the given tuple is simultaneously diagonalizable. In this paper, to establish an alternative powerful strategy for ASD, we present a novel two-step strategy, called Approximate-Then-Diagonalize-Simultaneously (ATDS) algorithm. The ATDS algorithm decomposes ASD into (Step 1) finding a simultaneously diagonalizable tuple near the given one; and (Step 2) finding a common similarity transformation which diagonalizes exactly the tuple obtained in Step 1. The proposed approach to Step 1 is realized by solving a Structured Low-Rank Approximation (SLRA) with Cadzow's algorithm. In Step 2, by exploiting the idea in the constructive proof regarding the conditions for the exact simultaneous diagonalizability, we obtain an exact common diagonalizer of the obtained tuple in Step 1 as a solution for the original ASD. Unlike the Jacobi-like methods, the ATDS algorithm has a guarantee to find an exact common diagonalizer if the given tuple happens to be simultaneously diagonalizable. Numerical experiments show that the ATDS algorithm achieves better performance than the Jacobi-like methods.
Shucong TIAN Meng YANG Jianpeng WANG
Z-complementary pairs (ZCPs) were proposed by Fan et al. to make up for the scarcity of Golay complementary pairs. A ZCP of odd length N is called Z-optimal if its zero correlation zone width can achieve the maximum value (N + 1)/2. In this letter, inserting three elements to a GCP of length L, or deleting a point of a GCP of length L, we propose two constructions of Z-optimal ZCPs with length L + 3 and L - 1, where L=2α 10β 26γ, α ≥ 1, β ≥ 0, γ ≥ 0 are integers. The proposed constructions generate ZCPs with new lengths which cannot be produced by earlier ones.
Jingjing SI Wenwen SUN Chuang LI Yinbo CHENG
Deep learning is playing an increasingly important role in signal processing field due to its excellent performance on many inference problems. Parametric bilinear generalized approximate message passing (P-BiG-AMP) is a new approximate message passing based approach to a general class of structure-matrix bilinear estimation problems. In this letter, we propose a novel feed-forward neural network architecture to realize P-BiG-AMP methodology with deep learning for the inference problem of compressive sensing under matrix uncertainty. Linear transforms utilized in the recovery process and parameters involved in the input and output channels of measurement are jointly learned from training data. Simulation results show that the trained P-BiG-AMP network can achieve higher reconstruction performance than the P-BiG-AMP algorithm with parameters tuned via the expectation-maximization method.
In this letter, a low latency, high throughput and hardware efficient sorted MMSE QR decomposition (MMSE-SQRD) for multiple-input multiple-output (MIMO) systems is presented. In contrast to the method of extending the complex matrix to real model and thereafter applying real-valued QR decomposition (QRD), we develop a highly parallel decomposition scheme based on coordinate rotation digital computer (CORDIC) which performs the QRD in complex domain directly and then converting the complex result to its real counterpart. The proposed scheme can greatly improve the processing parallelism and curtail the nullification and sorting procedures. Besides, we also design the corresponding pipelined hardware architecture of the MMSE-SQRD based on highly parallel Givens rotation structure with CORDIC algorithm for 4×4 MIMO detectors. The proposed MMSE-SQRD is implemented in SMIC 55nm CMOS technology achieving up to 50M QRD/s throughput and a latency of 59 clock cycles with only 218 kilo-gates (KG). Compared to the previous works, the proposed design achieves the highest normalized throughput efficiency and lowest processing latency.
Masayuki ODAGAWA Takumi OKAMOTO Tetsushi KOIDE Toru TAMAKI Bisser RAYTCHEV Kazufumi KANEDA Shigeto YOSHIDA Hiroshi MIENO Shinji TANAKA Takayuki SUGAWARA Hiroshi TOISHI Masayuki TSUJI Nobuo TAMBA
In this paper, we present a hardware implementation of a colorectal cancer diagnosis support system using a colorectal endoscopic video image on customizable embedded DSP. In an endoscopic video image, color shift, blurring or reflection of light occurs in a lesion area, which affects the discrimination result by a computer. Therefore, in order to identify lesions with high robustness and stable classification to these images specific to video frame, we implement a computer-aided diagnosis (CAD) system for colorectal endoscopic images with Narrow Band Imaging (NBI) magnification with the Convolutional Neural Network (CNN) feature and Support Vector Machine (SVM) classification. Since CNN and SVM need to perform many multiplication and accumulation (MAC) operations, we implement the proposed hardware system on a customizable embedded DSP, which can realize at high speed MAC operations and parallel processing with Very Long Instruction Word (VLIW). Before implementing to the customizable embedded DSP, we profile and analyze processing cycles of the CAD system and optimize the bottlenecks. We show the effectiveness of the real-time diagnosis support system on the embedded system for endoscopic video images. The prototyped system demonstrated real-time processing on video frame rate (over 30fps @ 200MHz) and more than 90% accuracy.
Di YAO Aijun LIU Hongzhi LI Changjun YU
In the user-congested high-frequency band, radio frequency interference (RFI) is a dominant factor that degrades the detection performance of high-frequency surface wave radar (HFSWR). Up to now, various RFI suppression algorithms have been proposed while they are usually inapplicable to the compact HFSWR because of the minimal array aperture. Therefore, this letter proposes a novel RFI mitigation scheme for compact HFSWR, even for single antenna. The scheme utilized the robust principal component analysis to separate RFI and target, based on the time-frequency distribution characteristics of the RFI. The effectiveness of this scheme is demonstrated by the measured data, which can effectively suppress RFI without losing target signal.
Hiroaki YAMANAKA Yuuichi TERANISHI Eiji KAWAI
Edge computing offers computing capability with ultra-low response times by leveraging servers close to end-user devices. Due to the mobility of end-user devices, the latency between the servers and the end-user devices can become long and the response time might become unacceptable for an application service. Service (container) migration that follows the handover of end-user devices retains the response time. Service migration following the mass movement of people in the same geographic area and at the same time due to an event (e.g., commuting) generates heavy bandwidth usage in the mobile backhaul network. Heavy usage by service migration reduces available bandwidth for ordinary application traffic in the network. Shaping the migration traffic limits the bandwidth usage while delaying service migration and increasing the response time of the container for the moving end-user device. Furthermore, targets of migration decisions increase (i.e., the system load) because delaying a migration process accumulates containers waiting for migration. In this paper, we propose a migration scheduling method to control bandwidth usage for migration in a network and ensure timely processing of service migration. Simulations that compare the proposal with state-of-the-art methods show that the proposal always suppresses the bandwidth usage under the predetermined threshold. The method reduced the number of containers exceeding the acceptable response time up to 40% of the compared state-of-the-art methods. Furthermore, the proposed method minimized the targets of migration decisions.
Lei SONG Xue-Cheng SUN Zhe-Ming LU
In this Letter, we propose a blind and robust multiple watermarking scheme using Contourlet transform and singular value decomposition (SVD). The host image is first decomposed by Contourlet transform. Singular values of Contourlet coefficient blocks are adopted to embed watermark information, and a fast calculation method is proposed to avoid the heavy computation of SVD. The watermark is embedded in both low and high frequency Contourlet coefficients to increase the robustness against various attacks. Moreover, the proposed scheme intrinsically exploits the characteristics of human visual system and thus can ensure the invisibility of the watermark. Simulation results show that the proposed scheme outperforms other related methods in terms of both robustness and execution time.
Ryousei TAKANO Kuniyasu SUZAKI
A conventional data center that consists of monolithic-servers is confronted with limitations including lack of operational flexibility, low resource utilization, low maintainability, etc. Resource disaggregation is a promising solution to address the above issues. We propose a concept of disaggregated cloud data center architecture called Flow-in-Cloud (FiC) that enables an existing cluster computer system to expand an accelerator pool through a high-speed network. FlowOS-RM manages the entire pool resources, and deploys a user job on a dynamically constructed slice according to a user request. This slice consists of compute nodes and accelerators where each accelerator is attached to the corresponding compute node. This paper demonstrates the feasibility of FiC in a proof of concept experiment running a distributed deep learning application on the prototype system. The result successfully warrants the applicability of the proposed system.
Robert Chen-Hao CHANG Wei-Chih CHEN Shao-Che SU
A switching-based Li-ion battery charger without any additional compensation circuit is proposed. The proposed charger adopts a dual-current sensor and a current window control to ensure system stability in different charge modes: trickle current, constant current, and constant voltage. The proposed Li-ion battery charger has less chip area and a simpler structure to design than a conventional Li-ion battery charger with pulse width modulation. Simulation with a 1000µF capacitor as the battery equivalent, a 5V input, and a 1A charge current resulted in a charging time of 1.47ms and a 91% power efficiency.
Toshihiko NISHIMURA Yasutaka OGAWA Takeo OHGANE Junichiro HAGIWARA
Sparse modeling is one of the most active research areas in engineering and science. The technique provides solutions from far fewer samples exploiting sparsity, that is, the majority of the data are zero. This paper reviews sparse modeling in radio techniques. The first half of this paper introduces direction-of-arrival (DOA) estimation from signals received by multiple antennas. The estimation is carried out using compressed sensing, an effective tool for the sparse modeling, which produces solutions to an underdetermined linear system with a sparse regularization term. The DOA estimation performance is compared among three compressed sensing algorithms. The second half reviews channel state information (CSI) acquisitions in multiple-input multiple-output (MIMO) systems. In time-varying environments, CSI estimated with pilot symbols may be outdated at the actual transmission time. We describe CSI prediction based on sparse DOA estimation, and show excellent precoding performance when using the CSI prediction. The other topic in the second half is sparse Bayesian learning (SBL)-based channel estimation. A base station (BS) has many antennas in a massive MIMO system. A major obstacle for using the massive MIMO system in frequency-division duplex mode is an overhead for downlink CSI acquisition because we need to send many pilot symbols from the BS and to get the feedback from user equipment. An SBL-based channel estimation method can mitigate this issue. In this paper, we describe the outline of the method, and show that the technique can reduce the downlink pilot symbols.
Tatsuya SUGIYAMA Keigo TAKEUCHI
Sparse orthogonal matrices are proposed to improve the convergence property of expectation propagation (EP) for sparse signal recovery from compressed linear measurements subject to known dense and ill-conditioned multiplicative noise. As a typical problem, this letter addresses generalized spatial modulation (GSM) in over-loaded and spatially correlated multiple-input multiple-output (MIMO) systems. The proposed sparse orthogonal matrices are used in precoding and constructed efficiently via a generalization of the fast Walsh-Hadamard transform. Numerical simulations show that the proposed sparse orthogonal precoding improves the convergence property of EP in over-loaded GSM MIMO systems with known spatially correlated channel matrices.
Krittin INTHARAWIJITR Katsuyoshi IIDA Hiroyuki KOGA Katsunori YAMAOKA
The Internet of Things (IoT) with its support for cyber-physical systems (CPS) will provide many latency-sensitive services that require very fast responses from network services. Mobile edge computing (MEC), one of the distributed computing models, is a promising component of the low-latency network architecture. In network architectures with MEC, mobile devices will offload heavy computing tasks to edge servers. There exist numbers of researches about low-latency network architecture with MEC. However, none of the existing researches simultaneously satisfy the followings: (1) guarantee the latency of computing tasks and (2) implement a real system. In this paper, we designed and implemented an MEC based network architecture that guarantees the latency of offloading tasks. More specifically, we first estimate the total latency including computing and communication ones at the centralized node called orchestrator. If the estimated value exceeds the latency requirement, the task will be rejected. We then evaluated its performance in terms of the blocking probability of the tasks. To analyze the results, we compared the performance between obtained from experiments and simulations. Based on the comparisons, we clarified that the computing latency estimation accuracy is a significant factor for this system.
Hiroshi FUJIWARA Yuta WANIKAWA Hiroaki YAMAMOTO
The performance of online algorithms for the bin packing problem is usually measured by the asymptotic approximation ratio. However, even if an online algorithm is explicitly described, it is in general difficult to obtain the exact value of the asymptotic approximation ratio. In this paper we show a theorem that gives the exact value of the asymptotic approximation ratio in a closed form when the item sizes and the online algorithm satisfy some conditions. Moreover, we demonstrate that our theorem serves as a powerful tool for the design of online algorithms combined with mathematical optimization.
Shinobu KUDO Shota ORIHASHI Ryuichi TANIDA Seishi TAKAMURA Hideaki KIMATA
Recently, image compression systems based on convolutional neural networks that use flexible nonlinear analysis and synthesis transformations have been developed to improve the restoration accuracy of decoded images. Although these methods that use objective metric such as peak signal-to-noise ratio and multi-scale structural similarity for optimization attain high objective results, such metric may not reflect human visual characteristics and thus degrade subjective image quality. A method using a framework called a generative adversarial network (GAN) has been reported as one of the methods aiming to improve the subjective image quality. It optimizes the distribution of restored images to be close to that of natural images; thus it suppresses visual artifacts such as blurring, ringing, and blocking. However, since methods of this type are optimized to focus on whether the restored image is subjectively natural or not, components that are not correlated with the original image are mixed into the restored image during the decoding process. Thus, even though the appearance looks natural, subjective similarity may be degraded. In this paper, we investigated why the conventional GAN-based compression techniques degrade subjective similarity, then tackled this problem by rethinking how to handle image generation in the GAN framework between image sources with different probability distributions. The paper describes a method to maximize mutual information between the coding features and the restored images. Experimental results show that the proposed mutual information amount is clearly correlated with subjective similarity and the method makes it possible to develop image compression systems with high subjective similarity.
AI (artificial intelligence) has grown at an overwhelming speed for the last decade, to the extent that it has become one of the mainstream tools that drive the advancements in science and technology. Meanwhile, the paradigm of edge computing has emerged as one of the foremost areas in which applications using the AI technology are being most actively researched, due to its potential benefits and impact on today's widespread networked computing environments. In this paper, we evaluate two major entry-level offerings in the state-of-the-art edge device technology, which highlight increased computing power and specialized hardware support for AI applications. We perform a set of deep learning benchmarks on the devices to measure their performance. By comparing the performance with other GPU (graphics processing unit) accelerated systems in different platforms, we assess the computational capability of the modern edge devices featuring a significant amount of hardware parallelism.
The circuit satisfiability problem has been intensively studied since Ryan Williams showed a connection between the problem and lower bounds for circuit complexity. In this letter, we present a #SAT algorithm for synchronous Boolean circuits of n inputs and s gates in time $2^{nleft(1 - rac{1}{2^{O(s/n)}} ight)}$ if s=o(n log n).
Vu-Tran-Minh KHUONG Khanh-Minh PHAN Huy-Quang UNG Cuong-Tuan NGUYEN Masaki NAKAGAWA
Many approaches enable teachers to digitalize students' answers and mark them on the computer. However, they are still limited for supporting marking descriptive mathematical answers that can best evaluate learners' understanding. This paper presents clustering of offline handwritten mathematical expressions (HMEs) to help teachers efficiently mark answers in the form of HMEs. In this work, we investigate a method of combining feature types from low-level directional features and multiple levels of recognition: bag-of-symbols, bag-of-relations, and bag-of-positions. Moreover, we propose a marking cost function to measure the marking effort. To show the effectiveness of our method, we used two datasets and another sampled from CROHME 2016 with synthesized patterns to prepare correct answers and incorrect answers for each question. In experiments, we employed the k-means++ algorithm for each level of features and considered their combination to produce better performance. The experiments show that the best combination of all the feature types can reduce the marking cost to about 0.6 by setting the number of answer clusters appropriately compared with the manual one-by-one marking.
Bing LIU Zhengchun ZHOU Udaya PARAMPALLI
Inspired by an idea due to Levenshtein, we apply the low correlation zone constraint in the analysis of the weighted mean square aperiodic correlation. Then we derive a lower bound on the measure for quasi-complementary sequence sets with low correlation zone (LCZ-QCSS). We discuss the conditions of tightness for the proposed bound. It turns out that the proposed bound is tighter than Liu-Guan-Ng-Chen bound for LCZ-QCSS. We also derive a lower bound for QCSS, which improves the Liu-Guan-Mow bound in general.
Zhenhui XU Tielong SHEN Daizhan CHENG
This paper studies the infinite time horizon optimal control problem for continuous-time nonlinear systems. A completely model-free approximate optimal control design method is proposed, which only makes use of the real-time measured data from trajectories instead of a dynamical model of the system. This approach is based on the actor-critic structure, where the weights of the critic neural network and the actor neural network are updated sequentially by the method of weighted residuals. It should be noted that an external input is introduced to replace the input-to-state dynamics to improve the control policy. Moreover, strict proof of convergence to the optimal solution along with the stability of the closed-loop system is given. Finally, a numerical example is given to show the efficiency of the method.