Taito MANABE Kazuya UETSUHARA Akane TAHARA Yuichiro SHIBATA
This paper shows design and implementation of an image-based vibration detection system on a field-programmable gate array (FPGA), aiming at application to tremor suppression for microsurgery assistance systems. The system can extract a vibration component within a user-specified frequency band from moving images in real-time. For fast and robust detection, we employ a statistical approach using dense optical flow to derive vibration component, and design a custom hardware based on the Lucas-Kanade (LK) method to compute optical flow. And for band-pass filtering without phase delay, we implement the band-limited multiple Fourier linear combiner (BMFLC), a sort of adaptive band-pass filter which can recompose an input signal as a mixture of sinusoidal signals with multiple frequencies within the specified band, with no phase delay. The whole system is implemented as a deep pipeline on a Xilinx Kintex-7 XC7K325T FPGA without using any external memory. We employ fixed-point arithmetic to reduce resource utilization while maintaining accuracy close to double-precision floating-point arithmetic. Empirical experiments reveal that the proposed system extracts a high-frequency tremor component from hand motions, with intentional low-frequency motions successfully filtered out. The system can process VGA moving images at 60fps, with a delay of less than 1 µs for the BMFLC, suggesting effectiveness of the deep pipelined architecture. In addition, we are planning to integrate a CNN-based segmentation system for improving detection accuracy, and show preliminary software evaluation results.
Kouki SEO Chihiro GO Yuma KINOSHITA Hitoshi KIYA
We propose a novel hue-correction scheme for multi-exposure image fusion (MEF). Various MEF methods have so far been studied to generate higher-quality images. However, there are few MEF methods considering hue distortion unlike other fields of image processing, due to a lack of a reference image that has correct hue. In the proposed scheme, we generate an HDR image as a reference for hue correction, from input multi-exposure images. After that, hue distortion in images fused by an MEF method is removed by using hue information of the HDR one, on the basis of the constant-hue plane in the RGB color space. In simulations, the proposed scheme is demonstrated to be effective to correct hue-distortion caused by conventional MEF methods. Experimental results also show that the proposed scheme can generate high-quality images, regardless of exposure conditions of input multi-exposure images.
Ayana KAWAMURA Yuma KINOSHITA Takayuki NAKACHI Sayaka SHIOTA Hitoshi KIYA
We propose a privacy-preserving machine learning scheme with encryption-then-compression (EtC) images, where EtC images are images encrypted by using a block-based encryption method proposed for EtC systems with JPEG compression. In this paper, a novel property of EtC images is first discussed, although EtC ones was already shown to be compressible as a property. The novel property allows us to directly apply EtC images to machine learning algorithms non-specialized for computing encrypted data. In addition, the proposed scheme is demonstrated to provide no degradation in the performance of some typical machine learning algorithms including the support vector machine algorithm with kernel trick and random forests under the use of z-score normalization. A number of facial recognition experiments with are carried out to confirm the effectiveness of the proposed scheme.
Masakazu IWAI Takuya FUTAGAMI Noboru HAYASAKA Takao ONOYE
In this paper, we improve upon the automatic building extraction method, which uses a variational inference Gaussian mixture model for performing color clustering, by accelerating its computational speed. The improved method decreases the computational time using an image with reduced resolution upon applying color clustering. According to our experiment, in which we used 106 scenery images, the improved method could extract buildings at a rate 86.54% faster than that of the conventional methods. Furthermore, the improved method significantly increased the extraction accuracy by 1.8% or more by preventing over-clustering using the reduced image, which also had a reduced number of the colors.
Mengmeng LI Xiaoguang REN Yanzhen WANG Wei QIN Yi LIU
Feature selection is important for learning algorithms, and it is still an open problem. Antlion optimizer is an excellent nature inspired method, but it doesn't work well for feature selection. This paper proposes a hybrid approach called Ant-Antlion Optimizer which combines advantages of antlion's smart behavior of antlion optimizer and ant's powerful searching movement of ant colony optimization. A mutation operator is also adopted to strengthen exploration ability. Comprehensive experiments by binary classification problems show that the proposed algorithm is superiority to other state-of-art methods on four performance indicators.
Masayuki SHIMODA Youki SADA Ryosuke KURAMOCHI Shimpei SATO Hiroki NAKAHARA
In the realization of convolutional neural networks (CNNs) in resource-constrained embedded hardware, the memory footprint of weights is one of the primary problems. Pruning techniques are often used to reduce the number of weights. However, the distribution of nonzero weights is highly skewed, which makes it more difficult to utilize the underlying parallelism. To address this problem, we present SENTEI*, filter-wise pruning with distillation, to realize hardware-aware network architecture with comparable accuracy. The filter-wise pruning eliminates weights such that each filter has the same number of nonzero weights, and retraining with distillation retains the accuracy. Further, we develop a zero-weight skipping inter-layer pipelined accelerator on an FPGA. The equalization enables inter-filter parallelism, where a processing block for a layer executes filters concurrently with straightforward architecture. Our evaluation of semantic-segmentation tasks indicates that the resulting mIoU only decreased by 0.4 points. Additionally, the speedup and power efficiency of our FPGA implementation were 33.2× and 87.9× higher than those of the mobile GPU. Therefore, our technique realizes hardware-aware network with comparable accuracy.
Nayeon KIM Woongsoo NA Byungjun BAE
This article proposes a dynamic linkage service which is a specific service model of integrated broadcast — broadband services based ATSC 3.0. The dynamic linkage service is useful to the viewer who wants to continue watching programs using TV or their personal devices, even after the terrestrial broadcast ends due to the start of the next regular programming. In addition, we verify the feasibility of the proposed extended dynamic linkage service through developed emulation system based on ATSC 3.0. In consideration of the personal network capabilities of the viewer environment, the service was tested with 4K/2K Ultra HD and receiving the service was finished within 4 second over intranet.
Hiroyuki OKUDA Nobuto SUGIE Tatsuya SUZUKI Kentaro HARAGUCHI Zibo KANG
Path planning and motion control are fundamental components to realize safe and reliable autonomous driving. The discrimination of the role of these two components, however, is somewhat obscure because of strong mathematical interaction between these two components. This often results in a redundant computation in the implementation. One of attracting idea to overcome this redundancy is a simultaneous path planning and motion control (SPPMC) based on a model predictive control framework. SPPMC finds the optimal control input considering not only the vehicle dynamics but also the various constraints which reflect the physical limitations, safety constraints and so on to achieve the goal of a given behavior. In driving in the real traffic environment, decision making has also strong interaction with planning and control. This is much more emphasized in the case that several tasks are switched in some context to realize higher-level tasks. This paper presents a basic idea to integrate decision making, path planning and motion control which is able to be executed in realtime. In particular, lane-changing behavior together with the decision of its initiation is selected as the target task. The proposed idea is based on the nonlinear model predictive control and appropriate switching of the cost function and constraints in it. As the result, the decision of the initiation, planning, and control of the lane-changing behavior are achieved by solving a single optimization problem under several constraints such as safety. The validity of the proposed method is tested by using a vehicle simulator.
In this paper, we propose a method which enables us to control the variance of the coefficients of the LMS-type adaptive filters. In the method, each coefficient of the adaptive filter is modeled as an random variable with a Gaussian distribution, and its value is estimated as the mean value of the distribution. Besides, at each time, we check if the updated value exists within the predefined range of distribution. The update of a coefficient will be canceled when its updated value exceeds the range. We propose an implementation method which has similar formula as the Gaussian mixture model (GMM) widely used in signal processing and machine learning. The effectiveness of the proposed method is evaluated by the computer simulations.
Motion deblurring for noisy and blurry images is an arduous and fundamental problem in image processing community. The problem is ill-posed as many different pairs of latent image and blur kernel can render the same blurred image, and thus, the optimization of this problem is still unsolved. To tackle it, we present an effective motion deblurring method for noisy and blurry images based on prominent structure and a data-driven heavy-tailed prior of enhanced gradient. Specifically, first, we employ denoising as a preprocess to remove the input image noise, and then restore strong edges for accurate kernel estimation. The image extreme channels-based priors (dark channel prior and bright channel prior) as sparse complementary knowledge are exploited to extract prominent structure. High closeness of the extracted structure to the clear image structure can be obtained via tuning the parameters of extraction function. Next, the integration term of enhanced interim image gradient and clear image heavy-tailed prior is proposed and then embedded into the image restoration model, which favors sharp images over blurry ones. A large number of experiments on both synthetic and real-life images verify the superiority of the proposed method over state-of-the-art algorithms, both qualitatively and quantitatively.
Yukihiro BANDOH Seishi TAKAMURA Hideaki KIMATA
Designing an optimum quantizer can be treated as the optimization problem of finding the quantization indices that minimize the quantization error. One solution to the optimization problem, DP quantization, is based on dynamic programming. Some applications, such as bit-depth scalable codec and tone mapping, require the construction of multiple quantizers with different quantization levels, for example, from 12bit/channel to 10bit/channel and 8bit/channel. Unfortunately, the above mentioned DP quantization optimizes the quantizer for just one quantization level. That is, it is unable to simultaneously optimize multiple quantizers. Therefore, when DP quantization is used to design multiple quantizers, there are many redundant computations in the optimization process. This paper proposes an extended DP quantization with a complexity reduction algorithm for the optimal design of multiple quantizers. Experiments show that the proposed algorithm reduces complexity by 20.8%, on average, compared to conventional DP quantization.
Kazuki NAGANUMA Takashi SUZUKI Hiroyuki TSUJI Tomoaki KIMURA
Gaussian integer has a potential to enhance the safety of elliptic curve cryptography (ECC) on system under the condition fixing bit length of integral and floating point types, in viewpoint of the order of a finite field. However, there seems to have been no algorithm which makes Gaussian integer ECC safer under the condition. We present the algorithm to enhance the safety of ECC under the condition. Then, we confirm our Gaussian integer ECC is safer in viewpoint of the order of finite field than rational integer ECC or Gaussian integer ECC of naive methods under the condition.
Shanqi PANG Ruining ZHANG Xiao ZHANG
In this work, we introduce notions of quantum frequency arrangements consisting of quantum frequency squares, cubes, hypercubes and a notion of orthogonality between them. We also propose a notion of quantum mixed orthogonal array (QMOA). By using irredundant mixed orthogonal array proposed by Goyeneche et al. we can obtain k-uniform states of heterogeneous systems from quantum frequency arrangements and QMOAs. Furthermore, some examples are presented to illustrate our method.
The nearest neighbor method is a simple and flexible scheme for the classification of data points in a vector space. It predicts a class label of an unseen data point using a majority rule for the labels of known data points inside a neighborhood of the unseen data point. Because it sometimes achieves good performance even for complicated problems, several derivatives of it have been studied. Among them, the discriminant adaptive nearest neighbor method is particularly worth revisiting to demonstrate its application. The main idea of this method is to adjust the neighbor metric of an unseen data point to the set of known data points before label prediction. It often improves the prediction, provided the neighbor metric is adjusted well. For statistical shape analysis, shape classification attracts attention because it is a vital topic in shape analysis. However, because a shape is generally expressed as a matrix, it is non-trivial to apply the discriminant adaptive nearest neighbor method to shape classification. Thus, in this study, we develop the discriminant adaptive nearest neighbor method to make it slightly more useful in shape classification. To achieve this development, a mixture model and optimization algorithm for shape clustering are incorporated into the method. Furthermore, we describe several helpful techniques for the initial guess of the model parameters in the optimization algorithm. Using several shape datasets, we demonstrated that our method is successful for shape classification.
Shusuke NARIEDA Daiki CHO Hiromichi OGASAWARA Kenta UMEBAYASHI Takeo FUJII Hiroshi NARUSE
This paper provides theoretical analyses for maximum cyclic autocorrelation selection (MCAS)-based spectrum sensing techniques in cognitive radio networks. The MCAS-based spectrum sensing techniques are low computational complexity spectrum sensing in comparison with some cyclostationary detection. However, MCAS-based spectrum sensing characteristics have never been theoretically derived. In this study, we derive closed form solutions for signal detection probability and false alarm probability for MCAS-based spectrum sensing. The theoretical values are compared with numerical examples, and the values match well with each other.
Jingcheng SHEN Fumihiko INO Albert FARRÉS Mauricio HANZICH
Graphics processing units (GPUs) are highly efficient architectures for parallel stencil code; however, the small device (i.e., GPU) memory capacity (several tens of GBs) necessitates the use of out-of-core computation to process excess data. Great programming effort is needed to manually implement efficient out-of-core stencil code. To relieve such programming burdens, directive-based frameworks emerged, such as the pipelined accelerator (PACC); however, they usually lack specific optimizations to reduce data transfer. In this paper, we extend PACC with two data-centric optimizations to address data transfer problems. The first is a direct-mapping scheme that eliminates host (i.e., CPU) buffers, which intermediate between the original data and device buffers. The second is a region-sharing scheme that significantly reduces host-to-device data transfer. The extended PACC was applied to an acoustic wave propagator, automatically extending the length of original serial code 2.3-fold to obtain the out-of-core code. Experimental results revealed that on a Tesla V100 GPU, the generated code ran 41.0, 22.1, and 3.6 times as fast as implementations based on Open Multi-Processing (OpenMP), Unified Memory, and the previous PACC, respectively. The generated code also demonstrated usefulness with small datasets that fit in the device capacity, running 1.3 times as fast as an in-core implementation.
Data sorting is an important operation in computer science. It is extensively used in several applications such as database and searching. While high-performance sorting accelerators are in demand, it is very important to pay attention to the hardware resources for such kind of high-performance sorters. In this paper, we propose three FPGA based architectures to accelerate sorting operation based on the merge sorting algorithm. We call our proposals as WMS: Wide Merge Sorter, EHMS: Efficient Hardware Merge Sorter, and EHMSP: Efficient Hardware Merge Sorter Plus. We target the Virtex UltraScale FPGA device. Evaluation results show that our proposed merge sorters maintain both the high-performance and cost-effective properties. While using much fewer hardware resources, our proposed merge sorters achieve higher performance compared to the state-of-the-art. For instance, with 256 sorted records are produced per cycle, implementation results of proposed EHMS show a significant reduction in the required number of Flip Flops (FFs) and Look-Up Tables (LUTs) to about 66% and 79%, respectively over the state-of-the-art merge sorter. Moreover, while requiring fewer hardware resources, EHMS achieves about 1.4x higher throughput than the state-of-the-art merge sorter. For the same number of produced records, proposed WMS also achieves about 1.6x throughput improvement over the state-of-the-art while requiring about 81% of FFs and 76% of LUTs needed by the state-of-the-art sorter.
In this paper we analyze the interval algorithm for random number generation proposed by Han and Hoshi in the case of Markov coin tossing. Using the expression of real numbers on the interval [0,1), we first establish an explicit representation of the interval algorithm with the representation of real numbers on the interval [0,1) based one number systems. Next, using the expression of the interval algorithm, we give a rigorous analysis of the interval algorithm. We discuss the difference between the expected number of the coin tosses in the interval algorithm and their upper bound derived by Han and Hoshi and show that it can be characterized explicitly with the established expression of the interval algorithm.