Peiqi ZHANG Shinya TAKAMAEDA-YAMAZAKI
Binary Neural Networks (BNN) have binarized neuron and connection values so that their accelerators can be realized by extremely efficient hardware. However, there is a significant accuracy gap between BNNs and networks with wider bit-width. Conventional BNNs binarize feature maps by static globally-unified thresholds, which makes the produced bipolar image lose local details. This paper proposes a multi-input activation function to enable adaptive thresholding for binarizing feature maps: (a) At the algorithm level, instead of operating each input pixel independently, adaptive thresholding dynamically changes the threshold according to surrounding pixels of the target pixel. When optimizing weights, adaptive thresholding is equivalent to an accompanied depth-wise convolution between normal convolution and binarization. Accompanied weights in the depth-wise filters are ternarized and optimized end-to-end. (b) At the hardware level, adaptive thresholding is realized through a multi-input activation function, which is compatible with common accelerator architectures. Compact activation hardware with only one extra accumulator is devised. By equipping the proposed method on FPGA, 4.1% accuracy improvement is achieved on the original BNN with only 1.1% extra LUT resource. Compared with State-of-the-art methods, the proposed idea further increases network accuracy by 0.8% on the Cifar-10 dataset and 0.4% on the ImageNet dataset.
This study considered an extension of a sparse regularization method with scaling, especially in thresholding methods that are simple and typical examples of sparse modeling. In this study, in the setting of a non-parametric orthogonal regression problem, we developed and analyzed a thresholding method in which soft thresholding estimators are independently expanded by empirical scaling values. The scaling values have a common hyper-parameter that is an order of expansion of an ideal scaling value to achieve hard thresholding. We simply refer to this estimator as a scaled soft thresholding estimator. The scaled soft thresholding method is a bridge method between soft and hard thresholding methods. This new estimator is indeed consistent with an adaptive LASSO estimator in the orthogonal case; i.e., it is thus an another derivation of an adaptive LASSO estimator. It is a general method that includes soft thresholding and non-negative garrote as special cases. We subsequently derived the degree of freedom of the scaled soft thresholding in calculating the Stein's unbiased risk estimate. We found that it is decomposed into the degree of freedom of soft thresholding and the remainder term connecting to the hard thresholding. As the degree of freedom reflects the degree of over-fitting, this implies that the scaled soft thresholding has an another source of over-fitting in addition to the number of un-removed components. The theoretical result was verified by a simple numerical example. In this process, we also focused on the non-monotonicity in the above remainder term of the degree of freedom and found that, in a sparse and large sample setting, it is mainly caused by useless components that are not related to the target function.
Qin CHENG Linghua ZHANG Bo XUE Feng SHU Yang YU
As an emerging technology, device-free localization (DFL) using wireless sensor networks to detect targets not carrying any electronic devices, has spawned extensive applications, such as security safeguards and smart homes or hospitals. Previous studies formulate DFL as a classification problem, but there are still some challenges in terms of accuracy and robustness. In this paper, we exploit a generalized thresholding algorithm with parameter p as a penalty function to solve inverse problems with sparsity constraints for DFL. The function applies less bias to the large coefficients and penalizes small coefficients by reducing the value of p. By taking the distinctive capability of the p thresholding function to measure sparsity, the proposed approach can achieve accurate and robust localization performance in challenging environments. Extensive experiments show that the algorithm outperforms current alternatives.
Yuta SAKAGAWA Kosuke NAKAJIMA Gosuke OHASHI
We propose a method that detects vehicles from in-vehicle monocular camera images captured during nighttime driving. Detecting vehicles from their shape is difficult at night; however, many vehicle detection methods focusing on light have been proposed. We detect bright spots by appropriate binarization based on the characteristics of vehicle lights such as brightness and color. Also, as the detected bright spots include lights other than vehicles, we need to distinguish the vehicle lights from other bright spots. Therefore, the bright spots were distinguished using Random Forest, a multiclass classification machine-learning algorithm. The features of bright spots not associated with vehicles were effectively utilized in the vehicle detection in our proposed method. More precisely vehicle detection is performed by giving weights to the results of the Random Forest based on the features of vehicle bright spots and the features of bright spots not related to the vehicle. Our proposed method was applied to nighttime images and confirmed effectiveness.
Sipeng ZHANG Wei JIANG Shin'ichi SATOH
In this paper, a multilevel thresholding color image segmentation method is proposed using a modified Artificial Bee Colony(ABC) algorithm. In this work, in order to improve the local search ability of ABC algorithm, Krill Herd algorithm is incorporated into its onlooker bees phase. The proposed algorithm is named as Krill herd-inspired modified Artificial Bee Colony algorithm (KABC algorithm). Experiment results verify the robustness of KABC algorithm, as well as its improvement in optimizing accuracy and convergence speed. In this work, KABC algorithm is used to solve the problem of multilevel thresholding for color image segmentation. To deal with luminance variation, rather than using gray scale histogram, a HSV space-based pre-processing method is proposed to obtain 1D feature vector. KABC algorithm is then applied to find thresholds of the feature vector. At last, an additional local search around the quasi-optimal solutions is employed to improve segmentation accuracy. In this stage, we use a modified objective function which combines Structural Similarity Index Matrix (SSIM) with Kapur's entropy. The pre-processing method, the global optimization with KABC algorithm and the local optimization stage form the whole color image segmentation method. Experiment results show enhance in accuracy of segmentation with the proposed method.
Yu Min HWANG Gyeong Hyeon CHA Jong Kwan SEO Jae-Jo LEE Jin Young KIM
This paper proposes a novel wavelet de-noising scheme regarding the existing burst noises that consist of background and impulsive noises in power-line communications. The proposed de-noising scheme employs multi-level threshold functions to efficiently and adaptively reduce the given burst noises. The experiment results show that the proposed de-noising scheme significantly outperformed the conventional schemes.
Soft-thresholding is a sparse modeling method typically applied to wavelet denoising in statistical signal processing. It is also important in machine learning since it is an essential nature of the well-known LASSO (Least Absolute Shrinkage and Selection Operator). It is known that soft-thresholding, thus, LASSO suffers from a problem of dilemma between sparsity and generalization. This is caused by excessive shrinkage at a sparse representation. There are several methods for improving this problem in the field of signal processing and machine learning. In this paper, we considered to extend and analyze a method of scaling of soft-thresholding estimators. In a setting of non-parametric orthogonal regression problem including discrete wavelet transform, we introduced component-wise and data-dependent scaling that is indeed identical to non-negative garrote. We here considered a case where a parameter value of soft-thresholding is chosen from absolute values of the least squares estimates, by which the model selection problem reduces to the determination of the number of non-zero coefficient estimates. In this case, we firstly derived a risk and construct SURE (Stein's unbiased risk estimator) that can be used for determining the number of non-zero coefficient estimates. We also analyzed some properties of the risk curve and found that our scaling method with the derived SURE is possible to yield a model with low risk and high sparsity compared to a naive soft-thresholding method with SURE. This theoretical speculation was verified by a simple numerical experiment of wavelet denoising.
Huan HAO Huali WANG Weijun ZENG Hui TIAN
This paper presents a novel MEMD interval thresholding denoising, where relevant modes are selected by the similarity measure between the probability density functions of the input and that of each mode. Simulation and measured EEG data processing results show that the proposed scheme achieves better performance than other traditional denoisings.
Regularized forward selection is viewed as a method for obtaining a sparse representation in a nonparametric regression problem. In regularized forward selection, regression output is represented by a weighted sum of several significant basis functions that are selected from among a large number of candidates by using a greedy training procedure in terms of a regularized cost function and applying an appropriate model selection method. In this paper, we propose a model selection method in regularized forward selection. For the purpose, we focus on the reduction of a cost function, which is brought by appending a new basis function in a greedy training procedure. We first clarify a bias and variance decomposition of the cost reduction and then derive a probabilistic upper bound for the variance of the cost reduction under some conditions. The derived upper bound reflects an essential feature of the greedy training procedure; i.e., it selects a basis function which maximally reduces the cost function. We then propose a thresholding method for determining significant basis functions by applying the derived upper bound as a threshold level and effectively combining it with the leave-one-out cross validation method. Several numerical experiments show that generalization performance of the proposed method is comparable to that of the other methods while the number of basis functions selected by the proposed method is greatly smaller than by the other methods. We can therefore say that the proposed method is able to yield a sparse representation while keeping a relatively good generalization performance. Moreover, our method has an advantage that it is free from a selection of a regularization parameter.
Natsuki AIZAWA Shogo MURAMATSU Masahiro YUKAWA
A directional lapped orthogonal transform (DirLOT) is an orthonormal transform of which basis is allowed to be anisotropic with the symmetric, real-valued and compact-support property. Due to its directional property, DirLOT is superior to the existing separable transforms such as DCT and DWT in expressing diagonal edges and textures. The goal of this paper is to enhance the ability of DirLOT further. To achieve this goal, we propose a novel image restoration technique using multiple DirLOTs. This paper generalizes an image denoising technique in [1], and expands the application of multiple DirLOTs by introducing linear degradation operator P. The idea is to use multiple DirLOTs to construct a redundant dictionary. More precisely, the redundant dictionary is constructed as a union of symmetric orthonormal discrete wavelet transforms generated by DirLOTs. To select atoms fitting a target image from the dictionary, we formulate an image restoration problem as an l1-regularized least square problem, which can efficiently be solved by the iterative-shrinkage/thresholding algorithm (ISTA). The proposed technique is beneficial in expressing multiple directions of edges/textures. Simulation results show that the proposed technique significantly outperforms the non-subsampled Haar wavelet transform for deblurring, super-resolution, and inpainting.
Wei YI Lingjiang KONG Jianyu YANG
Dynamic Programming (DP) based Track-Before-Detect (TBD) algorithm is effective in detecting low signal-to-noise ratio (SNR) targets. However, its complexity increases exponentially as the dimension of the target state space increases, so the exact implementation of DP-TBD will become computationally prohibitive if the state dimension is more than two or three, which greatly prevents its applications to many realistic problems. In order to improve the computational efficiency of DP-TBD, a thresholding process based DP-TBD (TP-DP-TBD) is proposed in this paper. In TP-DP-TBD, a low threshold is first used to eliminate the noise-like (with low-amplitude) measurements. Then the DP integration process is modified to only focuses on the thresholded higher-amplitude measurements, thus huge amounts of computation devoted to the less meaningful low-amplitude measurements are saved. Additionally, a merit function transfer process is integrated into DP recursion to guarantee the inheritance and utilization of the target merits. The performance of TP-DP-TBD is investigated under both optical style Cartesian model and surveillance radar model. The results show that substantial computation reduction is achieved with limited performance loss, consequently TP-DP-TBD provides a cost-efficient tradeoff between computational cost and performance. The effect of the merit function transfer on performance is also studied.
Qingyong LI Yaping HUANG Zhengping LIANG Siwei LUO
Automatic thresholding is an important technique for rail defect detection, but traditional methods are not competent enough to fit the characteristics of this application. This paper proposes the Maximum Weighted Object Correlation (MWOC) thresholding method, fitting the features that rail images are unimodal and defect proportion is small. MWOC selects a threshold by optimizing the product of object correlation and the weight term that expresses the proportion of thresholded defects. Our experimental results demonstrate that MWOC achieves misclassification error of 0.85%, and outperforms the other well-established thresholding methods, including Otsu, maximum correlation thresholding, maximum entropy thresholding and valley-emphasis method, for the application of rail defect detection.
In this paper, we consider a nonparametric regression problem using a learning machine defined by a weighted sum of fixed basis functions, where the number of basis functions, or equivalently, the number of weights, is equal to the number of training data. For the learning machine, we propose a training scheme that is based on orthogonalization and thresholding. On the basis of the scheme, vectors of basis function outputs are orthogonalized and coefficients of the orthogonalized vectors are estimated instead of weights. The coefficient is set to zero if it is less than a predetermined threshold level assigned component-wise to each coefficient. We then obtain the resulting weight vector by transforming the thresholded coefficients. In this training scheme, we propose asymptotically reasonable threshold levels to distinguish contributed components from unnecessary ones. To see how this works in a simple case, we derive an upper bound for the generalization error of the training scheme with the given threshold levels. It tells us that an increase in the generalization error is of O(log n/n) when there is a sparse representation of a target function in an orthogonal domain. In implementing the training scheme, eigen-decomposition or the Gram–Schmidt procedure is employed for orthogonalization, and the corresponding training methods are referred to as OHTED and OHTGS. Furthermore, modified versions of OHTED and OHTGS, called OHTED2 and OHTGS2 respectively, are proposed for reduced estimation bias. On real benchmark datasets, OHTED2 and OHTGS2 are found to exhibit relatively good generalization performance. In addition, OHTGS2 is found to be obtain a sparse representation of a target function in terms of the basis functions.
Suk Tae SEO Hye Cheun JEONG In Keun LEE Chang Sik SON Soon Hak KWON
An approach to image thresholding based on the plausibility of object and background regions by adopting a co-occurrence matrix and category utility is presented. The effectiveness of the proposed method is shown through the experimental results tested on several images and compared with conventional methods.
Soon Hak KWON Hye Cheun JEONG Suk Tae SEO In Keun LEE Chang Sik SON
The thresholding results for gray level images depend greatly on the thresholding method applied. However, this letter proposes a histogram equalization-based thresholding algorithm that makes the thresholding results insensitive to the thresholding method applied. Experimental results are presented to demonstrate the effectiveness of the proposed thresholding algorithm.
Aryuanto SOETEDJO Koichi YAMADA
This paper describes a new color segmentation based on a normalized RGB chromaticity diagram for face detection. Face skin is extracted from color images using a coarse skin region with fixed boundaries followed by a fine skin region with variable boundaries. Two newly developed histograms that have prominent peaks of skin color and non-skin colors are employed to adjust the boundaries of the skin region. The proposed approach does not need a skin color model, which depends on a specific camera parameter and is usually limited to a particular environment condition, and no sample images are required. The experimental results using color face images of various races under varying lighting conditions and complex backgrounds, obtained from four different resources on the Internet, show a high detection rate of 87%. The results of the detection rate and computation time are comparable to the well known real-time face detection method proposed by Viola-Jones [11],[12].
In this study, we propose a simple, yet general and powerful framework for constructing accurate affine invariant regions. In our framework, a method for extracting reliable seed points is first proposed. Then, regions which are invariant to most common affine transformations can be extracted from seed points by two new methods the Path Growing (PG) or the Thresholding Seeded Growing Region (TSGR). After that, an improved ellipse fitting method based on the Direct Least Square Fitting (DLSF) is used to fit the irregularly-shaped contours from the PG or the TSGR to obtain ellipse regions as the final invariant regions. In the experiments, our framework is first evaluated by the criterions of Mikolajczyk's evaluation framework [1], and then by near-duplicate detection problem [2]. Our framework shows its superiorities to the other detectors for different transformed images under Mikolajczyk's evaluation framework and the one with TSGR also gives satisfying results in the application to near-duplicate detection problem.
Kan'ya SASAKI Takashi MORIE Atsushi IWATA
An integrate-and-fire-type spiking feedback network is discussed in this paper. In our spiking neuron model, analog information expressing processing results is given by the relative relation of spike firing. Therefore, for spiking feedback networks, all neurons should fire (pseudo-)periodically. However, an integrate-and-fire-type neuron generates no spike unless its internal potential exceeds the threshold. To solve this problem, we propose negative thresholding operation. In this paper, this operation is achieved by a global excitatory unit. This unit operates immediately after receiving the first spike input. We have designed a CMOS spiking feedback network VLSI circuit with the global excitatory unit for Hopfield-type associative memory. The circuit simulation results show that the network achieves correct association operation.
Aryuanto SOETEDJO Koichi YAMADA
Traffic sign recognition usually consists of two stages: detection and classification. In this paper, we describe the classification stage using the ring-partitioned method. The proposed method uses a specified grayscale image in the pre-processing step and ring-partitioned matching in the matching step. The method does not need carefully prepared many samples of traffic sign images for the training process, alternatively only the standard traffic signs are used as the reference images. The experimental results show the effectiveness of the method in the matching of occluded, rotated, and illumination problems of the traffic sign images with the fast computation time.
Bing-Fei WU Yen-Lin CHEN Chung-Cheng CHIU
In this study, we have proposed an efficient automatic multilevel thresholding method for image segmentation. An effective criterion for measuring the separability of the homogenous objects in the image, based on discriminant analysis, has been introduced to automatically determine the number of thresholding levels to be performed. Then, by applying this discriminant criterion, the object regions with homogeneous illuminations in the image can be recursively and automatically thresholded into separate segmented images. The proposed method is fast and effective in analyzing and thresholding the histogram of the image. In order to conduct an equitable comparative performance evaluation of the proposed method with other thresholding methods, a combinatorial scheme is also introduced to properly reduce the computational complexity of performing multilevel thresholding. The experimental results demonstrated that the proposed method is feasible and computationally efficient in automatic multilevel thresholding for image segmentation.