The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] gradient(160hit)

1-20hit(160hit)

  • Robust Bilinear Form Identification: A Subgradient Method with Geometrically Decaying Stepsize in the Presence of Heavy-Tailed Noise Open Access

    Guowei YANG  

     
    PAPER-Fundamental Theories for Communications

      Vol:
    E107-B No:10
      Page(s):
    627-632

    This paper delves into the utilisation of the subgradient method with geometrically decaying stepsize for Bilinear Form Identification. We introduce the iterative Wiener Filter, an l2 regression method, and highlight its limitations when confronted with noise, particularly heavy-tailed noise. To address these challenges, the paper suggests employing the l1 regression method with a subgradient method utilizing a geometrically decaying step size. The effectiveness of this approach is compared to existing methods, including the ALS algorithem. The study demonstrates that the l1 algorithm, especially when paired with the proposed subgradient method, excels in stability and accuracy under conditions of heavy-tailed noise. Additionally, the paper introduces the standard rounding procedure and the S-outlier bound as relaxations of traditional assumptions. Numerical experiments provide support and validation for the presented results.

  • An Investigation on LP Decoding of Short Binary Linear Codes With the Subgradient Method Open Access

    Haiyang LIU  Xiaopeng JIAO  Lianrong MA  

     
    LETTER-Coding Theory

      Pubricized:
    2023/11/21
      Vol:
    E107-A No:8
      Page(s):
    1395-1399

    In this letter, we investigate the application of the subgradient method to design efficient algorithm for linear programming (LP) decoding of binary linear codes. A major drawback of the original formulation of LP decoding is that the description complexity of the feasible region is exponential in the check node degrees of the code. In order to tackle the problem, we propose a processing technique for LP decoding with the subgradient method, whose complexity is linear in the check node degrees. Consequently, a message-passing type decoding algorithm can be obtained, whose per-iteration complexity is extremely low. Moreover, if the algorithm converges to a valid codeword, it is guaranteed to be a maximum likelihood codeword. Simulation results on several binary linear codes with short lengths suggest that the performances of LP decoding based on the subgradient method and the state-of-art LP decoding implementation approach are comparable.

  • Precoder Optimization Using Data Correlation for Wireless Data Aggregation

    Ayano NAKAI-KASAI  Naoyuki HAYASHI  Tadashi WADAYAMA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E107-B No:3
      Page(s):
    330-338

    In this paper, we consider precoder design for wireless data aggregation in sensor networks. The precoder optimization problem can be formulated as minimization of mean squared error under transmit power and block diagonal constraints. We include statistical correlation of data into the optimization problem, which is appeared in typical applications but is ignored in conventional designing methods. We propose precoder optimization algorithms based on projected gradient descent with projection onto the constraint sets. The proposed method can achieve better performance than the conventional methods that do not incorporate data correlation, especially when data are highly correlated. We also extend the proposed approach to the context of over-the-air computation.

  • Hardware-Trojan Detection at Gate-Level Netlists Using a Gradient Boosting Decision Tree Model and Its Extension Using Trojan Probability Propagation

    Ryotaro NEGISHI  Tatsuki KURIHARA  Nozomu TOGAWA  

     
    PAPER

      Pubricized:
    2023/08/16
      Vol:
    E107-A No:1
      Page(s):
    63-74

    Technological devices have become deeply embedded in people's lives, and their demand is growing every year. It has been indicated that outsourcing the design and manufacturing of integrated circuits, which are essential for technological devices, may lead to the insertion of malicious circuitry, called hardware Trojans (HTs). This paper proposes an HT detection method at gate-level netlists based on XGBoost, one of the best gradient boosting decision tree models. We first propose the optimal set of HT features among many feature candidates at a netlist level through thorough evaluations. Then, we construct an XGBoost-based HT detection method with its optimized hyperparameters. Evaluation experiments were conducted on the netlists from Trust-HUB benchmarks and showed the average F-measure of 0.842 using the proposed method. Also, we newly propose a Trojan probability propagation method that effectively corrects the HT detection results and apply it to the results obtained by XGBoost-based HT detection. Evaluation experiments showed that the average F-measure is improved to 0.861. This value is 0.194 points higher than that of the existing best method proposed so far.

  • Gradient Descent Direction Random Walk MIMO Detection Using Intermediate Search Point

    Naoki ITO  Yukitoshi SANADA  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2023/07/24
      Vol:
    E106-B No:11
      Page(s):
    1192-1199

    In this paper, multi-input multi-output (MIMO) signal detection with random walk along a gradient descent direction using an intermediate search point is presented. As a low complexity MIMO signal detection schemes, a gradient descent algorithm with Metropolis-Hastings (MH) methods has been proposed. Random walk along a gradient descent direction speeds up the MH based search using the gradient of a least-squares cost function. However, the gradient vector may be discarded through QAM constellation quantization in some cases. For further performance improvement, this paper proposes an improved search scheme in which the gradient vector is stored for the next search iteration to generate an intermediate search point. The performance of the proposed scheme improves with higher order modulation symbols as compared with that of a conventional scheme. Numerical results obtained through computer simulation show that a bit error rate (BER) performance improves by 5dB at a BER of 10-3 for 64QAM symbols in a 16×16 MIMO system.

  • On Gradient Descent Training Under Data Augmentation with On-Line Noisy Copies

    Katsuyuki HAGIWARA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/06/12
      Vol:
    E106-D No:9
      Page(s):
    1537-1545

    In machine learning, data augmentation (DA) is a technique for improving the generalization performance of models. In this paper, we mainly consider gradient descent of linear regression under DA using noisy copies of datasets, in which noise is injected into inputs. We analyze the situation where noisy copies are newly generated and injected into inputs at each epoch, i.e., the case of using on-line noisy copies. Therefore, this article can also be viewed as an analysis on a method using noise injection into a training process by DA. We considered the training process under three training situations which are the full-batch training under the sum of squared errors, and full-batch and mini-batch training under the mean squared error. We showed that, in all cases, training for DA with on-line copies is approximately equivalent to the l2 regularization training for which variance of injected noise is important, whereas the number of copies is not. Moreover, we showed that DA with on-line copies apparently leads to an increase of learning rate in full-batch condition under the sum of squared errors and the mini-batch condition under the mean squared error. The apparent increase in learning rate and regularization effect can be attributed to the original input and additive noise in noisy copies, respectively. These results are confirmed in a numerical experiment in which we found that our result can be applied to usual off-line DA in an under-parameterization scenario and can not in an over-parametrization scenario. Moreover, we experimentally investigated the training process of neural networks under DA with off-line noisy copies and found that our analysis on linear regression can be qualitatively applied to neural networks.

  • Intrusion Detection Model of Internet of Things Based on LightGBM Open Access

    Guosheng ZHAO  Yang WANG  Jian WANG  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2023/02/20
      Vol:
    E106-B No:8
      Page(s):
    622-634

    Internet of Things (IoT) devices are widely used in various fields. However, their limited computing resources make them extremely vulnerable and difficult to be effectively protected. Traditional intrusion detection systems (IDS) focus on high accuracy and low false alarm rate (FAR), making them often have too high spatiotemporal complexity to be deployed in IoT devices. In response to the above problems, this paper proposes an intrusion detection model of IoT based on the light gradient boosting machine (LightGBM). Firstly, the one-dimensional convolutional neural network (CNN) is used to extract features from network traffic to reduce the feature dimensions. Then, the LightGBM is used for classification to detect the type of network traffic belongs. The LightGBM is more lightweight on the basis of inheriting the advantages of the gradient boosting tree. The LightGBM has a faster decision tree construction process. Experiments on the TON-IoT and BoT-IoT datasets show that the proposed model has stronger performance and more lightweight than the comparison models. The proposed model can shorten the prediction time by 90.66% and is better than the comparison models in accuracy and other performance metrics. The proposed model has strong detection capability for denial of service (DoS) and distributed denial of service (DDoS) attacks. Experimental results on the testbed built with IoT devices such as Raspberry Pi show that the proposed model can perform effective and real-time intrusion detection on IoT devices.

  • Simplification and Accurate Implementation of State Evolution Recursion for Conjugate Gradient

    Sakyo HASHIMOTO  Keigo TAKEUCHI  

     
    LETTER-Communication Theory and Signals

      Pubricized:
    2022/12/15
      Vol:
    E106-A No:6
      Page(s):
    952-956

    This letter simplifies and analyze existing state evolution recursions for conjugate gradient. The proposed simplification reduces the complexity for solving the recursions from cubic order to square order in the total number of iterations. The simplified recursions are still catastrophically sensitive to numerical errors, so that arbitrary-precision arithmetic is used for accurate evaluation of the recursions.

  • Proximal Decoding for LDPC Codes

    Tadashi WADAYAMA  Satoshi TAKABE  

     
    PAPER-Coding Theory and Techniques

      Pubricized:
    2022/09/01
      Vol:
    E106-A No:3
      Page(s):
    359-367

    This paper presents a novel optimization-based decoding algorithm for LDPC codes. The proposed decoding algorithm is based on a proximal gradient method for solving an approximate maximum a posteriori (MAP) decoding problem. The key idea of the proposed algorithm is the use of a code-constraint polynomial to penalize a vector far from a codeword as a regularizer in the approximate MAP objective function. A code proximal operator is naturally derived from a code-constraint polynomial. The proposed algorithm, called proximal decoding, can be described by a simple recursive formula consisting of the gradient descent step for a negative log-likelihood function corresponding to the channel conditional probability density function and the code proximal operation regarding the code-constraint polynomial. Proximal decoding is experimentally shown to be applicable to several non-trivial channel models such as LDPC-coded massive MIMO channels, correlated Gaussian noise channels, and nonlinear vector channels. In particular, in MIMO channels, proximal decoding outperforms known massive MIMO detection algorithms, such as an MMSE detector with belief propagation decoding. The simple optimization-based formulation of proximal decoding allows a way for developing novel signal processing algorithms involving LDPC codes.

  • A State-Space Approach and Its Estimation Bias Analysis for Adaptive Notch Digital Filters with Constrained Poles and Zeros

    Yoichi HINAMOTO  Shotaro NISHIMURA  

     
    PAPER-Digital Signal Processing

      Pubricized:
    2022/09/16
      Vol:
    E106-A No:3
      Page(s):
    582-589

    This paper deals with a state-space approach for adaptive second-order IIR notch digital filters with constrained poles and zeros. A simplified iterative algorithm is derived from the gradient-descent method to minimize the mean-squared output of an adaptive notch digital filter. Then, stability and parameter-estimation bias are analyzed for the simplified iterative algorithm. A numerical example is presented to demonstrate the validity and effectiveness of the proposed adaptive state-space notch digital filter and parameter-estimation bias analysis.

  • Improving Noised Gradient Penalty with Synchronized Activation Function for Generative Adversarial Networks

    Rui YANG  Raphael SHU  Hideki NAKAYAMA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2022/05/27
      Vol:
    E105-D No:9
      Page(s):
    1537-1545

    Generative Adversarial Networks (GANs) are one of the most successful learning principles of generative models and were wildly applied to many generation tasks. In the beginning, the gradient penalty (GP) was applied to enforce the discriminator in GANs to satisfy Lipschitz continuity in Wasserstein GAN. Although the vanilla version of the gradient penalty was further modified for different purposes, seeking a better equilibrium and higher generation quality in adversarial learning remains challenging. Recently, DRAGAN was proposed to achieve the local linearity in a surrounding data manifold by applying the noised gradient penalty to promote the local convexity in model optimization. However, we show that their approach will impose a burden on satisfying Lipschitz continuity for the discriminator. Such conflict between Lipschitz continuity and local linearity in DRAGAN will result in poor equilibrium, and thus the generation quality is far from ideal. To this end, we propose a novel approach to benefit both local linearity and Lipschitz continuity for reaching a better equilibrium without conflict. In detail, we apply our synchronized activation function in the discriminator to receive a particular form of noised gradient penalty for achieving local linearity without losing the property of Lipschitz continuity in the discriminator. Experimental results show that our method can reach the superior quality of images and outperforms WGAN-GP, DiracGAN, and DRAGAN in terms of Inception Score and Fréchet Inception Distance on real-world datasets.

  • Convergence Acceleration via Chebyshev Step: Plausible Interpretation of Deep-Unfolded Gradient Descent

    Satoshi TAKABE  Tadashi WADAYAMA  

     
    PAPER-Numerical Analysis and Optimization

      Pubricized:
    2022/01/25
      Vol:
    E105-A No:8
      Page(s):
    1110-1120

    Deep unfolding is a promising deep-learning technique, whose network architecture is based on expanding the recursive structure of existing iterative algorithms. Although deep unfolding realizes convergence acceleration, its theoretical aspects have not been revealed yet. This study details the theoretical analysis of the convergence acceleration in deep-unfolded gradient descent (DUGD) whose trainable parameters are step sizes. We propose a plausible interpretation of the learned step-size parameters in DUGD by introducing the principle of Chebyshev steps derived from Chebyshev polynomials. The use of Chebyshev steps in gradient descent (GD) enables us to bound the spectral radius of a matrix governing the convergence speed of GD, leading to a tight upper bound on the convergence rate. Numerical results show that Chebyshev steps numerically explain the learned step-size parameters in DUGD well.

  • Label-Adversarial Jointly Trained Acoustic Word Embedding

    Zhaoqi LI  Ta LI  Qingwei ZHAO  Pengyuan ZHANG  

     
    LETTER-Speech and Hearing

      Pubricized:
    2022/05/20
      Vol:
    E105-D No:8
      Page(s):
    1501-1505

    Query-by-example spoken term detection (QbE-STD) is a task of using speech queries to match utterances, and the acoustic word embedding (AWE) method of generating fixed-length representations for speech segments has shown high performance and efficiency in recent work. We propose an AWE training method using a label-adversarial network to reduce the interference information learned during AWE training. Experiments demonstrate that our method achieves significant improvements on multilingual and zero-resource test sets.

  • Weighted Gradient Pretrain for Low-Resource Speech Emotion Recognition

    Yue XIE  Ruiyu LIANG  Xiaoyan ZHAO  Zhenlin LIANG  Jing DU  

     
    LETTER-Speech and Hearing

      Pubricized:
    2022/04/04
      Vol:
    E105-D No:7
      Page(s):
    1352-1355

    To alleviate the problem of the dependency on the quantity of the training sample data in speech emotion recognition, a weighted gradient pre-train algorithm for low-resource speech emotion recognition is proposed. Multiple public emotion corpora are used for pre-training to generate shared hidden layer (SHL) parameters with the generalization ability. The parameters are used to initialize the downsteam network of the recognition task for the low-resource dataset, thereby improving the recognition performance on low-resource emotion corpora. However, the emotion categories are different among the public corpora, and the number of samples varies greatly, which will increase the difficulty of joint training on multiple emotion datasets. To this end, a weighted gradient (WG) algorithm is proposed to enable the shared layer to learn the generalized representation of different datasets without affecting the priority of the emotion recognition on each corpus. Experiments show that the accuracy is improved by using CASIA, IEMOCAP, and eNTERFACE as the known datasets to pre-train the emotion models of GEMEP, and the performance could be improved further by combining WG with gradient reversal layer.

  • Multi-Agent Distributed Route Selection under Consideration of Time Dependency among Agents' Road Usage for Vehicular Networks

    Takanori HARA  Masahiro SASABE  Shoji KASAHARA  

     
    PAPER

      Pubricized:
    2021/08/05
      Vol:
    E105-B No:2
      Page(s):
    140-150

    Traffic congestion in road networks has been studied as the congestion game in game theory. In the existing work, the road usage by each agent was assumed to be static during the whole time horizon of the agent's travel, as in the classical congestion game. This assumption, however, should be reconsidered because each agent sequentially uses roads composing the route. In this paper, we propose a multi-agent distributed route selection scheme based on a gradient descent method considering the time-dependency among agents' road usage for vehicular networks. The proposed scheme first estimates the time-dependent flow on each road by considering the agents' probabilistic occupation under the first-in-first-out (FIFO) policy. Then, it calculates the optimal route choice probability of each route candidate using the gradient descent method and the estimated time-dependent flow. Each agent finally selects one route according to the optimal route choice probabilities. We first prove that the proposed scheme can exponentially converge to the steady-state at the convergence rate inversely proportional to the product of the number of agents and that of individual route candidates. Through simulations under a grid-like network and a real road network, we show that the proposed scheme can improve the actual travel time by 5.1% and 2.5% compared with the conventional static-flow based approach, respectively. In addition, we demonstrate that the proposed scheme is robust against incomplete information sharing among agents, which would be caused by its low penetration ratio or limited transmission range of wireless communications.

  • Adaptive Normal State-Space Notch Digital Filters: Algorithm and Frequency-Estimation Bias Analysis

    Yoichi HINAMOTO  Shotaro NISHIMURA  

     
    PAPER-Digital Signal Processing

      Pubricized:
    2021/05/17
      Vol:
    E104-A No:11
      Page(s):
    1585-1592

    This paper investigates an adaptive notch digital filter that employs normal state-space realization of a single-frequency second-order IIR notch digital filter. An adaptive algorithm is developed to minimize the mean-squared output error of the filter iteratively. This algorithm is based on a simplified form of the gradient-decent method. Stability and frequency estimation bias are analyzed for the adaptive iterative algorithm. Finally, a numerical example is presented to demonstrate the validity and effectiveness of the proposed adaptive notch digital filter and the frequency-estimation bias analyzed for the adaptive iterative algorithm.

  • Gradient Corrected Approximation for Binary Neural Networks

    Song CHENG  Zixuan LI  Yongsen WANG  Wanbing ZOU  Yumei ZHOU  Delong SHANG  Shushan QIAO  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2021/07/05
      Vol:
    E104-D No:10
      Page(s):
    1784-1788

    Binary neural networks (BNNs), where both activations and weights are radically quantized to be {-1, +1}, can massively accelerate the run-time performance of convolution neural networks (CNNs) for edge devices, by computation complexity reduction and memory footprint saving. However, the non-differentiable binarizing function used in BNNs, makes the binarized models hard to be optimized, and introduces significant performance degradation than the full-precision models. Many previous works managed to correct the backward gradient of binarizing function with various improved versions of straight-through estimation (STE), or in a gradual approximate approach, but the gradient suppression problem was not analyzed and handled. Thus, we propose a novel gradient corrected approximation (GCA) method to match the discrepancy between binarizing function and backward gradient in a gradual and stable way. Our work has two primary contributions: The first is to approximate the backward gradient of binarizing function using a simple leaky-steep function with variable window size. The second is to correct the gradient approximation by standardizing the backward gradient propagated through binarizing function. Experiment results show that the proposed method outperforms the baseline by 1.5% Top-1 accuracy on ImageNet dataset without introducing extra computation cost.

  • Distributed UAVs Placement Optimization for Cooperative Communication

    Zhaoyang HOU  Zheng XIANG  Peng REN  Qiang HE  Ling ZHENG  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2020/12/08
      Vol:
    E104-B No:6
      Page(s):
    675-685

    In this paper, the distributed cooperative communication of unmanned aerial vehicles (UAVs) is studied, where the condition number (CN) and the inner product (InP) are used to measure the quality of communication links. By optimizing the relative position of UAVs, large channel capacity and stable communication links can be obtained. Using the spherical wave model under the line of sight (LOS) channel, CN expression of the channel matrix is derived when there are Nt transmitters and two receivers in the system. In order to maximize channel capacity, we derive the UAVs position constraint equation (UAVs-PCE), and the constraint between BS elements distance and carrier wavelength is analyzed. The result shows there is an area where no matter how the UAVs' positions are adjusted, the CN is still very large. Then a special scenario is considered where UAVs form a rectangular lattice array, and the optimal constraint between communication distance and UAVs distance is derived. After that, we derive the InP of channel matrix and the gradient expression of InP with respect to UAVs' position. The particle swarm optimization (PSO) algorithm is used to minimize the CN and the gradient descent (GD) algorithm is used to minimize the InP by optimizing UAVs' position iteratively. Both of the two algorithms present great potentials for optimizing the CN and InP respectively. Furthermore, a hybrid algorithm named PSO-GD combining the advantage of the two algorithms is proposed to maximize the communication capacity with lower complexity. Simulations show that PSO-GD is more efficient than PSO and GD. PSO helps GD to break away from local extremum and provides better positions for GD, and GD can converge to an optimal solution quickly by using the gradient information based on the better positions. Simulations also reveal that a better channel can be obtained when those parameters satisfy the UAVs position constraint equation (UAVs-PCE), meanwhile, theory analysis also explains the abnormal phenomena in simulations.

  • Joint Extreme Channels-Inspired Structure Extraction and Enhanced Heavy-Tailed Priors Heuristic Kernel Estimation for Motion Deblurring of Noisy and Blurry Images

    Hongtian ZHAO  Shibao ZHENG  

     
    PAPER-Vision

      Vol:
    E103-A No:12
      Page(s):
    1520-1528

    Motion deblurring for noisy and blurry images is an arduous and fundamental problem in image processing community. The problem is ill-posed as many different pairs of latent image and blur kernel can render the same blurred image, and thus, the optimization of this problem is still unsolved. To tackle it, we present an effective motion deblurring method for noisy and blurry images based on prominent structure and a data-driven heavy-tailed prior of enhanced gradient. Specifically, first, we employ denoising as a preprocess to remove the input image noise, and then restore strong edges for accurate kernel estimation. The image extreme channels-based priors (dark channel prior and bright channel prior) as sparse complementary knowledge are exploited to extract prominent structure. High closeness of the extracted structure to the clear image structure can be obtained via tuning the parameters of extraction function. Next, the integration term of enhanced interim image gradient and clear image heavy-tailed prior is proposed and then embedded into the image restoration model, which favors sharp images over blurry ones. A large number of experiments on both synthetic and real-life images verify the superiority of the proposed method over state-of-the-art algorithms, both qualitatively and quantitatively.

  • An Efficient Method for Training Deep Learning Networks Distributed

    Chenxu WANG  Yutong LU  Zhiguang CHEN  Junnan LI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2020/09/07
      Vol:
    E103-D No:12
      Page(s):
    2444-2456

    Training deep learning (DL) is a computationally intensive process; as a result, training time can become so long that it impedes the development of DL. High performance computing clusters, especially supercomputers, are equipped with a large amount of computing resources, storage resources, and efficient interconnection ability, which can train DL networks better and faster. In this paper, we propose a method to train DL networks distributed with high efficiency. First, we propose a hierarchical synchronous Stochastic Gradient Descent (SGD) strategy, which can make full use of hardware resources and greatly increase computational efficiency. Second, we present a two-level parameter synchronization scheme which can reduce communication overhead by transmitting parameters of the first layer models in shared memory. Third, we optimize the parallel I/O by making each reader read data as continuously as possible to avoid the high overhead of discontinuous data reading. At last, we integrate the LARS algorithm into our system. The experimental results demonstrate that our approach has tremendous performance advantages relative to unoptimized methods. Compared with the native distributed strategy, our hierarchical synchronous SGD strategy (HSGD) can increase computing efficiency by about 20 times.

1-20hit(160hit)