The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] stage(117hit)

1-20hit(117hit)

  • FA-YOLO: A High-Precision and Efficient Method for Fabric Defect Detection in Textile Industry Open Access

    Kai YU  Wentao LYU  Xuyi YU  Qing GUO  Weiqiang XU  Lu ZHANG  

     
    PAPER-Neural Networks and Bioengineering

      Pubricized:
    2023/09/04
      Vol:
    E107-A No:6
      Page(s):
    890-898

    The automatic defect detection for fabric images is an essential mission in textile industry. However, there are some inherent difficulties in the detection of fabric images, such as complexity of the background and the highly uneven scales of defects. Moreover, the trade-off between accuracy and speed should be considered in real applications. To address these problems, we propose a novel model based on YOLOv4 to detect defects in fabric images, called Feature Augmentation YOLO (FA-YOLO). In terms of network structure, FA-YOLO adds an additional detection head to improve the detection ability of small defects and builds a powerful Neck structure to enhance feature fusion. First, to reduce information loss during feature fusion, we perform the residual feature augmentation (RFA) on the features after dimensionality reduction by using 1×1 convolution. Afterward, the attention module (SimAM) is embedded into the locations with rich features to improve the adaptation ability to complex backgrounds. Adaptive spatial feature fusion (ASFF) is also applied to output of the Neck to filter inconsistencies across layers. Finally, the cross-stage partial (CSP) structure is introduced for optimization. Experimental results based on three real industrial datasets, including Tianchi fabric dataset (72.5% mAP), ZJU-Leaper fabric dataset (0.714 of average F1-score) and NEU-DET steel dataset (77.2% mAP), demonstrate the proposed FA-YOLO achieves competitive results compared to other state-of-the-art (SoTA) methods.

  • Two-Path Object Knowledge Injection for Detecting Novel Objects With Single-Stage Dense Detector

    KuanChao CHU  Hideki NAKAYAMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/08/02
      Vol:
    E106-D No:11
      Page(s):
    1868-1880

    We present an effective system for integrating generative zero-shot classification modules into a YOLO-like dense detector to detect novel objects. Most double-stage-based novel object detection methods are achieved by refining the classification output branch but cannot be applied to a dense detector. Our system utilizes two paths to inject knowledge of novel objects into a dense detector. One involves injecting the class confidence for novel classes from a classifier trained on data synthesized via a dual-step generator. This generator learns a mapping function between two feature spaces, resulting in better classification performance. The second path involves re-training the detector head with feature maps synthesized on different intensity levels. This approach significantly increases the predicted objectness for novel objects, which is a major challenge for a dense detector. We also introduce a stop-and-reload mechanism during re-training for optimizing across head layers to better learn synthesized features. Our method relaxes the constraint on the detector head architecture in the previous method and has markedly enhanced performance on the MSCOCO dataset.

  • Multi-Stage Contour Primitive of Interest Extraction Network with Dense Direction Classification

    Jinyan LU  Quanzhen HUANG  Shoubing LIU  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2022/07/06
      Vol:
    E105-D No:10
      Page(s):
    1743-1750

    For intelligent vision measurement, the geometric image feature extraction is an essential issue. Contour primitive of interest (CPI) means a regular-shaped contour feature lying on a target object, which is widely used for geometric calculation in vision measurement and servoing. To realize that the CPI extraction model can be flexibly applied to different novel objects, the one-shot learning based CPI extraction can be implemented with deep convolutional neural network, by using only one annotated support image to guide the CPI extraction process. In this paper, we propose a multi-stage contour primitives of interest extraction network (MS-CPieNet), which uses the multi-stage strategy to improve the discrimination ability of CPI and complex background. Second, the spatial non-local attention module is utilized to enhance the deep features, by globally fusing the image features with both short and long ranges. Moreover, the dense 4-direction classification is designed to obtain the normal direction of the contour, and the directions can be further used for the contour thinning post-process. The effectiveness of the proposed methods is validated by the experiments with the OCP and ROCM datasets. A 2-D measurement experiments are conducted to demonstrate the convenient application of the proposed MS-CPieNet.

  • Smaller Residual Network for Single Image Depth Estimation

    Andi HENDRA  Yasushi KANAZAWA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/08/17
      Vol:
    E104-D No:11
      Page(s):
    1992-2001

    We propose a new framework for estimating depth information from a single image. Our framework is relatively small and straightforward by employing a two-stage architecture: a residual network and a simple decoder network. Our residual network in this paper is a remodeled of the original ResNet-50 architecture, which consists of only thirty-eight convolution layers in the residual block following by pair of two up-sampling and layers. While the simple decoder network, stack of five convolution layers, accepts the initial depth to be refined as the final output depth. During training, we monitor the loss behavior and adjust the learning rate hyperparameter in order to improve the performance. Furthermore, instead of using a single common pixel-wise loss, we also compute loss based on gradient-direction, and their structure similarity. This setting in our network can significantly reduce the number of network parameters, and simultaneously get a more accurate image depth map. The performance of our approach has been evaluated by conducting both quantitative and qualitative comparisons with several prior related methods on the publicly NYU and KITTI datasets.

  • RVCoreP: An Optimized RISC-V Soft Processor of Five-Stage Pipelining

    Hiromu MIYAZAKI  Takuto KANAMORI  Md Ashraful ISLAM  Kenji KISE  

     
    PAPER-Computer System

      Pubricized:
    2020/09/07
      Vol:
    E103-D No:12
      Page(s):
    2494-2503

    RISC-V is a RISC based open and loyalty free instruction set architecture which has been developed since 2010, and can be used for cost-effective soft processors on FPGAs. The basic 32-bit integer instruction set in RISC-V is defined as RV32I, which is sufficient to support the operating system environment and suits for embedded systems. In this paper, we propose an optimized RV32I soft processor named RVCoreP adopting five-stage pipelining. Three effective methods are applied to the processor to improve the operating frequency. These methods are instruction fetch unit optimization, ALU optimization, and data memory optimization. We implement RVCoreP in Verilog HDL and verify the behavior using Verilog simulation and an actual Xilinx Atrix-7 FPGA board. We evaluate IPC (instructions per cycle), operating frequency, hardware resource utilization, and processor performance. From the evaluation results, we show that RVCoreP achieves 30.0% performance improvement compared with VexRiscv, which is a high-performance and open source RV32I processor selected from some related works.

  • Single Stage Vehicle Logo Detector Based on Multi-Scale Prediction

    Junxing ZHANG  Shuo YANG  Chunjuan BO  Huimin LU  

     
    PAPER-Pattern Recognition

      Pubricized:
    2020/07/14
      Vol:
    E103-D No:10
      Page(s):
    2188-2198

    Vehicle logo detection technology is one of the research directions in the application of intelligent transportation systems. It is an important extension of detection technology based on license plates and motorcycle types. A vehicle logo is characterized by uniqueness, conspicuousness, and diversity. Therefore, thorough research is important in theory and application. Although there are some related works for object detection, most of them cannot achieve real-time detection for different scenes. Meanwhile, some real-time detection methods of single-stage have performed poorly in the object detection of small sizes. In order to solve the problem that the training samples are scarce, our work in this paper is improved by constructing the data of a vehicle logo (VLD-45-S), multi-stage pre-training, multi-scale prediction, feature fusion between deeper with shallow layer, dimension clustering of the bounding box, and multi-scale detection training. On the basis of keeping speed, this article improves the detection precision of the vehicle logo. The generalization of the detection model and anti-interference capability in real scenes are optimized by data enrichment. Experimental results show that the accuracy and speed of the detection algorithm are improved for the object of small sizes.

  • Combining Siamese Network and Regression Network for Visual Tracking

    Yao GE  Rui CHEN  Ying TONG  Xuehong CAO  Ruiyu LIANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2020/05/13
      Vol:
    E103-D No:8
      Page(s):
    1924-1927

    We combine the siamese network and the recurrent regression network, proposing a two-stage tracking framework termed as SiamReg. Our method solves the problem that the classic siamese network can not judge the target size precisely and simplifies the procedures of regression in the training and testing process. We perform experiments on three challenging tracking datasets: VOT2016, OTB100, and VOT2018. The results indicate that, after offline trained, SiamReg can obtain a higher expected average overlap measure.

  • A Two-Stage Crack Detection Method for Concrete Bridges Using Convolutional Neural Networks

    Yundong LI  Weigang ZHAO  Xueyan ZHANG  Qichen ZHOU  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2018/09/05
      Vol:
    E101-D No:12
      Page(s):
    3249-3252

    Crack detection is a vital task to maintain a bridge's health and safety condition. Traditional computer-vision based methods easily suffer from disturbance of noise and clutters for a real bridge inspection. To address this limitation, we propose a two-stage crack detection approach based on Convolutional Neural Networks (CNN) in this letter. A predictor of small receptive field is exploited in the first detection stage, while another predictor of large receptive field is used to refine the detection results in the second stage. Benefiting from data fusion of confidence maps produced by both predictors, our method can predict the probability belongs to cracked areas of each pixel accurately. Experimental results show that the proposed method is superior to an up-to-date method on real concrete surface images.

  • Hybrid Mechanism to Detect Paroxysmal Stage of Atrial Fibrillation Using Adaptive Threshold-Based Algorithm with Artificial Neural Network

    Mohamad Sabri bin SINAL  Eiji KAMIOKA  

     
    PAPER-Biological Engineering

      Pubricized:
    2018/03/14
      Vol:
    E101-D No:6
      Page(s):
    1666-1676

    Automatic detection of heart cycle abnormalities in a long duration of ECG data is a crucial technique for diagnosing an early stage of heart diseases. Concretely, Paroxysmal stage of Atrial Fibrillation rhythms (ParAF) must be discriminated from Normal Sinus rhythms (NS). The both of waveforms in ECG data are very similar, and thus it is difficult to completely detect the Paroxysmal stage of Atrial Fibrillation rhythms. Previous studies have tried to solve this issue and some of them achieved the discrimination with a high degree of accuracy. However, the accuracies of them do not reach 100%. In addition, no research has achieved it in a long duration, e.g. 12 hours, of ECG data. In this study, a new mechanism to tackle with these issues is proposed: “Door-to-Door” algorithm is introduced to accurately and quickly detect significant peaks of heart cycle in 12 hours of ECG data and to discriminate obvious ParAF rhythms from NS rhythms. In addition, a quantitative method using Artificial Neural Network (ANN), which discriminates unobvious ParAF rhythms from NS rhythms, is investigated. As the result of Door-to-Door algorithm performance evaluation, it was revealed that Door-to-Door algorithm achieves the accuracy of 100% in detecting the significant peaks of heart cycle in 17 NS ECG data. In addition, it was verified that ANN-based method achieves the accuracy of 100% in discriminating the Paroxysmal stage of 15 Atrial Fibrillation data from 17 NS data. Furthermore, it was confirmed that the computational time to perform the proposed mechanism is less than the half of the previous study. From these achievements, it is concluded that the proposed mechanism can practically be used to diagnose early stage of heart diseases.

  • A Spatiotemporal Statistical Model for Eyeballs of Human Embryos

    Masashi KISHIMOTO  Atsushi SAITO  Tetsuya TAKAKUWA  Shigehito YAMADA  Hiroshi MATSUZOE  Hidekata HONTANI  Akinobu SHIMIZU  

     
    PAPER-Biological Engineering

      Pubricized:
    2017/04/17
      Vol:
    E100-D No:7
      Page(s):
    1505-1515

    During the development of a human embryo, the position of eyes moves medially and caudally in the viscerocranium. A statistical model of this process can play an important role in embryology by facilitating qualitative analyses of change. This paper proposes an algorithm to construct a spatiotemporal statistical model for the eyeballs of a human embryo. The proposed modeling algorithm builds a statistical model of the spatial coordinates of the eyeballs independently for each Carnegie stage (CS) by using principal component analysis (PCA). In the process, a q-Gaussian distribution with a model selection scheme based on the Aaike information criterion is used to handle a non-Gaussian distribution with a small sample size. Subsequently, it seamlessly interpolates the statistical models of neighboring CSs, and we present 10 interpolation methods. We also propose an estimation algorithm for the CS using our spatiotemporal statistical model. A set of images of eyeballs in human embryos from the Kyoto Collection was used to train the model and assess its performance. The modeling results suggested that information geometry-based interpolation under the assumption of a q-Gaussian distribution is the best modeling method. The average error in CS estimation was 0.409. We proposed an algorithm to construct a spatiotemporal statistical model of the eyeballs of a human embryo and tested its performance using the Kyoto Collection.

  • Power-Supply Rejection Model Analysis of Capacitor-Less LDO Regulator Designs

    Soyeon JOO  Jintae KIM  SoYoung KIM  

     
    PAPER-Electronic Circuits

      Vol:
    E100-C No:5
      Page(s):
    504-512

    This paper presents accurate DC and high frequency power-supply rejection (PSR) models for low drop-out (LDO) regulators using different types of active loads and pass transistors. Based on the proposed PSR model, we suggest design guidelines to achieve a high DC PSR or flat bandwidth (BW) by choosing appropriate active loads and pass transistors. Our PSR model captures the intricate interaction between the error amplifiers (EAs) and the pass devices by redefining the transfer function of the LDO topologies. The accuracy of our model has been verified through SPICE simulation and measurements. Moreover, the measurement results of the LDOs fabricated using the 0.18 µm CMOS process are consistent with the design guidelines suggested in this work.

  • An Efficient Algorithm of Discrete Particle Swarm Optimization for Multi-Objective Task Assignment

    Nannan QIAO  Jiali YOU  Yiqiang SHENG  Jinlin WANG  Haojiang DENG  

     
    PAPER-Distributed system

      Pubricized:
    2016/08/24
      Vol:
    E99-D No:12
      Page(s):
    2968-2977

    In this paper, a discrete particle swarm optimization method is proposed to solve the multi-objective task assignment problem in distributed environment. The objectives of optimization include the makespan for task execution and the budget caused by resource occupation. A two-stage approach is designed as follows. In the first stage, several artificial particles are added into the initialized swarm to guide the search direction. In the second stage, we redefine the operators of the discrete PSO to implement addition, subtraction and multiplication. Besides, a fuzzy-cost-based elite selection is used to improve the computational efficiency. Evaluation shows that the proposed algorithm achieves Pareto improvement in comparison to the state-of-the-art algorithms.

  • Application of Feature Engineering for Phishing Detection

    Wei ZHANG  Huan REN  Qingshan JIANG  

     
    PAPER

      Pubricized:
    2016/01/28
      Vol:
    E99-D No:4
      Page(s):
    1062-1070

    Phishing attacks target financial returns by luring Internet users to exposure their sensitive information. Phishing originates from e-mail fraud, and recently it is also spread by social networks and short message service (SMS), which makes phishing become more widespread. Phishing attacks have drawn great attention due to their high volume and causing heavy losses, and many methods have been developed to fight against them. However, most of researches suffered low detection accuracy or high false positive (FP) rate, and phishing attacks are facing the Internet users continuously. In this paper, we are concerned about feature engineering for improving the classification performance on phishing web pages detection. We propose a novel anti-phishing framework that employs feature engineering including feature selection and feature extraction. First, we perform feature selection based on genetic algorithm (GA) to divide features into critical features and non-critical features. Then, the non-critical features are projected to a new feature by implementing feature extraction based on a two-stage projection pursuit (PP) algorithm. Finally, we take the critical features and the new feature as input data to construct the detection model. Our anti-phishing framework does not simply eliminate the non-critical features, but considers utilizing their projection in the process of classification, which is different from literatures. Experimental results show that the proposed framework is effective in detecting phishing web pages.

  • An InP-Based 27-GHz-Bandwidth Limiting TIA IC Designed to Suppress Undershoot and Ringing in Its Output Waveform

    Hiroyuki FUKUYAMA  Michihiro HIRATA  Kenji KURISHIMA  Minoru IDA  Masami TOKUMITSU  Shogo YAMANAKA  Munehiko NAGATANI  Toshihiro ITOH  Kimikazu SANO  Hideyuki NOSAKA  Koichi MURATA  

     
    PAPER-Electronic Circuits

      Vol:
    E99-C No:3
      Page(s):
    385-396

    A design scheme for a high-speed differential-input limiting transimpedance amplifier (TIA) was developed. The output-stage amplifier of the TIA is investigated in detail in order to suppress undershoot and ringing in the output waveform. The amplifier also includes a peak detector for the received signal strength indicator (RSSI) output, which is used to control the optical demodulator for differential-phase-shift-keying or differential-quadrature-phase-shift-keying formats. The limiting TIA was fabricated on the basis of 1-µm emitter-width InP-based heterojunction-bipolar-transistor (HBT) IC technology. Its differential gain is 39 dB, its 3-dB bandwidth is 27 GHz, and its estimated differential transimpedance gain is 73 dBΩ. The obtained output waveform shows that the developed design scheme is effective for suppressing undershoot and ringing.

  • Multistage Function Speculation Adders

    Yinan SUN  Yongpan LIU  Zhibo WANG  Huazhong YANG  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E98-A No:4
      Page(s):
    954-965

    Function speculation design with error recovery mechanisms is quite promising due to its high performance and low area overhead. Previous work has focused on two-stage function speculation and thus lacks a systematic way to address the challenge of the multistage function speculation approach. This paper proposes a multistage function speculation with adaptive predictors and applies it in a novel adder. We deduced the analytical performance and area models for the design and validated them in our experiments. Based on those models, a general methodology is presented to guide design optimization. Both analytical proofs and experimental results on the fabricated chips show that the proposed adder's delay and area have a logarithmic and linear relationship with its bit number, respectively. Compared with the DesignWare IP, the proposed adder provides the same performance with 6-17% area reduction under different bit lengths.

  • Reference-Free Deterministic Calibration of Pipelined ADC

    Takashi OSHIMA  Taizo YAMAWAKI  

     
    PAPER-Analog Signal Processing

      Vol:
    E98-A No:2
      Page(s):
    665-675

    Novel deterministic digital calibration of pipelined ADC has been proposed and analyzed theoretically. Each MDAC is dithered exploiting its inherent redundancy during the calibration. The dither enables fast accurate convergence of calibration without requiring any accurate reference signal and hence with minimum area and power overhead. The proposed calibration can be applied to both the 1.5-bit/stage MDAC and the multi-bit/stage MDAC. Due to its simple structure and algorithm, it can be modified to the background calibration easily. The effectiveness of the proposed calibration has been confirmed by both the extensive simulations and the measurement of the prototype 0.13-µm-CMOS 50-MS/s pipelined ADC using the op-amps with only 37-dB gain. As expected, SNDR and SFDR have improved from 35.5dB to 58.1dB and from 37.4dB to 70.4dB, respectively by the proposed calibration.

  • Resource Allocation for MDC Multicast in CRNs with Imperfect Spectrum Sensing and Channel Feedback

    Shengyu LI  Wenjun XU  Zhihui LIU  Kai NIU  Jiaru LIN  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E98-B No:2
      Page(s):
    335-343

    In this paper, resource-efficient multiple description coding (MDC) multicast is investigated in cognitive radio networks with the consideration of imperfect spectrum sensing and imperfect channel feedback. Our objective is to maximize the system goodput, which is defined as the total successfully received data rate of all multicast users, while guaranteeing the maximum transmit power budget and the maximum average received interference constraint. Owing to the uncertainty of the spectrum state and the non-closed-form expression of the objective function, it is difficult to solve the problem directly. To circumvent this problem, a pretreatment is performed, in which we first estimate the real spectrum state of primary users and then propose a Gaussian approximation for the probability density functions of transmission channel gains to simplify the computation of the objective function. Thereafter, a two-stage resource allocation algorithm is presented to accomplish the subcarrier assignment, the optimal transmit channel gain to interference plus noise ratio (T-CINR) setting, and the transmit power allocation separately. Simulation results show that the proposed scheme is able to offset more than 80% of the performance loss caused by imperfect channel feedback when the feedback error is not high, while keeping the average interference on primary users below the prescribed threshold.

  • Disaster Recovery for Transport Network through Multiple Restoration Stages

    Shohei KAMAMURA  Daisaku SHIMAZAKI  Kouichi GENDA  Koji SASAYAMA  Yoshihiko UEMATSU  

     
    PAPER-Network System

      Vol:
    E98-B No:1
      Page(s):
    171-179

    This paper proposes a disaster recovery method for transport networks. In a scenario of recovery from a disaster, a network is repaired through multiple restoration stages because repair resources are limited. In a practical case, a network should provide the reachability of important traffic in transient stages, even as service interruption risks and/or operational overheads caused by transport paths switching are suppressed. Then, we define the multi-objective optimization problem: maximizing the traffic recovery ratio and minimizing the number of switched transport paths at each stage. We formulate our problem as linear programming, and show that it yields pareto-optimal solutions of traffic recovery versus the number of switched paths. We also propose a heuristic algorithm for applying to networks consisting of a few hundred nodes, and show that it can produce sub-optimal solutions that differ only slightly from optimal solutions.

  • Temperature-Aware Layer Assignment for Three-Dimensional Integrated Circuits

    Shih-Hsu HUANG  Hua-Hsin YEH  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E97-A No:8
      Page(s):
    1699-1708

    Because dielectrics between active layers have low thermal conductivities, there is a demand to reduce the temperature increase in three-dimensional integrated circuits (3D ICs). This paper demonstrates that, in the design of 3D ICs, different layer assignments often lead to different temperature increases. Based on this observation, we are motivated to perform temperature-aware layer assignment. Our work includes two parts. Firstly, an integer linear programming (ILP) approach that guarantees a minimum temperature increase is proposed. Secondly, a polynomial-time heuristic algorithm that reduces the temperature increase is proposed. Compared with the previous work, which does not take the temperature increase into account, the experimental results show that both our ILP approach and our heuristic algorithm produce a significant reduction in the temperature increase with a very small area overhead.

  • Solving the Phoneme Conflict in Grapheme-to-Phoneme Conversion Using a Two-Stage Neural Network-Based Approach

    Seng KHEANG  Kouichi KATSURADA  Yurie IRIBE  Tsuneo NITTA  

     
    PAPER-Speech and Hearing

      Vol:
    E97-D No:4
      Page(s):
    901-910

    To achieve high quality output speech synthesis systems, data-driven grapheme-to-phoneme (G2P) conversion is usually used to generate the phonetic transcription of out-of-vocabulary (OOV) words. To improve the performance of G2P conversion, this paper deals with the problem of conflicting phonemes, where an input grapheme can, in the same context, produce many possible output phonemes at the same time. To this end, we propose a two-stage neural network-based approach that converts the input text to phoneme sequences in the first stage and then predicts each output phoneme in the second stage using the phonemic information obtained. The first-stage neural network is fundamentally implemented as a many-to-many mapping model for automatic conversion of word to phoneme sequences, while the second stage uses a combination of the obtained phoneme sequences to predict the output phoneme corresponding to each input grapheme in a given word. We evaluate the performance of this approach using the American English words-based pronunciation dictionary known as the auto-aligned CMUDict corpus[1]. In terms of phoneme and word accuracy of the OOV words, on comparison with several proposed baseline approaches, the evaluation results show that our proposed approach improves on the previous one-stage neural network-based approach for G2P conversion. The results of comparison with another existing approach indicate that it provides higher phoneme accuracy but lower word accuracy on a general dataset, and slightly higher phoneme and word accuracy on a selection of words consisting of more than one phoneme conflicts.

1-20hit(117hit)