The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] vision(776hit)

21-40hit(776hit)

  • Motion Parameter Estimation Based on Overlapping Elements for TDM-MIMO FMCW Radar

    Feng TIAN  Wan LIU  Weibo FU  Xiaojun HUANG  

     
    PAPER-Sensing

      Pubricized:
    2023/02/06
      Vol:
    E106-B No:8
      Page(s):
    705-713

    Intelligent traffic monitoring provides information support for autonomous driving, which is widely used in intelligent transportation systems (ITSs). A method for estimating vehicle moving target parameters based on millimeter-wave radars is proposed to solve the problem of low detection accuracy due to velocity ambiguity and Doppler-angle coupling in the process of traffic monitoring. First of all, a MIMO antenna array with overlapping elements is constructed by introducing them into the typical design of MIMO radar array antennas. The motion-induced phase errors are eliminated by the phase difference among the overlapping elements. Then, the position errors among them are corrected through an iterative method, and the angle of multiple targets is estimated. Finally, velocity disambiguation is performed by adopting the error-corrected phase difference among the overlapping elements. An accurate estimation of vehicle moving target angle and velocity is achieved. Through Monte Carlo simulation experiments, the angle error is 0.1° and the velocity error is 0.1m/s. The simulation results show that the method can be used to effectively solve the problems related to velocity ambiguity and Doppler-angle coupling, meanwhile the accuracy of velocity and angle estimation can be improved. An improved algorithm is tested on the vehicle datasets that are gathered in the forward direction of ordinary public scenes of a city. The experimental results further verify the feasibility of the method, which meets the real-time and accuracy requirements of ITSs on vehicle information monitoring.

  • Multi-Target Recognition Utilizing Micro-Doppler Signatures with Limited Supervision

    Jingyi ZHANG  Kuiyu CHEN  Yue MA  

     
    BRIEF PAPER-Electronic Instrumentation and Control

      Pubricized:
    2023/03/06
      Vol:
    E106-C No:8
      Page(s):
    454-457

    Previously, convolutional neural networks have made tremendous progress in target recognition based on micro-Doppler radar. However, these studies only considered the presence of one target at a time in the surveillance area. Simultaneous multi-targets recognition for surveillance radar remains a pretty challenging issue. To alleviate this issue, this letter develops a multi-instance multi-label (MIML) learning strategy, which can automatically locate the crucial input patterns that trigger the labels. Benefitting from its powerful target-label relation discovery ability, the proposed framework can be trained with limited supervision. We emphasize that only echoes from single targets are involved in training data, avoiding the preparation and annotation of multi-targets echo in the training stage. To verify the validity of the proposed method, we model two representative ground moving targets, i.e., person and wheeled vehicles, and carry out numerous comparative experiments. The result demonstrates that the developed framework can simultaneously recognize multiple targets and is also robust to variation of the signal-to-noise ratio (SNR), the initial position of targets, and the difference in scattering coefficient.

  • Temporal-Based Action Clustering for Motion Tendencies

    Xingyu QIAN  Xiaogang CHEN  Aximu YUEMAIER  Shunfen LI  Weibang DAI  Zhitang SONG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/05/02
      Vol:
    E106-D No:8
      Page(s):
    1292-1295

    Video-based action recognition encompasses the recognition of appearance and the classification of action types. This work proposes a discrete-temporal-sequence-based motion tendency clustering framework to implement motion clustering by extracting motion tendencies and self-supervised learning. A published traffic intersection dataset (inD) and a self-produced gesture video set are used for evaluation and to validate the motion tendency action recognition hypothesis.

  • Space Division Multiplexing Using High-Luminance Cell-Size Reduction Arrangement for Low-Luminance Smartphone Screen to Camera Uplink Communication

    Alisa KAWADE  Wataru CHUJO  Kentaro KOBAYASHI  

     
    PAPER

      Pubricized:
    2022/11/01
      Vol:
    E106-A No:5
      Page(s):
    793-802

    To simultaneously enhance data rate and physical layer security (PLS) for low-luminance smartphone screen to camera uplink communication, space division multiplexing using high-luminance cell-size reduction arrangement is numerically analyzed and experimentally verified. The uplink consists of a low-luminance smartphone screen and an indoor telephoto camera at a long distance of 3.5 meters. The high-luminance cell-size reduction arrangement avoids the influence of spatial inter-symbol interference (ISI) and ambient light to obtain a stable low-luminance screen. To reduce the screen luminance without decreasing the screen pixel value, the arrangement reduces only the high-luminance cell area while keeping the cell spacing. In this study, two technical issues related to high-luminance cell-size reduction arrangement are solved. First, a numerical analysis and experimental results show that the high-luminance cell-size reduction arrangement is more effective in reducing the spatial ISI at low luminance than the conventional low-luminance cell arrangement. Second, in view point of PLS enhancement at wide angles, symbol error rate should be low in front of the screen and high at wide angles. A numerical analysis and experimental results show that the high-luminance cell-size reduction arrangement is more suitable for enhancing PLS at wide angles than the conventional low-luminance cell arrangement.

  • Computer Vision-Based Tracking of Workers in Construction Sites Based on MDNet

    Wen LIU  Yixiao SHAO  Shihong ZHAI  Zhao YANG  Peishuai CHEN  

     
    PAPER-Smart Industry

      Pubricized:
    2022/10/20
      Vol:
    E106-D No:5
      Page(s):
    653-661

    Automatic continuous tracking of objects involved in a construction project is required for such tasks as productivity assessment, unsafe behavior recognition, and progress monitoring. Many computer-vision-based tracking approaches have been investigated and successfully tested on construction sites; however, their practical applications are hindered by the tracking accuracy limited by the dynamic, complex nature of construction sites (i.e. clutter with background, occlusion, varying scale and pose). To achieve better tracking performance, a novel deep-learning-based tracking approach called the Multi-Domain Convolutional Neural Networks (MD-CNN) is proposed and investigated. The proposed approach consists of two key stages: 1) multi-domain representation of learning; and 2) online visual tracking. To evaluate the effectiveness and feasibility of this approach, it is applied to a metro project in Wuhan China, and the results demonstrate good tracking performance in construction scenarios with complex background. The average distance error and F-measure for the MDNet are 7.64 pixels and 67, respectively. The results demonstrate that the proposed approach can be used by site managers to monitor and track workers for hazard prevention in construction sites.

  • Post-Processing of Iterative Estimation and Cancellation Scheme for Clipping Noise in OFDM Systems

    Kee-Hoon KIM  Chanki KIM  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2022/09/30
      Vol:
    E106-B No:4
      Page(s):
    352-358

    Clipping is an efficient and simple method that can reduce the peak-to-average power ratio (PAPR) of orthogonal frequency division multiplexing (OFDM) signals. However, clipping causes in-band distortion referred to as clipping noise. To resolve this problem, a novel iterative estimation and cancellation (IEC) scheme for clipping noise is one of the most popular schemes because it can significantly improve the performance of clipped OFDM systems. However, IEC exploits detected symbols at the receiver to estimate the clipping noise in principle and the detected symbols are not the sufficient statistic in terms of estimation theory. In this paper, we propose the post-processing technique of IEC, which fully exploits given sufficient statistic at the receiver and thus further enhances the performance of a clipped OFDM system as verified by simulations.

  • Learning Multi-Level Features for Improved 3D Reconstruction

    Fairuz SAFWAN MAHAD  Masakazu IWAMURA  Koichi KISE  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2022/12/08
      Vol:
    E106-D No:3
      Page(s):
    381-390

    3D reconstruction methods using neural networks are popular and have been studied extensively. However, the resulting models typically lack detail, reducing the quality of the 3D reconstruction. This is because the network is not designed to capture the fine details of the object. Therefore, in this paper, we propose two networks designed to capture both the coarse and fine details of the object to improve the reconstruction of the detailed parts of the object. To accomplish this, we design two networks. The first network uses a multi-scale architecture with skip connections to associate and merge features from other levels. For the second network, we design a multi-branch deep generative network that separately learns the local features, generic features, and the intermediate features through three different tailored components. In both network architectures, the principle entails allowing the network to learn features at different levels that can reconstruct the fine parts and the overall shape of the reconstructed 3D model. We show that both of our methods outperformed state-of-the-art approaches.

  • Image and Model Transformation with Secret Key for Vision Transformer

    Hitoshi KIYA  Ryota IIJIMA  Aprilpyone MAUNGMAUNG  Yuma KINOSHITA  

     
    INVITED PAPER

      Pubricized:
    2022/11/02
      Vol:
    E106-D No:1
      Page(s):
    2-11

    In this paper, we propose a combined use of transformed images and vision transformer (ViT) models transformed with a secret key. We show for the first time that models trained with plain images can be directly transformed to models trained with encrypted images on the basis of the ViT architecture, and the performance of the transformed models is the same as models trained with plain images when using test images encrypted with the key. In addition, the proposed scheme does not require any specially prepared data for training models or network modification, so it also allows us to easily update the secret key. In an experiment, the effectiveness of the proposed scheme is evaluated in terms of performance degradation and model protection performance in an image classification task on the CIFAR-10 dataset.

  • Vehicle Re-Identification Based on Quadratic Split Architecture and Auxiliary Information Embedding

    Tongwei LU  Hao ZHANG  Feng MIN  Shihai JIA  

     
    LETTER-Image

      Pubricized:
    2022/05/24
      Vol:
    E105-A No:12
      Page(s):
    1621-1625

    Convolutional neural network (CNN) based vehicle re-identificatioin (ReID) inevitably has many disadvantages, such as information loss caused by downsampling operation. Therefore we propose a vision transformer (Vit) based vehicle ReID method to solve this problem. To improve the feature representation of vision transformer and make full use of additional vehicle information, the following methods are presented. (I) We propose a Quadratic Split Architecture (QSA) to learn both global and local features. More precisely, we split an image into many patches as “global part” and further split them into smaller sub-patches as “local part”. Features of both global and local part will be aggregated to enhance the representation ability. (II) The Auxiliary Information Embedding (AIE) is proposed to improve the robustness of the model by plugging a learnable camera/viewpoint embedding into Vit. Experimental results on several benchmarks indicate that our method is superior to many advanced vehicle ReID methods.

  • Multi-Stage Contour Primitive of Interest Extraction Network with Dense Direction Classification

    Jinyan LU  Quanzhen HUANG  Shoubing LIU  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2022/07/06
      Vol:
    E105-D No:10
      Page(s):
    1743-1750

    For intelligent vision measurement, the geometric image feature extraction is an essential issue. Contour primitive of interest (CPI) means a regular-shaped contour feature lying on a target object, which is widely used for geometric calculation in vision measurement and servoing. To realize that the CPI extraction model can be flexibly applied to different novel objects, the one-shot learning based CPI extraction can be implemented with deep convolutional neural network, by using only one annotated support image to guide the CPI extraction process. In this paper, we propose a multi-stage contour primitives of interest extraction network (MS-CPieNet), which uses the multi-stage strategy to improve the discrimination ability of CPI and complex background. Second, the spatial non-local attention module is utilized to enhance the deep features, by globally fusing the image features with both short and long ranges. Moreover, the dense 4-direction classification is designed to obtain the normal direction of the contour, and the directions can be further used for the contour thinning post-process. The effectiveness of the proposed methods is validated by the experiments with the OCP and ROCM datasets. A 2-D measurement experiments are conducted to demonstrate the convenient application of the proposed MS-CPieNet.

  • Integral Cryptanalysis on Reduced-Round KASUMI

    Nobuyuki SUGIO  Yasutaka IGARASHI  Sadayuki HONGO  

     
    PAPER-Cryptography and Information Security

      Pubricized:
    2022/04/22
      Vol:
    E105-A No:9
      Page(s):
    1309-1316

    Integral cryptanalysis is one of the most powerful attacks on symmetric key block ciphers. Attackers preliminarily search integral characteristics of a target cipher and use them to perform the key recovery attack. Todo proposed a novel technique named the bit-based division property to find integral characteristics. Xiang et al. extended the Mixed Integer Linear Programming (MILP) method to search integral characteristics of lightweight block ciphers based on the bit-based division property. In this paper, we apply these techniques to the symmetric key block cipher KASUMI which was developed by modifying MISTY1. As a result, we found new 4.5-round characteristics of KASUMI for the first time. We show that 7-round KASUMI is attackable with 263 data and 2120 encryptions.

  • Single Suction Grasp Detection for Symmetric Objects Using Shallow Networks Trained with Synthetic Data

    Suraj Prakash PATTAR  Tsubasa HIRAKAWA  Takayoshi YAMASHITA  Tetsuya SAWANOBORI  Hironobu FUJIYOSHI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2022/06/21
      Vol:
    E105-D No:9
      Page(s):
    1600-1609

    Predicting the grasping point accurately and quickly is crucial for successful robotic manipulation. However, to commercially deploy a robot, such as a dishwasher robot in a commercial kitchen, we also need to consider the constraints of limited usable resources. We present a deep learning method to predict the grasp position when using a single suction gripper for picking up objects. The proposed method is based on a shallow network to enable lower training costs and efficient inference on limited resources. Costs are further reduced by collecting data in a custom-built synthetic environment. For evaluating the proposed method, we developed a system that models a commercial kitchen for a dishwasher robot to manipulate symmetric objects. We tested our method against a model-fitting method and an algorithm-based method in our developed commercial kitchen environment and found that a shallow network trained with only the synthetic data achieves high accuracy. We also demonstrate the practicality of using a shallow network in sequence with an object detector for ease of training, prediction speed, low computation cost, and easier debugging.

  • A Large-Scale SCMA Codebook Optimization and Codeword Allocation Method

    Shiqing QIAN  Wenping GE  Yongxing ZHANG  Pengju ZHANG  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2021/12/24
      Vol:
    E105-B No:7
      Page(s):
    788-796

    Sparse code division multiple access (SCMA) is a non-orthogonal multiple access (NOMA) technology that can improve frequency band utilization and allow many users to share quite a few resource elements (REs). This paper uses the modulation of lattice theory to develop a systematic construction procedure for the design of SCMA codebooks under Gaussian channel environments that can achieve near-optimal designs, especially for cases that consider large-scale SCMA parameters. However, under the condition of large-scale SCMA parameters, the mother constellation (MC) points will overlap, which can be solved by the method of the partial dimensions transformation (PDT). More importantly, we consider the upper bounded error probability of the signal transmission in the AWGN channels, and design a codeword allocation method to reduce the inter symbol interference (ISI) on the same RE. Simulation results show that under different codebook sizes and different overload rates, using two different message passing algorithms (MPA) to verify, the codebook proposed in this paper has a bit error rate (BER) significantly better than the reference codebooks, moreover the convergence time does not exceed that of the reference codebooks.

  • Anomaly Detection Using Spatio-Temporal Context Learned by Video Clip Sorting

    Wen SHAO  Rei KAWAKAMI  Takeshi NAEMURA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2022/02/08
      Vol:
    E105-D No:5
      Page(s):
    1094-1102

    Previous studies on anomaly detection in videos have trained detectors in which reconstruction and prediction tasks are performed on normal data so that frames on which their task performance is low will be detected as anomalies during testing. This paper proposes a new approach that involves sorting video clips, by using a generative network structure. Our approach learns spatial contexts from appearances and temporal contexts from the order relationship of the frames. Experiments were conducted on four datasets, and we categorized the anomalous sequences by appearance and motion. Evaluations were conducted not only on each total dataset but also on each of the categories. Our method improved detection performance on both anomalies with different appearance and different motion from normality. Moreover, combining our approach with a prediction method produced improvements in precision at a high recall.

  • Improved Metric Function for AlphaSeq Algorithm to Design Ideal Complementary Codes for Multi-Carrier CDMA Systems

    Shucong TIAN  Meng YANG  Jianpeng WANG  Rui WANG  Avik R. ADHIKARY  

     
    LETTER-Communication Theory and Signals

      Pubricized:
    2021/11/15
      Vol:
    E105-A No:5
      Page(s):
    901-905

    AlphaSeq is a new paradigm to design sequencess with desired properties based on deep reinforcement learning (DRL). In this work, we propose a new metric function and a new reward function, to design an improved version of AlphaSeq. We show analytically and also through numerical simulations that the proposed algorithm can discover sequence sets with preferable properties faster than that of the previous algorithm.

  • BOTDA-Based Technique for Measuring Maximum Loss and Crosstalk at Splice Point in Few-Mode Fibers Open Access

    Tomokazu ODA  Atsushi NAKAMURA  Daisuke IIDA  Hiroyuki OSHIDA  

     
    PAPER-Optical Fiber for Communications

      Pubricized:
    2021/11/05
      Vol:
    E105-B No:5
      Page(s):
    504-511

    We propose a technique based on Brillouin optical time domain analysis for measuring loss and crosstalk in few-mode fibers (FMFs). The proposed technique extracts the loss and crosstalk of a specific mode in FMFs from the Brillouin gains and Brillouin gain coefficients measured under two different conditions in terms of the frequency difference between the pump and probe lights. The technique yields the maximum loss and crosstalk at a splice point by changing the electrical field injected into an FMF as the pump light. Experiments demonstrate that the proposed technique can measure the maximum loss and crosstalk of the LP11 mode at a splice point in a two-mode fiber.

  • Numerical Analysis of Pulse Response for Slanted Grating Structure with an Air Regions in Dispersion Media by TE Case Open Access

    Ryosuke OZAKI  Tsuneki YAMASAKI  

     
    BRIEF PAPER

      Pubricized:
    2021/10/18
      Vol:
    E105-C No:4
      Page(s):
    154-158

    In our previous paper, we have proposed a new numerical technique for transient scattering problem of periodically arrayed dispersion media by using a combination of the fast inversion Laplace transform (FILT) method and Fourier series expansion method (FSEM), and analyzed the pulse response for several widths of the dispersion media or rectangular cavities. From the numerical results, we examined the influence of a periodically arrayed dispersion media with a rectangular cavity on the pulse response. In this paper, we analyzed the transient scattering problem for the case of dispersion media with slanted air regions by utilizing a combination of the FILT, FSEM, and multilayer division method (MDM), and investigated an influence for the slanted angle of an air region. In addition, we verified the computational accuracy for term of the MDM and truncation mode number of the electromagnetic fields.

  • Dual Self-Guided Attention with Sparse Question Networks for Visual Question Answering

    Xiang SHEN  Dezhi HAN  Chin-Chen CHANG  Liang ZONG  

     
    PAPER-Natural Language Processing

      Pubricized:
    2022/01/06
      Vol:
    E105-D No:4
      Page(s):
    785-796

    Visual Question Answering (VQA) is multi-task research that requires simultaneous processing of vision and text. Recent research on the VQA models employ a co-attention mechanism to build a model between the context and the image. However, the features of questions and the modeling of the image region force irrelevant information to be calculated in the model, thus affecting the performance. This paper proposes a novel dual self-guided attention with sparse question networks (DSSQN) to address this issue. The aim is to avoid having irrelevant information calculated into the model when modeling the internal dependencies on both the question and image. Simultaneously, it overcomes the coarse interaction between sparse question features and image features. First, the sparse question self-attention (SQSA) unit in the encoder calculates the feature with the highest weight. From the self-attention learning of question words, the question features of larger weights are reserved. Secondly, sparse question features are utilized to guide the focus on image features to obtain fine-grained image features, and to also prevent irrelevant information from being calculated into the model. A dual self-guided attention (DSGA) unit is designed to improve modal interaction between questions and images. Third, the sparse question self-attention of the parameter δ is optimized to select these question-related object regions. Our experiments with VQA 2.0 benchmark datasets demonstrate that DSSQN outperforms the state-of-the-art methods. For example, the accuracy of our proposed model on the test-dev and test-std is 71.03% and 71.37%, respectively. In addition, we show through visualization results that our model can pay more attention to important features than other advanced models. At the same time, we also hope that it can promote the development of VQA in the field of artificial intelligence (AI).

  • An Efficient Secure Division Protocol Using Approximate Multi-Bit Product and New Constant-Round Building Blocks Open Access

    Keitaro HIWATASHI  Satsuya OHATA  Koji NUIDA  

     
    PAPER-Cryptography and Information Security

      Pubricized:
    2021/09/28
      Vol:
    E105-A No:3
      Page(s):
    404-416

    Integer division is one of the most fundamental arithmetic operators and is ubiquitously used. However, the existing division protocols in secure multi-party computation (MPC) are inefficient and very complex, and this has been a barrier to applications of MPC such as secure machine learning. We already have some secure division protocols working in Z2n. However, these existing results have drawbacks that those protocols needed many communication rounds and needed to use bigger integers than in/output. In this paper, we improve a secure division protocol in two ways. First, we construct a new protocol using only the same size integers as in/output. Second, we build efficient constant-round building blocks used as subprotocols in the division protocol. With these two improvements, communication rounds of our division protocol are reduced to about 36% (87 rounds → 31 rounds) for 64-bit integers in comparison with the most efficient previous one.

  • A Sparsely-Connected OTFS-BFDM System Using Message-Passing Decoding Open Access

    Tingyao WU  Zhisong BIE  Celimuge WU  

     
    PAPER-Communication Theory and Signals

      Pubricized:
    2021/08/27
      Vol:
    E105-A No:3
      Page(s):
    576-583

    The newly proposed orthogonal time frequency space (OTFS) system exhibits excellent error performance on high-Doppler fading channels. However, the rectangular prototype window function (PWF) inherent in OTFS leads to high out-of-band emission (OOBE), which reduces the spectral efficiency in multi-user scenarios. To this end, this paper presents an OTFS system based on bi-orthogonal frequency division multiplexing (OTFS-BFDM) modulation. In OTFS-BFDM systems, PWFs with bi-orthogonal properties can be optimized to provide lower OOBE than OTFS, which is a special case with rectangular PWF. We further derive that the OTFS-BFDM system is sparsely-connected so that the low-complexity message passing (MP) decoding algorithm can be adopted. Moreover, the power spectral density, peak to average power ratio (PAPR) and bit error rate (BER) of the OTFS-BFDM system with different PWFs are compared. Simulation results show that: i) the use of BFDM modulation significantly inhibits the OOBE of OTFS system; ii) the better the frequency-domain localization of PWFs, the smaller the BER and PAPR of OTFS-BFDM system.

21-40hit(776hit)