The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] ATI(18690hit)

441-460hit(18690hit)

  • SPSD: Semantics and Deep Reinforcement Learning Based Motion Planning for Supermarket Robot

    Jialun CAI  Weibo HUANG  Yingxuan YOU  Zhan CHEN  Bin REN  Hong LIU  

     
    PAPER-Positioning and Navigation

      Pubricized:
    2022/09/15
      Vol:
    E106-D No:5
      Page(s):
    765-772

    Robot motion planning is an important part of the unmanned supermarket. The challenges of motion planning in supermarkets lie in the diversity of the supermarket environment, the complexity of obstacle movement, the vastness of the search space. This paper proposes an adaptive Search and Path planning method based on the Semantic information and Deep reinforcement learning (SPSD), which effectively improves the autonomous decision-making ability of supermarket robots. Firstly, based on the backbone of deep reinforcement learning (DRL), supermarket robots process real-time information from multi-modality sensors to realize high-speed and collision-free motion planning. Meanwhile, in order to solve the problem caused by the uncertainty of the reward in the deep reinforcement learning, common spatial semantic relationships between landmarks and target objects are exploited to define reward function. Finally, dynamics randomization is introduced to improve the generalization performance of the algorithm in the training. The experimental results show that the SPSD algorithm is excellent in the three indicators of generalization performance, training time and path planning length. Compared with other methods, the training time of SPSD is reduced by 27.42% at most, the path planning length is reduced by 21.08% at most, and the trained network of SPSD can be applied to unfamiliar scenes safely and efficiently. The results are motivating enough to consider the application of the proposed method in practical scenes. We have uploaded the video of the results of the experiment to https://www.youtube.com/watch?v=h1wLpm42NZk.

  • An Improved BPNN Method Based on Probability Density for Indoor Location

    Rong FEI  Yufan GUO  Junhuai LI  Bo HU  Lu YANG  

     
    PAPER-Positioning and Navigation

      Pubricized:
    2022/12/23
      Vol:
    E106-D No:5
      Page(s):
    773-785

    With the widespread use of indoor positioning technology, the need for high-precision positioning services is rising; nevertheless, there are several challenges, such as the difficulty of simulating the distribution of interior location data and the enormous inaccuracy of probability computation. As a result, this paper proposes three different neural network model comparisons for indoor location based on WiFi fingerprint - indoor location algorithm based on improved back propagation neural network model, RSSI indoor location algorithm based on neural network angle change, and RSSI indoor location algorithm based on depth neural network angle change - to raise accurately predict indoor location coordinates. Changing the action range of the activation function in the standard back-propagation neural network model achieves the goal of accurately predicting location coordinates. The revised back-propagation neural network model has strong stability and enhances indoor positioning accuracy based on experimental comparisons of loss rate (loss), accuracy rate (acc), and cumulative distribution function (CDF).

  • Learning Pixel Perception for Identity and Illumination Consistency Face Frontalization in the Wild

    Yongtang BAO  Pengfei ZHOU  Yue QI  Zhihui WANG  Qing FAN  

     
    PAPER-Person Image Generation

      Pubricized:
    2022/06/21
      Vol:
    E106-D No:5
      Page(s):
    794-803

    A frontal and realistic face image was synthesized from a single profile face image. It has a wide range of applications in face recognition. Although the frontal face method based on deep learning has made substantial progress in recent years, there is still no guarantee that the generated face has identity consistency and illumination consistency in a significant posture. This paper proposes a novel pixel-based feature regression generative adversarial network (PFR-GAN), which can learn to recover local high-frequency details and preserve identity and illumination frontal face images in an uncontrolled environment. We first propose a Reslu block to obtain richer feature representation and improve the convergence speed of training. We then introduce a feature conversion module to reduce the artifacts caused by face rotation discrepancy, enhance image generation quality, and preserve more high-frequency details of the profile image. We also construct a 30,000 face pose dataset to learn about various uncontrolled field environments. Our dataset includes ages of different races and wild backgrounds, allowing us to handle other datasets and obtain better results. Finally, we introduce a discriminator used for recovering the facial structure of the frontal face images. Quantitative and qualitative experimental results show our PFR-GAN can generate high-quality and high-fidelity frontal face images, and our results are better than the state-of-art results.

  • Multi-Scale Correspondence Learning for Person Image Generation

    Shi-Long SHEN  Ai-Guo WU  Yong XU  

     
    PAPER-Person Image Generation

      Pubricized:
    2022/04/15
      Vol:
    E106-D No:5
      Page(s):
    804-812

    A generative model is presented for two types of person image generation in this paper. First, this model is applied to pose-guided person image generation, i.e., converting the pose of a source person image to the target pose while preserving the texture of that source person image. Second, this model is also used for clothing-guided person image generation, i.e., changing the clothing texture of a source person image to the desired clothing texture. The core idea of the proposed model is to establish the multi-scale correspondence, which can effectively address the misalignment introduced by transferring pose, thereby preserving richer information on appearance. Specifically, the proposed model consists of two stages: 1) It first generates the target semantic map imposed on the target pose to provide more accurate guidance during the generation process. 2) After obtaining the multi-scale feature map by the encoder, the multi-scale correspondence is established, which is useful for a fine-grained generation. Experimental results show the proposed method is superior to state-of-the-art methods in pose-guided person image generation and show its effectiveness in clothing-guided person image generation.

  • Enhanced Full Attention Generative Adversarial Networks

    KaiXu CHEN  Satoshi YAMANE  

     
    LETTER-Core Methods

      Pubricized:
    2023/01/12
      Vol:
    E106-D No:5
      Page(s):
    813-817

    In this paper, we propose improved Generative Adversarial Networks with attention module in Generator, which can enhance the effectiveness of Generator. Furthermore, recent work has shown that Generator conditioning affects GAN performance. Leveraging this insight, we explored the effect of different normalization (spectral normalization, instance normalization) on Generator and Discriminator. Moreover, an enhanced loss function called Wasserstein Divergence distance, can alleviate the problem of difficult to train module in practice.

  • Prediction of Driver's Visual Attention in Critical Moment Using Optical Flow

    Rebeka SULTANA  Gosuke OHASHI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/01/26
      Vol:
    E106-D No:5
      Page(s):
    1018-1026

    In recent years, driver's visual attention has been actively studied for driving automation technology. However, the number of models is few to perceive an insight understanding of driver's attention in various moments. All attention models process multi-level image representations by a two-stream/multi-stream network, increasing the computational cost due to an increment of model parameters. However, multi-level image representation such as optical flow plays a vital role in tasks involving videos. Therefore, to reduce the computational cost of a two-stream network and use multi-level image representation, this work proposes a single stream driver's visual attention model for a critical situation. The experiment was conducted using a publicly available critical driving dataset named BDD-A. Qualitative results confirm the effectiveness of the proposed model. Moreover, quantitative results highlight that the proposed model outperforms state-of-the-art visual attention models according to CC and SIM. Extensive ablation studies verify the presence of optical flow in the model, the position of optical flow in the spatial network, the convolution layers to process optical flow, and the computational cost compared to a two-stream model.

  • OPENnet: Object Position Embedding Network for Locating Anti-Bird Thorn of High-Speed Railway

    Zhuo WANG  Junbo LIU  Fan WANG  Jun WU  

     
    LETTER-Intelligent Transportation Systems

      Pubricized:
    2022/11/14
      Vol:
    E106-D No:5
      Page(s):
    824-828

    Machine vision-based automatic anti-bird thorn failure inspection, instead of manual identification, remains a great challenge. In this paper, we proposed a novel Object Position Embedding Network (OPENnet), which can improve the precision of anti-bird thorn localization. OPENnet can simultaneously predict the location boxes of the support device and anti-bird thorn by using the proposed double-head network. And then, OPENnet is optimized using the proposed symbiotic loss function (SymLoss), which embeds the object position into the network. The comprehensive experiments are conducted on the real railway video dataset. OPENnet yields competitive performance on anti-bird thorn localization. Specifically, the localization performance gains +3.65 AP, +2.10 AP50, and +1.22 AP75.

  • Clustering-Based Neural Network for Carbon Dioxide Estimation

    Conghui LI  Quanlin ZHONG  Baoyin LI  

     
    LETTER-Intelligent Transportation Systems

      Pubricized:
    2022/08/01
      Vol:
    E106-D No:5
      Page(s):
    829-832

    In recent years, the applications of deep learning have facilitated the development of green intelligent transportation system (ITS), and carbon dioxide estimation has been one of important issues in green ITS. Furthermore, the carbon dioxide estimation could be modelled as the fuel consumption estimation. Therefore, a clustering-based neural network is proposed to analyze clusters in accordance with fuel consumption behaviors and obtains the estimated fuel consumption and the estimated carbon dioxide. In experiments, the mean absolute percentage error (MAPE) of the proposed method is only 5.61%, and the performance of the proposed method is higher than other methods.

  • Effective Language Representations for Danmaku Comment Classification in Nicovideo

    Hiroyoshi NAGAO  Koshiro TAMURA  Marie KATSURAI  

     
    PAPER

      Pubricized:
    2023/01/16
      Vol:
    E106-D No:5
      Page(s):
    838-846

    Danmaku commenting has become popular for co-viewing on video-sharing platforms, such as Nicovideo. However, many irrelevant comments usually contaminate the quality of the information provided by videos. Such an information pollutant problem can be solved by a comment classifier trained with an abstention option, which detects comments whose video categories are unclear. To improve the performance of this classification task, this paper presents Nicovideo-specific language representations. Specifically, we used sentences from Nicopedia, a Japanese online encyclopedia of entities that possibly appear in Nicovideo contents, to pre-train a bidirectional encoder representations from Transformers (BERT) model. The resulting model named Nicopedia BERT is then fine-tuned such that it could determine whether a given comment falls into any of predefined categories. The experiments conducted on Nicovideo comment data demonstrated the effectiveness of Nicopedia BERT compared with existing BERT models pre-trained using Wikipedia or tweets. We also evaluated the performance of each model in an additional sentiment classification task, and the obtained results implied the applicability of Nicopedia BERT as a feature extractor of other social media text.

  • 3D Multiple-Contextual ROI-Attention Network for Efficient and Accurate Volumetric Medical Image Segmentation

    He LI  Yutaro IWAMOTO  Xianhua HAN  Lanfen LIN  Akira FURUKAWA  Shuzo KANASAKI  Yen-Wei CHEN  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/02/21
      Vol:
    E106-D No:5
      Page(s):
    1027-1037

    Convolutional neural networks (CNNs) have become popular in medical image segmentation. The widely used deep CNNs are customized to extract multiple representative features for two-dimensional (2D) data, generally called 2D networks. However, 2D networks are inefficient in extracting three-dimensional (3D) spatial features from volumetric images. Although most 2D segmentation networks can be extended to 3D networks, the naively extended 3D methods are resource-intensive. In this paper, we propose an efficient and accurate network for fully automatic 3D segmentation. Specifically, we designed a 3D multiple-contextual extractor to capture rich global contextual dependencies from different feature levels. Then we leveraged an ROI-estimation strategy to crop the ROI bounding box. Meanwhile, we used a 3D ROI-attention module to improve the accuracy of in-region segmentation in the decoder path. Moreover, we used a hybrid Dice loss function to address the issues of class imbalance and blurry contour in medical images. By incorporating the above strategies, we realized a practical end-to-end 3D medical image segmentation with high efficiency and accuracy. To validate the 3D segmentation performance of our proposed method, we conducted extensive experiments on two datasets and demonstrated favorable results over the state-of-the-art methods.

  • Maximizing External Action with Information Provision Over Multiple Rounds in Online Social Networks

    Masaaki MIYASHITA  Norihiko SHINOMIYA  Daisuke KASAMATSU  Genya ISHIGAKI  

     
    PAPER

      Pubricized:
    2023/02/03
      Vol:
    E106-D No:5
      Page(s):
    847-855

    Online social networks have increased their impact on the real world, which motivates information senders to control the propagation process of information to promote particular actions of online users. However, the existing works on information provisioning seem to oversimplify the users' decision-making process that involves information reception, internal actions of social networks, and external actions of social networks. In particular, characterizing the best practices of information provisioning that promotes the users' external actions is a complex task due to the complexity of the propagation process in OSNs, even when the variation of information is limited. Therefore, we propose a new information diffusion model that distinguishes user behaviors inside and outside of OSNs, and formulate an optimization problem to maximize the number of users who take the external actions by providing information over multiple rounds. Also, we define a robust provisioning policy for the problem, which selects a message sequence to maximize the expected number of desired users under the probabilistic uncertainty of OSN settings. Our experiment results infer that there could exist an information provisioning policy that achieves nearly-optimal solutions in different types of OSNs. Furthermore, we empirically demonstrate that the proposed robust policy can be such a universally optimal solution.

  • Construction of a Support Tool for Japanese User Reading of Privacy Policies and Assessment of its User Impact

    Sachiko KANAMORI  Hirotsune SATO  Naoya TABATA  Ryo NOJIMA  

     
    PAPER

      Pubricized:
    2023/02/08
      Vol:
    E106-D No:5
      Page(s):
    856-867

    To protect user privacy and establish self-information control rights, service providers must notify users of their privacy policies and obtain their consent in advance. The frameworks that impose these requirements are mandatory. Although originally designed to protect user privacy, obtaining user consent in advance has become a mere formality. These problems are induced by the gap between service providers' privacy policies, which prioritize the observance of laws and guidelines, and user expectations which are to easily understand how their data will be handled. To reduce this gap, we construct a tool supporting users in reading privacy policies in Japanese. We designed the tool to present users with separate unique expressions containing relevant information to improve the display format of the privacy policy and render it more comprehensive for Japanese users. To accurately extract the unique expressions from privacy policies, we created training data for machine learning for the constructed tool. The constructed tool provides a summary of privacy policies for users to help them understand the policies of interest. Subsequently, we assess the effectiveness of the constructed tool in experiments and follow-up questionnaires. Our findings reveal that the constructed tool enhances the users' subjective understanding of the services they read about and their awareness of the related risks. We expect that the developed tool will help users better understand the privacy policy content and and make educated decisions based on their understanding of how service providers intend to use their personal data.

  • Privacy-Preserving Correlation Coefficient

    Tomoaki MIMOTO  Hiroyuki YOKOYAMA  Toru NAKAMURA  Takamasa ISOHARA  Masayuki HASHIMOTO  Ryosuke KOJIMA  Aki HASEGAWA  Yasushi OKUNO  

     
    PAPER

      Pubricized:
    2023/02/08
      Vol:
    E106-D No:5
      Page(s):
    868-876

    Differential privacy is a confidentiality metric and quantitatively guarantees the confidentiality of individuals. A noise criterion, called sensitivity, must be calculated when constructing a probabilistic disturbance mechanism that satisfies differential privacy. Depending on the statistical process, the sensitivity may be very large or even impossible to compute. As a result, the usefulness of the constructed mechanism may be significantly low; it might even be impossible to directly construct it. In this paper, we first discuss situations in which sensitivity is difficult to calculate, and then propose a differential privacy with additional dummy data as a countermeasure. When the sensitivity in the conventional differential privacy is calculable, a mechanism that satisfies the proposed metric satisfies the conventional differential privacy at the same time, and it is possible to evaluate the relationship between the respective privacy parameters. Next, we derive sensitivity by focusing on correlation coefficients as a case study of a statistical process for which sensitivity is difficult to calculate, and propose a probabilistic disturbing mechanism that satisfies the proposed metric. Finally, we experimentally evaluate the effect of noise on the sensitivity of the proposed and direct methods. Experiments show that privacy-preserving correlation coefficients can be derived with less noise compared to using direct methods.

  • Geo-Graph-Indistinguishability: Location Privacy on Road Networks with Differential Privacy

    Shun TAKAGI  Yang CAO  Yasuhito ASANO  Masatoshi YOSHIKAWA  

     
    PAPER

      Pubricized:
    2023/01/16
      Vol:
    E106-D No:5
      Page(s):
    877-894

    In recent years, concerns about location privacy are increasing with the spread of location-based services (LBSs). Many methods to protect location privacy have been proposed in the past decades. Especially, perturbation methods based on Geo-Indistinguishability (GeoI), which randomly perturb a true location to a pseudolocation, are getting attention due to its strong privacy guarantee inherited from differential privacy. However, GeoI is based on the Euclidean plane even though many LBSs are based on road networks (e.g. ride-sharing services). This causes unnecessary noise and thus an insufficient tradeoff between utility and privacy for LBSs on road networks. To address this issue, we propose a new privacy notion, Geo-Graph-Indistinguishability (GeoGI), for locations on a road network to achieve a better tradeoff. We propose Graph-Exponential Mechanism (GEM), which satisfies GeoGI. Moreover, we formalize the optimization problem to find the optimal GEM in terms of the tradeoff. However, the computational complexity of a naive method to find the optimal solution is prohibitive, so we propose a greedy algorithm to find an approximate solution in an acceptable amount of time. Finally, our experiments show that our proposed mechanism outperforms GeoI mechanisms, including optimal GeoI mechanism, with respect to the tradeoff.

  • MicroState: An Anomaly Localization Method in Heterogeneous Microservice Systems

    Jingjing YANG  Yuchun GUO  Yishuai CHEN  

     
    PAPER

      Pubricized:
    2023/01/13
      Vol:
    E106-D No:5
      Page(s):
    904-912

    Microservice architecture has been widely adopted for large-scale applications because of its benefits of scalability, flexibility, and reliability. However, microservice architecture also proposes new challenges in diagnosing root causes of performance degradation. Existing methods rely on labeled data and suffer a high computation burden. This paper proposes MicroState, an unsupervised and lightweight method to pinpoint the root cause with detailed descriptions. We decompose root cause diagnosis into element location and detailed reason identification. To mitigate the impact of element heterogeneity and dynamic invocations, MicroState generates elements' invoked states, quantifies elements' abnormality by warping-based state comparison, and infers the anomalous group. MicroState locates the root cause element with the consideration of anomaly frequency and persistency. To locate the anomalous metric from diverse metrics, MicroState extracts metrics' trend features and evaluates metrics' abnormality based on their trend feature variation, which reduces the reliance on anomaly detectors. Our experimental evaluation based on public data of the Artificial intelligence for IT Operations Challenge (AIOps Challenge 2020) shows that MicroState locates root cause elements with 87% precision and diagnoses anomaly reasons accurately.

  • A Fast Handover Mechanism for Ground-to-Train Free-Space Optical Communication using Station ID Recognition by Dual-Port Camera

    Kosuke MORI  Fumio TERAOKA  Shinichiro HARUYAMA  

     
    PAPER

      Pubricized:
    2023/03/08
      Vol:
    E106-D No:5
      Page(s):
    940-951

    There are demands for high-speed and stable ground-to-train optical communication as a network environment for trains. The existing ground-to-train optical communication system developed by the authors uses a camera and a QPD (Quadrant photo diode) to capture beacon light. The problem with the existing system is that it is impossible to identify the ground station. In the system proposed in this paper, a beacon light modulated with the ID of the ground station is transmitted, and the ground station is identified by demodulating the image from the dual-port camera on the opposite side. In this paper, we developed an actual system and conducted experiments using a car on the road. The results showed that only one packet was lost with the ping command every 1 ms near handover. Although the communication device itself has a bandwidth of 100 Mbps, the throughput before and after the handover was about 94 Mbps, and only dropped to about 89.4 Mbps during the handover.

  • High-Precision Mobile Robot Localization Using the Integration of RAR and AKF

    Chen WANG  Hong TAN  

     
    PAPER-Information Network

      Pubricized:
    2023/01/24
      Vol:
    E106-D No:5
      Page(s):
    1001-1009

    The high-precision indoor positioning technology has gradually become one of the research hotspots in indoor mobile robots. Relax and Recover (RAR) is an indoor positioning algorithm using distance observations. The algorithm restores the robot's trajectory through curve fitting and does not require time synchronization of observations. The positioning can be successful with few observations. However, the algorithm has the disadvantages of poor resistance to gross errors and cannot be used for real-time positioning. In this paper, while retaining the advantages of the original algorithm, the RAR algorithm is improved with the adaptive Kalman filter (AKF) based on the innovation sequence to improve the anti-gross error performance of the original algorithm. The improved algorithm can be used for real-time navigation and positioning. The experimental validation found that the improved algorithm has a significant improvement in accuracy when compared to the original RAR. When comparing to the extended Kalman filter (EKF), the accuracy is also increased by 12.5%, which can be used for high-precision positioning of indoor mobile robots.

  • Subjective Difficulty Estimation of Educational Comics Using Gaze Features

    Kenya SAKAMOTO  Shizuka SHIRAI  Noriko TAKEMURA  Jason ORLOSKY  Hiroyuki NAGATAKI  Mayumi UEDA  Yuki URANISHI  Haruo TAKEMURA  

     
    PAPER-Educational Technology

      Pubricized:
    2023/02/03
      Vol:
    E106-D No:5
      Page(s):
    1038-1048

    This study explores significant eye-gaze features that can be used to estimate subjective difficulty while reading educational comics. Educational comics have grown rapidly as a promising way to teach difficult topics using illustrations and texts. However, comics include a variety of information on one page, so automatically detecting learners' states such as subjective difficulty is difficult with approaches such as system log-based detection, which is common in the Learning Analytics field. In order to solve this problem, this study focused on 28 eye-gaze features, including the proposal of three new features called “Variance in Gaze Convergence,” “Movement between Panels,” and “Movement between Tiles” to estimate two degrees of subjective difficulty. We then ran an experiment in a simulated environment using Virtual Reality (VR) to accurately collect gaze information. We extracted features in two unit levels, page- and panel-units, and evaluated the accuracy with each pattern in user-dependent and user-independent settings, respectively. Our proposed features achieved an average F1 classification-score of 0.721 and 0.742 in user-dependent and user-independent models at panel unit levels, respectively, trained by a Support Vector Machine (SVM).

  • New Training Method for Non-Dominant Hand Pitching Motion Based on Reversal Trajectory of Dominant Hand Pitching Motion Using AR and Vibration

    Masato SOGA  Taiki MORI  

     
    PAPER-Educational Technology

      Pubricized:
    2023/02/08
      Vol:
    E106-D No:5
      Page(s):
    1049-1058

    In this paper, we propose a new method for non-dominant limb training. The method is that a learner aims at a motion which is generated by reversing his/her own motion of dominant limb, when he/she tries to train himself/herself for non-dominant limb training. In addition, we designed and developed interface for the new method which can select feedback types. One is an interface using AR and sound, and the other is an interface using AR and vibration. We found that vibration feedback was effective for non-dominant hand training of pitching motion, while sound feedback was not so effective as vibration.

  • A Computer Simulation Study on Movement Control by Functional Electrical Stimulation Using Optimal Control Technique with Simplified Parameter Estimation

    Fauzan ARROFIQI  Takashi WATANABE  Achmad ARIFIN  

     
    PAPER-Rehabilitation Engineering and Assistive Technology

      Pubricized:
    2023/02/21
      Vol:
    E106-D No:5
      Page(s):
    1059-1068

    The purpose of this study was to develop a practical functional electrical stimulation (FES) controller for joint movements restoration based on an optimal control technique by cascading a linear model predictive control (MPC) and a nonlinear transformation. The cascading configuration was aimed to obtain an FES controller that is able to deal with a nonlinear system. The nonlinear transformation was utilized to transform the linear solution of linear MPC to become a nonlinear solution in form of optimized electrical stimulation intensity. Four different types of nonlinear functions were used to realize the nonlinear transformation. A simple parameter estimation to determine the value of the nonlinear transformation parameter was also developed. The tracking control capability of the proposed controller along with the parameter estimation was examined in controlling the 1-DOF wrist joint movement through computer simulation. The proposed controller was also compared with a fuzzy FES controller. The proposed MPC-FES controller with estimated parameter value worked properly and had a better control accuracy than the fuzzy controller. The parameter estimation was suggested to be useful and effective in practical FES control applications to reduce the time-consuming of determining the parameter value of the proposed controller.

441-460hit(18690hit)