The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Ti(30728hit)

501-520hit(30728hit)

  • Facial Mask Completion Using StyleGAN2 Preserving Features of the Person

    Norihiko KAWAI  Hiroaki KOIKE  

     
    PAPER

      Pubricized:
    2023/05/30
      Vol:
    E106-D No:10
      Page(s):
    1627-1637

    Due to the global outbreak of coronaviruses, people are increasingly wearing masks even when photographed. As a result, photos uploaded to web pages and social networking services with the lower half of the face hidden are less likely to convey the attractiveness of the photographed persons. In this study, we propose a method to complete facial mask regions using StyleGAN2, a type of Generative Adversarial Networks (GAN). In the proposed method, a reference image of the same person without a mask is prepared separately from a target image of the person wearing a mask. After the mask region in the target image is temporarily inpainted, the face orientation and contour of the person in the reference image are changed to match those of the target image using StyleGAN2. The changed image is then composited into the mask region while correcting the color tone to produce a mask-free image while preserving the person's features.

  • Social Relation Atmosphere Recognition with Relevant Visual Concepts

    Ying JI  Yu WANG  Kensaku MORI  Jien KATO  

     
    PAPER

      Pubricized:
    2023/06/02
      Vol:
    E106-D No:10
      Page(s):
    1638-1649

    Social relationships (e.g., couples, opponents) are the foundational part of society. Social relation atmosphere describes the overall interaction environment between social relationships. Discovering social relation atmosphere can help machines better comprehend human behaviors and improve the performance of social intelligent applications. Most existing research mainly focuses on investigating social relationships, while ignoring the social relation atmosphere. Due to the complexity of the expressions in video data and the uncertainty of the social relation atmosphere, it is even difficult to define and evaluate. In this paper, we innovatively analyze the social relation atmosphere in video data. We introduce a Relevant Visual Concept (RVC) from the social relationship recognition task to facilitate social relation atmosphere recognition, because social relationships contain useful information about human interactions and surrounding environments, which are crucial clues for social relation atmosphere recognition. Our approach consists of two main steps: (1) we first generate a group of visual concepts that preserve the inherent social relationship information by utilizing a 3D explanation module; (2) the extracted relevant visual concepts are used to supplement the social relation atmosphere recognition. In addition, we present a new dataset based on the existing Video Social Relation Dataset. Each video is annotated with four kinds of social relation atmosphere attributes and one social relationship. We evaluate the proposed method on our dataset. Experiments with various 3D ConvNets and fusion methods demonstrate that the proposed method can effectively improve recognition accuracy compared to end-to-end ConvNets. The visualization results also indicate that essential information in social relationships can be discovered and used to enhance social relation atmosphere recognition.

  • Filter Bank for Perfect Reconstruction of Light Field from Its Focal Stack

    Akira KUBOTA  Kazuya KODAMA  Daiki TAMURA  Asami ITO  

     
    PAPER

      Pubricized:
    2023/07/19
      Vol:
    E106-D No:10
      Page(s):
    1650-1660

    Focal stacks (FS) have attracted attention as an alternative representation of light field (LF). However, the problem of reconstructing LF from its FS is considered ill-posed. Although many regularization methods have been discussed, no method has been proposed to solve this problem perfectly. This paper showed that the LF can be perfectly reconstructed from the FS through a filter bank in theory for Lambertian scenes without occlusion if the camera aperture for acquiring the FS is a Cauchy function. The numerical simulation demonstrated that the filter bank allows perfect reconstruction of the LF.

  • Fusion-Based Edge and Color Recovery Using Weighted Near-Infrared Image and Color Transmission Maps for Robust Haze Removal

    Onhi KATO  Akira KUBOTA  

     
    PAPER

      Pubricized:
    2023/05/23
      Vol:
    E106-D No:10
      Page(s):
    1661-1672

    Various haze removal methods based on the atmospheric scattering model have been presented in recent years. Most methods have targeted strong haze images where light is scattered equally in all color channels. This paper presents a haze removal method using near-infrared (NIR) images for relatively weak haze images. In order to recover the lost edges, the presented method first extracts edges from an appropriately weighted NIR image and fuses it with the color image. By introducing a wavelength-dependent scattering model, our method then estimates the transmission map for each color channel and recovers the color more naturally from the edge-recovered image. Finally, the edge-recovered and the color-recovered images are blended. In this blending process, the regions with high lightness, such as sky and clouds, where unnatural color shifts are likely to occur, are effectively estimated, and the optimal weighting map is obtained. Our qualitative and quantitative evaluations using 59 pairs of color and NIR images demonstrated that our method can recover edges and colors more naturally in weak haze images than conventional methods.

  • Feedback Node Sets in Pancake Graphs and Burnt Pancake Graphs

    Sinyu JUNG  Keiichi KANEKO  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2023/06/30
      Vol:
    E106-D No:10
      Page(s):
    1677-1685

    A feedback node set (FNS) of a graph is a subset of the nodes of the graph whose deletion makes the residual graph acyclic. By finding an FNS in an interconnection network, we can set a check point at each node in it to avoid a livelock configuration. Hence, to find an FNS is a critical issue to enhance the dependability of a parallel computing system. In this paper, we propose a method to find FNS's in n-pancake graphs and n-burnt pancake graphs. By analyzing the types of cycles proposed in our method, we also give the number of the nodes in the FNS in an n-pancake graph, (n-2.875)(n-1)!+1.5(n-3)!, and that in an n-burnt pancake graph, 2n-1(n-1)!(n-3.5).

  • Decentralized Incentive Scheme for Peer-to-Peer Video Streaming using Solana Blockchain

    Yunqi MA  Satoshi FUJITA  

     
    PAPER-Information Network

      Pubricized:
    2023/07/13
      Vol:
    E106-D No:10
      Page(s):
    1686-1693

    Peer-to-peer (P2P) technology has gained popularity as a way to enhance system performance. Nodes in a P2P network work together by providing network resources to one another. In this study, we examine the use of P2P technology for video streaming and develop a distributed incentive mechanism to prevent free-riding. Our proposed solution combines WebTorrent and the Solana blockchain and can be accessed through a web browser. To incentivize uploads, some of the received video chunks are encrypted using AES. Smart contracts on the blockchain are used for third-party verification of uploads and for managing access to the video content. Experimental results on a test network showed that our system can encrypt and decrypt chunks in about 1/40th the time it takes using WebRTC, without affecting the quality of video streaming. Smart contracts were also found to quickly verify uploads in about 860 milliseconds. The paper also explores how to effectively reward virtual points for uploads.

  • GPU-Accelerated Estimation and Targeted Reduction of Peak IR-Drop during Scan Chain Shifting

    Shiling SHI  Stefan HOLST  Xiaoqing WEN  

     
    PAPER-Dependable Computing

      Pubricized:
    2023/07/07
      Vol:
    E106-D No:10
      Page(s):
    1694-1704

    High power dissipation during scan test often causes undue yield loss, especially for low-power circuits. One major reason is that the resulting IR-drop in shift mode may corrupt test data. A common approach to solving this problem is partial-shift, in which multiple scan chains are formed and only one group of scan chains is shifted at a time. However, existing partial-shift based methods suffer from two major problems: (1) their IR-drop estimation is not accurate enough or computationally too expensive to be done for each shift cycle; (2) partial-shift is hence applied to all shift cycles, resulting in long test time. This paper addresses these two problems with a novel IR-drop-aware scan shift method, featuring: (1) Cycle-based IR-Drop Estimation (CIDE) supported by a GPU-accelerated dynamic power simulator to quickly find potential shift cycles with excessive peak IR-drop; (2) a scan shift scheduling method that generates a scan chain grouping targeted for each considered shift cycle to reduce the impact on test time. Experiments on ITC'99 benchmark circuits show that: (1) the CIDE is computationally feasible; (2) the proposed scan shift schedule can achieve a global peak IR-drop reduction of up to 47%. Its scheduling efficiency is 58.4% higher than that of an existing typical method on average, which means our method has less test time.

  • Local-to-Global Structure-Aware Transformer for Question Answering over Structured Knowledge

    Yingyao WANG  Han WANG  Chaoqun DUAN  Tiejun ZHAO  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/06/27
      Vol:
    E106-D No:10
      Page(s):
    1705-1714

    Question-answering tasks over structured knowledge (i.e., tables and graphs) require the ability to encode structural information. Traditional pre-trained language models trained on linear-chain natural language cannot be directly applied to encode tables and graphs. The existing methods adopt the pre-trained models in such tasks by flattening structured knowledge into sequences. However, the serialization operation will lead to the loss of the structural information of knowledge. To better employ pre-trained transformers for structured knowledge representation, we propose a novel structure-aware transformer (SATrans) that injects the local-to-global structural information of the knowledge into the mask of the different self-attention layers. Specifically, in the lower self-attention layers, SATrans focus on the local structural information of each knowledge token to learn a more robust representation of it. In the upper self-attention layers, SATrans further injects the global information of the structured knowledge to integrate the information among knowledge tokens. In this way, the SATrans can effectively learn the semantic representation and structural information from the knowledge sequence and the attention mask, respectively. We evaluate SATrans on the table fact verification task and the knowledge base question-answering task. Furthermore, we explore two methods to combine symbolic and linguistic reasoning for these tasks to solve the problem that the pre-trained models lack symbolic reasoning ability. The experiment results reveal that the methods consistently outperform strong baselines on the two benchmarks.

  • Visual Inspection Method for Subway Tunnel Cracks Based on Multi-Kernel Convolution Cascade Enhancement Learning

    Baoxian WANG  Zhihao DONG  Yuzhao WANG  Shoupeng QIN  Zhao TAN  Weigang ZHAO  Wei-Xin REN  Junfang WANG  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/06/27
      Vol:
    E106-D No:10
      Page(s):
    1715-1722

    As a typical surface defect of tunnel lining structures, cracking disease affects the durability of tunnel structures and poses hidden dangers to tunnel driving safety. Factors such as interference from the complex service environment of the tunnel and the low signal-to-noise ratio of the crack targets themselves, have led to existing crack recognition methods based on semantic segmentation being unable to meet actual engineering needs. Based on this, this paper uses the Unet network as the basic framework for crack identification and proposes to construct a multi-kernel convolution cascade enhancement (MKCE) model to achieve accurate detection and identification of crack diseases. First of all, to ensure the performance of crack feature extraction, the model modified the main feature extraction network in the basic framework to ResNet-50 residual network. Compared with the VGG-16 network, this modification can extract richer crack detail features while reducing model parameters. Secondly, considering that the Unet network cannot effectively perceive multi-scale crack features in the skip connection stage, a multi-kernel convolution cascade enhancement module is proposed by combining a cascaded connection of multi-kernel convolution groups and multi-expansion rate dilated convolution groups. This module achieves a comprehensive perception of local details and the global content of tunnel lining cracks. In addition, to better weaken the effect of tunnel background clutter interference, a convolutional block attention calculation module is further introduced after the multi-kernel convolution cascade enhancement module, which effectively reduces the false alarm rate of crack recognition. The algorithm is tested on a large number of subway tunnel crack image datasets. The experimental results show that, compared with other crack recognition algorithms based on deep learning, the method in this paper has achieved the best results in terms of accuracy and intersection over union (IoU) indicators, which verifies the method in this paper has better applicability.

  • Multi-Scale Estimation for Omni-Directional Saliency Maps Using Learnable Equator Bias

    Takao YAMANAKA  Tatsuya SUZUKI  Taiki NOBUTSUNE  Chenjunlin WU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/07/19
      Vol:
    E106-D No:10
      Page(s):
    1723-1731

    Omni-directional images have been used in wide range of applications including virtual/augmented realities, self-driving cars, robotics simulators, and surveillance systems. For these applications, it would be useful to estimate saliency maps representing probability distributions of gazing points with a head-mounted display, to detect important regions in the omni-directional images. This paper proposes a novel saliency-map estimation model for the omni-directional images by extracting overlapping 2-dimensional (2D) plane images from omni-directional images at various directions and angles of view. While 2D saliency maps tend to have high probability at the center of images (center bias), the high-probability region appears at horizontal directions in omni-directional saliency maps when a head-mounted display is used (equator bias). Therefore, the 2D saliency model with a center-bias layer was fine-tuned with an omni-directional dataset by replacing the center-bias layer to an equator-bias layer conditioned on the elevation angle for the extraction of the 2D plane image. The limited availability of omni-directional images in saliency datasets can be compensated by using the well-established 2D saliency model pretrained by a large number of training images with the ground truth of 2D saliency maps. In addition, this paper proposes a multi-scale estimation method by extracting 2D images in multiple angles of view to detect objects of various sizes with variable receptive fields. The saliency maps estimated from the multiple angles of view were integrated by using pixel-wise attention weights calculated in an integration layer for weighting the optimal scale to each object. The proposed method was evaluated using a publicly available dataset with evaluation metrics for omni-directional saliency maps. It was confirmed that the accuracy of the saliency maps was improved by the proposed method.

  • Context-Aware Stock Recommendations with Stocks' Characteristics and Investors' Traits

    Takehiro TAKAYANAGI  Kiyoshi IZUMI  

     
    PAPER-Natural Language Processing

      Pubricized:
    2023/07/20
      Vol:
    E106-D No:10
      Page(s):
    1732-1741

    Personalized stock recommendations aim to suggest stocks tailored to individual investor needs, significantly aiding the financial decision making of an investor. This study shows the advantages of incorporating context into personalized stock recommendation systems. We embed item contextual information such as technical indicators, fundamental factors, and business activities of individual stocks. Simultaneously, we consider user contextual information such as investors' personality traits, behavioral characteristics, and attributes to create a comprehensive investor profile. Our model incorporating contextual information, validated on novel stock recommendation tasks, demonstrated a notable improvement over baseline models when incorporating these contextual features. Consistent outperformance across various hyperparameters further underscores the robustness and utility of our model in integrating stocks' features and investors' traits into personalized stock recommendations.

  • Fault-Resilient Robot Operating System Supporting Rapid Fault Recovery with Node Replication

    Jonghyeok YOU  Heesoo KIM  Kilho LEE  

     
    LETTER-Software System

      Pubricized:
    2023/07/07
      Vol:
    E106-D No:10
      Page(s):
    1742-1746

    This paper proposes a fault-resilient ROS platform supporting rapid fault detection and recovery. The platform employs heartbeat-based fault detection and node replication-based recovery. Our prototype implementation on top of the ROS Melodic shows a great performance in evaluations with a Nvidia development board and an inverted pendulum device.

  • Large-Scale Gaussian Process Regression Based on Random Fourier Features and Local Approximation with Tsallis Entropy

    Hongli ZHANG  Jinglei LIU  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/07/11
      Vol:
    E106-D No:10
      Page(s):
    1747-1751

    With the emergence of a large quantity of data in science and industry, it is urgent to improve the prediction accuracy and reduce the high complexity of Gaussian process regression (GPR). However, the traditional global approximation and local approximation have corresponding shortcomings, such as global approximation tends to ignore local features, and local approximation has the problem of over-fitting. In order to solve these problems, a large-scale Gaussian process regression algorithm (RFFLT) combining random Fourier features (RFF) and local approximation is proposed. 1) In order to speed up the training time, we use the random Fourier feature map input data mapped to the random low-dimensional feature space for processing. The main innovation of the algorithm is to design features by using existing fast linear processing methods, so that the inner product of the transformed data is approximately equal to the inner product in the feature space of the shift invariant kernel specified by the user. 2) The generalized robust Bayesian committee machine (GRBCM) based on Tsallis mutual information method is used in local approximation, which enhances the flexibility of the model and generates a sparse representation of the expert weight distribution compared with previous work. The algorithm RFFLT was tested on six real data sets, which greatly shortened the time of regression prediction and improved the prediction accuracy.

  • Prior Information Based Decomposition and Reconstruction Learning for Micro-Expression Recognition

    Jinsheng WEI  Haoyu CHEN  Guanming LU  Jingjie YAN  Yue XIE  Guoying ZHAO  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2023/07/13
      Vol:
    E106-D No:10
      Page(s):
    1752-1756

    Micro-expression recognition (MER) draws intensive research interest as micro-expressions (MEs) can infer genuine emotions. Prior information can guide the model to learn discriminative ME features effectively. However, most works focus on researching the general models with a stronger representation ability to adaptively aggregate ME movement information in a holistic way, which may ignore the prior information and properties of MEs. To solve this issue, driven by the prior information that the category of ME can be inferred by the relationship between the actions of facial different components, this work designs a novel model that can conform to this prior information and learn ME movement features in an interpretable way. Specifically, this paper proposes a Decomposition and Reconstruction-based Graph Representation Learning (DeRe-GRL) model to efectively learn high-level ME features. DeRe-GRL includes two modules: Action Decomposition Module (ADM) and Relation Reconstruction Module (RRM), where ADM learns action features of facial key components and RRM explores the relationship between these action features. Based on facial key components, ADM divides the geometric movement features extracted by the graph model-based backbone into several sub-features, and learns the map matrix to map these sub-features into multiple action features; then, RRM learns weights to weight all action features to build the relationship between action features. The experimental results demonstrate the effectiveness of the proposed modules, and the proposed method achieves competitive performance.

  • Quantitative Estimation of Video Forgery with Anomaly Analysis of Optical Flow

    Wan Yeon LEE  Yun-Seok CHOI  Tong Min KIM  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2023/05/19
      Vol:
    E106-D No:10
      Page(s):
    1757-1760

    We propose a quantitative measurement technique of video forgery that eliminates the decision burden of subtle boundary between normal and tampered patterns. We also propose the automatic adjustment scheme of spatial and temporal target zones, which maximizes the abnormality measurement of forged videos. Evaluation shows that the proposed scheme provides manifest detection capability against both inter-frame and intra-frame forgeries.

  • Practical Improvement and Performance Evaluation of Road Damage Detection Model using Machine Learning

    Tomoya FUJII  Rie JINKI  Yuukou HORITA  

     
    LETTER-Image

      Pubricized:
    2023/06/13
      Vol:
    E106-A No:9
      Page(s):
    1216-1219

    The social infrastructure, including roads and bridges built during period of rapid economic growth in Japan, is now aging, and there is a need to strategically maintain and renew the social infrastructure that is aging. On the other hand, road maintenance in rural areas is facing serious problems such as reduced budgets for maintenance and a shortage of engineers due to the declining birthrate and aging population. Therefore, it is difficult to visually inspect all roads in rural areas by maintenance engineers, and a system to automatically detect road damage is required. This paper reports practical improvements to the road damage model using YOLOv5, an object detection model capable of real-time operation, focusing on road image features.

  • Mitigate: Toward Comprehensive Research and Development for Analyzing and Combating IoT Malware

    Koji NAKAO  Katsunari YOSHIOKA  Takayuki SASAKI  Rui TANABE  Xuping HUANG  Takeshi TAKAHASHI  Akira FUJITA  Jun'ichi TAKEUCHI  Noboru MURATA  Junji SHIKATA  Kazuki IWAMOTO  Kazuki TAKADA  Yuki ISHIDA  Masaru TAKEUCHI  Naoto YANAI  

     
    INVITED PAPER

      Pubricized:
    2023/06/08
      Vol:
    E106-D No:9
      Page(s):
    1302-1315

    In this paper, we developed the latest IoT honeypots to capture IoT malware currently on the loose, analyzed IoT malware with new features such as persistent infection, developed malware removal methods to be provided to IoT device users. Furthermore, as attack behaviors using IoT devices become more diverse and sophisticated every year, we conducted research related to various factors involved in understanding the overall picture of attack behaviors from the perspective of incident responders. As the final stage of countermeasures, we also conducted research and development of IoT malware disabling technology to stop only IoT malware activities in IoT devices and IoT system disabling technology to remotely control (including stopping) IoT devices themselves.

  • Enumerating Empty and Surrounding Polygons

    Shunta TERUI  Katsuhisa YAMANAKA  Takashi HIRAYAMA  Takashi HORIYAMA  Kazuhiro KURITA  Takeaki UNO  

     
    PAPER-Algorithms and Data Structures

      Pubricized:
    2023/04/03
      Vol:
    E106-A No:9
      Page(s):
    1082-1091

    We are given a set S of n points in the Euclidean plane. We assume that S is in general position. A simple polygon P is an empty polygon of S if each vertex of P is a point in S and every point in S is either outside P or a vertex of P. In this paper, we consider the problem of enumerating all the empty polygons of a given point set. To design an efficient enumeration algorithm, we use a reverse search by Avis and Fukuda with child lists. We propose an algorithm that enumerates all the empty polygons of S in O(n2|ε(S)|)-time, where ε(S) is the set of empty polygons of S. Moreover, by applying the same idea to the problem of enumerating surrounding polygons of a given point set S, we propose an enumeration algorithm that enumerates them in O(n2)-delay, while the known algorithm enumerates in O(n2 log n)-delay, where a surroundingpolygon of S is a polygon such that each vertex of the polygon is a point in S and every point in S is either inside the polygon or a vertex of the polygon.

  • Optimal Online Bin Packing Algorithms for Some Cases with Two Item Sizes

    Hiroshi FUJIWARA  Masaya KAWAGUCHI  Daiki TAKIZAWA  Hiroaki YAMAMOTO  

     
    PAPER-Algorithms and Data Structures

      Pubricized:
    2023/03/07
      Vol:
    E106-A No:9
      Page(s):
    1100-1110

    The bin packing problem is a problem of finding an assignment of a sequence of items to a minimum number of bins, each of capacity one. An online algorithm for the bin packing problem is an algorithm that irrevocably assigns each item one by one from the head of the sequence. Gutin, Jensen, and Yeo (2006) considered a version in which all items are only of two different sizes and the online algorithm knows the two possible sizes in advance, and gave an optimal online algorithm for the case when the larger size exceeds 1/2. In this paper we provide an optimal online algorithm for some of the cases when the larger size is at most 1/2, on the basis of a framework that facilitates the design and analysis of algorithms.

  • Computational Complexity of Allow Rule Ordering and Its Greedy Algorithm

    Takashi FUCHINO  Takashi HARADA  Ken TANAKA  Kenji MIKAWA  

     
    PAPER-Algorithms and Data Structures

      Pubricized:
    2023/03/20
      Vol:
    E106-A No:9
      Page(s):
    1111-1118

    Packet classification is used to determine the behavior of incoming packets in network devices according to defined rules. As it is achieved using a linear search on a classification rule list, a large number of rules will lead to longer communication latency. To solve this, the problem of finding the order of rules minimizing the latency has been studied. Misherghi et al. and Harada et al. have proposed a problem that relaxes to policy-based constraints. In this paper, we show that the Relaxed Optimal Rule Ordering (RORO) for the allowlist is NP-hard, and by reducing from this we show that RORO for the general rule list is NP-hard. We also propose a heuristic algorithm based on the greedy method for an allowlist. Furthermore, we demonstrate the effectiveness of our method using ClassBench, which is a benchmark for packet classification algorithms.

501-520hit(30728hit)