The search functionality is under construction.

Keyword Search Result

[Keyword] RSA(346hit)


  • Conversational AI as a Facilitator Improves Participant Engagement and Problem-Solving in Online Discussion: Sharing Evidence from Five Cities in Afghanistan Open Access

    Sofia SAHAB  Jawad HAQBEEN  Takayuki ITO  


    E107-D No:4

    Despite the increasing use of conversational artificial intelligence (AI) in online discussion environments, few studies explore the application of AI as a facilitator in forming problem-solving debates and influencing opinions in cross-venue scenarios, particularly in diverse and war-ravaged countries. This study aims to investigate the impact of AI on enhancing participant engagement and collaborative problem-solving in online-mediated discussion environments, especially in diverse and heterogeneous discussion settings, such as the five cities in Afghanistan. We seek to assess the extent to which AI participation in online conversations succeeds by examining the depth of discussions and participants' contributions, comparing discussions facilitated by AI with those not facilitated by AI across different venues. The results are discussed with respect to forming and changing opinions with and without AI-mediated communication. The findings indicate that the number of opinions generated in AI-facilitated discussions significantly differs from discussions without AI support. Additionally, statistical analyses reveal quantitative disparities in online discourse sentiments when conversational AI is present compared to when it is absent. These findings contribute to a better understanding of the role of AI-mediated discussions and offer several practical and social implications, paving the way for future developments and improvements.

  • An Efficient Bayes Coding Algorithm for Changing Context Tree Model

    Koshi SHIMADA  Shota SAITO  Toshiyasu MATSUSHIMA  

    PAPER-Source Coding and Data Compression

    E107-A No:3

    The context tree model has the property that the occurrence probability of symbols is determined from a finite past sequence and is a broader class of sources that includes i.i.d. or Markov sources. This paper proposes a non-stationary source with context tree models that change from interval to interval. The Bayes code for this source requires weighting of the posterior probabilities of the context tree models and change points, so the computational complexity of it usually increases to exponential order. Therefore, the challenge is how to reduce the computational complexity. In this paper, we propose a special class of prior probability distribution of context tree models and change points and develop an efficient Bayes coding algorithm by combining two existing Bayes coding algorithms. The algorithm minimizes the Bayes risk function of the proposed source in this paper, and the computational complexity of the proposed algorithm is polynomial order. We investigate the behavior and performance of the proposed algorithm by conducting experiments.

  • Equivalences among Some Information Measures for Individual Sequences and Their Applications for Fixed-Length Coding Problems

    Tomohiko UYEMATSU  Tetsunao MATSUTA  

    PAPER-Source Coding and Data Compression

    E107-A No:3

    This paper proposes three new information measures for individual sequences and clarifies their properties. Our new information measures are called as the non-overlapping max-entropy, the overlapping smooth max-entropy, and the non-overlapping smooth max-entropy, respectively. These measures are related to the fixed-length coding of individual sequences. We investigate these measures, and show the following three properties: (1) The non-overlapping max-entropy coincides with the topological entropy. (2) The overlapping smooth max-entropy and the non-overlapping smooth max-entropy coincide with the Ziv-entropy. (3) When an individual sequence is drawn from an ergodic source, the overlapping smooth max-entropy and the non-overlapping smooth max-entropy coincide with the entropy rate of the source. Further, we apply these information measures to the fixed-length coding of individual sequences, and propose some new universal coding schemes which are asymptotically optimum.

  • Adversarial Examples Created by Fault Injection Attack on Image Sensor Interface

    Tatsuya OYAMA  Kota YOSHIDA  Shunsuke OKURA  Takeshi FUJINO  


    E107-A No:3

    Adversarial examples (AEs), which cause misclassification by adding subtle perturbations to input images, have been proposed as an attack method on image-classification systems using deep neural networks (DNNs). Physical AEs created by attaching stickers to traffic signs have been reported, which are a threat to traffic-sign-recognition DNNs used in advanced driver assistance systems. We previously proposed an attack method for generating a noise area on images by superimposing an electrical signal on the mobile industry processor interface and showed that it can generate a single adversarial mark that triggers a backdoor attack on the input image. Therefore, we propose a misclassification attack method n DNNs by creating AEs that include small perturbations to multiple places on the image by the fault injection. The perturbation position for AEs is pre-calculated in advance against the target traffic-sign image, which will be captured on future driving. With 5.2% to 5.5% of a specific image on the simulation, the perturbation that induces misclassification to the target label was calculated. As the experimental results, we confirmed that the traffic-sign-recognition DNN on a Raspberry Pi was successfully misclassified when the target traffic sign was captured with. In addition, we created robust AEs that cause misclassification of images with varying positions and size by adding a common perturbation. We propose a method to reduce the amount of robust AEs perturbation. Our results demonstrated successful misclassification of the captured image with a high attack success rate even if the position and size of the captured image are slightly changed.

  • Content-Adaptive Optimization Framework for Universal Deep Image Compression

    Koki TSUBOTA  Kiyoharu AIZAWA  

    PAPER-Image Processing and Video Processing

    E107-D No:2

    While deep image compression performs better than traditional codecs like JPEG on natural images, it faces a challenge as a learning-based approach: compression performance drastically decreases for out-of-domain images. To investigate this problem, we introduce a novel task that we call universal deep image compression, which involves compressing images in arbitrary domains, such as natural images, line drawings, and comics. Furthermore, we propose a content-adaptive optimization framework to tackle this task. This framework adapts a pre-trained compression model to each target image during testing for addressing the domain gap between pre-training and testing. For each input image, we insert adapters into the decoder of the model and optimize the latent representation extracted by the encoder and the adapter parameters in terms of rate-distortion, with the adapter parameters transmitted per image. To achieve the evaluation of the proposed universal deep compression, we constructed a benchmark dataset containing uncompressed images of four domains: natural images, line drawings, comics, and vector arts. We compare our proposed method with non-adaptive and existing adaptive compression methods, and the results show that our method outperforms them. Our code and dataset are publicly available at

  • A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation

    Gang LIU  Xin CHEN  Zhixiang GAO  

    PAPER-Artificial Intelligence, Data Mining

    E107-D No:1

    Photo animation is to transform photos of real-world scenes into anime style images, which is a challenging task in AIGC (AI Generated Content). Although previous methods have achieved promising results, they often introduce noticeable artifacts or distortions. In this paper, we propose a novel double-tail generative adversarial network (DTGAN) for fast photo animation. DTGAN is the third version of the AnimeGAN series. Therefore, DTGAN is also called AnimeGANv3. The generator of DTGAN has two output tails, a support tail for outputting coarse-grained anime style images and a main tail for refining coarse-grained anime style images. In DTGAN, we propose a novel learnable normalization technique, termed as linearly adaptive denormalization (LADE), to prevent artifacts in the generated images. In order to improve the visual quality of the generated anime style images, two novel loss functions suitable for photo animation are proposed: 1) the region smoothing loss function, which is used to weaken the texture details of the generated images to achieve anime effects with abstract details; 2) the fine-grained revision loss function, which is used to eliminate artifacts and noise in the generated anime style image while preserving clear edges. Furthermore, the generator of DTGAN is a lightweight generator framework with only 1.02 million parameters in the inference phase. The proposed DTGAN can be easily end-to-end trained with unpaired training data. Extensive experiments have been conducted to qualitatively and quantitatively demonstrate that our method can produce high-quality anime style images from real-world photos and perform better than the state-of-the-art models.

  • Imbalanced Data Over-Sampling Method Based on ISODATA Clustering

    Zhenzhe LV  Qicheng LIU  

    PAPER-Artificial Intelligence, Data Mining

    E106-D No:9

    Class imbalance is one of the challenges faced in the field of machine learning. It is difficult for traditional classifiers to predict the minority class data. If the imbalanced data is not processed, the effect of the classifier will be greatly reduced. Aiming at the problem that the traditional classifier tends to the majority class data and ignores the minority class data, imbalanced data over-sampling method based on iterative self-organizing data analysis technique algorithm(ISODATA) clustering is proposed. The minority class is divided into different sub-clusters by ISODATA, and each sub-cluster is over-sampled according to the sampling ratio, so that the sampled minority class data also conforms to the imbalance of the original minority class data. The new imbalanced data composed of new minority class data and majority class data is classified by SVM and Random Forest classifier. Experiments on 12 datasets from the KEEL datasets show that the method has better G-means and F-value, improving the classification accuracy.

  • Multiple Layout Design Generation via a GAN-Based Method with Conditional Convolution and Attention

    Xing ZHU  Yuxuan LIU  Lingyu LIANG  Tao WANG  Zuoyong LI  Qiaoming DENG  Yubo LIU  

    LETTER-Computer Graphics

    E106-D No:9

    Recently, many AI-aided layout design systems are developed to reduce tedious manual intervention based on deep learning. However, most methods focus on a specific generation task. This paper explores a challenging problem to obtain multiple layout design generation (LDG), which generates floor plan or urban plan from a boundary input under a unified framework. One of the main challenges of multiple LDG is to obtain reasonable topological structures of layout generation with irregular boundaries and layout elements for different types of design. This paper formulates the multiple LDG task as an image-to-image translation problem, and proposes a conditional generative adversarial network (GAN), called LDGAN, with adaptive modules. The framework of LDGAN is based on a generator-discriminator architecture, where the generator is integrated with conditional convolution constrained by the boundary input and the attention module with channel and spatial features. Qualitative and quantitative experiments were conducted on the SCUT-AutoALP and RPLAN datasets, and the comparison with the state-of-the-art methods illustrate the effectiveness and superiority of the proposed LDGAN.

  • Multi-Scale Correspondence Learning for Person Image Generation

    Shi-Long SHEN  Ai-Guo WU  Yong XU  

    PAPER-Person Image Generation

    E106-D No:5

    A generative model is presented for two types of person image generation in this paper. First, this model is applied to pose-guided person image generation, i.e., converting the pose of a source person image to the target pose while preserving the texture of that source person image. Second, this model is also used for clothing-guided person image generation, i.e., changing the clothing texture of a source person image to the desired clothing texture. The core idea of the proposed model is to establish the multi-scale correspondence, which can effectively address the misalignment introduced by transferring pose, thereby preserving richer information on appearance. Specifically, the proposed model consists of two stages: 1) It first generates the target semantic map imposed on the target pose to provide more accurate guidance during the generation process. 2) After obtaining the multi-scale feature map by the encoder, the multi-scale correspondence is established, which is useful for a fine-grained generation. Experimental results show the proposed method is superior to state-of-the-art methods in pose-guided person image generation and show its effectiveness in clothing-guided person image generation.

  • Enhanced Full Attention Generative Adversarial Networks

    KaiXu CHEN  Satoshi YAMANE  

    LETTER-Core Methods

    E106-D No:5

    In this paper, we propose improved Generative Adversarial Networks with attention module in Generator, which can enhance the effectiveness of Generator. Furthermore, recent work has shown that Generator conditioning affects GAN performance. Leveraging this insight, we explored the effect of different normalization (spectral normalization, instance normalization) on Generator and Discriminator. Moreover, an enhanced loss function called Wasserstein Divergence distance, can alleviate the problem of difficult to train module in practice.

  • New Training Method for Non-Dominant Hand Pitching Motion Based on Reversal Trajectory of Dominant Hand Pitching Motion Using AR and Vibration

    Masato SOGA  Taiki MORI  

    PAPER-Educational Technology

    E106-D No:5

    In this paper, we propose a new method for non-dominant limb training. The method is that a learner aims at a motion which is generated by reversing his/her own motion of dominant limb, when he/she tries to train himself/herself for non-dominant limb training. In addition, we designed and developed interface for the new method which can select feedback types. One is an interface using AR and sound, and the other is an interface using AR and vibration. We found that vibration feedback was effective for non-dominant hand training of pitching motion, while sound feedback was not so effective as vibration.

  • Fish Detecting Using YOLOv4 and CVAE in Aquaculture Ponds with a Non-Uniform Strong Reflection Background

    Meng ZHAO  Junfeng WU  Hong YU  Haiqing LI  Jingwen XU  Siqi CHENG  Lishuai GU  Juan MENG  

    PAPER-Smart Agriculture

    E106-D No:5

    Accurate fish detection is of great significance in aquaculture. However, the non-uniform strong reflection in aquaculture ponds will affect the precision of fish detection. This paper combines YOLOv4 and CVAE to accurately detect fishes in the image with non-uniform strong reflection, in which the reflection in the image is removed at first and then the reflection-removed image is provided for fish detecting. Firstly, the improved YOLOv4 is applied to detect and mask the strong reflective region, to locate and label the reflective region for the subsequent reflection removal. Then, CVAE is combined with the improved YOLOv4 for inferring the priori distribution of the Reflection region and restoring the Reflection region by the distribution so that the reflection can be removed. For further improving the quality of the reflection-removed images, the adversarial learning is appended to CVAE. Finally, YOLOV4 is used to detect fishes in the high quality image. In addition, a new image dataset of pond cultured takifugu rubripes is constructed,, which includes 1000 images with fishes annotated manually, also a synthetic dataset including 2000 images with strong reflection is created and merged with the generated dataset for training and verifying the robustness of the proposed method. Comprehensive experiments are performed to compare the proposed method with the state-of-the-art fish detecting methods without reflection removal on the generated dataset. The results show that the fish detecting precision and recall of the proposed method are improved by 2.7% and 2.4% respectively.

  • How Many Tweets Describe the Topics on TV Programs: An Investigation on the Relation between Twitter and Mass Media

    Jun IIO  


    E106-D No:4

    As the Internet has become prevalent, the popularity of net media has been growing, to a point that it has taken over conventional mass media. However, TWtrends, the Twitter trends visualization system operated by our research team since 2019, indicates that many topics on TV programs frequently appear on Twitter trendlines. This study investigates the relationship between Twitter and TV programs by collecting information on Twitter trends and TV programs simultaneously. Although this study provides a rough estimation of the volume of tweets that mention TV programs, the results show that several tweets mention TV programs at a constant rate, which tends to increase on the weekend. This tendency of TV-related tweets stems from the audience rating survey results. Considering the study outcome, and the fact that many TV programs introduce topics popular in social media, implies codependency between Internet media (social media) and mass media.

  • Information Leakage Through Passive Timing Attacks on RSA Decryption System

    Tomonori HIRATA  Yuichi KAJI  

    PAPER-Cryptography and Information Security

    E106-A No:3

    A side channel attack is a means of security attacks that tries to restore secret information by analyzing side-information such as electromagnetic wave, heat, electric energy and running time that are unintentionally emitted from a computer system. The side channel attack that focuses on the running time of a cryptosystem is specifically named a “timing attack”. Timing attacks are relatively easy to carry out, and particularly threatening for tiny systems that are used in smart cards and IoT devices because the system is so simple that the processing time would be clearly observed from the outside of the card/device. The threat of timing attacks is especially serious when an attacker actively controls the input to a target program. Countermeasures are studied to deter such active attacks, but the attacker still has the chance to learn something about the concealed information by passively watching the running time of the target program. The risk of passive timing attacks can be measured by the mutual information between the concealed information and the running time. However, the computation of the mutual information is hardly possible except for toy examples. This study focuses on three algorithms for RSA decryption, derives formulas of the mutual information under several assumptions and approximations, and calculates the mutual information numerically for practical security parameters.

  • Lookahead Search-Based Low-Complexity Multi-Type Tree Pruning Method for Versatile Video Coding (VVC) Intra Coding

    Qi TENG  Guowei TENG  Xiang LI  Ran MA  Ping AN  Zhenglong YANG  

    PAPER-Coding Theory

    E106-A No:3

    The latest versatile video coding (VVC) introduces some novel techniques such as quadtree with nested multi-type tree (QTMT), multiple transform selection (MTS) and multiple reference line (MRL). These tools improve compression efficiency compared with the previous standard H.265/HEVC, but they suffer from very high computational complexity. One of the most time-consuming parts of VVC intra coding is the coding tree unit (CTU) structure decision. In this paper, we propose a low-complexity multi-type tree (MT) pruning method for VVC intra coding. This method consists of lookahead search and MT pruning. The lookahead search process is performed to derive the approximate rate-distortion (RD) cost of each MT node at depth 2 or 3. Subsequently, the improbable MT nodes are pruned by different strategies under different cost errors. These strategies are designed according to the priority of the node. Experimental results show that the overall proposed algorithm can achieve 47.15% time saving with only 0.93% Bjøntegaard delta bit rate (BDBR) increase over natural scene sequences, and 45.39% time saving with 1.55% BDBR increase over screen content sequences, compared with the VVC reference software VTM 10.0. Such results demonstrate that our method achieves a good trade-off between computational complexity and compression quality compared to recent methods.

  • Adversarial Reinforcement Learning-Based Coordinated Robust Spatial Reuse in Broadcast-Overlaid WLANs

    Yuto KIHIRA  Yusuke KODA  Koji YAMAMOTO  Takayuki NISHIO  

    PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

    E106-B No:2

    Broadcast services for wireless local area networks (WLANs) are being standardized in the IEEE 802.11 task group bc. Envisaging the upcoming coexistence of broadcast access points (APs) with densely-deployed legacy APs, this paper addresses a learning-based spatial reuse with only partial receiver-awareness. This partial awareness means that the broadcast APs can leverage few acknowledgment frames (ACKs) from recipient stations (STAs). This is in view of the specific concerns of broadcast communications. In broadcast communications for a very large number of STAs, ACK implosions occur unless some STAs are stopped from responding with ACKs. Given this, the main contribution of this paper is to demonstrate the feasibility to improve the robustness of learning-based spatial reuse to hidden interferers only with the partial receiver-awareness while discarding any re-training of broadcast APs. The core idea is to leverage robust adversarial reinforcement learning (RARL), where before a hidden interferer is installed, a broadcast AP learns a rate adaptation policy in a competition with a proxy interferer that provides jamming signals intelligently. Therein, the recipient STAs experience interference and the partial STAs provide a feedback overestimating the effect of interference, allowing the broadcast AP to select a data rate to avoid frame losses in a broad range of recipient STAs. Simulations demonstrate the suppression of the throughput degradation under a sudden installation of a hidden interferer, indicating the feasibility of acquiring robustness to the hidden interferer.

  • Toward Selective Adversarial Attack for Gait Recognition Systems Based on Deep Neural Network

    Hyun KWON  

    LETTER-Information Network

    E106-D No:2

    Deep neural networks (DNNs) perform well for image recognition, speech recognition, and pattern analysis. However, such neural networks are vulnerable to adversarial examples. An adversarial example is a data sample created by adding a small amount of noise to an original sample in such a way that it is difficult for humans to identify but that will cause the sample to be misclassified by a target model. In a military environment, adversarial examples that are correctly classified by a friendly model while deceiving an enemy model may be useful. In this paper, we propose a method for generating a selective adversarial example that is correctly classified by a friendly gait recognition system and misclassified by an enemy gait recognition system. The proposed scheme generates the selective adversarial example by combining the loss for correct classification by the friendly gait recognition system with the loss for misclassification by the enemy gait recognition system. In our experiments, we used the CASIA Gait Database as the dataset and TensorFlow as the machine learning library. The results show that the proposed method can generate selective adversarial examples that have a 98.5% attack success rate against an enemy gait recognition system and are classified with 87.3% accuracy by a friendly gait recognition system.

  • Face Image Generation of Anime Characters Using an Advanced First Order Motion Model with Facial Landmarks

    Junki OSHIBA  Motoi IWATA  Koichi KISE  


    E106-D No:1

    Recently, deep learning for image generation with a guide for the generation has been progressing. Many methods have been proposed to generate the animation of facial expression change from a single face image by transferring some facial expression information to the face image. In particular, the method of using facial landmarks as facial expression information can generate a variety of facial expressions. However, most methods do not focus on anime characters but humans. Moreover, we attempted to apply several existing methods to anime characters by training the methods on an anime character face dataset; however, they generated images with noise, even in regions where there was no change. The first order motion model (FOMM) is an image generation method that takes two images as input and transfers one facial expression or pose to the other. By explicitly calculating the difference between the two images based on optical flow, FOMM can generate images with low noise in the unchanged regions. In the following, we focus on the aspect of the face image generation in FOMM. When we think about the employment of facial landmarks as targets, the performance of FOMM is not enough because FOMM cannot use a facial landmark as a facial expression target because the appearances of a face image and a facial landmark are quite different. Therefore, we propose an advanced FOMM method to use facial landmarks as a facial expression target. In the proposed method, we change the input data and data flow to use facial landmarks. Additionally, to generate face images with expressions that follow the target landmarks more closely, we introduce the landmark estimation loss, which is computed by comparing the landmark detected from the generated image with the target landmark. Our experiments on an anime character face image dataset demonstrated that our method is effective for landmark-guided face image generation for anime characters. Furthermore, our method outperformed other methods quantitatively and generated face images with less noise.

  • Projection-Based Physical Adversarial Attack for Monocular Depth Estimation

    Renya DAIMO  Satoshi ONO  


    E106-D No:1

    Monocular depth estimation has improved drastically due to the development of deep neural networks (DNNs). However, recent studies have revealed that DNNs for monocular depth estimation contain vulnerabilities that can lead to misestimation when perturbations are added to input. This study investigates whether DNNs for monocular depth estimation is vulnerable to misestimation when patterned light is projected on an object using a video projector. To this end, this study proposes an evolutionary adversarial attack method with multi-fidelity evaluation scheme that allows creating adversarial examples under black-box condition while suppressing the computational cost. Experiments in both simulated and real scenes showed that the designed light pattern caused a DNN to misestimate objects as if they have moved to the back.

  • Adversarial Example Detection Based on Improved GhostBusters

    Hyunghoon KIM  Jiwoo SHIN  Hyo Jin JO  


    E105-D No:11

    In various studies of attacks on autonomous vehicles (AVs), a phantom attack in which advanced driver assistance system (ADAS) misclassifies a fake object created by an adversary as a real object has been proposed. In this paper, we propose F-GhostBusters, which is an improved version of GhostBusters that detects phantom attacks. The proposed model uses a new feature, i.e, frequency of images. Experimental results show that F-GhostBusters not only improves the detection performance of GhostBusters but also can complement the accuracy against adversarial examples.
