The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] SI(16314hit)

81-100hit(16314hit)

  • A Novel Remote-Tracking Heart Rate Measurement Method Based on Stepping Motor and mm-Wave FMCW Radar Open Access

    Yaokun HU  Xuanyu PENG  Takeshi TODA  

     
    PAPER-Sensing

      Vol:
    E107-B No:6
      Page(s):
    470-486

    The subject must be motionless for conventional radar-based non-contact vital signs measurements. Additionally, the measurement range is limited by the design of the radar module itself. Although the accuracy of measurements has been improving, the prospects for their application could have been faster to develop. This paper proposed a novel radar-based adaptive tracking method for measuring the heart rate of the moving monitored person. The radar module is fixed on a circular plate and driven by stepping motors to rotate it. In order to protect the user’s privacy, the method uses radar signal processing to detect the subject’s position to control a stepping motor that adjusts the radar’s measurement range. The results of the fixed-route experiments revealed that when the subject was moving at a speed of 0.5 m/s, the mean values of RMSE for heart rate measurements were all below 2.85 beat per minute (bpm), and when moving at a speed of 1 m/s, they were all below 4.05 bpm. When subjects walked at random routes and speeds, the RMSE of the measurements were all below 6.85 bpm, with a mean value of 4.35 bpm. The average RR interval time of the reconstructed heartbeat signal was highly correlated with the electrocardiography (ECG) data, with a correlation coefficient of 0.9905. In addition, this study not only evaluated the potential effect of arm swing (more normal walking motion) on heart rate measurement but also demonstrated the ability of the proposed method to measure heart rate in a multiple-people scenario.

  • Federated Deep Reinforcement Learning for Multimedia Task Offloading and Resource Allocation in MEC Networks Open Access

    Rongqi ZHANG  Chunyun PAN  Yafei WANG  Yuanyuan YAO  Xuehua LI  

     
    PAPER-Network

      Vol:
    E107-B No:6
      Page(s):
    446-457

    With maturation of 5G technology in recent years, multimedia services such as live video streaming and online games on the Internet have flourished. These multimedia services frequently require low latency, which pose a significant challenge to compute the high latency requirements multimedia tasks. Mobile edge computing (MEC), is considered a key technology solution to address the above challenges. It offloads computation-intensive tasks to edge servers by sinking mobile nodes, which reduces task execution latency and relieves computing pressure on multimedia devices. In order to use MEC paradigm reasonably and efficiently, resource allocation has become a new challenge. In this paper, we focus on the multimedia tasks which need to be uploaded and processed in the network. We set the optimization problem with the goal of minimizing the latency and energy consumption required to perform tasks in multimedia devices. To solve the complex and non-convex problem, we formulate the optimization problem as a distributed deep reinforcement learning (DRL) problem and propose a federated Dueling deep Q-network (DDQN) based multimedia task offloading and resource allocation algorithm (FDRL-DDQN). In the algorithm, DRL is trained on the local device, while federated learning (FL) is responsible for aggregating and updating the parameters from the trained local models. Further, in order to solve the not identically and independently distributed (non-IID) data problem of multimedia devices, we develop a method for selecting participating federated devices. The simulation results show that the FDRL-DDQN algorithm can reduce the total cost by 31.3% compared to the DQN algorithm when the task data is 1000 kbit, and the maximum reduction can be 35.3% compared to the traditional baseline algorithm.

  • Physical Layer Security Enhancement for mmWave System with Multiple RISs and Imperfect CSI Open Access

    Qingqing TU  Zheng DONG  Xianbing ZOU  Ning WEI  

     
    PAPER-Fundamental Theories for Communications

      Vol:
    E107-B No:6
      Page(s):
    430-445

    Despite the appealing advantages of reconfigurable intelligent surfaces (RIS) aided mmWave communications, there remain practical issues that need to be addressed before the large-scale deployment of RISs in future wireless networks. In this study, we jointly consider the non-neglectable practical issues in a multi-RIS-aided mmWave system, which can significantly affect the secrecy performance, including the high computational complexity, imperfect channel state information (CSI), and finite resolution of phase shifters. To solve this non-convex challenging stochastic optimization problem, we propose a robust and low-complexity algorithm to maximize the achievable secrete rate. Specially, by combining the benefits of fractional programming and the stochastic successive convex approximation techniques, we transform the joint optimization problem into some convex ones and solve them sub-optimally. The theoretical analysis and simulation results demonstrate that the proposed algorithms could mitigate the joint negative effects of practical issues and yielded a tradeoff between secure performance and complexity/overhead outperforming non-robust benchmarks, which increases the robustness and flexibility of multiple RIS deployments in future wireless networks.

  • An Adaptively Biased OFDM Based on Hartley Transform for Visible Light Communication Systems Open Access

    Menglong WU  Yongfa XIE  Yongchao SHI  Jianwen ZHANG  Tianao YAO  Wenkai LIU  

     
    LETTER-Communication Theory and Signals

      Pubricized:
    2023/09/20
      Vol:
    E107-A No:6
      Page(s):
    928-931

    Direct-current biased optical orthogonal frequency division multiplexing (DCO-OFDM) converts bipolar OFDM signals into unipolar non-negative signals by introducing a high DC bias, which satisfies the requirement that the signal transmitted by intensity modulated/direct detection (IM/DD) must be positive. However, the high DC bias results in low power efficiency of DCO-OFDM. An adaptively biased optical OFDM was proposed, which could be designed with different biases according to the signal amplitude to improve power efficiency in this letter. The adaptive bias does not need to be taken off deliberately at the receiver, and the interference caused by the adaptive bias will only be placed on the reserved subcarriers, which will not affect the effective information. Moreover, the proposed OFDM uses Hartley transform instead of Fourier transform used in conventional optical OFDM, which makes this OFDM have low computational complexity and high spectral efficiency. The simulation results show that the normalized optical bit energy to noise power ratio (Eb(opt)/N0) required by the proposed OFDM at the bit error rate (BER) of 10-3 is, on average, 7.5 dB and 3.4 dB lower than that of DCO-OFDM and superimposed asymmetrically clipped optical OFDM (ACO-OFDM), respectively.

  • Secrecy Outage Probability and Secrecy Diversity Order of Alamouti STBC with Decision Feedback Detection over Time-Selective Fading Channels Open Access

    Gyulim KIM  Hoojin LEE  Xinrong LI  Seong Ho CHAE  

     
    LETTER-Communication Theory and Signals

      Pubricized:
    2023/09/19
      Vol:
    E107-A No:6
      Page(s):
    923-927

    This letter studies the secrecy outage probability (SOP) and the secrecy diversity order of Alamouti STBC with decision feedback (DF) detection over the time-selective fading channels. For given temporal correlations, we have derived the exact SOPs and their asymptotic approximations for all possible combinations of detection schemes including joint maximum likehood (JML), zero-forcing (ZF), and DF at Bob and Eve. We reveal that the SOP is mainly influenced by the detection scheme of the legitimate receiver rather than eavesdropper and the achievable secrecy diversity order converges to two and one for JML only at Bob (i.e., JML-JML/ZF/DF) and for the other cases (i.e., ZF-JML/ZF/DF, DF-JML/ZF/DF), respectively. Here, p-q combination pair indicates that Bob and Eve adopt the detection method p ∈ {JML, ZF, DF} and q ∈ {JML, ZF, DF}, respectively.

  • Dynamic Limited Variable Step-Size Algorithm Based on the MSD Variation Cost Function Open Access

    Yufei HAN  Jiaye XIE  Yibo LI  

     
    LETTER-Digital Signal Processing

      Pubricized:
    2023/09/11
      Vol:
    E107-A No:6
      Page(s):
    919-922

    The steady-state and convergence performances are important indicators to evaluate adaptive algorithms. The step-size affects these two important indicators directly. Many relevant scholars have also proposed some variable step-size adaptive algorithms for improving performance. However, there are still some problems in these existing variable step-size adaptive algorithms, such as the insufficient theoretical analysis, the imbalanced performance and the unachievable parameter. These problems influence the actual performance of some algorithms greatly. Therefore, we intend to further explore an inherent relationship between the key performance and the step-size in this paper. The variation of mean square deviation (MSD) is adopted as the cost function. Based on some theoretical analyses and derivations, a novel variable step-size algorithm with a dynamic limited function (DLF) was proposed. At the same time, the sufficient theoretical analysis is conducted on the weight deviation and the convergence stability. The proposed algorithm is also tested with some typical algorithms in many different environments. Both the theoretical analysis and the experimental result all have verified that the proposed algorithm equips a superior performance.

  • Analysis of Blood Cell Image Recognition Methods Based on Improved CNN and Vision Transformer Open Access

    Pingping WANG  Xinyi ZHANG  Yuyan ZHAO  Yueti LI  Kaisheng XU  Shuaiyin ZHAO  

     
    PAPER-Neural Networks and Bioengineering

      Pubricized:
    2023/09/15
      Vol:
    E107-A No:6
      Page(s):
    899-908

    Leukemia is a common and highly dangerous blood disease that requires early detection and treatment. Currently, the diagnosis of leukemia types mainly relies on the pathologist’s morphological examination of blood cell images, which is a tedious and time-consuming process, and the diagnosis results are highly subjective and prone to misdiagnosis and missed diagnosis. This research suggests a blood cell image recognition technique based on an enhanced Vision Transformer to address these problems. Firstly, this paper incorporate convolutions with token embedding to replace the positional encoding which represent coarse spatial information. Then based on the Transformer’s self-attention mechanism, this paper proposes a sparse attention module that can select identifying regions in the image, further enhancing the model’s fine-grained feature expression capability. Finally, this paper uses a contrastive loss function to further increase the intra-class consistency and inter-class difference of classification features. According to experimental results, The model in this study has an identification accuracy of 92.49% on the Munich single-cell morphological dataset, which is an improvement of 1.41% over the baseline. And comparing with sota Swin transformer, this method still get greater performance. So our method has the potential to provide reference for clinical diagnosis by physicians.

  • Operational Resilience of Network Considering Common-Cause Failures Open Access

    Tetsushi YUGE  Yasumasa SAGAWA  Natsumi TAKAHASHI  

     
    PAPER-Reliability, Maintainability and Safety Analysis

      Pubricized:
    2023/09/11
      Vol:
    E107-A No:6
      Page(s):
    855-863

    This paper discusses the resilience of networks based on graph theory and stochastic process. The electric power network where edges may fail simultaneously and the performance of the network is measured by the ratio of connected nodes is supposed for the target network. For the restoration, under the constraint that the resources are limited, the failed edges are repaired one by one, and the order of the repair for several failed edges is determined with the priority to the edge that the amount of increasing system performance is the largest after the completion of repair. Two types of resilience are discussed, one is resilience in the recovery stage according to the conventional definition of resilience and the other is steady state operational resilience considering the long-term operation in which the network state changes stochastically. The second represents a comprehensive capacity of resilience for a system and is analytically derived by Markov analysis. We assume that the large-scale disruption occurs due to the simultaneous failure of edges caused by the common cause failures in the analysis. Marshall-Olkin type shock model and α factor method are incorporated to model the common cause failures. Then two resilience measures, “operational resilience” and “operational resilience in recovery stage” are proposed. We also propose approximation methods to obtain these two operational resilience measures for complex networks.

  • Investigating the Efficacy of Partial Decomposition in Kit-Build Concept Maps for Reducing Cognitive Load and Enhancing Reading Comprehension Open Access

    Nawras KHUDHUR  Aryo PINANDITO  Yusuke HAYASHI  Tsukasa HIRASHIMA  

     
    PAPER-Educational Technology

      Pubricized:
    2024/01/11
      Vol:
    E107-D No:5
      Page(s):
    714-727

    This study investigates the efficacy of a partial decomposition approach in concept map recomposition tasks to reduce cognitive load while maintaining the benefits of traditional recomposition approaches. Prior research has demonstrated that concept map recomposition, involving the rearrangement of unconnected concepts and links, can enhance reading comprehension. However, this task often imposes a significant burden on learners’ working memory. To address this challenge, this study proposes a partial recomposition approach where learners are tasked with recomposing only a portion of the concept map, thereby reducing the problem space. The proposed approach aims at lowering the cognitive load while maintaining the benefits of traditional recomposition task, that is, learning effect and motivation. To investigate the differences in cognitive load, learning effect, and motivation between the full decomposition (the traditional approach) and partial decomposition (the proposed approach), we have conducted an experiment (N=78) where the participants were divided into two groups of “full decomposition” and “partial decomposition”. The full decomposition group was assigned the task of recomposing a concept map from a set of unconnected concept nodes and links, while the partial decomposition group worked with partially connected nodes and links. The experimental results show a significant reduction in the embedded cognitive load of concept map recomposition across different dimensions while learning effect and motivation remained similar between the conditions. On the basis of these findings, educators are recommended to incorporate partially disconnected concept maps in recomposition tasks to optimize time management and sustain learner motivation. By implementing this approach, instructors can conserve cognitive resources and allocate saved energy and time to other activities that enhance the overall learning process.

  • Weighted Generalized Hesitant Fuzzy Sets and Its Application in Ensemble Learning Open Access

    Haijun ZHOU  Weixiang LI  Ming CHENG  Yuan SUN  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2024/01/22
      Vol:
    E107-D No:5
      Page(s):
    694-703

    Traditional intuitionistic fuzzy sets and hesitant fuzzy sets will lose some information while representing vague information, to avoid this problem, this paper constructs weighted generalized hesitant fuzzy sets by remaining multiple intuitionistic fuzzy values and giving them corresponding weights. For weighted generalized hesitant fuzzy elements in weighted generalized hesitant fuzzy sets, the paper defines some basic operations and proves their operation properties. On this basis, the paper gives the comparison rules of weighted generalized hesitant fuzzy elements and presents two kinds of aggregation operators. As for weighted generalized hesitant fuzzy preference relation, this paper proposes its definition and computing method of its corresponding consistency index. Furthermore, the paper designs an ensemble learning algorithm based on weighted generalized hesitant fuzzy sets, carries out experiments on 6 datasets in UCI database and compares with various classification algorithms. The experiments show that the ensemble learning algorithm based on weighted generalized hesitant fuzzy sets has better performance in all indicators.

  • Automated Labeling of Entities in CVE Vulnerability Descriptions with Natural Language Processing Open Access

    Kensuke SUMOTO  Kenta KANAKOGI  Hironori WASHIZAKI  Naohiko TSUDA  Nobukazu YOSHIOKA  Yoshiaki FUKAZAWA  Hideyuki KANUKA  

     
    PAPER

      Pubricized:
    2024/02/09
      Vol:
    E107-D No:5
      Page(s):
    674-682

    Security-related issues have become more significant due to the proliferation of IT. Collating security-related information in a database improves security. For example, Common Vulnerabilities and Exposures (CVE) is a security knowledge repository containing descriptions of vulnerabilities about software or source code. Although the descriptions include various entities, there is not a uniform entity structure, making security analysis difficult using individual entities. Developing a consistent entity structure will enhance the security field. Herein we propose a method to automatically label select entities from CVE descriptions by applying the Named Entity Recognition (NER) technique. We manually labeled 3287 CVE descriptions and conducted experiments using a machine learning model called BERT to compare the proposed method to labeling with regular expressions. Machine learning using the proposed method significantly improves the labeling accuracy. It has an f1 score of about 0.93, precision of about 0.91, and recall of about 0.95, demonstrating that our method has potential to automatically label select entities from CVE descriptions.

  • A Personalised Session-Based Recommender System with Sequential Updating Based on Aggregation of Item Embeddings Open Access

    Yuma NAGI  Kazushi OKAMOTO  

     
    PAPER

      Pubricized:
    2024/01/09
      Vol:
    E107-D No:5
      Page(s):
    638-649

    The study proposes a personalised session-based recommender system that embeds items by using Word2Vec and sequentially updates the session and user embeddings with the hierarchicalization and aggregation of item embeddings. To process a recommendation request, the system constructs a real-time user embedding that considers users’ general preferences and sequential behaviour to handle short-term changes in user preferences with a low computational cost. The system performance was experimentally evaluated in terms of the accuracy, diversity, and novelty of the ranking of recommended items and the training and prediction times of the system for three different datasets. The results of these evaluations were then compared with those of the five baseline systems. According to the evaluation experiment, the proposed system achieved a relatively high recommendation accuracy compared with baseline systems and the diversity and novelty scores of the proposed system did not fall below 90% for any dataset. Furthermore, the training times of the Word2Vec-based systems, including the proposed system, were shorter than those of FPMC and GRU4Rec. The evaluation results suggest that the proposed recommender system succeeds in keeping the computational cost for training low while maintaining high-level recommendation accuracy, diversity, and novelty.

  • Locating Concepts on Use Case Steps in Source Code Open Access

    Shinpei HAYASHI  Teppei KATO  Motoshi SAEKI  

     
    PAPER

      Pubricized:
    2023/12/20
      Vol:
    E107-D No:5
      Page(s):
    602-612

    Use case descriptions describe features consisting of multiple concepts with following a procedural flow. Because existing feature location techniques lack a relation between concepts in such features, it is difficult to identify the concepts in the source code with high accuracy. This paper presents a technique to locate concepts in a feature described in a use case description consisting of multiple use case steps using dependency between them. We regard each use case step as a description of a concept and apply an existing concept location technique to the descriptions of concepts and obtain lists of modules. Also, three types of dependencies: time, call, and data dependencies among use case steps are extracted based on their textual description. Modules in the obtained lists failing to match the dependency between concepts are filtered out. Thus, we can obtain more precise lists of modules. We have applied our technique to use case descriptions in a benchmark. Results show that our technique outperformed baseline setting without applying the filtering.

  • Changes in Reading Voice to Convey Design Intention for Users with Visual Impairment Open Access

    Junko SHIROGANE  Daisuke SAYAMA  Hajime IWATA  Yoshiaki FUKAZAWA  

     
    PAPER

      Pubricized:
    2023/12/27
      Vol:
    E107-D No:5
      Page(s):
    589-601

    Webpage texts are often emphasized by decorations such as bold, italic, underline, and text color using HTML (HyperText Markup Language) tags and CSS (Cascading Style Sheets). However, users with visual impairment often struggle to recognize decorations appropriately because most screen readers do not read decorations appropriately. To overcome this limitation, we propose a method to read emphasized texts by changing the reading voice parameters of a screen reader and adding sound effects. First, the strong emphasis types and reading voices are investigated. Second, the intensity of the emphasis type is used to calculate a score. Then the score is used to assign the reading method for the emphasized text. Finally, the proposed method is evaluated by users with and without visual impairment. The proposed method can convey emphasized texts, but future improvements are necessary.

  • Analysis of Optical Power Splitter with Resonator Structure Constructed by Two-Dimensional MDM Plasmonic Waveguide Open Access

    Yoshihiro NAKA  Masahiko NISHIMOTO  Mitsuhiro YOKOTA  

     
    BRIEF PAPER-Electromagnetic Theory

      Pubricized:
    2023/12/07
      Vol:
    E107-C No:5
      Page(s):
    141-145

    An efficient optical power splitter constructed by a metal-dielectric-metal plasmonic waveguide with a resonator structure has been analyzed. The method of solution is the finite difference time domain (FD-TD) method with the piecewise linear recursive convolution (PLRC) method. The resonator structure consists of input/output waveguides and a narrow waveguide with a T-junction. The power splitter with the resonator structure is expressed by an equivalent transmission-line circuit. We can find that the transmittance and reflectance calculated by the FD-TD method and the equivalent circuit are matched when the difference in width between the input/output waveguides and the narrow waveguide is small. It is also shown that the transmission wavelength can be adjusted by changing the narrow waveguide lengths that satisfy the impedance matching condition in the equivalent circuit.

  • Simplified Reactive Torque Model Predictive Control of Induction Motor with Common Mode Voltage Suppression Open Access

    Siyao CHU  Bin WANG  Xinwei NIU  

     
    PAPER-Electronic Instrumentation and Control

      Pubricized:
    2023/11/30
      Vol:
    E107-C No:5
      Page(s):
    132-140

    To reduce the common mode voltage (CMV), suppress the CMV spikes, and improve the steady-state performance, a simplified reactive torque model predictive control (RT-MPC) for induction motors (IMs) is proposed. The proposed prediction model can effectively reduce the complexity of the control algorithm with the direct torque control (DTC) based voltage vector (VV) preselection approach. In addition, the proposed CMV suppression strategy can restrict the CMV within ±Vdc/6, and does not require the exclusion of non-adjacent non-opposite VVs, thus resulting in the system showing good steady-state performance. The effectiveness of the proposed design has been tested and verified by the practical experiment. The proposed algorithm can reduce the execution time by an average of 26.33% compared to the major competitors.

  • An Extension of Physical Optics Approximation for Dielectric Wedge Diffraction for a TM-Polarized Plane Wave Open Access

    Duc Minh NGUYEN  Hiroshi SHIRAI  Se-Yun KIM  

     
    PAPER-Electromagnetic Theory

      Pubricized:
    2023/11/08
      Vol:
    E107-C No:5
      Page(s):
    115-123

    In this study, the edge diffraction of a TM-polarized electromagnetic plane wave by two-dimensional dielectric wedges has been analyzed. An asymptotic solution for the radiation field has been derived from equivalent electric and magnetic currents which can be determined by the geometrical optics (GO) rays. This method may be regarded as an extended version of physical optics (PO). The diffracted field has been represented in terms of cotangent functions whose singularity behaviors are closely related to GO shadow boundaries. Numerical calculations are performed to compare the results with those by other reference solutions, such as the hidden rays of diffraction (HRD) and a numerical finite-difference time-domain (FDTD) simulation. Comparisons of the diffraction effect among these results have been made to propose additional lateral waves in the denser media.

  • Traffic Reduction for Speculative Video Transmission in Cloud Gaming Systems Open Access

    Takumasa ISHIOKA  Tatsuya FUKUI  Toshihito FUJIWARA  Satoshi NARIKAWA  Takuya FUJIHASHI  Shunsuke SARUWATARI  Takashi WATANABE  

     
    PAPER-Network

      Vol:
    E107-B No:5
      Page(s):
    408-418

    Cloud gaming systems allow users to play games that require high-performance computational capability on their mobile devices at any location. However, playing games through cloud gaming systems increases the Round-Trip Time (RTT) due to increased network delay. To simulate a local gaming experience for cloud users, we must minimize RTTs, which include network delays. The speculative video transmission pre-generates and encodes video frames corresponding to all possible user inputs and sends them to the user before the user’s input. The speculative video transmission mitigates the network, whereas a simple solution significantly increases the video traffic. This paper proposes tile-wise delta detection for traffic reduction of speculative video transmission. More specifically, the proposed method determines a reference video frame from the generated video frames and divides the reference video frame into multiple tiles. We calculate the similarity between each tile of the reference video frame and other video frames based on a hash function. Based on calculated similarity, we determine redundant tiles and do not transmit them to reduce traffic volume in minimal processing time without implementing a high compression ratio video compression technique. Evaluations using commercial games showed that the proposed method reduced 40-50% in traffic volume when the SSIM index was around 0.98 in certain genres, compared with the speculative video transmission method. Furthermore, to evaluate the feasibility of the proposed method, we investigated the effectiveness of network delay reduction with existing computational capability and the requirements in the future. As a result, we found that the proposed scheme may mitigate network delay by one to two frames, even with existing computational capability under limited conditions.

  • High-Throughput Exact Matching Implementation on FPGA with Shared Rule Tables among Parallel Pipelines Open Access

    Xiaoyong SONG  Zhichuan GUO  Xinshuo WANG  Mangu SONG  

     
    PAPER-Network System

      Vol:
    E107-B No:5
      Page(s):
    387-397

    In software defined network (SDN), packet processing is commonly implemented using match-action model, where packets are processed based on matched actions in match action table. Due to the limited FPGA on-board resources, it is an important challenge to achieve large-scale high throughput based on exact matching (EM), while solving hash conflicts and out-of-order problems. To address these issues, this study proposed an FPGA-based EM table that leverages shared rule tables across multiple pipelines to eliminate memory replication and enhance overall throughput. An out-of-order reordering function is used to ensure packet sequencing within the pipelines. Moreover, to handle collisions and increase load factor of hash table, multiple hash table blocks are combined and an auxiliary CAM-based EM table is integrated in each pipeline. To the best of our knowledge, this is the first time that the proposed design considers the recovery of out-of-order operations in multi-channel EM table for high-speed network packets processing application. Furthermore, it is implemented on Xilinx Alveo U250 field programmable gate arrays, which has a million rules and achieves a processing speed of 200 million operations per second, theoretically enabling throughput exceeding 100 Gbps for 64-Byte size packets.

  • Dance-Conditioned Artistic Music Generation by Creative-GAN Open Access

    Jiang HUANG  Xianglin HUANG  Lifang YANG  Zhulin TAO  

     
    PAPER-Multimedia Environment Technology

      Pubricized:
    2023/08/23
      Vol:
    E107-A No:5
      Page(s):
    836-844

    We present a novel adversarial, end-to-end framework based on Creative-GAN to generate artistic music conditioned on dance videos. Our proposed framework takes the visual and motion posture data as input, and then adopts a quantized vector as the audio representation to generate complex music corresponding to input. However, the GAN algorithm just imitate and reproduce works what humans have created, instead of generating something new and creative. Therefore, we newly introduce Creative-GAN, which extends the original GAN framework to two discriminators, one is to determine whether it is real music, and the other is to classify music style. The paper shows that our proposed Creative-GAN can generate novel and interesting music which is not found in the training dataset. To evaluate our model, a comprehensive evaluation scheme is introduced to make subjective and objective evaluation. Compared with the advanced methods, our experimental results performs better in measureing the music rhythm, generation diversity, dance-music correlation and overall quality of generated music.

81-100hit(16314hit)