The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] CTI(8214hit)


  • A Novel Anomaly Detection Framework Based on Model Serialization

    Byeongtae PARK  Dong-Kyu CHAE  

    LETTER-Artificial Intelligence, Data Mining

    E107-D No:3

    Recently, multivariate time-series data has been generated in various environments, such as sensor networks and IoT, making anomaly detection in time-series data an essential research topic. Unsupervised learning anomaly detectors identify anomalies by training a model on normal data and producing high residuals for abnormal observations. However, a fundamental issue arises as anomalies do not consistently result in high residuals, necessitating a focus on the time-series patterns of residuals rather than individual residual sizes. In this paper, we present a novel framework comprising two serialized anomaly detectors: the first model calculates residuals as usual, while the second one evaluates the time-series pattern of the computed residuals to determine whether they are normal or abnormal. Experiments conducted on real-world time-series data demonstrate the effectiveness of our proposed framework.

  • Hierarchical Latent Alignment for Non-Autoregressive Generation under High Compression Ratio

    Wang XU  Yongliang MA  Kehai CHEN  Ming ZHOU  Muyun YANG  Tiejun ZHAO  

    PAPER-Natural Language Processing

    E107-D No:3

    Non-autoregressive generation has attracted more and more attention due to its fast decoding speed. Latent alignment objectives, such as CTC, are designed to capture the monotonic alignments between the predicted and output tokens, which have been used for machine translation and sentence summarization. However, our preliminary experiments revealed that CTC performs poorly on document abstractive summarization, where a high compression ratio between the input and output is involved. To address this issue, we conduct a theoretical analysis and propose Hierarchical Latent Alignment (HLA). The basic idea is a two-step alignment process: we first align the sentences in the input and output, and subsequently derive token-level alignment using CTC based on aligned sentences. We evaluate the effectiveness of our proposed approach on two widely used datasets XSUM and CNNDM. The results indicate that our proposed method exhibits remarkable scalability even when dealing with high compression ratios.

  • DanceUnisoner: A Parametric, Visual, and Interactive Simulation Interface for Choreographic Composition of Group Dance

    Shuhei TSUCHIDA  Satoru FUKAYAMA  Jun KATO  Hiromu YAKURA  Masataka GOTO  

    PAPER-Human-computer Interaction

    E107-D No:3

    Composing choreography is challenging because it involves numerous iterative refinements. According to our video analysis and interviews, choreographers typically need to imagine dancers' movements to revise drafts on paper since testing new movements and formations with actual dancers takes time. To address this difficulty, we present an interactive group-dance simulation interface, DanceUnisoner, that assists choreographers in composing a group dance in a simulated environment. With DanceUnisoner, choreographers can arrange excerpts from solo-dance videos of dancers throughout a three-dimensional space. They can adjust various parameters related to the dancers in real time, such as each dancer's position and size and each movement's timing. To evaluate the effectiveness of the system's parametric, visual, and interactive interface, we asked seven choreographers to use it and compose group dances. Our observations, interviews, and quantitative analysis revealed their successful usage in iterative refinements and visual checking of choreography, providing insights to facilitate further computational creativity support for choreographers.

  • An Intra- and Inter-Emotion Transformer-Based Fusion Model with Homogeneous and Diverse Constraints Using Multi-Emotional Audiovisual Features for Depression Detection

    Shiyu TENG  Jiaqing LIU  Yue HUANG  Shurong CHAI  Tomoko TATEYAMA  Xinyin HUANG  Lanfen LIN  Yen-Wei CHEN  


    E107-D No:3

    Depression is a prevalent mental disorder affecting a significant portion of the global population, leading to considerable disability and contributing to the overall burden of disease. Consequently, designing efficient and robust automated methods for depression detection has become imperative. Recently, deep learning methods, especially multimodal fusion methods, have been increasingly used in computer-aided depression detection. Importantly, individuals with depression and those without respond differently to various emotional stimuli, providing valuable information for detecting depression. Building on these observations, we propose an intra- and inter-emotional stimulus transformer-based fusion model to effectively extract depression-related features. The intra-emotional stimulus fusion framework aims to prioritize different modalities, capitalizing on their diversity and complementarity for depression detection. The inter-emotional stimulus model maps each emotional stimulus onto both invariant and specific subspaces using individual invariant and specific encoders. The emotional stimulus-invariant subspace facilitates efficient information sharing and integration across different emotional stimulus categories, while the emotional stimulus specific subspace seeks to enhance diversity and capture the distinct characteristics of individual emotional stimulus categories. Our proposed intra- and inter-emotional stimulus fusion model effectively integrates multimodal data under various emotional stimulus categories, providing a comprehensive representation that allows accurate task predictions in the context of depression detection. We evaluate the proposed model on the Chinese Soochow University students dataset, and the results outperform state-of-the-art models in terms of concordance correlation coefficient (CCC), root mean squared error (RMSE) and accuracy.

  • Feasibility of Estimating Concentration Level of Japanese Document Workers Based on Kana-Kanji Conversion Confirmation Time

    Ryosuke SAEKI  Takeshi HAYASHI  Ibuki YAMAMOTO  Kinya FUJITA  


    E107-D No:3

    This study discusses the feasibility to estimate the concentration level of Japanese document workers using computer. Based on the previous findings that dual-task scenarios increase reaction time, we hypothesized that the Kana-Kanji conversion confirmation time (KKCCT) would increase due to the decrease in cognitive resources allocated to the document task, i.e. the level of concentration on the task at hand. To examine this hypothesis, we conducted a set of experiments in which sixteen participants copied Kana text by typing and concurrently converted it into Kanji under three conditions: Normal, Dual-task, and Mental-fatigue. The results suggested the feasibility that KKCCT increased when participants were less concentrated on the task due to subtask or mental fatigue. These findings imply the potential utility of using confirmation time as a measure of concentration level in Japanese document workers.

  • Collecting Balls on a Line by Robots with Limited Energy

    Tesshu HANAKA  Nicolás HONORATO DROGUETT  Kazuhiro KURITA  Hirotaka ONO  Yota OTACHI  


    E107-D No:3

    In this paper, we study BALL COLLECTING WITH LIMITED ENERGY, which is a problem of scheduling robots with limited energy confined to a line to catch moving balls that eventually cross the line. For this problem, we show the NP-completeness of the general case and some algorithmic results for some cases with a small number of robots.

  • Non-Cooperative Rational Synthesis Problem on Stochastic Games for Positional Strategies

    So KOIDE  Yoshiaki TAKATA  Hiroyuki SEKI  


    E107-D No:3

    Synthesis problems on multiplayer non-zero-sum games (MG) with multiple environment players that behave rationally are the problems to find a good strategy of the system and have been extensively studied. This paper concerns the synthesis problems on stochastic MG (SMG), where a special controller other than players, called nature, which chooses a move in its turn randomly, may exist. Two types of synthesis problems on SMG exist: cooperative rational synthesis problem (CRSP) and non-cooperative rational synthesis problem (NCRSP). The rationality of environment players is modeled by Nash equilibria, and CRSP is the problem to decide whether there exists a Nash equilibrium that gives the system a payoff not less than a given threshold. Ummels et al. studied the complexity of CRSP for various classes of objectives and strategies of players. CRSP fits the situation where the system can make a suggestion of a strategy profile (a tuple of strategies of all players) to the environment players. However, in real applications, the system may rarely have an opportunity to make suggestions to the environment, and thus CRSP is optimistic. NCRSP is the problem to decide whether there exists a strategy σ0 of the system satisfying that for every strategy profile of the environment players that forms a 0-fixed Nash equilibrium (a Nash equilibrium where the system's strategy is fixed to σ0), the system obtains a payoff not less than a given threshold. In this paper, we investigate the complexity of NCRSP for positional (i.e. pure memoryless) strategies. We consider ω-regular objectives as the model of players' objectives, and show the complexity results of the problem for several subclasses of ω-regular objectives. In particular, the problem for terminal reachability (TR) objectives is shown to be Σp2-complete.

  • Prediction of Residual Defects after Code Review Based on Reviewer Confidence

    Shin KOMEDA  Masateru TSUNODA  Keitaro NAKASAI  Hidetake UWANO  


    E107-D No:3

    A major approach to enhancing software quality is reviewing the source code to identify defects. To aid in identifying flaws, an approach in which a machine learning model predicts residual defects after implementing a code review is adopted. After the model has predicted the existence of residual defects, a second-round review is performed to identify such residual flaws. To enhance the prediction accuracy of the model, information known to developers but not recorded as data is utilized. Confidence in the review is evaluated by reviewers using a 10-point scale. The assessment result is used as an independent variable of the prediction model of residual defects. Experimental results indicate that confidence improves the prediction accuracy.

  • Uniaxially Symmetrical T-Junction OMT with 45° -Tilted Branch Waveguide Ports


    PAPER-Electromagnetic Theory

    E107-C No:3

    A T-junction orthomode transducer (OMT) is a waveguide component that separates two orthogonal linear polarizations in the same frequency band. It has a common circular waveguide short-circuited at one end and two branch rectangular waveguides arranged in opposite directions near the short circuit. One of the advantages of a T-junction OMT is its short axial length. However, the two rectangular ports, which need to be orthogonal, have different levels of performance because of asymmetry. We therefore propose a uniaxially symmetrical T-junction OMT, which is configured such that the two branch waveguides are tilted 45° to the short circuit. The uniaxially symmetrical configuration enables same levels of performance for the two ports, and its impedance matching is easier compared to that for the conventional configuration. The polarization separation principle can be explained using the principles of orthomode junction (OMJ) and turnstile OMT. Based on calculations, the proposed configuration demonstrated a return loss of 25dB, XPD of 30dB, isolation of 21dB between the two branch ports, and loss of 0.25dB, with a bandwidth of 15% in the K band. The OMT was then fabricated as a single piece via 3D printing and evaluated against the calculated performance indices.

  • Low Complexity Overloaded MIMO Non-Linear Detector with Iterative LLR Estimation

    Satoshi DENNO  Shuhei MAKABE  Yafei HOU  

    PAPER-Wireless Communication Technologies

    E107-B No:3

    This paper proposes a non-linear overloaded MIMO detector that outperforms the conventional soft-input maximum likelihood detector (MLD) with less computational complexity. We propose iterative log-likelihood ratio (LLR) estimation and multi stage LLR estimation for the proposed detector to achieve such superior performance. While the iterative LLR estimation achieves better BER performance, the multi stage LLR estimation makes the detector less complex than the conventional soft-input maximum likelihood detector (MLD). The computer simulation reveals that the proposed detector achieves about 0.6dB better BER performance than the soft-input MLD with about half of the soft-input MLD's complexity in a 6×3 overloaded MIMO OFDM system.

  • On a Spectral Lower Bound of Treewidth

    Tatsuya GIMA  Tesshu HANAKA  Kohei NORO  Hirotaka ONO  Yota OTACHI  


    E107-D No:3

    In this letter, we present a new lower bound for the treewidth of a graph in terms of the second smallest eigenvalue of its Laplacian matrix. Our bound slightly improves the lower bound given by Chandran and Subramanian [Inf. Process. Lett., 87 (2003)].

  • Dynamic Attentive Convolution for Facial Beauty Prediction

    Zhishu SUN  Zilong XIAO  Yuanlong YU  Luojun LIN  

    LETTER-Image Recognition, Computer Vision

    E107-D No:2

    Facial Beauty Prediction (FBP) is a significant pattern recognition task that aims to achieve consistent facial attractiveness assessment with human perception. Currently, Convolutional Neural Networks (CNNs) have become the mainstream method for FBP. The training objective of most conventional CNNs is usually to learn static convolution kernels, which, however, makes the network quite difficult to capture global attentive information, and thus usually ignores the key facial regions, e.g., eyes, and nose. To tackle this problem, we devise a new convolution manner, Dynamic Attentive Convolution (DyAttenConv), which integrates the dynamic and attention mechanism into convolution in kernel-level, with the aim of enforcing the convolution kernels adapted to each face dynamically. DyAttenConv is a plug-and-play module that can be flexibly combined with existing CNN architectures, making the acquisition of the beauty-related features more globally and attentively. Extensive ablation studies show that our method is superior to other fusion and attention mechanisms, and the comparison with other state-of-the-arts also demonstrates the effectiveness of DyAttenConv on facial beauty prediction task.

  • Re-Evaluating Syntax-Based Negation Scope Resolution

    Asahi YOSHIDA  Yoshihide KATO  Shigeki MATSUBARA  

    LETTER-Natural Language Processing

    E107-D No:1

    Negation scope resolution is the process of detecting the negated part of a sentence. Unlike the syntax-based approach employed in previous researches, state-of-the-art methods performed better without the explicit use of syntactic structure. This work revisits the syntax-based approach and re-evaluates the effectiveness of syntactic structure in negation scope resolution. We replace the parser utilized in the prior works with state-of-the-art parsers and modify the syntax-based heuristic rules. The experimental results demonstrate that the simple modifications enhance the performance of the prior syntax-based method to the same level as state-of-the-art end-to-end neural-based methods.

  • Improved Head and Data Augmentation to Reduce Artifacts at Grid Boundaries in Object Detection

    Shinji UCHINOURA  Takio KURITA  

    PAPER-Image Recognition, Computer Vision

    E107-D No:1

    We investigated the influence of horizontal shifts of the input images for one stage object detection method. We found that the object detector class scores drop when the target object center is at the grid boundary. Many approaches have focused on reducing the aliasing effect of down-sampling to achieve shift-invariance. However, down-sampling does not completely solve this problem at the grid boundary; it is necessary to suppress the dispersion of features in pixels close to the grid boundary into adjacent grid cells. Therefore, this paper proposes two approaches focused on the grid boundary to improve this weak point of current object detection methods. One is the Sub-Grid Feature Extraction Module, in which the sub-grid features are added to the input of the classification head. The other is Grid-Aware Data Augmentation, where augmented data are generated by the grid-level shifts and are used in training. The effectiveness of the proposed approaches is demonstrated using the COCO validation set after applying the proposed method to the FCOS architecture.

  • Efficient Action Spotting Using Saliency Feature Weighting

    Yuzhi SHI  Takayoshi YAMASHITA  Tsubasa HIRAKAWA  Hironobu FUJIYOSHI  Mitsuru NAKAZAWA  Yeongnam CHAE  Björn STENGER  

    PAPER-Image Processing and Video Processing

    E107-D No:1

    Action spotting is a key component in high-level video understanding. The large number of similar frames poses a challenge for recognizing actions in videos. In this paper we use frame saliency to represent the importance of frames for guiding the model to focus on keyframes. We propose the frame saliency weighting module to improve frame saliency and video representation at the same time. Our proposed model contains two encoders, for pre-action and post-action time windows, to encode video context. We validate our design choices and the generality of proposed method in extensive experiments. On the public SoccerNet-v2 dataset, the method achieves an average mAP of 57.3%, improving over the state of the art. Using embedding features obtained from multiple feature extractors, the average mAP further increases to 75%. We show that reducing the model size by over 90% does not significantly impact performance. Additionally, we use ablation studies to prove the effective of saliency weighting module. Further, we show that our frame saliency weighting strategy is applicable to existing methods on more general action datasets, such as SoccerNet-v1, ActivityNet v1.3, and UCF101.

  • Node-to-Set Disjoint Paths Problem in Cross-Cubes

    Rikuya SASAKI  Hiroyuki ICHIDA  Htoo Htoo Sandi KYAW  Keiichi KANEKO  

    PAPER-Fundamentals of Information Systems

    E107-D No:1

    The increasing demand for high-performance computing in recent years has led to active research on massively parallel systems. The interconnection network in a massively parallel system interconnects hundreds of thousands of processing elements so that they can process large tasks while communicating among others. By regarding the processing elements as nodes and the links between processing elements as edges, respectively, we can discuss various problems of interconnection networks in the framework of the graph theory. Many topologies have been proposed for interconnection networks of massively parallel systems. The hypercube is a very popular topology and it has many variants. The cross-cube is such a topology, which can be obtained by adding one extra edge to each node of the hypercube. The cross-cube reduces the diameter of the hypercube, and allows cycles of odd lengths. Therefore, we focus on the cross-cube and propose an algorithm that constructs disjoint paths from a node to a set of nodes. We give a proof of correctness of the algorithm. Also, we show that the time complexity and the maximum path length of the algorithm are O(n3 log n) and 2n - 3, respectively. Moreover, we estimate that the average execution time of the algorithm is O(n2) based on a computer experiment.

  • CASEformer — A Transformer-Based Projection Photometric Compensation Network

    Yuqiang ZHANG  Huamin YANG  Cheng HAN  Chao ZHANG  Chaoran ZHU  


    E107-D No:1

    In this paper, we present a novel photometric compensation network named CASEformer, which is built upon the Swin module. For the first time, we combine coordinate attention and channel attention mechanisms to extract rich features from input images. Employing a multi-level encoder-decoder architecture with skip connections, we establish multiscale interactions between projection surfaces and projection images, achieving precise inference and compensation. Furthermore, through an attention fusion module, which simultaneously leverages both coordinate and channel information, we enhance the global context of feature maps while preserving enhanced texture coordinate details. The experimental results demonstrate the superior compensation effectiveness of our approach compared to the current state-of-the-art methods. Additionally, we propose a method for multi-surface projection compensation, further enriching our contributions.

  • Location and History Information Aided Efficient Initial Access Scheme for High-Speed Railway Communications

    Chang SUN  Xiaoyu SUN  Jiamin LI  Pengcheng ZHU  Dongming WANG  Xiaohu YOU  

    PAPER-Wireless Communication Technologies

    E107-B No:1

    The application of millimeter wave (mmWave) directional transmission technology in high-speed railway (HSR) scenarios helps to achieve the goal of multiple gigabit data rates with low latency. However, due to the high mobility of trains, the traditional initial access (IA) scheme with high time consumption is difficult to guarantee the effectiveness of the beam alignment. In addition, the high path loss at the coverage edge of the millimeter wave remote radio unit (mmW-RRU) will also bring great challenges to the stability of IA performance. Fortunately, the train trajectory in HSR scenarios is periodic and regular. Moreover, the cell-free network helps to improve the system coverage performance. Based on these observations, this paper proposes an efficient IA scheme based on location and history information in cell-free networks, where the train can flexibly select a set of mmW-RRUs according to the received signal quality. We specifically analyze the collaborative IA process based on the exhaustive search and based on location and history information, derive expressions for IA success probability and delay, and perform the numerical analysis. The results show that the proposed scheme can significantly reduce the IA delay and effectively improve the stability of IA success probability.

  • Investigation of a Non-Contact Bedsore Detection System

    Tomoki CHIBA  Yusuke ASANO  Masaharu TAKAHASHI  

    PAPER-Antennas and Propagation

    E107-B No:1

    The proportion of persons over 65 years old is projected to increase worldwide between 2022 and 2050. The increasing burden on medical staff and the shortage of human resources are growing problems. Bedsores are injuries caused by prolonged pressure on the skin and stagnation of blood flow. The more the damage caused by bedsores progresses, the longer the treatment period becomes. Moreover, patients require surgery in some serious cases. Therefore, early detection is essential. In our research, we are developing a non-contact bedsore detection system using electromagnetic waves at 10.5GHz. In this paper, we extracted appropriate information from a scalogram and utilized it to detect the sizes of bedsores. In addition, experiments using a phantom were conducted to confirm the basic operation of the bedsore detection system. As a result, using the approximate curves and lines obtained from prior analysis data, it was possible to estimate the volume of each defected area, as well as combinations of the depth of the defected area and the length of the defected area. Moreover, the experiments showed that it was possible to detect bedsore presence and estimate their sizes, although the detection results had slight variations.

  • Belief Propagation Detection with MRC Reception and MMSE Pre-Cancellation for Overloaded MIMO

    Yuto SUZUKI  Yukitoshi SANADA  

    PAPER-Transmission Systems and Transmission Equipment for Communications

    E107-B No:1

    In this paper, belief propagation (BP) multi-input multi-output (MIMO) detection with maximum ratio combining (MRC) and minimum mean square error (MMSE) pre-cancellation is proposed for overload MIMO. The proposed scheme applies MRC before MMSE pre-cancellation. The BP MIMO detection with MMSE pre-cancellation leads to a reduction in diversity gain due to the decreased number of connections between variable nodes and observation nodes in a factor graph. MRC increases the diversity gain and contributes to improve bit error rate (BER) performance. Numerical results obtained through computer simulation show that the BERs of the proposed BP MIMO detection with MRC and MMSE pre-cancellation yields bit error rates (BERs) that are approximately 0.5dB better than those of conventional BP MIMO detection with MMSE pre-cancellation at a BER of 10-3.
