Highly conflicting evidence that may lead to the counter-intuitive results is one of the challenges for information fusion in Dempster-Shafer evidence theory. To deal with this issue, evidence conflict is investigated based on belief divergence measuring the discrepancy between evidence. In this paper, the pignistic probability transform belief χ2 divergence, named as BBχ2 divergence, is proposed. By introducing the pignistic probability transform, the proposed BBχ2 divergence can accurately quantify the difference between evidence with the consideration of multi-element sets. Compared with a few belief divergences, the novel divergence has more precision. Based on this advantageous divergence, a new multi-source information fusion method is devised. The proposed method considers both credibility weights and information volume weights to determine the overall weight of each evidence. Eventually, the proposed method is applied in target recognition and fault diagnosis, in which comparative analysis indicates that the proposed method can realize the highest accuracy for managing evidence conflict.
Haochen LYU Jianjun LI Yin YE Chin-Chen CHANG
The purpose of Facial Beauty Prediction (FBP) is to automatically assess facial attractiveness based on human aesthetics. Most neural network-based prediction methods do not consider the ranking information in the task. For scoring tasks like facial beauty prediction, there is abundant ranking information both between images and within images. Reasonable utilization of these information during training can greatly improve the performance of the model. In this paper, we propose a novel end-to-end Convolutional Neural Network (CNN) model based on ranking information of images, incorporating a Rank Module and an Adaptive Weight Module. We also design pairwise ranking loss functions to fully leverage the ranking information of images. Considering training efficiency and model inference capability, we choose ResNet-50 as the backbone network. We conduct experiments on the SCUT-FBP5500 dataset and the results show that our model achieves a new state-of-the-art performance. Furthermore, ablation experiments show that our approach greatly contributes to improving the model performance. Finally, the Rank Module with the corresponding ranking loss is plug-and-play and can be extended to any CNN model and any task with ranking information. Code is available at https://github.com/nehcoah/Rank-Info-Net.
The study proposes a personalised session-based recommender system that embeds items by using Word2Vec and sequentially updates the session and user embeddings with the hierarchicalization and aggregation of item embeddings. To process a recommendation request, the system constructs a real-time user embedding that considers users’ general preferences and sequential behaviour to handle short-term changes in user preferences with a low computational cost. The system performance was experimentally evaluated in terms of the accuracy, diversity, and novelty of the ranking of recommended items and the training and prediction times of the system for three different datasets. The results of these evaluations were then compared with those of the five baseline systems. According to the evaluation experiment, the proposed system achieved a relatively high recommendation accuracy compared with baseline systems and the diversity and novelty scores of the proposed system did not fall below 90% for any dataset. Furthermore, the training times of the Word2Vec-based systems, including the proposed system, were shorter than those of FPMC and GRU4Rec. The evaluation results suggest that the proposed recommender system succeeds in keeping the computational cost for training low while maintaining high-level recommendation accuracy, diversity, and novelty.
Li HE Xiaowu ZHANG Jianyong DUAN Hao WANG Xin LI Liang ZHAO
Chinese spelling correction (CSC) models detect and correct a text typo based on the misspelled character and its context. Recently, Bert-based models have dominated the research of Chinese spelling correction. However, these methods only focus on the semantic information of the text during the pretraining stage, neglecting the learning of correcting spelling errors. Moreover, when multiple incorrect characters are in the text, the context introduces noisy information, making it difficult for the model to accurately detect the positions of the incorrect characters, leading to false corrections. To address these limitations, we apply the multimodal pre-trained language model ChineseBert to the task of spelling correction. We propose a self-distillation learning-based pretraining strategy, where a confusion set is used to construct text containing erroneous characters, allowing the model to jointly learns how to understand language and correct spelling errors. Additionally, we introduce a single-channel masking mechanism to mitigate the noise caused by the incorrect characters. This mechanism masks the semantic encoding channel while preserving the phonetic and glyph encoding channels, reducing the noise introduced by incorrect characters during the prediction process. Finally, experiments are conducted on widely used benchmarks. Our model achieves superior performance against state-of-the-art methods by a remarkable gain.
Tomohiko UYEMATSU Tetsunao MATSUTA
This paper proposes three new information measures for individual sequences and clarifies their properties. Our new information measures are called as the non-overlapping max-entropy, the overlapping smooth max-entropy, and the non-overlapping smooth max-entropy, respectively. These measures are related to the fixed-length coding of individual sequences. We investigate these measures, and show the following three properties: (1) The non-overlapping max-entropy coincides with the topological entropy. (2) The overlapping smooth max-entropy and the non-overlapping smooth max-entropy coincide with the Ziv-entropy. (3) When an individual sequence is drawn from an ergodic source, the overlapping smooth max-entropy and the non-overlapping smooth max-entropy coincide with the entropy rate of the source. Further, we apply these information measures to the fixed-length coding of individual sequences, and propose some new universal coding schemes which are asymptotically optimum.
Information-theoretic security and computational security are fundamental paradigms of security in the theory of cryptography. The two paradigms interact with each other but have shown different progress, which motivates us to explore the intersection between them. In this paper, we focus on Multi-Party Computation (MPC) because the security of MPC is formulated by simulation-based security, which originates from computational security, even if it requires information-theoretic security. We provide several equivalent formalizations of the security of MPC under a semi-honest model from the viewpoints of information theory and statistics. The interpretations of these variants are so natural that they support the other aspects of simulation-based security. Specifically, the variants based on conditional mutual information and sufficient statistics are interesting because security proofs for those variants can be given by information measures and factorization theorem, respectively. To exemplify this, we show several security proofs of BGW (Ben-Or, Goldwasser, Wigderson) protocols, which are basically proved by constructing a simulator.
Ren TAKEUCHI Rikima MITSUHASHI Masakatsu NISHIGAKI Tetsushi OHKI
The war between cyber attackers and security analysts is gradually intensifying. Owing to the ease of obtaining and creating support tools, recent malware continues to diversify into variants and new species. This increases the burden on security analysts and hinders quick analysis. Identifying malware families is crucial for efficiently analyzing diversified malware; thus, numerous low-cost, general-purpose, deep-learning-based classification techniques have been proposed in recent years. Among these methods, malware images that represent binary features as images are often used. However, no models or architectures specific to malware classification have been proposed in previous studies. Herein, we conduct a detailed analysis of the behavior and structure of malware and focus on PE sections that capture the unique characteristics of malware. First, we validate the features of each PE section that can distinguish malware families. Then, we identify PE sections that contain adequate features to classify families. Further, we propose an ensemble learning-based classification method that combines features of highly discriminative PE sections to improve classification accuracy. The validation of two datasets confirms that the proposed method improves accuracy over the baseline, thereby emphasizing its importance.
Secure two-party computation is a cryptographic tool that enables two parties to compute a function jointly without revealing their inputs. It is known that any function can be realized in the correlated randomness (CR) model, where a trusted dealer distributes input-independent CR to the parties beforehand. Sometimes we can construct more efficient secure two-party protocol for a function g than that for a function f, where g is a restriction of f. However, it is not known in which case we can construct more efficient protocol for domain-restricted function. In this paper, we focus on the size of CR. We prove that we can construct more efficient protocol for a domain-restricted function when there is a “good” structure in CR space of a protocol for the original function, and show a unified way to construct a more efficient protocol in such case. In addition, we show two applications of the above result: The first application shows that some known techniques of reducing CR size for domain-restricted function can be derived in a unified way, and the second application shows that we can construct more efficient protocol than an existing one using our result.
Shuichi MAEDA Akihiro FUKAMI Kaiki YAMAZAKI
There are several benefits of the information that is invisible to the human eye. “Invisible” here means that it can be visualized or quantified when using instruments. For example, it can improve security without compromising product design. We have succeeded in making an invisible digital image on a metal substrate using periodic repeatability by thin-film interference of niobium oxides. Although this digital information is invisible in the visible light wavelength range of 400-800nm, but detectable in the infrared light that of 800-1150nm. This technology has a potential to be applied to anti-counterfeiting and traceability.
Tomoki MINAMATA Hiroki HAMASAKI Hiroshi KAWASAKI Hajime NAGAHARA Satoshi ONO
This paper proposes a novel application of coded apertures (CAs) for visual information hiding. CA is one of the representative computational photography techniques, in which a patterned mask is attached to a camera as an alternative to a conventional circular aperture. With image processing in the post-processing phase, various functions such as omnifocal image capturing and depth estimation can be performed. In general, a watermark embedded as high-frequency components is difficult to extract if captured outside the focal length, and defocus blur occurs. Installation of a CA into the camera is a simple solution to mitigate the difficulty, and several attempts are conducted to make a better design for stable extraction. On the contrary, our motivation is to design a specific CA as well as an information hiding scheme; the secret information can only be decoded if an image with hidden information is captured with the key aperture at a certain distance outside the focus range. The proposed technique designs the key aperture patterns and information hiding scheme through evolutionary multi-objective optimization so as to minimize the decryption error of a hidden image when using the key aperture while minimizing the accuracy when using other apertures. During the optimization process, solution candidates, i.e., key aperture patterns and information hiding schemes, are evaluated on actual devices to account for disturbances that cannot be considered in optical simulations. Experimental results have shown that decoding can be performed with the designed key aperture and similar ones, that decrypted image quality deteriorates as the similarity between the key and the aperture used for decryption decreases, and that the proposed information hiding technique works on actual devices.
Chang SUN Xiaoyu SUN Jiamin LI Pengcheng ZHU Dongming WANG Xiaohu YOU
The application of millimeter wave (mmWave) directional transmission technology in high-speed railway (HSR) scenarios helps to achieve the goal of multiple gigabit data rates with low latency. However, due to the high mobility of trains, the traditional initial access (IA) scheme with high time consumption is difficult to guarantee the effectiveness of the beam alignment. In addition, the high path loss at the coverage edge of the millimeter wave remote radio unit (mmW-RRU) will also bring great challenges to the stability of IA performance. Fortunately, the train trajectory in HSR scenarios is periodic and regular. Moreover, the cell-free network helps to improve the system coverage performance. Based on these observations, this paper proposes an efficient IA scheme based on location and history information in cell-free networks, where the train can flexibly select a set of mmW-RRUs according to the received signal quality. We specifically analyze the collaborative IA process based on the exhaustive search and based on location and history information, derive expressions for IA success probability and delay, and perform the numerical analysis. The results show that the proposed scheme can significantly reduce the IA delay and effectively improve the stability of IA success probability.
Hitoshi ASAEDA Kazuhisa MATSUZONO Yusaku HAYAMIZU Htet Htet HLAING Atsushi OOKA
Information-Centric Networking (ICN) is an innovative technology that provides low-loss, low-latency, high-throughput, and high-reliability communications for diversified and advanced services and applications. In this article, we present a technical survey of ICN functionalities such as in-network caching, routing, transport, and security mechanisms, as well as recent research findings. We focus on CCNx, which is a prominent ICN protocol whose message types are defined by the Internet Research Task Force. To facilitate the development of functional code and encourage application deployment, we introduce an open-source software platform called Cefore that facilitates CCNx-based communications. Cefore consists of networking components such as packet forwarding and in-network caching daemons, and it provides APIs and a Python wrapper program that enables users to easily develop CCNx applications for on Cefore. We introduce a Mininet-based Cefore emulator and lightweight Docker containers for running CCNx experiments on Cefore. In addition to exploring ICN features and implementations, we also consider promising research directions for further innovation.
Jiansheng BAI Jinjie YAO Yating HOU Zhiliang YANG Liming WANG
Modulated signal detection has been rapidly advancing in various wireless communication systems as it's a core technology of spectrum sensing. To address the non-Gaussian statistical of noise in radio channels, especially its pulse characteristics in the time/frequency domain, this paper proposes a method based on Information Geometric Difference Mapping (IGDM) to solve the signal detection problem under Alpha-stable distribution (α-stable) noise and improve performance under low Generalized Signal-to-Noise Ratio (GSNR). Scale Mixtures of Gaussians is used to approximate the probability density function (PDF) of signals and model the statistical moments of observed data. Drawing on the principles of information geometry, we map the PDF of different types of data into manifold space. Through the application of statistical moment models, the signal is projected as coordinate points within the manifold structure. We then design a dual-threshold mechanism based on the geometric mean and use Kullback-Leibler divergence (KLD) to measure the information distance between coordinates. Numerical simulations and experiments were conducted to prove the superiority of IGDM for detecting multiple modulated signals in non-Gaussian noise, the results show that IGDM has adaptability and effectiveness under extremely low GSNR.
Leif Katsuo OXENLØWE Quentin SAUDAN Jasper RIEBESEHL Mujtaba ZAHIDY Smaranika SWAIN
This paper summarizes recent reports on the internet's energy consumption and the internet's benefits on climate actions. It discusses energy-efficiency and the need for a common standard for evaluating the climate impact of future communication technologies and suggests a model that can be adapted to different internet applications such as streaming, online reading and downloading. The two main approaches today are based on how much data is transmitted or how much time the data is under way. The paper concludes that there is a need for a standardized method to estimate energy consumption and CO2 emission related to internet services. This standard should include a method for energy-optimizing future networks, where every Wh will be scrutinized.
Jinsheng WEI Haoyu CHEN Guanming LU Jingjie YAN Yue XIE Guoying ZHAO
Micro-expression recognition (MER) draws intensive research interest as micro-expressions (MEs) can infer genuine emotions. Prior information can guide the model to learn discriminative ME features effectively. However, most works focus on researching the general models with a stronger representation ability to adaptively aggregate ME movement information in a holistic way, which may ignore the prior information and properties of MEs. To solve this issue, driven by the prior information that the category of ME can be inferred by the relationship between the actions of facial different components, this work designs a novel model that can conform to this prior information and learn ME movement features in an interpretable way. Specifically, this paper proposes a Decomposition and Reconstruction-based Graph Representation Learning (DeRe-GRL) model to efectively learn high-level ME features. DeRe-GRL includes two modules: Action Decomposition Module (ADM) and Relation Reconstruction Module (RRM), where ADM learns action features of facial key components and RRM explores the relationship between these action features. Based on facial key components, ADM divides the geometric movement features extracted by the graph model-based backbone into several sub-features, and learns the map matrix to map these sub-features into multiple action features; then, RRM learns weights to weight all action features to build the relationship between action features. The experimental results demonstrate the effectiveness of the proposed modules, and the proposed method achieves competitive performance.
Tomohiko YANO Hiroki KUZUNO Kenichi MAGATA
Information leakage is a significant threat to organizations, and effective measures are required to protect information assets. As confidential files can be leaked through various paths, a countermeasure is necessary to prevent information leakage from various paths, from simple drag-and-drop movements to complex transformations such as encryption and encoding. However, existing methods are difficult to take countermeasures depending on the information leakage paths. Furthermore, it is also necessary to create a visualization format that can find information leakage easily and a method that can remove unnecessary parts while leaving the necessary parts of information leakage to improve visibility. This paper proposes a new information leakage countermeasure method that incorporates file tracking and visualization. The file tracking component recursively extracts all events related to confidential files. Therefore, tracking is possible even when data have transformed significantly from the original file. The visualization component represents the results of file tracking as a network graph. This allows security administrators to find information leakage even if a file is transformed through multiple events. Furthermore, by pruning the network graph using the frequency of past events, the indicators of information leakage can be more easily found by security administrators. In experiments conducted, network graphs were generated for two information leakage scenarios in which files were moved and copied. The visualization results were obtained according to the scenarios, and the network graph was pruned to reduce vertices by 17.6% and edges by 10.9%.
Xiaoguang YUAN Chaofan DAI Zongkai TIAN Xinyu FAN Yingyi SONG Zengwen YU Peng WANG Wenjun KE
Question answering (QA) systems are designed to answer questions based on given information or with the help of external information. Recent advances in QA systems are overwhelmingly contributed by deep learning techniques, which have been employed in a wide range of fields such as finance, sports and biomedicine. For generative QA in open-domain QA, although deep learning can leverage massive data to learn meaningful feature representations and generate free text as answers, there are still problems to limit the length and content of answers. To alleviate this problem, we focus on the variant YNQA of generative QA and propose a model CasATT (cascade prompt learning framework with the sentence-level attention mechanism). In the CasATT, we excavate text semantic information from document level to sentence level and mine evidence accurately from large-scale documents by retrieval and ranking, and answer questions with ranked candidates by discriminative question answering. Our experiments on several datasets demonstrate the superior performance of the CasATT over state-of-the-art baselines, whose accuracy score can achieve 93.1% on IR&QA Competition dataset and 90.5% on BoolQ dataset.
This paper addresses the novel task of detecting chorus sections in English and Japanese lyrics text. Although chorus-section detection using audio signals has been studied, whether chorus sections can be detected from text-only lyrics is an open issue. Another open issue is whether patterns of repeating lyric lines such as those appearing in chorus sections depend on language. To investigate these issues, we propose a neural-network-based model for sequence labeling. It can learn phrase repetition and linguistic features to detect chorus sections in lyrics text. It is, however, difficult to train this model since there was no dataset of lyrics with chorus-section annotations as there was no prior work on this task. We therefore generate a large amount of training data with such annotations by leveraging pairs of musical audio signals and their corresponding manually time-aligned lyrics; we first automatically detect chorus sections from the audio signals and then use their temporal positions to transfer them to the line-level chorus-section annotations for the lyrics. Experimental results show that the proposed model with the generated data contributes to detecting the chorus sections, that the model trained on Japanese lyrics can detect chorus sections surprisingly well in English lyrics, and that patterns of repeating lyric lines are language-independent.
Xinqun LIU Tao LI Yingxiao ZHAO Jinlin PENG
Conventional Nyquist folding receiver (NYFR) uses zero crossing rising (ZCR) voltage times to control the RF sample clock, which is easily affected by noise. Moreover, the analog and digital parts are not synchronized so that the initial phase of the input signal is lost. Furthermore, it is assumed in most literature that the input signal is in a single Nyquist zone (NZ), which is inconsistent with the actual situation. In this paper, we propose an improved architecture denominated as a dual-channel NYFR with adjustable local oscillator (LOS) and an information recovery algorithm. The simulation results demonstrate the validity and viability of the proposed architecture and the corresponding algorithm.
This paper proposes an algorithm for estimating the location of wireless access points (APs) in indoor environments to realize smartphone positioning based on Wi-Fi without pre-constructing a database. The proposed method is designed to overcome the main problem of existing positioning methods requiring the advance construction of a database with coordinates or precise AP location measurements. The proposed algorithm constructs a local coordinate system with the first four APs that are activated in turn, and estimates the AP installation location using Wi-Fi round-trip time (RTT) lateration and the ranging results between the APs. The effectiveness of the proposed algorithm is confirmed by conducting experiments in a real indoor environment consisting of two rooms of different sizes to evaluate the positioning performance of the algorithm. The experimental results showed the proposed algorithm using Wi-Fi RTT lateration delivers high smartphone positioning performance without a pre-constructed database or precise AP location measurements.