Loan default prediction has been a significant problem in the financial domain because overdue loans may incur significant losses. Machine learning methods have been introduced to solve this problem, but there are still many challenges including feature multicollinearity, imbalanced labels, and small data sample problems. To replicate the success of deep learning in many areas, an effective regularization technique named muddling label regularization is introduced in this letter, and an ensemble of feed-forward neural networks is proposed, which outperforms machine learning and deep learning baselines in a real-world dataset.
Xiaoxiao CUI Cuiling FAN Xiaoni DU
Low-hit-zone frequency-hopping sequences (LHZ-FHSs) are frequency-hopping sequences with low Hamming correlation in a low-hit-zone (LHZ), which have important applications in quasi-synchronous communication systems. However, the strict quasi-synchronization may be hard to maintain at all times in practical FHMA networks, it is also necessary to minimize the Hamming correlation for time-shifts outside of the LHZ. The main objective of this letter is to propose a lower bound on the maximum correlation magnitude outside the low-hit-zone for LHZ-FHS sets. It turns out that the proposed bound is tight or almost tight in the sense that it can be achieved by some LHZ-FHS sets.
Manufacturers are coping with increasing pressures in quality, cost and efficiency as more and more industries are moving from traditional setup to industry 4.0 based digitally transformed setup due to its numerous playbacks. Within the manufacturing domain organizational structures and processes are complex, therefore adopting industry 4.0 and finding an optimized re-engineered business process is difficult without using a systematic methodology. Authors have developed Business Process Re-engineering (BPR) and Business Process Optimization (BPO) methods but no consolidated methodology have been seen in the literature that is based on industry 4.0 and incorporates both the BPR and BPO. We have presented a consolidated and systematic re-engineering and optimization framework for a manufacturing industry setup. The proposed framework performs Evolutionary Multi-Objective Combinatorial Optimization using Multi-Objective Genetic Algorithm (MOGA). An example process from an aircraft manufacturing factory has been optimized and re-engineered with available set of technologies from industry 4.0 based on the criteria of lower cost, reduced processing time and reduced error rate. At the end to validate the proposed framework Business Process Model and Notation (BPMN) is used for simulations and perform comparison between AS-IS and TO-BE processes as it is widely used standard for business process specification. The proposed framework will be used in converting an industry from traditional setup to industry 4.0 resulting in cost reduction, increased performance and quality.
Recently, the performances of discriminative correlation filter (CF) trackers are getting better and better in visual tracking. In this paper, we propose spatial-temporal regularization with precise state estimation based on discriminative correlation filter (STPSE) in order to achieve more significant tracking performance. First, we consider the continuous change of the object state, using the information from the previous two filters for training the correlation filter model. Here, we train the correlation filter model with the hand-crafted features. Second, we introduce update control in which average peak-to-correlation energy (APCE) and the distance between the object locations obtained by HOG features and hand-crafted features are utilized to detect abnormality of the state around the object. APCE and the distance indicate the reliability of the filter response, thus if abnormality is detected, the proposed method does not update the scale and the object location estimated by the filter response. In the experiment, our tracker (STPSE) achieves significant and real-time performance with only CPU for the challenging benchmark sequence (OTB2013, OTB2015, and TC128).
In this paper, we study the number of failed components in a consecutive-k-out-of-n:G system. The distributions and expected values of the number of failed components when system is failed or working at a particular time t are evaluated. We also apply them to the optimization problems concerned with the optimal number of components and the optimal replacement time. Finally, we present the illustrative examples for the expected number of failed components and give the numerical results for the optimization problems.
Hitoshi SUDA Gaku KOTANI Daisuke SAITO
In this paper, we propose a new training framework named the INmfCA algorithm for nonparallel voice conversion (VC) systems. To train conversion models, traditional VC frameworks require parallel corpora, in which source and target speakers utter the same linguistic contents. Although the frameworks have achieved high-quality VC, they are not applicable in situations where parallel corpora are unavailable. To acquire conversion models without parallel corpora, nonparallel methods are widely studied. Although the frameworks achieve VC under nonparallel conditions, they tend to require huge background knowledge or many training utterances. This is because of difficulty in disentangling linguistic and speaker information without a large amount of data. In this work, we tackle this problem by exploiting NMF, which can factorize acoustic features into time-variant and time-invariant components in an unsupervised manner. The method acquires alignment between the acoustic features of a source speaker's utterances and a target dictionary and uses the obtained alignment as activation of NMF to train the source speaker's dictionary without parallel corpora. The acquisition method is based on the INCA algorithm, which obtains the alignment of nonparallel corpora. In contrast to the INCA algorithm, the alignment is not restricted to observed samples, and thus the proposed method can efficiently utilize small nonparallel corpora. The results of subjective experiments show that the combination of the proposed algorithm and the INCA algorithm outperformed not only an INCA-based nonparallel framework but also CycleGAN-VC, which performs nonparallel VC without any additional training data. The results also indicate that a one-shot VC framework, which does not need to train source speakers, can be constructed on the basis of the proposed method.
With the increasing densification of 5G and future 6G networks high-capacity backhaul links to connect the numerous base stations become an issue. Since not all base stations can be connected via fibre links for either technical or economic reasons wireless connections at 300GHz, which may provide data rates comparable to fibre links, are an alternative. This paper deals with the planning of 300GHz backhaul links and describes two novel automatic planning approaches for backhaul links arranged in ring and star topology. The two planning approaches are applied to various scenarios and the corresponding planning results are evaluated by comparing signal to interference plus noise ratio under various simulation conditions including weather impacts showing the feasibility of wireless backhaul links.
Ai-ichiro SASAKI Ken FUKUSHIMA
Magnetic fields are often utilized for position sensing of mobile devices. In typical sensing systems, multiple sensors are used to detect magnetic fields generated by target devices. To determine the positions of the devices, magnetic-field data detected by the sensors must be converted to device-position data. The data conversion is not trivial because it is a nonlinear inverse problem. In this study, we propose a machine-learning approach suitable for data conversion required in the magnetic-field-based position sensing of target devices. In our approach, two different sets of training data are used. One of the training datasets is composed of raw data of magnetic fields to be detected by sensors. The other set is composed of logarithmically represented data of the fields. We can obtain two different predictor functions by learning with these training datasets. Results show that the prediction accuracy of the target position improves when the two different predictor functions are used. Based on our simulation, the error of the target position estimated with the predictor functions is within 10cm in a 2m × 2m × 2m cubic space for 87% of all the cases of the target device states. The computational time required for predicting the positions of the target device is 4ms. As the prediction method is accurate and rapid, it can be utilized for the real-time tracking of moving objects and people.
Ding LI Chunxiang GU Yuefei ZHU
Website Fingerprinting (WF) enables a passive attacker to identify which website a user is visiting over an encrypted tunnel. Current WF attacks have two strong assumptions: (i) specific tunnel, i.e., the attacker can train on traffic samples collected in a simulated tunnel with the same tunnel settings as the user, and (ii) pseudo-open-world, where the attacker has access to training samples of unmonitored sites and treats them as a separate class. These assumptions, while experimentally feasible, render WF attacks less usable in practice. In this paper, we present Gene Fingerprinting (GF), a new WF attack that achieves cross-tunnel transferability by generating fingerprints that reflect the intrinsic profile of a website. The attack leverages Zero-shot Learning — a machine learning technique not requiring training samples to identify a given class — to reduce the effort to collect data from different tunnels and achieve a real open-world. We demonstrate the attack performance using three popular tunneling tools: OpenSSH, Shadowsocks, and OpenVPN. The GF attack attains over 94% accuracy on each tunnel, far better than existing CUMUL, DF, and DDTW attacks. In the more realistic open-world scenario, the attack still obtains 88% TPR and 9% FPR, outperforming the state-of-the-art attacks. These results highlight the danger of our attack in various scenarios where gathering and training on a tunnel-specific dataset would be impractical.
Kyogo OTA Daisuke INOUE Mamoru SAWAHASHI Satoshi NAGATA
This paper proposes individual computation processes of the partial demodulation reference signal (DM-RS) sequence in a synchronization signal (SS)/physical broadcast channel (PBCH) block to be used to detect the radio frame timing based on SS/PBCH block index detection for New Radio (NR) initial access. We present the radio frame timing detection probability using the proposed partial DM-RS sequence detection method that is applied subsequent to the physical-layer cell identity (PCID) detection in five tapped delay line (TDL) models in both non-line-of-sight (NLOS) and line-of-sight (LOS) environments. Computer simulation results show that by using the proposed method, the radio frame timing detection probabilities of almost 100% and higher than 90% are achieved for the LOS and NLOS channel models, respectively, at the average received signal-to-noise power ratio (SNR) of 0dB with the frequency stability of a local oscillator in a set of user equipment (UE) of 5ppm at the carrier frequency of 4GHz.
Yuyao LIU Shi BAO Go TANAKA Yujun LIU Dongsheng XU
When collecting images, owing to the influence of shooting equipment, shooting environment, and other factors, often low-illumination images with insufficient exposure are obtained. For low-illumination images, it is necessary to improve the contrast. In this paper, a digital color image contrast enhancement method based on luminance weight adjustment is proposed. This method improves the contrast of the image and maintains the detail and nature of the image. In the proposed method, the illumination of the histogram equalization image and the adaptive gamma correction with weighted distribution image are adjusted by the luminance weight of w1 to obtain a detailed image of the bright areas. Thereafter, the suppressed multi-scale retinex (MSR) is used to process the input image and obtain a detailed image of the dark areas. Finally, the luminance weight w2 is used to adjust the illumination component of the detailed images of the bright and dark areas, respectively, to obtain the output image. The experimental results show that the proposed method can enhance the details of the input image and avoid excessive enhancement of contrast, which maintains the naturalness of the input image well. Furthermore, we used the discrete entropy and lightness order error function to perform a numerical evaluation to verify the effectiveness of the proposed method.
In this study, we aim to improve the performance of audio source separation for monaural mixture signals. For monaural audio source separation, semisupervised nonnegative matrix factorization (SNMF) can achieve higher separation performance by employing small supervised signals. In particular, penalized SNMF (PSNMF) with orthogonality penalty is an effective method. PSNMF forces two basis matrices for target and nontarget sources to be orthogonal to each other and improves the separation accuracy. However, the conventional orthogonality penalty is based on an inner product and does not affect the estimation of the basis matrix properly because of the scale indeterminacy between the basis and activation matrices in NMF. To cope with this problem, a new PSNMF with cosine similarity between the basis matrices is proposed. The experimental comparison shows the efficacy of the proposed cosine similarity penalty in supervised audio source separation.
Takashi ISHIO Naoto MAEDA Kensuke SHIBUYA Kenho IWAMOTO Katsuro INOUE
Software developers may write a number of similar source code fragments including the same mistake in software products. To remove such faulty code fragments, developers inspect code clones if they found a bug in their code. While various code clone detection methods have been proposed to identify clones of either code blocks or functions, those tools do not always fit the code inspection task because a faulty code fragment may be much smaller than code blocks, e.g. a single line of code. To enable developers to search code clones of such a small faulty code fragment in a large-scale software product, we propose a method using Lempel-Ziv Jaccard Distance, which is an approximation of Normalized Compression Distance. We conducted an experiment using an existing research dataset and a user survey in a company. The result shows our method efficiently reports cloned faulty code fragments and the performance is acceptable for software developers.
Ryota YOSHIMURA Ichiro MARUTA Kenji FUJIMOTO Ken SATO Yusuke KOBAYASHI
Particle filters have been widely used for state estimation problems in nonlinear and non-Gaussian systems. Their performance depends on the given system and measurement models, which need to be designed by the user for each target system. This paper proposes a novel method to design these models for a particle filter. This is a numerical optimization method, where the particle filter design process is interpreted into the framework of reinforcement learning by assigning the randomnesses included in both models of the particle filter to the policy of reinforcement learning. In this method, estimation by the particle filter is repeatedly performed and the parameters that determine both models are gradually updated according to the estimation results. The advantage is that it can optimize various objective functions, such as the estimation accuracy of the particle filter, the variance of the particles, the likelihood of the parameters, and the regularization term of the parameters. We derive the conditions to guarantee that the optimization calculation converges with probability 1. Furthermore, in order to show that the proposed method can be applied to practical-scale problems, we design the particle filter for mobile robot localization, which is an essential technology for autonomous navigation. By numerical simulations, it is demonstrated that the proposed method further improves the localization accuracy compared to the conventional method.
Rubin ZHAO Xiaolong ZHENG Zhihua YING Lingyan FAN
Most existing object detection methods and text detection methods are mainly designed to detect either text or objects. In some scenarios where the task is to find the target word pointed-at by an object, results of existing methods are far from satisfying. However, such scenarios happen often in human-computer interaction, when the computer needs to figure out which word the user is pointing at. Comparing with object detection, pointed-at word localization (PAWL) requires higher accuracy, especially in dense text scenarios. Moreover, in printed document, characters are much smaller than those in scene text detection datasets such as ICDAR-2013, ICDAR-2015 and ICPR-2018 etc. To address these problems, the authors propose a novel target word localization network (TWLN) to detect the pointed-at word in printed documents. In this work, a single deep neural network is trained to extract the features of markers and text sequentially. For each image, the location of the marker is predicted firstly, according to the predicted location, a smaller image is cropped from the original image and put into the same network, then the location of pointed-at word is predicted. To train and test the networks, an efficient approach is proposed to generate the dataset from PDF format documents by inserting markers pointing at the words in the documents, which avoids laborious labeling work. Experiments on the proposed dataset demonstrate that TWLN outperforms the compared object detection method and optical character recognition method on every category of targets, especially when the target is a single character that only occupies several pixels in the image. TWLN is also tested with real photographs, and the accuracy shows no significant differences, which proves the validity of the generating method to construct the dataset.
Weiguo ZHANG Jiaqi LU Jing ZHANG Xuewen LI Qi ZHAO
The haze situation will seriously affect the quality of license plate recognition and reduce the performance of the visual processing algorithm. In order to improve the quality of haze pictures, a license plate recognition algorithm based on haze weather is proposed in this paper. The algorithm in this paper mainly consists of two parts: The first part is MPGAN image dehazing, which uses a generative adversarial network to dehaze the image, and combines multi-scale convolution and perceptual loss. Multi-scale convolution is conducive to better feature extraction. The perceptual loss makes up for the shortcoming that the mean square error (MSE) is greatly affected by outliers; the second part is to recognize the license plate, first we use YOLOv3 to locate the license plate, the STN network corrects the license plate, and finally enters the improved LPRNet network to get license plate information. Experimental results show that the dehazing model proposed in this paper achieves good results, and the evaluation indicators PSNR and SSIM are better than other representative algorithms. After comparing the license plate recognition algorithm with the LPRNet algorithm, the average accuracy rate can reach 93.9%.
This letter proposes a post-processing method to improve the smoothness and safety of the path for an autonomous vehicle navigating in an urban environment. The proposed method transforms the initial path given by local path planning algorithms using a stochastic approach to improve its smoothness and safety. Using the proposed method, the initial path is efficiently transformed by iteratively updating the position of each waypoint within it. The proposed method also guarantees the feasibility of the transformed path. Experimental results verify that the proposed method can improve the smoothness and safety of the initial path and ensure the feasibility of the transformed path.
In this letter, we propose a deep neural network and semi-supervised learning based dehazing algorithm. The dehazing network uses a pyramidal architecture to recover the haze-free scene from a single hazy image in a coarse-to-fine order. To faithfully restore the objects with different scales, we incorporate cascaded multi-scale convolutional blocks into each level of the pyramid. Feature fusion and transfer in the network are achieved using the paths constructed by interleaved residual connections. For better generalization to the complicated haze in real-world environments, we also devise a discriminator that enables semi-supervised adversarial training. Experimental results demonstrate that the proposed work outperforms comparative ones with higher quantitative metrics and more visually pleasant outputs. It can also enhance the robustness of object detection under haze.
Katsuki TOKANO Wenqi ZHU Tatsuki OSATO Kien NGUYEN Hiroo SEKIYA
This paper presents a design method of a two-hop wireless power transfer (WPT) system for installing on a robot arm. The class-E inverter and the class-D rectifier are used on the transmission and receiving sides, respectively, in the proposed WPT system. Analytical equations for the proposed WPT system are derived as functions of the geometrical and physical parameters of the coils, such as the outer diameter and height of the coils, winding-wire diameter, and number of turns. Using the analytical equations, we can optimize the WPT system to obtain the design values with the theoretically highest power-delivery efficiency under the size limitation of the robot arm. The circuit experiments are in quantitative agreement with the theoretical predictions obtained from the analysis, indicating the validity of the analysis and design method. The experimental prototype achieved 83.6% power-delivery efficiency at 6.78MHz operating frequency and 39.3W output power.
Rizal Setya PERDANA Yoshiteru ISHIDA
This study presents a formulation for generating context-aware natural language by machine from visual representation. Given an image sequence input, the visual storytelling task (VST) aims to generate a coherent, object-focused, and contextualized sentence story. Previous works in this domain faced a problem in modeling an architecture that works in temporal multi-modal data, which led to a low-quality output, such as low lexical diversity, monotonous sentences, and inaccurate context. This study introduces a further improvement, that is, an end-to-end architecture, called cross-modal contextualize attention, optimized to extract visual-temporal features and generate a plausible story. Visual object and non-visual concept features are encoded from the convolutional feature map, and object detection features are joined with language features. Three scenarios are defined in decoding language generation by incorporating weights from a pre-trained language generation model. Extensive experiments are conducted to confirm that the proposed model outperforms other models in terms of automatic metrics and manual human evaluation.