IEICE global.ieice.org Site

Keyword Search Result

[Keyword] SEM(686hit)

41-60hit(686hit)

Master-Teacher-Student: A Weakly Labelled Semi-Supervised Framework for Audio Tagging and Sound Event Detection
Yuzhuo LIU Hangting CHEN Qingwei ZHAO Pengyuan ZHANG

LETTER-Speech and Hearing

Pubricized:
2022/01/13
Vol:
E105-D No:4
Page(s):
828-831
Weakly labelled semi-supervised audio tagging (AT) and sound event detection (SED) have become significant in real-world applications. A popular method is teacher-student learning, making student models learn from pseudo-labels generated by teacher models from unlabelled data. To generate high-quality pseudo-labels, we propose a master-teacher-student framework trained with a dual-lead policy. Our experiments illustrate that our model outperforms the state-of-the-art model on both tasks.
Sea Clutter Image Segmentation Method of High Frequency Surface Wave Radar Based on the Improved Deeplab Network
Haotian CHEN Sukhoon LEE Di YAO Dongwon JEONG

LETTER-Digital Signal Processing

Pubricized:
2021/10/12
Vol:
E105-A No:4
Page(s):
730-733
High Frequency Surface Wave Radar (HFSWR) can achieve over-the-horizon detection, which can effectively detect and track the ships and ultra-low altitude aircrafts, as well as the acquisition of sea state information such as icebergs and ocean currents and so on. However, HFSWR is seriously affected by the clutters, especially sea clutter and ionospheric clutter. In this paper, we propose a deep learning image semantic segmentation method based on optimized Deeplabv3+ network to achieve the automatic detection of sea clutter and ionospheric clutter using the measured R-D spectrum images of HFSWR during the typhoon as experimental data, which avoids the disadvantage of traditional detection methods that require a large amount of a priori knowledge and provides a basis for subsequent the clutter suppression or the clutter characteristics research.
Numerical Analysis of Pulse Response for Slanted Grating Structure with an Air Regions in Dispersion Media by TE Case Open Access
Ryosuke OZAKI Tsuneki YAMASAKI

BRIEF PAPER

Pubricized:
2021/10/18
Vol:
E105-C No:4
Page(s):
154-158
In our previous paper, we have proposed a new numerical technique for transient scattering problem of periodically arrayed dispersion media by using a combination of the fast inversion Laplace transform (FILT) method and Fourier series expansion method (FSEM), and analyzed the pulse response for several widths of the dispersion media or rectangular cavities. From the numerical results, we examined the influence of a periodically arrayed dispersion media with a rectangular cavity on the pulse response. In this paper, we analyzed the transient scattering problem for the case of dispersion media with slanted air regions by utilizing a combination of the FILT, FSEM, and multilayer division method (MDM), and investigated an influence for the slanted angle of an air region. In addition, we verified the computational accuracy for term of the MDM and truncation mode number of the electromagnetic fields.
Semi-Supervised Representation Learning via Triplet Loss Based on Explicit Class Ratio of Unlabeled Data
Kazuhiko MURASAKI Shingo ANDO Jun SHIMAMURA

PAPER-Image Recognition, Computer Vision

Pubricized:
2022/01/17
Vol:
E105-D No:4
Page(s):
778-784
In this paper, we propose a semi-supervised triplet loss function that realizes semi-supervised representation learning in a novel manner. We extend conventional triplet loss, which uses labeled data to achieve representation learning, so that it can deal with unlabeled data. We estimate, in advance, the degree to which each label applies to each unlabeled data point, and optimize the loss function with unlabeled features according to the resulting ratios. Since the proposed loss function has the effect of adjusting the distribution of all unlabeled data, it complements methods based on consistency regularization, which has been extensively studied in recent years. Combined with a consistency regularization-based method, our method achieves more accurate semi-supervised learning. Experiments show that the proposed loss function achieves a higher accuracy than the conventional fine-tuning method.
Latent Space Virtual Adversarial Training for Supervised and Semi-Supervised Learning
Genki OSADA Budrul AHSAN Revoti PRASAD BORA Takashi NISHIDE

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2021/12/09
Vol:
E105-D No:3
Page(s):
667-678
Virtual Adversarial Training (VAT) has shown impressive results among recently developed regularization methods called consistency regularization. VAT utilizes adversarial samples, generated by injecting perturbation in the input space, for training and thereby enhances the generalization ability of a classifier. However, such adversarial samples can be generated only within a very small area around the input data point, which limits the adversarial effectiveness of such samples. To address this problem we propose LVAT (Latent space VAT), which injects perturbation in the latent space instead of the input space. LVAT can generate adversarial samples flexibly, resulting in more adverse effect and thus more effective regularization. The latent space is built by a generative model, and in this paper we examine two different type of models: variational auto-encoder and normalizing flow, specifically Glow. We evaluated the performance of our method in both supervised and semi-supervised learning scenarios for an image classification task using SVHN and CIFAR-10 datasets. In our evaluation, we found that our method outperforms VAT and other state-of-the-art methods.
Upper Bounds on the Error Probability for the Ensemble of Linear Block Codes with Mismatched Decoding Open Access
Toshihiro NIINOMI Hideki YAGI Shigeichi HIRASAWA

PAPER-Coding Theory

Pubricized:
2021/10/08
Vol:
E105-A No:3
Page(s):
363-371
In channel decoding, a decoder with suboptimal metrics may be used because of the uncertainty of the channel statistics or the limitations of the decoder. In this case, the decoding metric is different from the actual channel metric, and thus it is called mismatched decoding. In this paper, applying the technique of the DS2 bound, we derive an upper bound on the error probability of mismatched decoding over a regular channel for the ensemble of linear block codes, which was defined by Hof, Sason and Shamai. Assuming the ensemble of random linear block codes defined by Gallager, we show that the obtained bound is not looser than the conventional bound. We also give a numerical example for the ensemble of LDPC codes also introduced by Gallager, which shows that our proposed bound is tighter than the conventional bound. Furthermore, we obtain a single letter error exponent for linear block codes.
Semantic Shilling Attack against Heterogeneous Information Network Based Recommend Systems
Yizhi REN Zelong LI Lifeng YUAN Zhen ZHANG Chunhua SU Yujuan WANG Guohua WU

PAPER

Pubricized:
2021/11/30
Vol:
E105-D No:2
Page(s):
289-299
The recommend system has been widely used in many web application areas such as e-commerce services. With the development of the recommend system, the HIN modeling method replaces the traditional bipartite graph modeling method to represent the recommend system. But several studies have already showed that recommend system is vulnerable to shilling attack (injecting attack). However, the effectiveness of how traditional shilling attack has rarely been studied directly in the HIN model. Moreover, no study has focused on how to enhance shilling attacks against HIN recommend system by using the high-level semantic information. This work analyzes the relationship between the high-level semantic information and the attacking effects in HIN recommend system. This work proves that attack results are proportional to the high-level semantic information. Therefore, we propose a heuristic attack method based on high-level semantic information, named Semantic Shilling Attack (SSA) on a HIN recommend system (HERec). This method injects a specific score into each selected item related to the target in semantics. It ensures transmitting the misleading information towards target items and normal users, and attempts to interfere with the effect of the recommend system. The experiment is dependent on two real-world datasets, and proves that the attacking effect is positively correlate with the number of meta-paths. The result shows that our method is more effective when compared with existing baseline algorithms.
Semantic Guided Infrared and Visible Image Fusion
Wei WU Dazhi ZHANG Jilei HOU Yu WANG Tao LU Huabing ZHOU

LETTER-Image

Pubricized:
2021/06/10
Vol:
E104-A No:12
Page(s):
1733-1738
In this letter, we propose a semantic guided infrared and visible image fusion method, which can train a network to fuse different semantic objects with different fusion weights according to their own characteristics. First, we design the appropriate fusion weights for each semantic object instead of the whole image. Second, we employ the semantic segmentation technology to obtain the semantic region of each object, and generate special weight maps for the infrared and visible image via pre-designed fusion weights. Third, we feed the weight maps into the loss function to guide the image fusion process. The trained fusion network can generate fused images with better visual effect and more comprehensive scene representation. Moreover, we can enhance the modal features of various semantic objects, benefiting subsequent tasks and applications. Experiment results demonstrate that our method outperforms the state-of-the-art in terms of both visual effect and quantitative metrics.
GECNN for Weakly Supervised Semantic Segmentation of 3D Point Clouds
Zifen HE Shouye ZHU Ying HUANG Yinhui ZHANG

PAPER-Image Recognition, Computer Vision

Pubricized:
2021/09/24
Vol:
E104-D No:12
Page(s):
2237-2243
This paper presents a novel method for weakly supervised semantic segmentation of 3D point clouds using a novel graph and edge convolutional neural network (GECNN) towards 1% and 10% point cloud with labels. Our general framework facilitates semantic segmentation by encoding both global and local scale features via a parallel graph and edge aggregation scheme. More specifically, global scale graph structure cues of point clouds are captured by a graph convolutional neural network, which is propagated from pairwise affinity representation over the whole graph established in a d-dimensional feature embedding space. We integrate local scale features derived from a dynamic edge feature aggregation convolutional neural networks that allows us to fusion both global and local cues of 3D point clouds. The proposed GECNN model is trained by using a comprehensive objective which consists of incomplete, inexact, self-supervision and smoothness constraints based on partially labeled points. The proposed approach enforces global and local consistency constraints directly on the objective losses. It inherently handles the challenges of segmenting sparse 3D point clouds with limited annotations in a large scale point cloud space. Our experiments on the ShapeNet and S3DIS benchmarks demonstrate the effectiveness of the proposed approach for efficient (within 20 epochs) learning of large scale point cloud semantics despite very limited labels.
Fragmentation-Minimized Periodic Network-Bandwidth Expansion Employing Aligned Channel Slot Allocation in Flexible Grid Optical Networks
Hiroshi HASEGAWA Takuma YASUDA Yojiro MORI Ken-ichi SATO

PAPER-Fiber-Optic Transmission for Communications

Pubricized:
2021/06/01
Vol:
E104-B No:12
Page(s):
1514-1523
We propose an efficient network upgrade and expansion method that can make the most of the next generation channel resources to accommodate further increases in traffic. Semi-flexible grid configuration and two cost metrics are introduced to establish a regularity in frequency assignment and minimize disturbance in the upgrade process; both reduce the fragmentation in frequency assignment and the number of fibers necessary. Various investigations of different configurations elucidate that the number of fibers necessary is reduced about 10-15% for any combination of upgrade scenario, channel frequency bandwidth, and topology adopted.
Synthetic Scene Character Generator and Ensemble Scheme with the Random Image Feature Method for Japanese and Chinese Scene Character Recognition
Fuma HORIE Hideaki GOTO Takuo SUGANUMA

PAPER-Image Recognition, Computer Vision

Pubricized:
2021/08/24
Vol:
E104-D No:11
Page(s):
2002-2010
Scene character recognition has been intensively investigated for a couple of decades because it has a great potential in many applications including automatic translation, signboard recognition, and reading assistance for the visually-impaired. However, scene characters are difficult to recognize at sufficient accuracy owing to various noise and image distortions. In addition, Japanese scene character recognition is more challenging and requires a large amount of character data for training because thousands of character classes exist in the language. Some researchers proposed training data augmentation techniques using Synthetic Scene Character Data (SSCD) to compensate for the shortage of training data. In this paper, we propose a Random Filter which is a new method for SSCD generation, and introduce an ensemble scheme with the Random Image Feature (RI-Feature) method. Since there has not been a large Japanese scene character dataset for the evaluation of the recognition systems, we have developed an open dataset JPSC1400, which consists of a large number of real Japanese scene characters. It is shown that the accuracy has been improved from 70.9% to 83.1% by introducing the RI-Feature method to the ensemble scheme.
Code-Switching ASR and TTS Using Semisupervised Learning with Machine Speech Chain
Sahoko NAKAYAMA Andros TJANDRA Sakriani SAKTI Satoshi NAKAMURA

PAPER-Speech and Hearing

Pubricized:
2021/07/08
Vol:
E104-D No:10
Page(s):
1661-1677
The phenomenon where a speaker mixes two or more languages within the same conversation is called code-switching (CS). Handling CS is challenging for automatic speech recognition (ASR) and text-to-speech (TTS) because it requires coping with multilingual input. Although CS text or speech may be found in social media, the datasets of CS speech and corresponding CS transcriptions are hard to obtain even though they are required for supervised training. This work adopts a deep learning-based machine speech chain to train CS ASR and CS TTS with each other with semisupervised learning. After supervised learning with monolingual data, the machine speech chain is then carried out with unsupervised learning of either the CS text or speech. The results show that the machine speech chain trains ASR and TTS together and improves performance without requiring the pair of CS speech and corresponding CS text. We also integrate language embedding and language identification into the CS machine speech chain in order to handle CS better by giving language information. We demonstrate that our proposed approach can improve the performance on both a single CS language pair and multiple CS language pairs, including the unknown CS excluded from training data.
An Ising Machine-Based Solver for Visiting-Route Recommendation Problems in Amusement Parks
Yosuke MUKASA Tomoya WAKAIZUMI Shu TANAKA Nozomu TOGAWA

PAPER-Computer System

Pubricized:
2021/07/08
Vol:
E104-D No:10
Page(s):
1592-1600
In an amusement park, an attraction-visiting route considering the waiting time and traveling time improves visitors' satisfaction and experience. We focus on Ising machines to solve the problem, which are recently expected to solve combinatorial optimization problems at high speed by mapping the problems to Ising models or quadratic unconstrained binary optimization (QUBO) models. We propose a mapping of the visiting-route recommendation problem in amusement parks to a QUBO model for solving it using Ising machines. By using an actual Ising machine, we could obtain feasible solutions one order of magnitude faster with almost the same accuracy as the simulated annealing method for the visiting-route recommendation problem.
Image Emotion Recognition Using Visual and Semantic Features Reflecting Emotional and Similar Objects
Takahisa YAMAMOTO Shiki TAKEUCHI Atsushi NAKAZAWA

PAPER-Image Recognition, Computer Vision

Pubricized:
2021/06/24
Vol:
E104-D No:10
Page(s):
1691-1701
Visual sentiment analysis has a lot of applications, including image captioning, opinion mining, and advertisement; however, it is still a difficult problem and existing algorithms cannot produce satisfactory results. One of the difficulties in classifying images into emotions is that visual sentiments are evoked by different types of information - visual and semantic information where visual information includes colors or textures, and semantic information includes types of objects evoking emotions and/or their combinations. In contrast to the existing methods that use only visual information, this paper shows a novel algorithm for image emotion recognition that uses both information simultaneously. For semantic features, we introduce an object vector and a word vector. The object vector is created by an object detection method and reflects existing objects in an image. The word vector is created by transforming the names of detected objects through a word embedding model. This vector will be similar among objects that are semantically similar. These semantic features and a visual feature made by a fine-tuned convolutional neural network (CNN) are concatenated. We perform the classification by the concatenated feature vector. Extensive evaluation experiments using emotional image datasets show that our method achieves the best accuracy except for one dataset against other existing methods. The improvement in accuracy of our method from existing methods is 4.54% at the highest.
Preparation Copper Sulfide Nanoparticles by Laser Ablation in Liquid and Optical Properties
Kazuki ISODA Ryuga YANAGIHARA Yoshitaka KITAMOTO Masahiko HARA Hiroyuki WADA

BRIEF PAPER-Ultrasonic Electronics

Pubricized:
2021/02/08
Vol:
E104-C No:8
Page(s):
390-393
Copper sulfide nanoparticles were successfully prepared by laser ablation in liquid. CuS powders in deionized water were irradiated with nanosecond-pulsed laser (Nd:YAG, SHG) to prepare nanoparticles. Prepared nanoparticles were investigated by scanning electron microscopy (SEM), dynamic light scattering (DLS) and fluorospectrometer. According to the results of SEM and DLS, the primary and secondary particle size was decreased with the increase in laser fluence of laser ablation in liquid. The ratio of Cu and S of prepared nanoparticles were not changed. The absorbance of prepared copper sulfide nanoparticles in water was increased with the increase in laser fluence.
Classification Functions for Handwritten Digit Recognition
Tsutomu SASAO Yuto HORIKAWA Yukihiro IGUCHI

PAPER-Logic Design

Pubricized:
2021/04/01
Vol:
E104-D No:8
Page(s):
1076-1082
A classification function maps a set of vectors into several classes. A machine learning problem is treated as a design problem for partially defined classification functions. To realize classification functions for MNIST hand written digits, three different architectures are considered: Single-unit realization, 45-unit realization, and 45-unit ×r realization. The 45-unit realization consists of 45 ternary classifiers, 10 counters, and a max selector. Test accuracy of these architectures are compared using MNIST data set.
Two-Stage Fine-Grained Text-Level Sentiment Analysis Based on Syntactic Rule Matching and Deep Semantic
Weizhi LIAO Yaheng MA Yiling CAO Guanglei YE Dongzhou ZUO

PAPER

Pubricized:
2021/04/28
Vol:
E104-D No:8
Page(s):
1274-1280
Aiming at the problem that traditional text-level sentiment analysis methods usually ignore the emotional tendency corresponding to the object or attribute. In this paper, a novel two-stage fine-grained text-level sentiment analysis model based on syntactic rule matching and deep semantics is proposed. Based on analyzing the characteristics and difficulties of fine-grained sentiment analysis, a two-stage fine-grained sentiment analysis algorithm framework is constructed. In the first stage, the objects and its corresponding opinions are extracted based on syntactic rules matching to obtain preliminary objects and opinions. The second stage based on deep semantic network to extract more accurate objects and opinions. Aiming at the problem that the extraction result contains multiple objects and opinions to be matched, an object-opinion matching algorithm based on the minimum lexical separation distance is proposed to achieve accurate pairwise matching. Finally, the proposed algorithm is evaluated on several public datasets to demonstrate its practicality and effectiveness.
Attention Voting Network with Prior Distance Augmented Loss for 6DoF Pose Estimation
Yong HE Ji LI Xuanhong ZHOU Zewei CHEN Xin LIU

PAPER-Image Recognition, Computer Vision

Pubricized:
2021/03/26
Vol:
E104-D No:7
Page(s):
1039-1048
6DoF pose estimation from a monocular RGB image is a challenging but fundamental task. The methods based on unit direction vector-field representation and Hough voting strategy achieved state-of-the-art performance. Nevertheless, they apply the smooth l1 loss to learn the two elements of the unit vector separately, resulting in which is not taken into account that the prior distance between the pixel and the keypoint. While the positioning error is significantly affected by the prior distance. In this work, we propose a Prior Distance Augmented Loss (PDAL) to exploit the prior distance for more accurate vector-field representation. Furthermore, we propose a lightweight channel-level attention module for adaptive feature fusion. Embedding this Adaptive Fusion Attention Module (AFAM) into the U-Net, we build an Attention Voting Network to further improve the performance of our method. We conduct extensive experiments to demonstrate the effectiveness and performance improvement of our methods on the LINEMOD, OCCLUSION and YCB-Video datasets. Our experiments show that the proposed methods bring significant performance gains and outperform state-of-the-art RGB-based methods without any post-refinement.
Estimation of Semantic Impressions from Portraits
Mari MIYATA Kiyoharu AIZAWA

PAPER-Image Processing and Video Processing

Pubricized:
2021/03/18
Vol:
E104-D No:6
Page(s):
863-872
In this paper, we present a novel portrait impression estimation method using nine pairs of semantic impression words: bitter-majestic, clear-pure, elegant-mysterious, gorgeous-mature, modern-intellectual, natural-mild, sporty-agile, sweet-sunny, and vivid-dynamic. In the first part of the study, we analyzed the relationship between the facial features in deformed portraits and the nine semantic impression word pairs over a large dataset, which we collected by a crowdsourcing process. In the second part, we leveraged the knowledge from the results of the analysis to develop a ranking network trained on the collected data and designed to estimate the semantic impression associated with a portrait. Our network demonstrated superior performance in impression estimation compared with current state-of-the-art methods.
A Robust Semidefinite Source Localization TDOA/FDOA Method with Sensor Position Uncertainties
Zhengfeng GU Hongying TANG Xiaobing YUAN

PAPER-Sensing

Pubricized:
2020/10/15
Vol:
E104-B No:4
Page(s):
472-480
Source localization in a wireless sensor network (WSN) is sensitive to the sensors' positions. In practice, due to mobility, the receivers' positions may be known inaccurately, leading to non-negligible degradation in source localization estimation performance. The goal of this paper is to develop a semidefinite programming (SDP) method using time-difference-of arrival (TDOA) and frequency-difference-of-arrival (FDOA) by taking the sensor position uncertainties into account. Specifically, we transform the commonly used maximum likelihood estimator (MLE) problem into a convex optimization problem to obtain an initial estimation. To reduce the coupling between position and velocity estimator, we also propose an iterative method to obtain the velocity and position, by using weighted least squares (WLS) method and SDP method, respectively. Simulations show that the method can approach the Cramér-Rao lower bound (CRLB) under both mild and high noise levels.

41-60hit(686hit)

Keyword Search Result

[Keyword] SEM(686hit)

Master-Teacher-Student: A Weakly Labelled Semi-Supervised Framework for Audio Tagging and Sound Event Detection

Sea Clutter Image Segmentation Method of High Frequency Surface Wave Radar Based on the Improved Deeplab Network

Numerical Analysis of Pulse Response for Slanted Grating Structure with an Air Regions in Dispersion Media by TE Case Open Access

Semi-Supervised Representation Learning via Triplet Loss Based on Explicit Class Ratio of Unlabeled Data

Latent Space Virtual Adversarial Training for Supervised and Semi-Supervised Learning

Upper Bounds on the Error Probability for the Ensemble of Linear Block Codes with Mismatched Decoding Open Access

Semantic Shilling Attack against Heterogeneous Information Network Based Recommend Systems

Semantic Guided Infrared and Visible Image Fusion

GECNN for Weakly Supervised Semantic Segmentation of 3D Point Clouds

Fragmentation-Minimized Periodic Network-Bandwidth Expansion Employing Aligned Channel Slot Allocation in Flexible Grid Optical Networks

Synthetic Scene Character Generator and Ensemble Scheme with the Random Image Feature Method for Japanese and Chinese Scene Character Recognition

Code-Switching ASR and TTS Using Semisupervised Learning with Machine Speech Chain

An Ising Machine-Based Solver for Visiting-Route Recommendation Problems in Amusement Parks

Image Emotion Recognition Using Visual and Semantic Features Reflecting Emotional and Similar Objects

Preparation Copper Sulfide Nanoparticles by Laser Ablation in Liquid and Optical Properties

Classification Functions for Handwritten Digit Recognition

Two-Stage Fine-Grained Text-Level Sentiment Analysis Based on Syntactic Rule Matching and Deep Semantic

Attention Voting Network with Prior Distance Augmented Loss for 6DoF Pose Estimation

Estimation of Semantic Impressions from Portraits

A Robust Semidefinite Source Localization TDOA/FDOA Method with Sensor Position Uncertainties

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles