IEICE global.ieice.org Site

Keyword Search Result

[Keyword] (42807hit)

421-440hit(42807hit)

FOREWORD Open Access
Masahiro YAMAGUCHI

FOREWORD

Vol:
E107-C No:2
Page(s):
22-22
Interdigital and Multi-Via Structures for Mushroom-Type Metasurface Reflectors
Taisei URAKAMI Tamami MARUYAMA Shimpei NISHIYAMA Manato KUSAMIZU Akira ONO Takahiro SHIOZAWA

PAPER-Antennas and Propagation

Vol:
E107-B No:2
Page(s):
309-320
The novel patch element shapes with the interdigital and multi-via structures for mushroom-type metasurface reflectors are proposed for controlling the reflection phases. The interdigital structure provides a wide reflection phase range by changing the depth of the interdigital fingers. In addition, the multi-via structure provides the higher positive reflection phases such as near +180°. The sufficient reflection phase range of 360° and the low polarization dependent properties could be confirmed by the electromagnetic field simulation. The metasurface reflector for the normal incident plane wave was designed. The desired reflection angles and sharp far field patterns of the reflected beams could be confirmed in the simulation results. The prototype reflectors for the experiments should be designed in the same way as the primary reflector design of the reflector antenna. Specifically, the reflector design method based on the ray tracing method using the incident wave phase was proposed for the prototype. The experimental radiation pattern for the reflector antenna composed of the transmitting antenna (TX) and the prototype metasurface reflector was similar to the simulated radiation pattern. The effectiveness of the proposed structures and their design methods could be confirmed by these simulation and experiment results.
An Adaptive Energy-Efficient Uneven Clustering Routing Protocol for WSNs
Mingyu LI Jihang YIN Yonggang XU Gang HUA Nian XU

PAPER-Network

Vol:
E107-B No:2
Page(s):
296-308
Aiming at the problem of “energy hole” caused by random distribution of nodes in large-scale wireless sensor networks (WSNs), this paper proposes an adaptive energy-efficient balanced uneven clustering routing protocol (AEBUC) for WSNs. The competition radius is adaptively adjusted based on the node density and the distance from candidate cluster head (CH) to base station (BS) to achieve scale-controlled adaptive optimal clustering; in candidate CHs, the energy relative density and candidate CH relative density are comprehensively considered to achieve dynamic CH selection. In the inter-cluster communication, based on the principle of energy balance, the relay communication cost function is established and combined with the minimum spanning tree method to realize the optimized inter-cluster multi-hop routing, forming an efficient communication routing tree. The experimental results show that the protocol effectively saves network energy, significantly extends network lifetime, and better solves the “energy hole” problem.
A Recommendation-Based Auxiliary Caching for Mapping Record
Zhaolin MA Jiali YOU Haojiang DENG

PAPER-Fundamental Theories for Communications

Vol:
E107-B No:2
Page(s):
286-295
Due to the increase in the volume of data and intensified concurrent requests, distributed caching is commonly used to manage high-concurrency requests and alleviate pressure on databases. However, there is limited research on distributed record mapping caching, and traditional caching algorithms have suboptimal resolution performance for mapping records that typically follow a long-tail distribution. To address the aforementioned issue, in this paper, we propose a recommendation-based adaptive auxiliary caching method, AC-REC, which delivers the primary cache record along with a list of additional cache records. The method uses request correlations as a basis for recommendations, customizes the number of additional cache entries provided, and dynamically adjusts the time-to-live. We conducted evaluations to compare the performance of our method against various benchmark strategies. The results show that our proposed method, as compared to the conventional LCE method, increased the cache hit ratio by an average of 20%, Moreover, this improvement is achieved while effectively utilizing the cache space. We believe that our strategy will contribute an effective solution to the related studies in both traditional network architecture and caching in paradigms like ICN.
Parity-Check Polarization-Adjusted Convolutional Coding
Qingping YU You ZHANG Renze LUO Longye WANG Xingwang LI

LETTER-Coding Theory

Pubricized:
2023/07/27
Vol:
E107-A No:2
Page(s):
187-191
Polarization-adjusted convolutional (PAC) codes have better error-correcting performance than polar codes mostly because of the improved weight distribution brought by the convolutional pre-transformation. In this paper, we propose the parity check PAC (PC-PAC) codes to further improve error-correcting performance of PAC codes. The design principle is to establish parity check functions between bits with distinct row weights, such that information bits of lower reliability are re-protected by the PC relation. Moreover, an algorithm to select which bits to be involved in parity-check functions is also proposed to make sure that the constructed codes have fewer minimum-weight codewords. Simulation results show that the proposed PC-PAC codes can achieve nearly 0.2dB gain over PAC codes at frame error rate (FER) about 10-3 codes.
Development and Photoluminescence Properties of Dinuclear Eu(III)-β-Diketonates with a Branched Tetraphosphine Tetraoxide Ligand for Potential Use in LEDs as Red Phosphors Open Access
Hiroki IWANAGA Fumihiko AIGA Shin-ichi SASAOKA Takahiro WAZAKI

INVITED PAPER

Pubricized:
2023/08/03
Vol:
E107-C No:2
Page(s):
34-41
In the field of micro-LED displays consisting of UV or Blue-LED arrays and phosphors, where the chips used are very small, particle size of phosphors must be small to suppress variation in hue for each pixel. Especially, there is a strong demand for a red phosphor with small particle sizes. However, quantum yields of inorganic phosphors decrease as particles size of phosphors get smaller. On the other hand, in the case of organic phosphors and complexes, quantum yields don't decrease when particle size gets smaller because each molecule has a function of absorbing and emitting light. We focus on Eu(III) complexes as candidates of red phosphors for micro-LED displays because their color purities of photoluminescence spectra are high, and have been tried to enhance photoluminescence intensity by coordinating non-ionic ligand, specifically, newly designed phosphine oxide ligands. Non-ionic ligands have generally less influential on properties of complexes compared with ionic ligands, but have a high degree of flexibility in molecular design. We found novel molecular design concept of phosphine oxide ligands to enhance photoluminescence properties of Eu(III) complexes. This time, novel dinuclear Eu(III)-β-diketonates with a branched tetraphosphine tetraoxide ligand, TDPBPO and TDPPPO, were developed. They are designed to have two different phosphine oxide portions; one has aromatic substituents and the other has no aromatic substituent. TDPBPO and TDPPPO ligands have functions of increasing absolute quantum yields of Eu(III)-β-diketonates. Eu(III)-β-diketonates with branched tetraphosphine tetraoxide ligands have sharp red emissions and excellent quantum yields, and are promising candidates for micro LED displays, security media, and sensing for their pure and strong photoluminescence intensity.
Re-Evaluating Syntax-Based Negation Scope Resolution
Asahi YOSHIDA Yoshihide KATO Shigeki MATSUBARA

LETTER-Natural Language Processing

Pubricized:
2023/10/16
Vol:
E107-D No:1
Page(s):
165-168
Negation scope resolution is the process of detecting the negated part of a sentence. Unlike the syntax-based approach employed in previous researches, state-of-the-art methods performed better without the explicit use of syntactic structure. This work revisits the syntax-based approach and re-evaluates the effectiveness of syntactic structure in negation scope resolution. We replace the parser utilized in the prior works with state-of-the-art parsers and modify the syntax-based heuristic rules. The experimental results demonstrate that the simple modifications enhance the performance of the prior syntax-based method to the same level as state-of-the-art end-to-end neural-based methods.
Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention
Peng GAO Xin-Yue ZHANG Xiao-Li YANG Jian-Cheng NI Fei WANG

LETTER-Image Recognition, Computer Vision

Pubricized:
2023/10/20
Vol:
E107-D No:1
Page(s):
161-164
Despite Siamese trackers attracting much attention due to their scalability and efficiency in recent years, researchers have ignored the background appearance, which leads to their inapplicability in recognizing arbitrary target objects with various variations, especially in complex scenarios with background clutter and distractors. In this paper, we present a simple yet effective Siamese tracker, where the shifted windows multi-head self-attention is produced to learn the characteristics of a specific given target object for visual tracking. To validate the effectiveness of our proposed tracker, we use the Swin Transformer as the backbone network and introduced an auxiliary feature enhancement network. Extensive experimental results on two evaluation datasets demonstrate that the proposed tracker outperforms other baselines.
Lightweight and Fast Low-Light Image Enhancement Method Based on PoolFormer
Xin HU Jinhua WANG Sunhan XU

LETTER-Image Processing and Video Processing

Pubricized:
2023/10/05
Vol:
E107-D No:1
Page(s):
157-160
Images captured in low-light environments have low visibility and high noise, which will seriously affect subsequent visual tasks such as target detection and face recognition. Therefore, low-light image enhancement is of great significance in obtaining high-quality images and is a challenging problem in computer vision tasks. A low-light enhancement model, LLFormer, based on the Vision Transformer, uses axis-based multi-head self-attention and a cross-layer attention fusion mechanism to reduce the complexity and achieve feature extraction. This algorithm can enhance images well. However, the calculation of the attention mechanism is complex and the number of parameters is large, which limits the application of the model in practice. In response to this problem, a lightweight module, PoolFormer, is used to replace the attention module with spatial pooling, which can increase the parallelism of the network and greatly reduce the number of model parameters. To suppress image noise and improve visual effects, a new loss function is constructed for model optimization. The experiment results show that the proposed method not only reduces the number of parameters by 49%, but also performs better in terms of image detail restoration and noise suppression compared with the baseline model. On the LOL dataset, the PSNR and SSIM were 24.098dB and 0.8575 respectively. On the MIT-Adobe FiveK dataset, the PSNR and SSIM were 27.060dB and 0.9490. The evaluation results on the two datasets are better than the current mainstream low-light enhancement algorithms.
A CNN-Based Multi-Scale Pooling Strategy for Acoustic Scene Classification
Rong HUANG Yue XIE

LETTER-Speech and Hearing

Pubricized:
2023/10/17
Vol:
E107-D No:1
Page(s):
153-156
Acoustic scene classification (ASC) is a fundamental domain within the realm of artificial intelligence classification tasks. ASC-based tasks commonly employ models based on convolutional neural networks (CNNs) that utilize log-Mel spectrograms as input for gathering acoustic features. In this paper, we designed a CNN-based multi-scale pooling (MSP) strategy for ASC. The log-Mel spectrograms are utilized as the input to CNN, which is partitioned into four frequency axis segments. Furthermore, we devised four CNN channels to acquire inputs from distinct frequency ranges. The high-level features extracted from outputs in various frequency bands are integrated through frequency pyramid average pooling layers at multiple levels. Subsequently, a softmax classifier is employed to classify different scenes. Our study demonstrates that the implementation of our designed model leads to a significant enhancement in the model's performance, as evidenced by the testing of two acoustic datasets.
Shared Latent Embedding Learning for Multi-View Subspace Clustering
Zhaohu LIU Peng SONG Jinshuai MU Wenming ZHENG

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2023/10/17
Vol:
E107-D No:1
Page(s):
148-152
Most existing multi-view subspace clustering approaches only capture the inter-view similarities between different views and ignore the optimal local geometric structure of the original data. To this end, in this letter, we put forward a novel method named shared latent embedding learning for multi-view subspace clustering (SLE-MSC), which can efficiently capture a better latent space. To be specific, we introduce a pseudo-label constraint to capture the intra-view similarities within each view. Meanwhile, we utilize a novel optimal graph Laplacian to learn the consistent latent representation, in which the common manifold is considered as the optimal manifold to obtain a more reasonable local geometric structure. Comprehensive experimental results indicate the superiority and effectiveness of the proposed method.
Negative Learning to Prevent Undesirable Misclassification
Kazuki EGASHIRA Atsuyuki MIYAI Qing YU Go IRIE Kiyoharu AIZAWA

LETTER-Artificial Intelligence, Data Mining

Pubricized:
2023/10/05
Vol:
E107-D No:1
Page(s):
144-147
We propose a novel classification problem setting where Undesirable Classes (UCs) are defined for each class. UC is the class you specifically want to avoid misclassifying. To address this setting, we propose a framework to reduce the probabilities for UCs while increasing the probability for a correct class.
Inference Discrepancy Based Curriculum Learning for Neural Machine Translation
Lei ZHOU Ryohei SASANO Koichi TAKEDA

PAPER-Natural Language Processing

Pubricized:
2023/10/18
Vol:
E107-D No:1
Page(s):
135-143
In practice, even a well-trained neural machine translation (NMT) model can still make biased inferences on the training set due to distribution shifts. For the human learning process, if we can not reproduce something correctly after learning it multiple times, we consider it to be more difficult. Likewise, a training example causing a large discrepancy between inference and reference implies higher learning difficulty for the MT model. Therefore, we propose to adopt the inference discrepancy of each training example as the difficulty criterion, and according to which rank training examples from easy to hard. In this way, a trained model can guide the curriculum learning process of an initial model identical to itself. We put forward an analogy to this training scheme as guiding the learning process of a curriculum NMT model by a pretrained vanilla model. In this paper, we assess the effectiveness of the proposed training scheme and take an insight into the influence of translation direction, evaluation metrics and different curriculum schedules. Experimental results on translation benchmarks WMT14 English ⇒ German, WMT17 Chinese ⇒ English and Multitarget TED Talks Task (MTTT) English ⇔ German, English ⇔ Chinese, English ⇔ Russian demonstrate that our proposed method consistently improves the translation performance against the advanced Transformer baseline.
Multi-Task Learning of Japanese How-to Tip Machine Reading Comprehension by a Generative Model
Xiaotian WANG Tingxuan LI Takuya TAMURA Shunsuke NISHIDA Takehito UTSURO

PAPER-Natural Language Processing

Pubricized:
2023/10/23
Vol:
E107-D No:1
Page(s):
125-134
In the research of machine reading comprehension of Japanese how-to tip QA tasks, conventional extractive machine reading comprehension methods have difficulty in dealing with cases in which the answer string spans multiple locations in the context. The method of fine-tuning of the BERT model for machine reading comprehension tasks is not suitable for such cases. In this paper, we trained a generative machine reading comprehension model of Japanese how-to tip by constructing a generative dataset based on the website “wikihow” as a source of information. We then proposed two methods for multi-task learning to fine-tune the generative model. The first method is the multi-task learning with a generative and extractive hybrid training dataset, where both generative and extractive datasets are simultaneously trained on a single model. The second method is the multi-task learning with the inter-sentence semantic similarity and answer generation, where, drawing upon the answer generation task, the model additionally learns the distance between the sentences of the question/context and the answer in the training examples. The evaluation results showed that both of the multi-task learning methods significantly outperformed the single-task learning method in generative question-and-answer examples. Between the two methods for multi-task learning, that with the inter-sentence semantic similarity and answer generation performed the best in terms of the manual evaluation result. The data and the code are available at https://github.com/EternalEdenn/multitask_ext-gen_sts-gen.
Improved Head and Data Augmentation to Reduce Artifacts at Grid Boundaries in Object Detection
Shinji UCHINOURA Takio KURITA

PAPER-Image Recognition, Computer Vision

Pubricized:
2023/10/23
Vol:
E107-D No:1
Page(s):
115-124
We investigated the influence of horizontal shifts of the input images for one stage object detection method. We found that the object detector class scores drop when the target object center is at the grid boundary. Many approaches have focused on reducing the aliasing effect of down-sampling to achieve shift-invariance. However, down-sampling does not completely solve this problem at the grid boundary; it is necessary to suppress the dispersion of features in pixels close to the grid boundary into adjacent grid cells. Therefore, this paper proposes two approaches focused on the grid boundary to improve this weak point of current object detection methods. One is the Sub-Grid Feature Extraction Module, in which the sub-grid features are added to the input of the classification head. The other is Grid-Aware Data Augmentation, where augmented data are generated by the grid-level shifts and are used in training. The effectiveness of the proposed approaches is demonstrated using the COCO validation set after applying the proposed method to the FCOS architecture.
Efficient Action Spotting Using Saliency Feature Weighting
Yuzhi SHI Takayoshi YAMASHITA Tsubasa HIRAKAWA Hironobu FUJIYOSHI Mitsuru NAKAZAWA Yeongnam CHAE Björn STENGER

PAPER-Image Processing and Video Processing

Pubricized:
2023/10/17
Vol:
E107-D No:1
Page(s):
105-114
Action spotting is a key component in high-level video understanding. The large number of similar frames poses a challenge for recognizing actions in videos. In this paper we use frame saliency to represent the importance of frames for guiding the model to focus on keyframes. We propose the frame saliency weighting module to improve frame saliency and video representation at the same time. Our proposed model contains two encoders, for pre-action and post-action time windows, to encode video context. We validate our design choices and the generality of proposed method in extensive experiments. On the public SoccerNet-v2 dataset, the method achieves an average mAP of 57.3%, improving over the state of the art. Using embedding features obtained from multiple feature extractors, the average mAP further increases to 75%. We show that reducing the model size by over 90% does not significantly impact performance. Additionally, we use ablation studies to prove the effective of saliency weighting module. Further, we show that our frame saliency weighting strategy is applicable to existing methods on more general action datasets, such as SoccerNet-v1, ActivityNet v1.3, and UCF101.
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis
Kenichi FUJITA Atsushi ANDO Yusuke IJIMA

PAPER-Speech and Hearing

Pubricized:
2023/10/06
Vol:
E107-D No:1
Page(s):
93-104
This paper proposes a speech rhythm-based method for speaker embeddings to model phoneme duration using a few utterances by the target speaker. Speech rhythm is one of the essential factors among speaker characteristics, along with acoustic features such as F0, for reproducing individual utterances in speech synthesis. A novel feature of the proposed method is the rhythm-based embeddings extracted from phonemes and their durations, which are known to be related to speaking rhythm. They are extracted with a speaker identification model similar to the conventional spectral feature-based one. We conducted three experiments, speaker embeddings generation, speech synthesis with generated embeddings, and embedding space analysis, to evaluate the performance. The proposed method demonstrated a moderate speaker identification performance (15.2% EER), even with only phonemes and their duration information. The objective and subjective evaluation results demonstrated that the proposed method can synthesize speech with speech rhythm closer to the target speaker than the conventional method. We also visualized the embeddings to evaluate the relationship between the distance of the embeddings and the perceptual similarity. The visualization of the embedding space and the relation analysis between the closeness indicated that the distribution of embeddings reflects the subjective and objective similarity.
Research on Lightweight Acoustic Scene Perception Method Based on Drunkard Methodology
Wenkai LIU Lin ZHANG Menglong WU Xichang CAI Hongxia DONG

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2023/10/23
Vol:
E107-D No:1
Page(s):
83-92
The goal of Acoustic Scene Classification (ASC) is to simulate human analysis of the surrounding environment and make accurate decisions promptly. Extracting useful information from audio signals in real-world scenarios is challenging and can lead to suboptimal performance in acoustic scene classification, especially in environments with relatively homogeneous backgrounds. To address this problem, we model the sobering-up process of “drunkards” in real-life and the guiding behavior of normal people, and construct a high-precision lightweight model implementation methodology called the “drunkard methodology”. The core idea includes three parts: (1) designing a special feature transformation module based on the different mechanisms of information perception between drunkards and ordinary people, to simulate the process of gradually sobering up and the changes in feature perception ability; (2) studying a lightweight “drunken” model that matches the normal model's perception processing process. The model uses a multi-scale class residual block structure and can obtain finer feature representations by fusing information extracted at different scales; (3) introducing a guiding and fusion module of the conventional model to the “drunken” model to speed up the sobering-up process and achieve iterative optimization and accuracy improvement. Evaluation results on the official dataset of DCASE2022 Task1 demonstrate that our baseline system achieves 40.4% accuracy and 2.284 loss under the condition of 442.67K parameters and 19.40M MAC (multiply-accumulate operations). After adopting the “drunkard” mechanism, the accuracy is improved to 45.2%, and the loss is reduced by 0.634 under the condition of 551.89K parameters and 23.6M MAC.
A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation
Gang LIU Xin CHEN Zhixiang GAO

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2023/09/28
Vol:
E107-D No:1
Page(s):
72-82
Photo animation is to transform photos of real-world scenes into anime style images, which is a challenging task in AIGC (AI Generated Content). Although previous methods have achieved promising results, they often introduce noticeable artifacts or distortions. In this paper, we propose a novel double-tail generative adversarial network (DTGAN) for fast photo animation. DTGAN is the third version of the AnimeGAN series. Therefore, DTGAN is also called AnimeGANv3. The generator of DTGAN has two output tails, a support tail for outputting coarse-grained anime style images and a main tail for refining coarse-grained anime style images. In DTGAN, we propose a novel learnable normalization technique, termed as linearly adaptive denormalization (LADE), to prevent artifacts in the generated images. In order to improve the visual quality of the generated anime style images, two novel loss functions suitable for photo animation are proposed: 1) the region smoothing loss function, which is used to weaken the texture details of the generated images to achieve anime effects with abstract details; 2) the fine-grained revision loss function, which is used to eliminate artifacts and noise in the generated anime style image while preserving clear edges. Furthermore, the generator of DTGAN is a lightweight generator framework with only 1.02 million parameters in the inference phase. The proposed DTGAN can be easily end-to-end trained with unpaired training data. Extensive experiments have been conducted to qualitatively and quantitatively demonstrate that our method can produce high-quality anime style images from real-world photos and perform better than the state-of-the-art models.
Testing and Delay-Monitoring for the High Reliability of Memory-Based Programmable Logic Device
Xihong ZHOU Senling WANG Yoshinobu HIGAMI Hiroshi TAKAHASHI

PAPER-Dependable Computing

Pubricized:
2023/10/03
Vol:
E107-D No:1
Page(s):
60-71
Memory-based Programmable Logic Device (MPLD) is a new type of reconfigurable device constructed using a general SRAM array in a unique interconnect configuration. This research aims to propose approaches to guarantee the long-term reliability of MPLDs, including a test method to identify interconnect defects in the SRAM array during the production phase and a delay monitoring technique to detect aging-caused failures. The proposed test method configures pre-generated test configuration data into SRAMs to create fault propagation paths, applies an external walking-zero/one vector to excite faults, and identifies faults at the external output ports. The proposed delay monitoring method configures a novel ring oscillator logic design into MPLD to measure delay variations when the device is in practical use. The logic simulation results with fault injection confirm the effectiveness of the proposed methods.

421-440hit(42807hit)

Keyword Search Result

[Keyword] (42807hit)

FOREWORD Open Access

Interdigital and Multi-Via Structures for Mushroom-Type Metasurface Reflectors

An Adaptive Energy-Efficient Uneven Clustering Routing Protocol for WSNs

A Recommendation-Based Auxiliary Caching for Mapping Record

Parity-Check Polarization-Adjusted Convolutional Coding

Development and Photoluminescence Properties of Dinuclear Eu(III)-β-Diketonates with a Branched Tetraphosphine Tetraoxide Ligand for Potential Use in LEDs as Red Phosphors Open Access

Re-Evaluating Syntax-Based Negation Scope Resolution

Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention

Lightweight and Fast Low-Light Image Enhancement Method Based on PoolFormer

A CNN-Based Multi-Scale Pooling Strategy for Acoustic Scene Classification

Shared Latent Embedding Learning for Multi-View Subspace Clustering

Negative Learning to Prevent Undesirable Misclassification

Inference Discrepancy Based Curriculum Learning for Neural Machine Translation

Multi-Task Learning of Japanese How-to Tip Machine Reading Comprehension by a Generative Model

Improved Head and Data Augmentation to Reduce Artifacts at Grid Boundaries in Object Detection

Efficient Action Spotting Using Saliency Feature Weighting

Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis

Research on Lightweight Acoustic Scene Perception Method Based on Drunkard Methodology

A Novel Double-Tail Generative Adversarial Network for Fast Photo Animation

Testing and Delay-Monitoring for the High Reliability of Memory-Based Programmable Logic Device

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles