The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] EE(4079hit)

321-340hit(4079hit)

  • Deep Metric Learning for Multi-Label and Multi-Object Image Retrieval

    Jonathan MOJOO  Takio KURITA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2021/03/08
      Vol:
    E104-D No:6
      Page(s):
    873-880

    Content-based image retrieval has been a hot topic among computer vision researchers for a long time. There have been many advances over the years, one of the recent ones being deep metric learning, inspired by the success of deep neural networks in many machine learning tasks. The goal of metric learning is to extract good high-level features from image pixel data using neural networks. These features provide useful abstractions, which can enable algorithms to perform visual comparison between images with human-like accuracy. To learn these features, supervised information of image similarity or relative similarity is often used. One important issue in deep metric learning is how to define similarity for multi-label or multi-object scenes in images. Traditionally, pairwise similarity is defined based on the presence of a single common label between two images. However, this definition is very coarse and not suitable for multi-label or multi-object data. Another common mistake is to completely ignore the multiplicity of objects in images, hence ignoring the multi-object facet of certain types of datasets. In our work, we propose an approach for learning deep image representations based on the relative similarity of both multi-label and multi-object image data. We introduce an intuitive and effective similarity metric based on the Jaccard similarity coefficient, which is equivalent to the intersection over union of two label sets. Hence we treat similarity as a continuous, as opposed to discrete quantity. We incorporate this similarity metric into a triplet loss with an adaptive margin, and achieve good mean average precision on image retrieval tasks. We further show, using a recently proposed quantization method, that the resulting deep feature can be quantized whilst preserving similarity. We also show that our proposed similarity metric performs better for multi-object images than a previously proposed cosine similarity-based metric. Our proposed method outperforms several state-of-the-art methods on two benchmark datasets.

  • Uplink Frame Transmission with Functions of Adaptive Triggering and Resource Allocation of OFDMA in Interfering IEEE 802.11ax Wireless LANs

    Ryoichi TAKAHASHI  Yosuke TANIGAWA  Hideki TODE  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2020/12/09
      Vol:
    E104-B No:6
      Page(s):
    664-674

    In recent years, wireless LANs (WLANs) are closely deployed which means they interfere with each other. Mobile stations (MSs) like smart phones that connect to such WLANs are also increasing. In such interfering environments, radio interference frequency depends on MS position. In addition, as MSs and their applications become diverse, frame generation rates from MSs are also becoming various. Thus, sufficient frame transmission opportunities should be assigned to MSs regardless of their radio interference frequencies and frame generation rates. One key technology to deal with this issue is uplink orthogonal frequency division multiple access (OFDMA) transmission introduced in IEEE 802.11ax. However, existing works do not consider the differences of the interference frequencies and frame generation rates among MSs in an integrated manner. This paper proposes an uplink frame transmission method for interfering WLAN environments that effectively uses the OFDMA transmission to assign enough transmission opportunities to MSs regardless of their own interference frequencies and frame generation rates, while efficiently using the channel resource. Considering the combined problem, this proposed method allocates resource units (RUs), created by dividing the channel, to MSs. In addition, based on a mathematical analysis of required frame transmission duration, the proposed method flexibly selects the OFDMA transmission or conventional frame transmission with CSMA/CA, which is also not considered in the existing works.

  • An Automatic Detection Approach of Traumatic Bleeding Based on 3D CNN Networks

    Lei YANG  Tingxiao YANG  Hiroki KIMURA  Yuichiro YOSHIMURA  Kumiko ARAI  Taka-aki NAKADA  Huiqin JIANG  Toshiya NAKAGUCHI  

     
    PAPER

      Pubricized:
    2021/01/18
      Vol:
    E104-A No:6
      Page(s):
    887-896

    In medical fields, detecting traumatic bleedings has always been a difficult task due to the small size, low contrast of targets and large number of images. In this work we propose an automatic traumatic bleeding detection approach from contrast enhanced CT images via deep CNN networks, containing segmentation process and classification process. CT values of DICOM images are extracted and processed via three different window settings first. Small 3D patches are cropped from processed images and segmented by a 3D CNN network. Then segmentation results are converted to point cloud data format and classified by a classifier. The proposed pre-processing approach makes the segmentation network be able to detect small and low contrast targets and achieve a high sensitivity. The additional classification network solves the boundary problem and short-sighted problem generated during the segmentation process to further decrease false positives. The proposed approach is tested with 3 CT cases containing 37 bleeding regions. As a result, a total of 34 bleeding regions are correctly detected, the sensitivity reaches 91.89%. The average false positive number of test cases is 1678. 46.1% of false positive predictions are decreased after being classified. The proposed method is proved to be able to achieve a high sensitivity and be a reference of medical doctors.

  • Graph Degree Heterogeneity Facilitates Random Walker Meetings

    Yusuke SAKUMOTO  Hiroyuki OHSAKI  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2020/12/14
      Vol:
    E104-B No:6
      Page(s):
    604-615

    Various graph algorithms have been developed with multiple random walks, the movement of several independent random walkers on a graph. Designing an efficient graph algorithm based on multiple random walks requires investigating multiple random walks theoretically to attain a deep understanding of their characteristics. The first meeting time is one of the important metrics for multiple random walks. The first meeting time on a graph is defined by the time it takes for multiple random walkers to meet at the same node in a graph. This time is closely related to the rendezvous problem, a fundamental problem in computer science. The first meeting time of multiple random walks has been analyzed previously, but many of these analyses focused on regular graphs. In this paper, we analyze the first meeting time of multiple random walks in arbitrary graphs and clarify the effects of graph structures on expected values. First, we derive the spectral formula of the expected first meeting time on the basis of spectral graph theory. Then, we examine the principal component of the expected first meeting time using the derived spectral formula. The clarified principal component reveals that (a) the expected first meeting time is almost dominated by $n/(1+d_{ m std}^2/d_{ mavg}^2)$ and (b) the expected first meeting time is independent of the starting nodes of random walkers, where n is the number of nodes of the graph. davg and dstd are the average and the standard deviation of weighted node degrees, respectively. Characteristic (a) is useful for understanding the effect of the graph structure on the first meeting time. According to the revealed effect of graph structures, the variance of the coefficient dstd/davg (degree heterogeneity) for weighted degrees facilitates the meeting of random walkers.

  • Suppression in Quality Variation for 360-Degree Tile-Based Video Streaming

    Arisa SEKINE  Masaki BANDAI  

     
    PAPER-Network

      Pubricized:
    2020/12/17
      Vol:
    E104-B No:6
      Page(s):
    616-623

    For 360-degree video streaming, a 360-degree video is divided into segments temporally (i.e. some seconds). Each segment consists of multiple video tiles spatially. In this paper, we propose a tile quality selection method for tile-based video streaming. The proposed method suppresses the spatial quality variation within the viewport caused by a change of the viewport region due to user head movement. In the proposed method, the client checks whether the difference in quality level between the viewport and the region around the viewport is large, and if so, reduces it when assigning quality levels. Simulation results indicate that when the segment length is long, quality variation can be suppressed without significantly reducing the perceived video quality (in terms of bitrate). In particular, the quality variation within the viewport can be greatly suppressed. Furthermore, we verify that the proposed method is effective in reducing quality variation within the viewport and across segments without changing the total download size.

  • Fabrication of Silicon Nanowires by Metal-Catalyzed Electroless Etching Method and Their Application in Solar Cell Open Access

    Naraphorn TUNGHATHAITHIP  Chutiparn LERTVACHIRAPAIBOON  Kazunari SHINBO  Keizo KATO  Sukkaneste TUNGASMITA  Akira BABA  

     
    BRIEF PAPER

      Pubricized:
    2020/12/08
      Vol:
    E104-C No:6
      Page(s):
    180-183

    We fabricated silicon nanowires (SiNWs) using a metal-catalyzed electroless etching method, which is known to be a low-cost and simple technique. The SiNW arrays with a length of 540 nm were used as a substrate of SiNWs/PEDOT:PSS hybrid solar cell. Furthermore, gold nanoparticles (AuNPs) were used to improve the light absorption of the device due to localized surface plasmon excitation. The results show that the short-circuit current density and the power conversion efficiency increased from 22.1 mA/cm2 to 26.0 mA/cm2 and 6.91% to 8.56%, respectively. The advantage of a higher interface area between the organic and inorganic semiconductors was established by using SiNW arrays and higher absorption light incorporated with AuNPs for improving the performance of the developed solar cell.

  • Two-Sided LPC-Based Speckle Noise Removal for Laser Speech Detection Systems

    Yahui WANG  Wenxi ZHANG  Xinxin KONG  Yongbiao WANG  Hongxin ZHANG  

     
    PAPER-Speech and Hearing

      Pubricized:
    2021/03/17
      Vol:
    E104-D No:6
      Page(s):
    850-862

    Laser speech detection uses a non-contact Laser Doppler Vibrometry (LDV)-based acoustic sensor to obtain speech signals by precisely measuring voice-generated surface vibrations. Over long distances, however, the detected signal is very weak and full of speckle noise. To enhance the quality and intelligibility of the detected signal, we designed a two-sided Linear Prediction Coding (LPC)-based locator and interpolator to detect and replace speckle noise. We first studied the characteristics of speckle noise in detected signals and developed a binary-state statistical model for speckle noise generation. A two-sided LPC-based locator was then designed to locate the polluted samples, composed of an inverse decorrelator, nonlinear filter and threshold estimator. This greatly improves the detectability of speckle noise and avoids false/missed detection by improving the noise-to-signal-ratio (NSR). Finally, samples from both sides of the speckle noise were used to estimate the parameters of the interpolator and to code samples for replacing the polluted samples. Real-world speckle noise removal experiments and simulation-based comparative experiments were conducted and the results show that the proposed method is better able to locate speckle noise in laser detected speech and highly effective at replacing it.

  • Differentially Private Neural Networks with Bounded Activation Function

    Kijung JUNG  Hyukki LEE  Yon Dohn CHUNG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/03/18
      Vol:
    E104-D No:6
      Page(s):
    905-908

    Deep learning has shown outstanding performance in various fields, and it is increasingly deployed in privacy-critical domains. If sensitive data in the deep learning model are exposed, it can cause serious privacy threats. To protect individual privacy, we propose a novel activation function and stochastic gradient descent for applying differential privacy to deep learning. Through experiments, we show that the proposed method can effectively protect the privacy and the performance of proposed method is better than the previous approaches.

  • Preliminary Performance Analysis of Distributed DNN Training with Relaxed Synchronization

    Koichi SHIRAHATA  Amir HADERBACHE  Naoto FUKUMOTO  Kohta NAKASHIMA  

     
    BRIEF PAPER

      Pubricized:
    2020/12/01
      Vol:
    E104-C No:6
      Page(s):
    257-260

    Scalability of distributed DNN training can be limited by slowdown of specific processes due to unexpected hardware failures. We propose a dynamic process exclusion technique so that training throughput is maximized. Our evaluation using 32 processes with ResNet-50 shows that our proposed technique reduces slowdown by 12.5% to 50% without accuracy loss through excluding the slow processes.

  • Action Recognition Using Pose Data in a Distributed Environment over the Edge and Cloud

    Chikako TAKASAKI  Atsuko TAKEFUSA  Hidemoto NAKADA  Masato OGUCHI  

     
    PAPER

      Pubricized:
    2021/02/02
      Vol:
    E104-D No:5
      Page(s):
    539-550

    With the development of cameras and sensors and the spread of cloud computing, life logs can be easily acquired and stored in general households for the various services that utilize the logs. However, it is difficult to analyze moving images that are acquired by home sensors in real time using machine learning because the data size is too large and the computational complexity is too high. Moreover, collecting and accumulating in the cloud moving images that are captured at home and can be used to identify individuals may invade the privacy of application users. We propose a method of distributed processing over the edge and cloud that addresses the processing latency and the privacy concerns. On the edge (sensor) side, we extract feature vectors of human key points from moving images using OpenPose, which is a pose estimation library. On the cloud side, we recognize actions by machine learning using only the feature vectors. In this study, we compare the action recognition accuracies of multiple machine learning methods. In addition, we measure the analysis processing time at the sensor and the cloud to investigate the feasibility of recognizing actions in real time. Then, we evaluate the proposed system by comparing it with the 3D ResNet model in recognition experiments. The experimental results demonstrate that the action recognition accuracy is the highest when using LSTM and that the introduction of dropout in action recognition using 100 categories alleviates overfitting because the models can learn more generic human actions by increasing the variety of actions. In addition, it is demonstrated that preprocessing using OpenPose on the sensor side can substantially reduce the transfer quantity from the sensor to the cloud.

  • Optimization of Hybrid Energy System Configuration for Marine Diesel Engine Open Access

    Guangmiao ZENG  Rongjie WANG  Ran HAN  

     
    PAPER-Algorithms and Data Structures

      Pubricized:
    2020/11/11
      Vol:
    E104-A No:5
      Page(s):
    786-796

    Because solar energy is intermittent and a ship's power-system load fluctuates and changes abruptly, in this work, the solar radiation parameters were adjusted according to the latitude and longitude of the ship and the change of the sea environment. An objective function was constructed that accounted for the cost and service life simultaneously to optimize the configuration of the marine diesel engine hybrid energy system. Finally, the improved artificial bee colony algorithm was used to optimize and obtain the optimal system configuration. The feasibility of the method was verified by ship navigation tests. This method exhibited better configuration performance optimization than the traditional methods.

  • A Modified Whale Optimization Algorithm for Pattern Synthesis of Linear Antenna Array

    Wentao FENG  Dexiu HU  

     
    LETTER-Numerical Analysis and Optimization

      Pubricized:
    2020/11/09
      Vol:
    E104-A No:5
      Page(s):
    818-822

    A modified whale optimization algorithm (MWOA) with dynamic leader selection mechanism and novel population updating procedure is introduced for pattern synthesis of linear antenna array. The current best solution is dynamic changed for each whale agent to overcome premature with local optima in iteration. A hybrid crossover operator is embedded in original algorithm to improve the convergence accuracy of solution. Moreover, the flow of population updating is optimized to balance the exploitation and exploration ability. The modified algorithm is tested on a 28 elements uniform linear antenna array to reduce its side lobe lever and null depth lever. The simulation results show that MWOA algorithm can improve the performance of WOA obviously compared with other algorithms.

  • NetworkAPI: An In-Band Signalling Application-Aware Traffic Engineering Using SRv6 and IP Anycast

    Takuya MIYASAKA  Yuichiro HEI  Takeshi KITAHARA  

     
    PAPER

      Pubricized:
    2021/02/22
      Vol:
    E104-D No:5
      Page(s):
    617-627

    Application-aware Traffic Engineering (TE) plays a crucial role in ensuring quality of services (QoS) for recently emerging applications such as AR, VR, cloud gaming, and connected vehicles. While a deterministic application-aware TE is required for these mission-critical applications, a negotiation procedure between applications and network operators needs to undergo major simplification to fulfill the scalability of the application based on emerging microservices and container-based architecture. In this paper, we propose a NetworkAPI framework which allows an application to indicate a desired TE behavior inside IP packets by leveraging Segment Routing over IPv6 (SRv6). In the NetworkAPI framework, the TE behavior provided by the network operator is expressed as an SRv6 Segment Identifier (SID) in the form of a 128-bit IPv6 address. Because the IPv6 address of an SRv6 SID is distributed using IP anycast, the application can utilize the unchanged SRv6 SID regardless of the application's location, as if the application controls an API on the transport network. We implement a prototype of the NetworkAPI framework on a Linux kernel. On the prototype implementation, a basic packet forwarding performance is evaluated to demonstrate the feasibility of our framework.

  • Joint Channel Allocation and Routing for ZigBee/Wi-Fi Coexistent Networks

    Yosuke TANIGAWA  Shu NISHIKORI  Kazuhiko KINOSHITA  Hideki TODE  Takashi WATANABE  

     
    PAPER

      Pubricized:
    2021/02/16
      Vol:
    E104-D No:5
      Page(s):
    575-584

    With the widespread diffusion of Internet of Things (IoT), the number of applications using wireless sensor devices are increasing, and Quality of Service (QoS) required for these applications is diversifying. Thus, it becomes difficult to satisfy a variety of QoS with a single wireless system, and many kinds of wireless systems are working in the same domains; time, frequency, and place. This paper considers coexistence environments of ZigBee and Wi-Fi networks, which use the same frequency band channels, in the same place. In such coexistence environments,ZigBee devices suffer radio interference from Wi-Fi networks, which results in severe ZigBee packet losses because the transmission power of Wi-Fi is much higher than that of ZigBee. Many existing methods to avoid interference from Wi-Fi networks focus on only one of time, frequency, or space domain. However, such avoidance in one domain is insufficient particularly in near future IoT environments where more ZigBee devices and Wi-Fi stations transfer more amount of data. Therefore, in this paper, we propose joint channel allocation and routing in both frequency and space domains. Finally, we show the effectiveness of the proposed method by computer simulation.

  • HAIF: A Hierarchical Attention-Based Model of Filtering Invalid Webpage

    Chaoran ZHOU  Jianping ZHAO  Tai MA  Xin ZHOU  

     
    PAPER

      Pubricized:
    2021/02/25
      Vol:
    E104-D No:5
      Page(s):
    659-668

    In Internet applications, when users search for information, the search engines invariably return some invalid webpages that do not contain valid information. These invalid webpages interfere with the users' access to useful information, affect the efficiency of users' information query and occupy Internet resources. Accurate and fast filtering of invalid webpages can purify the Internet environment and provide convenience for netizens. This paper proposes an invalid webpage filtering model (HAIF) based on deep learning and hierarchical attention mechanism. HAIF improves the semantic and sequence information representation of webpage text by concatenating lexical-level embeddings and paragraph-level embeddings. HAIF introduces hierarchical attention mechanism to optimize the extraction of text sequence features and webpage tag features. Among them, the local-level attention layer optimizes the local information in the plain text. By concatenating the input embeddings and the feature matrix after local-level attention calculation, it enriches the representation of information. The tag-level attention layer introduces webpage structural feature information on the attention calculation of different HTML tags, so that HAIF is better applicable to the Internet resource field. In order to evaluate the effectiveness of HAIF in filtering invalid pages, we conducted various experiments. Experimental results demonstrate that, compared with other baseline models, HAIF has improved to various degrees on various evaluation criteria.

  • Non-Invasive Monitoring of Respiratory Rate and Respiratory Status during Sleep Using a Passive Radio-Frequency Identification System

    Kagome NAYA  Toshiaki MIYAZAKI  Peng LI  

     
    PAPER-Biological Engineering

      Pubricized:
    2021/02/22
      Vol:
    E104-D No:5
      Page(s):
    762-771

    In recent years, checking sleep quality has become essential from a healthcare perspective. In this paper, we propose a respiratory rate (RR) monitoring system that can be used in the bedroom without wearing any sensor devices directly. To develop the system, passive radio-frequency identification (RFID) tags are introduced and attached to a blanket, instead of attaching them to the human body. The received signal strength indicator (RSSI) and phase values of the passive RFID tags are continuously obtained using an RFID reader through antennas located at the bedside. The RSSI and phase values change depending on the respiration of the person wearing the blanket. Thus, we can estimate the RR using these values. After providing an overview of the proposed system, the RR estimation flow is explained in detail. The processing flow includes noise elimination and irregular breathing period estimation methods. The evaluation demonstrates that the proposed system can estimate the RR and respiratory status without considering the user's body posture, body type, gender, or change in the RR.

  • MTGAN: Extending Test Case set for Deep Learning Image Classifier

    Erhu LIU  Song HUANG  Cheng ZONG  Changyou ZHENG  Yongming YAO  Jing ZHU  Shiqi TANG  Yanqiu WANG  

     
    PAPER-Software Engineering

      Pubricized:
    2021/02/05
      Vol:
    E104-D No:5
      Page(s):
    709-722

    During the recent several years, deep learning has achieved excellent results in image recognition, voice processing, and other research areas, which has set off a new upsurge of research and application. Internal defects and external malicious attacks may threaten the safe and reliable operation of a deep learning system and even cause unbearable consequences. The technology of testing deep learning systems is still in its infancy. Traditional software testing technology is not applicable to test deep learning systems. In addition, the characteristics of deep learning such as complex application scenarios, the high dimensionality of input data, and poor interpretability of operation logic bring new challenges to the testing work. This paper focuses on the problem of test case generation and points out that adversarial examples can be used as test cases. Then the paper proposes MTGAN which is a framework to generate test cases for deep learning image classifiers based on Generative Adversarial Network. Finally, this paper evaluates the effectiveness of MTGAN.

  • Efficient Hardware Accelerator for Compressed Sparse Deep Neural Network

    Hao XIAO  Kaikai ZHAO  Guangzhu LIU  

     
    LETTER-Computer System

      Pubricized:
    2021/02/19
      Vol:
    E104-D No:5
      Page(s):
    772-775

    This work presents a DNN accelerator architecture specifically designed for performing efficient inference on compressed and sparse DNN models. Leveraging the data sparsity, a runtime processing scheme is proposed to deal with the encoded weights and activations directly in the compressed domain without decompressing. Furthermore, a new data flow is proposed to facilitate the reusage of input activations across the fully-connected (FC) layers. The proposed design is implemented and verified using the Xilinx Virtex-7 FPGA. Experimental results show it achieves 1.99×, 1.95× faster and 20.38×, 3.04× more energy efficient than CPU and mGPU platforms, respectively, running AlexNet.

  • Deep Network for Parametric Bilinear Generalized Approximate Message Passing and Its Application in Compressive Sensing under Matrix Uncertainty

    Jingjing SI  Wenwen SUN  Chuang LI  Yinbo CHENG  

     
    LETTER-Digital Signal Processing

      Pubricized:
    2020/09/29
      Vol:
    E104-A No:4
      Page(s):
    751-756

    Deep learning is playing an increasingly important role in signal processing field due to its excellent performance on many inference problems. Parametric bilinear generalized approximate message passing (P-BiG-AMP) is a new approximate message passing based approach to a general class of structure-matrix bilinear estimation problems. In this letter, we propose a novel feed-forward neural network architecture to realize P-BiG-AMP methodology with deep learning for the inference problem of compressive sensing under matrix uncertainty. Linear transforms utilized in the recovery process and parameters involved in the input and output channels of measurement are jointly learned from training data. Simulation results show that the trained P-BiG-AMP network can achieve higher reconstruction performance than the P-BiG-AMP algorithm with parameters tuned via the expectation-maximization method.

  • Backbone Alignment and Cascade Tiny Object Detecting Techniques for Dolphin Detection and Classification

    Yih-Cherng LEE  Hung-Wei HSU  Jian-Jiun DING  Wen HOU  Lien-Shiang CHOU  Ronald Y. CHANG  

     
    PAPER-Image

      Pubricized:
    2020/09/29
      Vol:
    E104-A No:4
      Page(s):
    734-743

    Automatic tracking and classification are essential for studying the behaviors of wild animals. Owing to dynamic far-shooting photos, the occlusion problem, protective coloration, the background noise is irregular interference for designing a computerized algorithm for reducing human labeling resources. Moreover, wild dolphin images are hard-acquired by on-the-spot investigations, which takes a lot of waiting time and hardly sets the fixed camera to automatic monitoring dolphins on the ocean in several days. It is challenging tasks to detect well and classify a dolphin from polluted photos by a single famous deep learning method in a small dataset. Therefore, in this study, we propose a generic Cascade Small Object Detection (CSOD) algorithm for dolphin detection to handle small object problems and develop visualization to backbone based classification (V2BC) for removing noise, highlighting features of dolphin and classifying the name of dolphin. The architecture of CSOD consists of the P-net and the F-net. The P-net uses the crude Yolov3 detector to be a core network to predict all the regions of interest (ROIs) at lower resolution images. Then, the F-net, which is more robust, is applied to capture the ROIs from high-resolution photos to solve single detector problems. Moreover, a visualization to backbone based classification (V2BC) method focuses on extracting significant regions of occluded dolphin and design significant post-processing by referencing the backbone of dolphins to facilitate for classification. Compared to the state of the art methods, including faster-rcnn, yolov3 detection and Alexnet, the Vgg, and the Resnet classification. All experiments show that the proposed algorithm based on CSOD and V2BC has an excellent performance in dolphin detection and classification. Consequently, compared to the related works of classification, the accuracy of the proposed designation is over 14% higher. Moreover, our proposed CSOD detection system has 42% higher performance than that of the original Yolov3 architecture.

321-340hit(4079hit)