The search functionality is under construction.
The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.59

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E102-D No.11  (Publication Date:2019/11/01)

    Special Section on Picture Coding and Image Media Processing
  • FOREWORD Open Access

    Toshiaki FUJII  

     
    FOREWORD

      Page(s):
    2082-2082
  • Depth from Defocus Technique Based on Cross Reblurring

    Kazumi TAKEMURA  Toshiyuki YOSHIDA  

     
    PAPER

      Pubricized:
    2019/07/11
      Page(s):
    2083-2092

    This paper proposes a novel Depth From Defocus (DFD) technique based on the property that two images having different focus settings coincide if they are reblurred with the opposite focus setting, which is referred to as the “cross reblurring” property in this paper. Based on the property, the proposed technique estimates the block-wise depth profile for a target object by minimizing the mean squared error between the cross-reblurred images. Unlike existing DFD techniques, the proposed technique is free of lens parameters and independent of point spread function models. A compensation technique for a possible pixel disalignment between images is also proposed to improve the depth estimation accuracy. The experimental results and comparisons with the other DFD techniques show the advantages of our technique.

  • Cauchy Aperture and Perfect Reconstruction Filters for Extending Depth-of-Field from Focal Stack Open Access

    Akira KUBOTA  Kazuya KODAMA  Asami ITO  

     
    PAPER

      Pubricized:
    2019/08/16
      Page(s):
    2093-2100

    A pupil function of aperture in image capturing systems is theoretically derived such that one can perfectly reconstruct all-in-focus image through linear filtering of the focal stack. The perfect reconstruction filters are also designed based on the derived pupil function. The designed filters are space-invariant; hence the presented method does not require region segmentation. Simulation results using synthetic scenes shows effectiveness of the derived pupil function and the filters.

  • Fast and Robust Disparity Estimation from Noisy Light Fields Using 1-D Slanted Filters

    Gou HOUBEN  Shu FUJITA  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Pubricized:
    2019/07/03
      Page(s):
    2101-2109

    Depth (disparity) estimation from a light field (a set of dense multi-view images) is currently attracting much research interest. This paper focuses on how to handle a noisy light field for disparity estimation, because if left as it is, the noise deteriorates the accuracy of estimated disparity maps. Several researchers have worked on this problem, e.g., by introducing disparity cues that are robust to noise. However, it is not easy to break the trade-off between the accuracy and computational speed. To tackle this trade-off, we have integrated a fast denoising scheme in a fast disparity estimation framework that works in the epipolar plane image (EPI) domain. Specifically, we found that a simple 1-D slanted filter is very effective for reducing noise while preserving the underlying structure in an EPI. Moreover, this simple filtering does not require elaborate parameter configurations in accordance with the target noise level. Experimental results including real-world inputs show that our method can achieve good accuracy with much less computational time compared to some state-of-the-art methods.

  • Light Field Coding Using Weighted Binary Images

    Koji KOMATSU  Kohei ISECHI  Keita TAKAHASHI  Toshiaki FUJII  

     
    PAPER

      Pubricized:
    2019/07/03
      Page(s):
    2110-2119

    We propose an efficient coding scheme for a dense light field, i.e., a set of multi-viewpoint images taken with very small viewpoint intervals. The key idea behind our proposal is that a light field is represented using only weighted binary images, where several binary images and corresponding weight values are chosen so as to optimally approximate the light field. The proposed coding scheme is completely different from those of modern image/video coding standards that involve more complex procedures such as intra/inter-frame prediction and transforms. One advantage of our method is the extreme simplicity of the decoding process, which will lead to a faster and less power-hungry decoder than those of the standard codecs. Another useful aspect of our proposal is that our coding method can be made scalable, where the accuracy of the decoded light field is improved in a progressive manner as we use more encoded information. Thanks to the divide-and-conquer strategy adopted for the scalable coding, we can also substantially reduce the computational complexity of the encoding process. Although our method is still in the early research phase, experimental results demonstrated that it achieves reasonable rate-distortion performances compared with those of the standard video codecs.

  • Personalized Food Image Classifier Considering Time-Dependent and Item-Dependent Food Distribution Open Access

    Qing YU  Masashi ANZAWA  Sosuke AMANO  Kiyoharu AIZAWA  

     
    PAPER

      Pubricized:
    2019/06/21
      Page(s):
    2120-2126

    Since the development of food diaries could enable people to develop healthy eating habits, food image recognition is in high demand to reduce the effort in food recording. Previous studies have worked on this challenging domain with datasets having fixed numbers of samples and classes. However, in the real-world setting, it is impossible to include all of the foods in the database because the number of classes of foods is large and increases continually. In addition to that, inter-class similarity and intra-class diversity also bring difficulties to the recognition. In this paper, we solve these problems by using deep convolutional neural network features to build a personalized classifier which incrementally learns the user's data and adapts to the user's eating habit. As a result, we achieved the state-of-the-art accuracy of food image recognition by the personalization of 300 food records per user.

  • Regular Section
  • Mapping a Quantum Circuit to 2D Nearest Neighbor Architecture by Changing the Gate Order Open Access

    Wakaki HATTORI  Shigeru YAMASHITA  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2019/07/25
      Page(s):
    2127-2134

    This paper proposes a new approach to optimize the number of necessary SWAP gates when we perform a quantum circuit on a two-dimensional (2D) NNA. Our new idea is to change the order of quantum gates (if possible) so that each sub-circuit has only gates performing on adjacent qubits. For each sub-circuit, we utilize a SAT solver to find the best qubit placement such that the sub-circuit has only gates on adjacent qubits. Each sub-circuit may have a different qubit placement such that we do not need SWAP gates for the sub-circuit. Thus, we insert SWAP gates between two sub-circuits to change the qubit placement which is desirable for the following sub-circuit. To reduce the number of such SWAP gates between two sub-circuits, we utilize A* algorithm.

  • Progressive Forwarding Disaster Backup among Cloud Datacenters

    Xiaole LI  Hua WANG  Shanwen YI  Linbo ZHAI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2019/08/19
      Page(s):
    2135-2147

    The periodic disaster backup activity among geographically distributed multiple datacenters consumes huge network resources and therefore imposes a heavy burden on datacenters and transmission links. Previous work aims at least completion time, maximum utility or minimal cost, without consideration of load balance for limited network resources, likely to result in unfair distribution of backup load or significant impact on daily network services. In this paper, we propose a new progressive forwarding disaster backup strategy in the Software Defined Network scenarios to mitigate forwarding burdens on source datacenters and balance backup loads on backup datacenters and transmission links. We construct a new redundancy-aware time-expanded network model to divide time slots according to redundancy requirement, and propose role-switching method over time to utilize forwarding capability of backup datacenters. In every time slot, we leverage two-step optimization algorithm to realize capacity-constrained backup datacenter selection and fair backup load distribution. Simulations results prove that our strategy achieves good performance in load balance under the condition of guaranteeing transmission completion and backup redundancy.

  • Improved LDA Model for Credibility Evaluation of Online Product Reviews

    Xuan WANG  Bofeng ZHANG  Mingqing HUANG  Furong CHANG  Zhuocheng ZHOU  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2019/08/22
      Page(s):
    2148-2158

    When individuals make a purchase from online sources, they may lack first-hand knowledge of the product. In such cases, they will judge the quality of the item by the reviews other consumers have posted. Therefore, it is significant to determine whether comments about a product are credible. Most often, conventional research on comment credibility has employed supervised machine learning methods, which have the disadvantage of needing large quantities of training data. This paper proposes an unsupervised method for judging comment credibility based on the Biterm Sentiment Latent Dirichlet Allocation (BS-LDA) model. Using this approach, first we derived some distributions and calculated each comment's credibility score via them. A comment's credibility was judged based on whether it achieved a threshold score. Our experimental results using comments from Amazon.com demonstrated that the overall performance of our approach can play an important role in determining the credibility of comments in some situation.

  • QSL: A Specification Language for E-Questionnaire, E-Testing, and E-Voting Systems

    Yuan ZHOU  Yuichi GOTO  Jingde CHENG  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2019/08/19
      Page(s):
    2159-2175

    Many kinds of questionnaires, testing, and voting are performed in some completely electronic ways to do questions and answers on the Internet as Web applications, i.e. e-questionnaire systems, e-testing systems, and e-voting systems. Because there is no unified communication tool among the stakeholders of e-questionnaire, e-testing, and e-voting systems, until now, all the e-questionnaire, e-testing, and e-voting systems are designed, developed, used, and maintained in various ad hoc ways. As a result, the stakeholders are difficult to communicate to implement the systems, because there is neither an exhaustive requirement list to have a grasp of the overall e-questionnaire, e-testing, and e-voting systems nor a standardized terminology for these systems to avoid ambiguity. A general-purpose specification language to provide a unified description way for specifying various e-questionnaire, e-testing, and e-voting systems can solve the problems such that the stakeholders can refer to and use the complete requirements and standardized terminology for better communications, and can easily and unambiguously specify all the requirements of systems and services of e-questionnaire, e-testing, and e-voting, even can implement the systems. In this paper, we propose the first specification language, named “QSL,” with a standardized, consistent, and exhaustive list of requirements for specifying various e-questionnaire, e-testing, and e-voting systems such that the specifications can be used as the precondition of automatically generating e-questionnaire, e-testing, and e-voting systems. The paper presents our design addressing that QSL can specify all the requirements of various e-questionnaire, e-testing, and e-voting systems in a structured way, evaluates its effectiveness, performs real applications using QSL in case of e-questionnaire, e-testing, and e-voting systems, and shows various QSL applications for providing convenient QSL services to stakeholders.

  • Peer-to-Peer Video Streaming of Non-Uniform Bitrate with Guaranteed Delivery Hops Open Access

    Satoshi FUJITA  

     
    PAPER-Information Network

      Pubricized:
    2019/08/09
      Page(s):
    2176-2183

    In conventional video streaming systems, various kind of video streams are delivered from a dedicated server (e.g., edge server) to the subscribers so that a video stream of higher quality level is encoded with a higher bitrate. In this paper, we consider the problem of delivering those video streams with the assistance of Peer-to-Peer (P2P) technology with as small server cost as possible while keeping the performance of video streaming in terms of the throughput and the latency. The basic idea of the proposed method is to divide a given video stream into several sub-streams called stripes as evenly as possible and to deliver those stripes to the subscribers through different tree-structured overlays. Such a stripe-based approach could average the load of peers, and could effectively resolve the overloading of the overlay for high quality video streams. The performance of the proposed method is evaluated numerically. The result of evaluations indicates that the proposed method significantly reduces the server cost necessary to guarantee a designated delivery hops, compared with a naive tree-based scheme.

  • Fast Datapath Processing Based on Hop-by-Hop Packet Aggregation for Service Function Chaining Open Access

    Yuki TAGUCHI  Ryota KAWASHIMA  Hiroki NAKAYAMA  Tsunemasa HAYASHI  Hiroshi MATSUO  

     
    PAPER-Information Network

      Pubricized:
    2019/08/22
      Page(s):
    2184-2194

    Many studies have revealed that the performance of software-based Virtual Network Functions (VNFs) is insufficient for mission-critical networks. Scaling-out approaches, such as auto-scaling of VNFs, could handle a huge amount of traffic; however, the exponential traffic growth confronts us the limitations of both expandability of physical resources and complexity of their management. In this paper, we propose a fast datapath processing method called Packet Aggregation Flow (PA-Flow) that is based on hop-by-hop packet aggregation for more efficient Service Function Chaining (SFC). PA-Flow extends a notion of existing intra-node packet aggregation toward network-wide packet aggregation, and we introduce following three novel features. First, packet I/O overheads at intermediate network devices including NFV-nodes are mitigated by reduction of packet amount. Second, aggregated packets are further aggregated as going through the service chain in a hop-by-hop manner. Finally, next-hop aware packet aggregation is realized using OpenFlow-based flow tables. PA-Flow is designed to be available with various VNF forms (e.g. VM/container/baremetal-based) and virtual I/O technologies (e.g. vhost-user/SR-IOV), and its implementation does not bring noticeable delay for aggregation. We conducted two evaluations: (i) a baseline evaluation for understanding fundamental performance characteristics of PA-Flow (ii) a simulation-based SFC evaluation for proving PA-Flow's effect in a realistic environment. The results showed that throughput of short packet forwarding was improved by 4 times. Moreover, the total number of packets was reduced by 93% in a large-scale SFC.

  • Personalized Trip Planning Considering User Preferences and Environmental Variables with Uncertainty

    Mingu KIM  Seungwoo HONG  Il Hong SUH  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/07/24
      Page(s):
    2195-2204

    Personalized trip planning is a challenging problem given that places of interest should be selected according to user preferences and sequentially arranged while satisfying various constraints. In this study, we aimed to model various uncertain aspects that should be considered during trip planning and efficiently generate personalized plans that maximize user satisfaction based on preferences and constraints. Specifically, we propose a probabilistic itinerary evaluation model based on a hybrid temporal Bayesian network that determines suitable itineraries considering preferences, constraints, and uncertain environmental variables. The model retrieves the sum of time-weighted user satisfaction, and ant colony optimization generates the trip plan that maximizes the objective function. First, the optimization algorithm generates candidate itineraries and evaluates them using the proposed model. Then, we improve candidate itineraries based on the evaluation results of previous itineraries. To validate the proposed trip planning approach, we conducted an extensive user study by asking participants to choose their preferred trip plans from options created by a human planner and our approach. The results show that our approach provides human-like trip plans, as participants selected our generated plans in 57% of the pairs. We also evaluated the efficiency of the employed ant colony optimization algorithm for trip planning by performance comparisons with other optimization methods.

  • A Trend-Shift Model for Global Factor Analysis of Investment Products

    Makoto KIRIHATA  Qiang MA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/08/13
      Page(s):
    2205-2213

    Recently, more and more people start investing. Understanding the factors affecting financial products is important for making investment decisions. However, it is difficult to understand factors for novices because various factors affect each other. Various technique has been studied, but conventional factor analysis methods focus on revealing the impact of factors over a certain period locally, and it is not easy to predict net asset values. As a reasonable solution for the prediction of net asset values, in this paper, we propose a trend shift model for the global analysis of factors by introducing trend change points as shift interference variables into state space models. In addition, to realize the trend shift model efficiently, we propose an effective trend detection method, TP-TBSM (two-phase TBSM), by extending TBSM (trend-based segmentation method). Comparing with TBSM, TP-TBSM could detect trends flexibly by reducing the dependence on parameters. We conduct experiments with eleven investment trust products and reveal the usefulness and effectiveness of the proposed model and method.

  • Multi-Hypothesis Prediction Scheme Based on the Joint Sparsity Model Open Access

    Can CHEN  Chao ZHOU  Jian LIU  Dengyin ZHANG  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2019/08/05
      Page(s):
    2214-2220

    Distributed compressive video sensing (DCVS) has received considerable attention due to its potential in source-limited communication, e.g., wireless video sensor networks (WVSNs). Multi-hypothesis (MH) prediction, which treats the target block as a linear combination of hypotheses, is a state-of-the-art technique in DCVS. The common approach is under the supposition that blocks that are dissimilar from the target block are given lower weights than blocks that are more similar. This assumption can yield acceptable reconstruction quality, but it is not suitable for scenarios with more details. In this paper, based on the joint sparsity model (JSM), the authors present a Tikhonov-regularized MH prediction scheme in which the most similar block provides the similar common portion and the others blocks provide respective unique portions, differing from the common supposition. Specifically, a new scheme for generating hypotheses and a Euclidean distance-based metric for the regularized term are proposed. Compared with several state-of-the-art algorithms, the authors show the effectiveness of the proposed scheme when there are a limited number of hypotheses.

  • Synchronized Tracking in Multiple Omnidirectional Cameras with Overlapping View

    Houari SABIRIN  Hitoshi NISHIMURA  Sei NAITO  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2019/07/24
      Page(s):
    2221-2229

    A multi-camera setup for a surveillance system enables a larger coverage area, especially when a single camera has limited monitoring capability due to certain obstacles. Therefore, for large-scale coverage, multiple cameras are the best option. In this paper, we present a method for detecting multiple objects using several cameras with large overlapping views as this allows synchronization of object identification from a number of views. The proposed method uses a graph structure that is robust enough to represent any detected moving objects by defining their vertices and edges to determine their relationships. By evaluating these object features, represented as a set of attributes in a graph, we can perform lightweight multiple object detection using several cameras, as well as performing object tracking within each camera's field of view and between two cameras. By evaluating each vertex hierarchically as a subgraph, we can further observe the features of the detected object and perform automatic separation of occluding objects. Experimental results show that the proposed method would improve the accuracy of object tracking by reducing the occurrences of incorrect identification compared to individual camera-based tracking.

  • Improving Slice-Based Model for Person Re-ID with Multi-Level Representation and Triplet-Center Loss

    Yusheng ZHANG  Zhiheng ZHOU  Bo LI  Yu HUANG  Junchu HUANG  Zengqun CHEN  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2019/08/19
      Page(s):
    2230-2237

    Person Re-Identification has received extensive study in the past few years and achieves impressive progress. Recent outstanding methods extract discriminative features by slicing feature maps of deep neural network into several stripes. Still there have improvement on feature fusion and metric learning strategy which can help promote slice-based methods. In this paper, we propose a novel framework that is end-to-end trainable, called Multi-level Slice-based Network (MSN), to capture features both in different levels and body parts. Our model consists of a dual-branch network architecture, one branch for global feature extraction and the other branch for local ones. Both branches process multi-level features using pyramid feature alike module. By concatenating the global and local features, distinctive features are exploited and properly compared. Also, our proposed method creatively introduces a triplet-center loss to elaborate combined loss function, which helps train the joint-learning network. By demonstrating the comprehensive experiments on the mainstream evaluation datasets including Market-1501, DukeMTMC, CUHK03, it indicates that our proposed model robustly achieves excellent performance and outperforms many of existing approaches. For example, on DukeMTMC dataset in single-query mode, we obtain a great result of Rank-1/mAP =85.9%(+1.0%)/74.2%(+4.7%).

  • A Local Multi-Layer Model for Tissue Classification of in-vivo Atherosclerotic Plaques in Intravascular Optical Coherence Tomography

    Xinbo REN  Haiyuan WU  Qian CHEN  Toshiyuki IMAI  Takashi KUBO  Takashi AKASAKA  

     
    PAPER-Biological Engineering

      Pubricized:
    2019/08/15
      Page(s):
    2238-2248

    Clinical researches show that the morbidity of coronary artery disease (CAD) is gradually increasing in many countries every year, and it causes hundreds of thousands of people all over the world dying for each year. As the optical coherence tomography with high resolution and better contrast applied to the lesion tissue investigation of human vessel, many more micro-structures of the vessel could be easily and clearly visible to doctors, which help to improve the CAD treatment effect. Manual qualitative analysis and classification of vessel lesion tissue are time-consuming to doctors because a single-time intravascular optical coherence (IVOCT) data set of a patient usually contains hundreds of in-vivo vessel images. To overcome this problem, we focus on the investigation of the superficial layer of the lesion region and propose a model based on local multi-layer region for vessel lesion components (lipid, fibrous and calcified plaque) features characterization and extraction. At the pre-processing stage, we applied two novel automatic methods to remove the catheter and guide-wire respectively. Based on the detected lumen boundary, the multi-layer model in the proximity lumen boundary region (PLBR) was built. In the multi-layer model, features extracted from the A-line sub-region (ALSR) of each layer was employed to characterize the type of the tissue existing in the ALSR. We used 7 human datasets containing total 490 OCT images to assess our tissue classification method. Validation was obtained by comparing the manual assessment with the automatic results derived by our method. The proposed automatic tissue classification method achieved an average accuracy of 89.53%, 93.81% and 91.78% for fibrous, calcified and lipid plaque respectively.

  • Truth Discovery of Multi-Source Text Data

    Chen CHANG  Jianjun CAO  Qin FENG  Nianfeng WENG  Yuling SHANG  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2019/08/22
      Page(s):
    2249-2252

    Most existing truth discovery approaches are designed for structured data, and cannot meet the strong need to extract trustworthy information from raw text data for its unique characteristics such as multifactorial property of text answers (i.e., an answer may contain multiple key factors) and the diversity of word usages (i.e., different words may have the same semantic meaning). As for text answers, there are no absolute correctness or errors, most answers may be partially correct, which is quite different from the situation of traditional truth discovery. To solve these challenges, we propose an optimization-based text truth discovery model which jointly groups keywords extracted from the answers of the specific question into a set of multiple factors. Then, we select the subset of multiple factors as identified truth set for each question by parallel ant colony synchronization optimization algorithm. After that, the answers to each question can be ranked based on the similarities between factors answer provided and identified truth factors. The experiment results on real dataset show that though text data structures are complex, our model can still find reliable answers compared with retrieval-based and state-of-the-art approaches.

  • Accelerating Stochastic Simulations on GPUs Using OpenCL

    Pilsung KANG  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2019/07/23
      Page(s):
    2253-2256

    Since first introduced in 2008 with the 1.0 specification, OpenCL has steadily evolved over the decade to increase its support for heterogeneous parallel systems. In this paper, we accelerate stochastic simulation of biochemical reaction networks on modern GPUs (graphics processing units) by means of the OpenCL programming language. In implementing the OpenCL version of the stochastic simulation algorithm, we carefully apply its data-parallel execution model to optimize the performance provided by the underlying hardware parallelism of the modern GPUs. To evaluate our OpenCL implementation of the stochastic simulation algorithm, we perform a comparative analysis in terms of the performance using the CPU-based cluster implementation and the NVidia CUDA implementation. In addition to the initial report on the performance of OpenCL on GPUs, we also discuss applicability and programmability of OpenCL in the context of GPU-based scientific computing.

  • Optimal Price-Based Power Allocation Algorithm with Quality of Service Constraints in Non-Orthogonal Multiple Access Networks

    Zheng-qiang WANG  Kun-hao HUANG  Xiao-yu WAN  Zi-fu FAN  

     
    LETTER-Information Network

      Pubricized:
    2019/07/29
      Page(s):
    2257-2260

    In this letter, we investigate the price-based power allocation for non-orthogonal multiple access (NOMA) networks, where the base station (BS) can admit the users to transmit by pricing their power. Stackelberg game is utilized to model the pricing and power purchasing strategies between the BS and the users. Based on backward induction, the pricing problem of the BS is recast into the non-convex power allocation problem, which is equivalent to the rate allocation problem by variable replacement. Based on the equivalence problem, an optimal price-based power allocation algorithm is proposed to maximize the revenue of the BS. Simulation results show that the proposed algorithm is superior to the existing pricing algorithm in items of the revenue of BS and the number of admitted users.

  • Rootkit inside GPU Kernel Execution

    Ohmin KWON  Hyun KWON  Hyunsoo YOON  

     
    LETTER-Dependable Computing

      Pubricized:
    2019/08/19
      Page(s):
    2261-2264

    We propose a rootkit installation method inside a GPU kernel execution process which works through GPU context manipulation. In GPU-based applications such as deep learning computations and cryptographic operations, the proposed method uses the feature by which the execution flow of the GPU kernel obeys the GPU context information in GPU memory. The proposed method consists of two key ideas. The first is GPU code manipulation, which is able to hijack the execution flow of the original GPU kernel to execute an injected payload without affecting the original GPU computation result. The second is a self-page-table update execution during which the GPU kernel updates its page table to access any location in system memory. After the installation, the malicious payload is executed only in the GPU kernel, and any no evidence remains in system memory. Thus, it cannot be detected by conventional rootkit detection methods.

  • Discriminative Convolutional Neural Network for Image Quality Assessment with Fixed Convolution Filters

    Motohiro TAKAGI  Akito SAKURAI  Masafumi HAGIWARA  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2019/08/09
      Page(s):
    2265-2266

    Current image quality assessment (IQA) methods require the original images for evaluation. However, recently, IQA methods that use machine learning have been proposed. These methods learn the relationship between the distorted image and the image quality automatically. In this paper, we propose an IQA method based on deep learning that does not require a reference image. We show that a convolutional neural network with distortion prediction and fixed filters improves the IQA accuracy.

  • Prediction-Based Scale Adaptive Correlation Filter Tracker

    Zuopeng ZHAO  Hongda ZHANG  Yi LIU  Nana ZHOU  Han ZHENG  Shanyi SUN  Xiaoman LI  Sili XIA  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2019/07/30
      Page(s):
    2267-2271

    Although correlation filter-based trackers have demonstrated excellent performance for visual object tracking, there remain several challenges to be addressed. In this work, we propose a novel tracker based on the correlation filter framework. Traditional trackers face difficulty in accurately adapting to changes in the scale of the target when the target moves quickly. To address this, we suggest a scale adaptive scheme based on prediction scales. We also incorporate a speed-based adaptive model update method to further improve overall tracking performance. Experiments with samples from the OTB100 and KITTI datasets demonstrate that our method outperforms existing state-of-the-art tracking algorithms in fast motion scenes.

  • High Noise Tolerant R-Peak Detection Method Based on Deep Convolution Neural Network

    Menghan JIA  Feiteng LI  Zhijian CHEN  Xiaoyan XIANG  Xiaolang YAN  

     
    LETTER-Biological Engineering

      Pubricized:
    2019/08/02
      Page(s):
    2272-2275

    An R-peak detection method with a high noise tolerance is presented in this paper. This method utilizes a customized deep convolution neural network (DCNN) to extract morphological and temporal features from sliced electrocardiogram (ECG) signals. The proposed network adopts multiple parallel dilated convolution layers to analyze features from diverse fields of view. A sliding window slices the original ECG signals into segments, and then the network calculates one segment at a time and outputs every point's probability of belonging to the R-peak regions. After a binarization and a deburring operation, the occurrence time of the R-peaks can be located. Experimental results based on the MIT-BIH database show that the R-peak detection accuracies can be significantly improved under high intensity of the electrode motion artifact or muscle artifact noise, which reveals a higher performance than state-of-the-art methods.

  • Estimation of the Matrix Rank of Harmonic Components of a Spectrogram in a Piano Music Signal Based on the Stein's Unbiased Risk Estimator and Median Filter Open Access

    Seokjin LEE  

     
    LETTER-Music Information Processing

      Pubricized:
    2019/08/22
      Page(s):
    2276-2279

    The estimation of the matrix rank of harmonic components of a music spectrogram provides some useful information, e.g., the determination of the number of basis vectors of the matrix-factorization-based algorithms, which is required for the automatic music transcription or in post-processing. In this work, we develop an algorithm based on Stein's unbiased risk estimator (SURE) algorithm with the matrix factorization model. The noise variance required for the SURE algorithm is estimated by suppressing the harmonic component via median filtering. An evaluation performed using the MIDI-aligned piano sounds (MAPS) database revealed an average estimation error of -0.26 (standard deviation: 4.4) for the proposed algorithm.