The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] features(84hit)

1-20hit(84hit)

  • A Channel Contrastive Attention-Based Local-Nonlocal Mutual Block on Super-Resolution Open Access

    Yuhao LIU  Zhenzhong CHU  Lifei WEI  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2024/04/23
      Vol:
    E107-D No:9
      Page(s):
    1219-1227

    In the realm of Single Image Super-Resolution (SISR), the meticulously crafted Nonlocal Sparse Attention-based block demonstrates its efficacy in noise reduction and computational cost reduction for nonlocal (global) features. However, it neglect the traditional Convolutional-based block, which proficient in handling local features. Thus, merging both the Nonlocal Sparse Attention-based block and the Convolutional-based block to concurrently manage local and nonlocal features poses a significant challenge. To tackle the aforementioned issues, this paper introduces the Channel Contrastive Attention-based Local-Nonlocal Mutual block (CCLN) for Super-Resolution (SR). (1) We introduce the CCLN block, encompassing the Local Sparse Convolutional-based block for local features and the Nonlocal Sparse Attention-based network block for nonlocal features. (2) We introduce Channel Contrastive Attention (CCA) blocks, incorporating Sparse Aggregation into Convolutional-based blocks. Additionally, we introduce a robust framework to fuse these two blocks, ensuring that each branch operates according to its respective strengths. (3) The CCLN block can seamlessly integrate into established network backbones like the Enhanced Deep Super-Resolution network (EDSR), achieving in the Channel Attention based Local-Nonlocal Mutual Network (CCLNN). Experimental results show that our CCLNN effectively leverages both local and nonlocal features, outperforming other state-of-the-art algorithms.

  • A CNN-Based Feature Pyramid Segmentation Strategy for Acoustic Scene Classification Open Access

    Ji XI  Yue XIE  Pengxu JIANG  Wei JIANG  

     
    LETTER-Speech and Hearing

      Pubricized:
    2024/03/26
      Vol:
    E107-D No:8
      Page(s):
    1093-1096

    Currently, a significant portion of acoustic scene categorization (ASC) research is centered around utilizing Convolutional Neural Network (CNN) models. This preference is primarily due to CNN’s ability to effectively extract time-frequency information from audio recordings of scenes by employing spectrum data as input. The expression of many dimensions can be achieved by utilizing 2D spectrum characteristics. Nevertheless, the diverse interpretations of the same object’s existence in different positions on the spectrum map can be attributed to the discrepancies between spectrum properties and picture qualities. The lack of distinction between different aspects of input information in ASC-based CNN networks may result in a decline in system performance. Considering this, a feature pyramid segmentation (FPS) approach based on CNN is proposed. The proposed approach involves utilizing spectrum features as the input for the model. These features are split based on a preset scale, and each segment-level feature is then fed into the CNN network for learning. The SoftMax classifier will receive the output of all feature scales, and these high-level features will be fused and fed to it to categorize different scenarios. The experiment provides evidence to support the efficacy of the FPS strategy and its potential to enhance the performance of the ASC system.

  • Effect of Return Current Cable in Three Different Calibration Environments on Ringing Damped Oscillations of Contact Discharge Current Waveform from ESD Generator

    Yukihiro TOZAWA  Takeshi ISHIDA  Jiaqing WANG  Osamu FUJIWARA  

     
    PAPER-Electromagnetic Compatibility(EMC)

      Pubricized:
    2023/09/06
      Vol:
    E106-B No:12
      Page(s):
    1455-1462

    Measurements of contact discharge current waveforms from an ESD generator with a test voltage of 4kV are conducted with the IEC specified arrangement of a 2m long return current cable in different three calibration environments that all comply with the IEC calibration standard to identify the occurrence source of damped oscillations (ringing), which has remained unclear since contact discharge testing was first adopted in 1989 IEC publication 801-2. Their frequency spectra are analyzed comparing with the spectrum calculated from the ideal contact discharge current waveform without ringing (IEC specified waveform) offered in IEC 61000-4-2 and the spectra derived from a simplified equivalent circuit based on the IEC standard in combination with the measured input impedances of one-ended grounding return current cable with the same arrangement in the same calibration environment as those for the current measurements. The results show that the measured contact discharge waveforms have ringing around the IEC specified waveform after the falling edge of the peak, causing their spectra from 20MHz to 200MHz, but the spectra from 40MHz to 200MHz significantly differ depending on the calibration environments even for the same cable arrangement, which do not almost affect the spectra from 20MHz to 40MHz and over 200MHz. In the calibration environment under the cable arrangement close to the reference ground, the spectral shapes of the measured contact discharge currents and their frequencies of the multiple peaks and dips roughly correspond to the spectral distributions calculated from the simplified equivalent circuit using the measured cable input impedances. These findings reveal that the root cause of ringing is mainly due to the resonances of the return current cable, and calibration environment under the cable arrangement away from the reference ground tends to mitigate the cable resonances.

  • No Reference Quality Assessment of Contrast-Distorted SEM Images Based on Global Features

    Fengchuan XU  Qiaoyue LI  Guilu ZHANG  Yasheng CHANG  Zixuan ZHENG  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2023/07/28
      Vol:
    E106-D No:11
      Page(s):
    1935-1938

    This letter presents a global feature-based method for evaluating the no reference quality of scanning electron microscopy (SEM) contrast-distorted images. Based on the characteristics of SEM images and the human visual system, the global features of SEM images are extracted as the score for evaluating image quality. In this letter, the texture information of SEM images is first extracted using a low-pass filter with orientation, and the amount of information in the texture part is calculated based on the entropy reflecting the complexity of the texture. The singular values with four scales of the original image are then calculated, and the amount of structural change between different scales is calculated and averaged. Finally, the amounts of texture information and structural change are pooled to generate the final quality score of the SEM image. Experimental results show that the method can effectively evaluate the quality of SEM contrast-distorted images.

  • Large-Scale Gaussian Process Regression Based on Random Fourier Features and Local Approximation with Tsallis Entropy

    Hongli ZHANG  Jinglei LIU  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/07/11
      Vol:
    E106-D No:10
      Page(s):
    1747-1751

    With the emergence of a large quantity of data in science and industry, it is urgent to improve the prediction accuracy and reduce the high complexity of Gaussian process regression (GPR). However, the traditional global approximation and local approximation have corresponding shortcomings, such as global approximation tends to ignore local features, and local approximation has the problem of over-fitting. In order to solve these problems, a large-scale Gaussian process regression algorithm (RFFLT) combining random Fourier features (RFF) and local approximation is proposed. 1) In order to speed up the training time, we use the random Fourier feature map input data mapped to the random low-dimensional feature space for processing. The main innovation of the algorithm is to design features by using existing fast linear processing methods, so that the inner product of the transformed data is approximately equal to the inner product in the feature space of the shift invariant kernel specified by the user. 2) The generalized robust Bayesian committee machine (GRBCM) based on Tsallis mutual information method is used in local approximation, which enhances the flexibility of the model and generates a sparse representation of the expert weight distribution compared with previous work. The algorithm RFFLT was tested on six real data sets, which greatly shortened the time of regression prediction and improved the prediction accuracy.

  • Prior Information Based Decomposition and Reconstruction Learning for Micro-Expression Recognition

    Jinsheng WEI  Haoyu CHEN  Guanming LU  Jingjie YAN  Yue XIE  Guoying ZHAO  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2023/07/13
      Vol:
    E106-D No:10
      Page(s):
    1752-1756

    Micro-expression recognition (MER) draws intensive research interest as micro-expressions (MEs) can infer genuine emotions. Prior information can guide the model to learn discriminative ME features effectively. However, most works focus on researching the general models with a stronger representation ability to adaptively aggregate ME movement information in a holistic way, which may ignore the prior information and properties of MEs. To solve this issue, driven by the prior information that the category of ME can be inferred by the relationship between the actions of facial different components, this work designs a novel model that can conform to this prior information and learn ME movement features in an interpretable way. Specifically, this paper proposes a Decomposition and Reconstruction-based Graph Representation Learning (DeRe-GRL) model to efectively learn high-level ME features. DeRe-GRL includes two modules: Action Decomposition Module (ADM) and Relation Reconstruction Module (RRM), where ADM learns action features of facial key components and RRM explores the relationship between these action features. Based on facial key components, ADM divides the geometric movement features extracted by the graph model-based backbone into several sub-features, and learns the map matrix to map these sub-features into multiple action features; then, RRM learns weights to weight all action features to build the relationship between action features. The experimental results demonstrate the effectiveness of the proposed modules, and the proposed method achieves competitive performance.

  • A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition

    Yang LIU  Yuqi XIA  Haoqin SUN  Xiaolei MENG  Jianxiong BAI  Wenbo GUAN  Zhen ZHAO  Yongwei LI  

     
    PAPER-Speech and Hearing

      Pubricized:
    2022/12/08
      Vol:
    E106-A No:6
      Page(s):
    876-885

    Speech emotion recognition (SER) has been a complex and difficult task for a long time due to emotional complexity. In this paper, we propose a multitask deep learning approach based on cascaded attention network and self-adaption loss for SER. First, non-personalized features are extracted to represent the process of emotion change while reducing external variables' influence. Second, to highlight salient speech emotion features, a cascade attention network is proposed, where spatial temporal attention can effectively locate the regions of speech that express emotion, while self-attention reduces the dependence on external information. Finally, the influence brought by the differences in gender and human perception of external information is alleviated by using a multitask learning strategy, where a self-adaption loss is introduced to determine the weights of different tasks dynamically. Experimental results on IEMOCAP dataset demonstrate that our method gains an absolute improvement of 1.97% and 0.91% over state-of-the-art strategies in terms of weighted accuracy (WA) and unweighted accuracy (UA), respectively.

  • Epileptic Seizure Prediction Using Convolutional Neural Networks and Fusion Features on Scalp EEG Signals

    Qixin LAN  Bin YAO  Tao QING  

     
    LETTER-Smart Healthcare

      Pubricized:
    2022/05/27
      Vol:
    E106-D No:5
      Page(s):
    821-823

    Epileptic seizure prediction is an important research topic in the clinical epilepsy treatment, which can provide opportunities to take precautionary measures for epilepsy patients and medical staff. EEG is an commonly used tool for studying brain activity, which records the electrical discharge of brain. Many studies based on machine learning algorithms have been proposed to solve the task using EEG signal. In this study, we propose a novel seizure prediction models based on convolutional neural networks and scalp EEG for a binary classification between preictal and interictal states. The short-time Fourier transform has been used to translate raw EEG signals into STFT sepctrums, which is applied as input of the models. The fusion features have been obtained through the side-output constructions and used to train and test our models. The test results show that our models can achieve comparable results in both sensitivity and FPR upon fusion features. The proposed patient-specific model can be used in seizure prediction system for EEG classification.

  • Learning Local Similarity with Spatial Interrelations on Content-Based Image Retrieval

    Longjiao ZHAO  Yu WANG  Jien KATO  Yoshiharu ISHIKAWA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2023/02/14
      Vol:
    E106-D No:5
      Page(s):
    1069-1080

    Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.

  • Comparative Evaluation of Diverse Features in Fluency Evaluation of Spontaneous Speech

    Huaijin DENG  Takehito UTSURO  Akio KOBAYASHI  Hiromitsu NISHIZAKI  

     
    PAPER-Speech and Hearing

      Pubricized:
    2022/10/25
      Vol:
    E106-D No:1
      Page(s):
    36-45

    There have been lots of previous studies on fluency evaluation of spontaneous speech. However, most of them focus on lexical cues, and little emphasis is placed on how diverse acoustic features and deep end-to-end models contribute to improving the performance. In this paper, we describe multi-layer neural network to investigate not only lexical features extracted from transcription, but also consider utterance-level acoustic features from audio data. We also conduct the experiments to investigate the performance of end-to-end approaches with mel-spectrogram in this task. As the speech fluency evaluation task, we evaluate our proposed method in two binary classification tasks of fluent speech detection and disfluent speech detection. Speech data of around 10 seconds duration each with the annotation of the three classes of “fluent,” “neutral,” and “disfluent” is used for evaluation. According to the two way splits of those three classes, the task of fluent speech detection is defined as binary classification of fluent vs. neutral and disfluent, while that of disfluent speech detection is defined as binary classification of fluent and neutral vs. disfluent. We then conduct experiments with the purpose of comparative evaluation of multi-layer neural network with diverse features as well as end-to-end models. For the fluent speech detection, in the comparison of utterance-level disfluency-based, prosodic, and acoustic features with multi-layer neural network, disfluency-based and prosodic features only are better. More specifically, the performance improved a lot when removing all of the acoustic features from the full set of features, while the performance is damaged a lot if fillers related features are removed. Overall, however, the end-to-end Transformer+VGGNet model with mel-spectrogram achieves the best results. For the disfluent speech detection, the multi-layer neural network using disfluency-based, prosodic, and acoustic features without fillers achieves the best results. The end-to-end Transformer+VGGNet architecture also obtains high scores, whereas it is exceeded by the best results with the multi-layer neural network with significant difference. Thus, unlike in the fluent speech detection, disfluency-based and prosodic features other than fillers are still necessary in the disfluent speech detection.

  • Orthogonal Deep Feature Decomposition Network for Cross-Resolution Person Re-Identification

    Rui SUN  Zi YANG  Lei ZHANG  Yiheng YU  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2022/08/23
      Vol:
    E105-D No:11
      Page(s):
    1994-1997

    Person images captured by surveillance cameras in real scenes often have low resolution (LR), which suffers from severe degradation in recognition performance when matched with pre-stocked high-resolution (HR) images. There are existing methods which typically employ super-resolution (SR) techniques to address the resolution discrepancy problem in person re-identification (re-ID). However, SR techniques are intended to enhance the human eye visual fidelity of images without caring about the recovery of pedestrian identity information. To cope with this challenge, we propose an orthogonal depth feature decomposition network. And we decompose pedestrian features into resolution-related features and identity-related features who are orthogonal to each other, from which we design the identity-preserving loss and resolution-invariant loss to ensure the recovery of pedestrian identity information. When compared with the SOTA method, experiments on the MLR-CUHK03 and MLR-VIPeR datasets demonstrate the superiority of our method.

  • Combating Password Vulnerability with Keystroke Dynamics Featured by WiFi Sensing

    Yuanwei HOU  Yu GU  Weiping LI  Zhi LIU  

     
    PAPER-Mobile Information Network and Personal Communications

      Pubricized:
    2022/04/01
      Vol:
    E105-A No:9
      Page(s):
    1340-1347

    The fast evolving credential attacks have been a great security challenge to current password-based information systems. Recently, biometrics factors like facial, iris, or fingerprint that are difficult to forge rise as key elements for designing passwordless authentication. However, capturing and analyzing such factors usually require special devices, hindering their feasibility and practicality. To this end, we present WiASK, a device-free WiFi sensing enabled Authentication System exploring Keystroke dynamics. More specifically, WiASK captures keystrokes of a user typing a pre-defined easy-to-remember string leveraging the existing WiFi infrastructure. But instead of focusing on the string itself which are vulnerable to password attacks, WiASK interprets the way it is typed, i.e., keystroke dynamics, into user identity, based on the biologically validated correlation between them. We prototype WiASK on the low-cost off-the-shelf WiFi devices and verify its performance in three real environments. Empirical results show that WiASK achieves on average 93.7% authentication accuracy, 2.5% false accept rate, and 5.1% false reject rate.

  • Deep Learning Based Low Complexity Symbol Detection and Modulation Classification Detector

    Chongzheng HAO  Xiaoyu DANG  Sai LI  Chenghua WANG  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2022/01/24
      Vol:
    E105-B No:8
      Page(s):
    923-930

    This paper presents a deep neural network (DNN) based symbol detection and modulation classification detector (SDMCD) for mixed blind signals detection. Unlike conventional methods that employ symbol detection after modulation classification, the proposed SDMCD can perform symbol recovery and modulation identification simultaneously. A cumulant and moment feature vector is presented in conjunction with a low complexity sparse autoencoder architecture to complete mixed signals detection. Numerical results show that SDMCD scheme has remarkable symbol error rate performance and modulation classification accuracy for various modulation formats in AWGN and Rayleigh fading channels. Furthermore, the proposed detector has robust performance under the impact of frequency and phase offsets.

  • Coarse-to-Fine Evolutionary Method for Fast Horizon Detection in Maritime Images

    Uuganbayar GANBOLD  Junya SATO  Takuya AKASHI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/09/08
      Vol:
    E104-D No:12
      Page(s):
    2226-2236

    Horizon detection is useful in maritime image processing for various purposes, such as estimation of camera orientation, registration of consecutive frames, and restriction of the object search region. Existing horizon detection methods are based on edge extraction. For accuracy, they use multiple images, which are filtered with different filter sizes. However, this increases the processing time. In addition, these methods are not robust to blurting. Therefore, we developed a horizon detection method without extracting the candidates from the edge information by formulating the horizon detection problem as a global optimization problem. A horizon line in an image plane was represented by two parameters, which were optimized by an evolutionary algorithm (genetic algorithm). Thus, the local and global features of a horizon were concurrently utilized in the optimization process, which was accelerated by applying a coarse-to-fine strategy. As a result, we could detect the horizon line on high-resolution maritime images in about 50ms. The performance of the proposed method was tested on 49 videos of the Singapore marine dataset and the Buoy dataset, which contain over 16000 frames under different scenarios. Experimental results show that the proposed method can achieve higher accuracy than state-of-the-art methods.

  • An Autoencoder Based Background Subtraction for Public Surveillance

    Yue LI  Xiaosheng YU  Haijun CAO  Ming XU  

     
    LETTER-Image

      Pubricized:
    2021/04/08
      Vol:
    E104-A No:10
      Page(s):
    1445-1449

    An autoencoder is trained to generate the background from the surveillance image by setting the training label as the shuffled input, instead of the input itself in a traditional autoencoder. Then the multi-scale features are extracted by a sparse autoencoder from the surveillance image and the corresponding background to detect foreground.

  • Image Emotion Recognition Using Visual and Semantic Features Reflecting Emotional and Similar Objects

    Takahisa YAMAMOTO  Shiki TAKEUCHI  Atsushi NAKAZAWA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/06/24
      Vol:
    E104-D No:10
      Page(s):
    1691-1701

    Visual sentiment analysis has a lot of applications, including image captioning, opinion mining, and advertisement; however, it is still a difficult problem and existing algorithms cannot produce satisfactory results. One of the difficulties in classifying images into emotions is that visual sentiments are evoked by different types of information - visual and semantic information where visual information includes colors or textures, and semantic information includes types of objects evoking emotions and/or their combinations. In contrast to the existing methods that use only visual information, this paper shows a novel algorithm for image emotion recognition that uses both information simultaneously. For semantic features, we introduce an object vector and a word vector. The object vector is created by an object detection method and reflects existing objects in an image. The word vector is created by transforming the names of detected objects through a word embedding model. This vector will be similar among objects that are semantically similar. These semantic features and a visual feature made by a fine-tuned convolutional neural network (CNN) are concatenated. We perform the classification by the concatenated feature vector. Extensive evaluation experiments using emotional image datasets show that our method achieves the best accuracy except for one dataset against other existing methods. The improvement in accuracy of our method from existing methods is 4.54% at the highest.

  • HAIF: A Hierarchical Attention-Based Model of Filtering Invalid Webpage

    Chaoran ZHOU  Jianping ZHAO  Tai MA  Xin ZHOU  

     
    PAPER

      Pubricized:
    2021/02/25
      Vol:
    E104-D No:5
      Page(s):
    659-668

    In Internet applications, when users search for information, the search engines invariably return some invalid webpages that do not contain valid information. These invalid webpages interfere with the users' access to useful information, affect the efficiency of users' information query and occupy Internet resources. Accurate and fast filtering of invalid webpages can purify the Internet environment and provide convenience for netizens. This paper proposes an invalid webpage filtering model (HAIF) based on deep learning and hierarchical attention mechanism. HAIF improves the semantic and sequence information representation of webpage text by concatenating lexical-level embeddings and paragraph-level embeddings. HAIF introduces hierarchical attention mechanism to optimize the extraction of text sequence features and webpage tag features. Among them, the local-level attention layer optimizes the local information in the plain text. By concatenating the input embeddings and the feature matrix after local-level attention calculation, it enriches the representation of information. The tag-level attention layer introduces webpage structural feature information on the attention calculation of different HTML tags, so that HAIF is better applicable to the Internet resource field. In order to evaluate the effectiveness of HAIF in filtering invalid pages, we conducted various experiments. Experimental results demonstrate that, compared with other baseline models, HAIF has improved to various degrees on various evaluation criteria.

  • Rethinking the Rotation Invariance of Local Convolutional Features for Content-Based Image Retrieval

    Longjiao ZHAO  Yu WANG  Jien KATO  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2020/10/14
      Vol:
    E104-D No:1
      Page(s):
    174-182

    Recently, local features computed using convolutional neural networks (CNNs) show good performance to image retrieval. The local convolutional features obtained by the CNNs (LC features) are designed to be translation invariant, however, they are inherently sensitive to rotation perturbations. This leads to miss-judgements in retrieval tasks. In this work, our objective is to enhance the robustness of LC features against image rotation. To do this, we conduct a thorough experimental evaluation of three candidate anti-rotation strategies (in-model data augmentation, in-model feature augmentation, and post-model feature augmentation), over two kinds of rotation attack (dataset attack and query attack). In the training procedure, we implement a data augmentation protocol and network augmentation method. In the test procedure, we develop a local transformed convolutional (LTC) feature extraction method, and evaluate it over different network configurations. We end up a series of good practices with steady quantitative supports, which lead to the best strategy for computing LC features with high rotation invariance in image retrieval.

  • ECG Classification with Multi-Scale Deep Features Based on Adaptive Beat-Segmentation

    Huan SUN  Yuchun GUO  Yishuai CHEN  Bin CHEN  

     
    PAPER

      Pubricized:
    2020/07/01
      Vol:
    E103-B No:12
      Page(s):
    1403-1410

    Recently, the ECG-based diagnosis system based on wearable devices has attracted more and more attention of researchers. Existing studies have achieved high classification accuracy by using deep neural networks (DNNs), but there are still some problems, such as: imprecise heart beat segmentation, inadequate use of medical knowledge, the same treatment of features with different importance. To address these problems, this paper: 1) proposes an adaptive segmenting-reshaping method to acquire abundant useful samples; 2) builds a set of hand-crafted features and deep features on the inner-beat, beat and inter-beat scale by integrating enough medical knowledge. 3) introduced a modified channel attention module (CAM) to augment the significant channels in deep features. Following the Association for Advancement of Medical Instrumentation (AAMI) recommendation, we classified the dataset into four classes and validated our algorithm on the MIT-BIH database. Experiments show that the accuracy of our model reaches 96.94%, a 3.71% increase over that of a state-of-the-art alternative.

  • Inpainting via Sparse Representation Based on a Phaseless Quality Metric

    Takahiro OGAWA  Keisuke MAEDA  Miki HASEYAMA  

     
    PAPER-Image

      Vol:
    E103-A No:12
      Page(s):
    1541-1551

    An inpainting method via sparse representation based on a new phaseless quality metric is presented in this paper. Since power spectra, phaseless features, of local regions within images enable more successful representation of their texture characteristics compared to their pixel values, a new quality metric based on these phaseless features is newly derived for image representation. Specifically, the proposed method enables spare representation of target signals, i.e., target patches, including missing intensities by monitoring errors converged by phase retrieval as the novel phaseless quality metric. This is the main contribution of our study. In this approach, the phase retrieval algorithm used in our method has the following two important roles: (1) derivation of the new quality metric that can be derived even for images including missing intensities and (2) conversion of phaseless features, i.e., power spectra, to pixel values, i.e., intensities. Therefore, the above novel approach solves the existing problem of not being able to use better features or better quality metrics for inpainting. Results of experiments showed that the proposed method using sparse representation based on the new phaseless quality metric outperforms previously reported methods that directly use pixel values for inpainting.

1-20hit(84hit)