The search functionality is under construction.

Keyword Search Result

[Keyword] video processing(14hit)

1-14hit
  • Feasibility Study for Computer-Aided Diagnosis System with Navigation Function of Clear Region for Real-Time Endoscopic Video Image on Customizable Embedded DSP Cores

    Masayuki ODAGAWA  Tetsushi KOIDE  Toru TAMAKI  Shigeto YOSHIDA  Hiroshi MIENO  Shinji TANAKA  

     
    LETTER-VLSI Design Technology and CAD

      Pubricized:
    2021/07/08
      Vol:
    E105-A No:1
      Page(s):
    58-62

    This paper presents examination result of possibility for automatic unclear region detection in the CAD system for colorectal tumor with real time endoscopic video image. We confirmed that it is possible to realize the CAD system with navigation function of clear region which consists of unclear region detection by YOLO2 and classification by AlexNet and SVMs on customizable embedded DSP cores. Moreover, we confirmed the real time CAD system can be constructed by a low power ASIC using customizable embedded DSP cores.

  • Classification with CNN features and SVM on Embedded DSP Core for Colorectal Magnified NBI Endoscopic Video Image

    Masayuki ODAGAWA  Takumi OKAMOTO  Tetsushi KOIDE  Toru TAMAKI  Shigeto YOSHIDA  Hiroshi MIENO  Shinji TANAKA  

     
    PAPER-VLSI Design Technology and CAD

      Pubricized:
    2021/07/21
      Vol:
    E105-A No:1
      Page(s):
    25-34

    In this paper, we present a classification method for a Computer-Aided Diagnosis (CAD) system in a colorectal magnified Narrow Band Imaging (NBI) endoscopy. In an endoscopic video image, color shift, blurring or reflection of light occurs in a lesion area, which affects the discrimination result by a computer. Therefore, in order to identify lesions with high robustness and stable classification to these images specific to video frame, we implement a CAD system for colorectal endoscopic images with the Convolutional Neural Network (CNN) feature and Support Vector Machine (SVM) classification on the embedded DSP core. To improve the robustness of CAD system, we construct the SVM learned by multiple image sizes data sets so as to adapt to the noise peculiar to the video image. We confirmed that the proposed method achieves higher robustness, stable, and high classification accuracy in the endoscopic video image. The proposed method also can cope with differences in resolution by old and new endoscopes and perform stably with respect to the input endoscopic video image.

  • Video Smoke Removal from a Single Image Sequence Open Access

    Shiori YAMAGUCHI  Keita HIRAI  Takahiko HORIUCHI  

     
    PAPER

      Pubricized:
    2021/01/07
      Vol:
    E104-A No:6
      Page(s):
    876-886

    In this study, we present a novel method for removing smoke from videos based on a single image sequence. Smoke is a significant artifact in images or videos because it can reduce the visibility in disaster scenes. Our proposed method for removing smoke involves two main processes: (1) the development of a smoke imaging model and (2) smoke removal using spatio-temporal pixel compensation. First, we model the optical phenomena in natural scenes including smoke, which is called a smoke imaging model. Our smoke imaging model is developed by extending conventional haze imaging models. We then remove the smoke from a video in a frame-by-frame manner based on the smoke imaging model. Next, we refine the appearance of the smoke-free video by spatio-temporal pixel compensation, where we align the smoke-free frames using the corresponding pixels. To obtain the corresponding pixels, we use SIFT and color features with distance constraints. Finally, in order to obtain a clear video, we refine the pixel values based on the spatio-temporal weightings of the corresponding pixels in the smoke-free frames. We used simulated and actual smoke videos in our validation experiments. The experimental results demonstrated that our method can obtain effective smoke removal results from dynamic scenes. We also quantitatively assessed our method based on a temporal coherence measure.

  • An Efficient Block Assignment Policy in Hadoop Distributed File System for Multimedia Data Processing

    Cheolgi KIM  Daechul LEE  Jaehyun LEE  Jaehwan LEE  

     
    LETTER-Computer System

      Pubricized:
    2019/05/21
      Vol:
    E102-D No:8
      Page(s):
    1569-1571

    Hadoop, a distributed processing framework for big-data, is now widely used for multimedia processing. However, when processing video data from a Hadoop distributed file system (HDFS), unnecessary network traffic is generated due to an inefficient HDFS block slice policy for picture frames in video files. We propose a new block replication policy to solve this problem and compare the newly proposed HDFS with the original HDFS via extensive experiments. The proposed HDFS reduces network traffic, and increases locality between processing cores and file locations.

  • Distributed Video Decoding on Hadoop

    Illo YOON  Saehanseul YI  Chanyoung OH  Hyeonjin JUNG  Youngmin YI  

     
    PAPER-Cluster Computing

      Pubricized:
    2018/09/18
      Vol:
    E101-D No:12
      Page(s):
    2933-2941

    Video analytics is usually time-consuming as it not only requires video decoding as a first step but also usually applies complex computer vision and machine learning algorithms to the decoded frame. To achieve high efficiency in video analytics with ever increasing frame size, many researches have been conducted for distributed video processing using Hadoop. However, most approaches focused on processing multiple video files on multiple nodes. Such approaches require a number of video files to achieve any speedup, and could easily result in load imbalance when the size of video files is reasonably long since a video file itself is processed sequentially. In contrast, we propose a distributed video decoding method with an extended FFmpeg and VideoRecordReader, by which a single large video file can be processed in parallel across multiple nodes in Hadoop. The experimental results show that a case study of face detection and SURF system achieve 40.6 times and 29.1 times of speedups respectively on a four-node cluster with 12 mappers in each node, showing good scalability.

  • A Study on Quality Metrics for 360 Video Communications

    Huyen T. T. TRAN  Cuong T. PHAM  Nam PHAM NGOC  Anh T. PHAM  Truong Cong THANG  

     
    PAPER

      Pubricized:
    2017/10/16
      Vol:
    E101-D No:1
      Page(s):
    28-36

    360 videos have recently become a popular virtual reality content type. However, a good quality metric for 360 videos is still an open issue. In this work, our goal is to identify appropriate objective quality metrics for 360 video communications. Especially, fourteen objective quality measures at different processing phases are considered. Also, a subjective test is conducted in this study. The relationship between objective quality and subjective quality is investigated. It is found that most of the PSNR-related quality measures are well correlated with subjective quality. However, for evaluating video quality across different contents, a content-based quality metric is needed.

  • A 197mW 70ms-Latency Full-HD 12-Channel Video-Processing SoC in 16nm CMOS for In-Vehicle Information Systems

    Seiji MOCHIZUKI  Katsushige MATSUBARA  Keisuke MATSUMOTO  Chi Lan Phuong NGUYEN  Tetsuya SHIBAYAMA  Kenichi IWATA  Katsuya MIZUMOTO  Takahiro IRITA  Hirotaka HARA  Toshihiro HATTORI  

     
    PAPER

      Vol:
    E100-A No:12
      Page(s):
    2878-2887

    A 197mW 70ms-latency Full-HD 12-channel video-processing SoC for in-vehicle information systems has been implemented in 16nm CMOS. The SoC integrates 17 video processors of 6 types to operate video processing independently of other processing in CPU/GPU. The synchronous scheme between the video processors achieves 70ms low-latency for driver assistance. The optimized implementation of lossy and lossless video-data compression reduces memory access data by half and power consumption by 20%.

  • 3D Tracker-Level Fusion for Robust RGB-D Tracking

    Ning AN  Xiao-Guang ZHAO  Zeng-Guang HOU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2017/05/16
      Vol:
    E100-D No:8
      Page(s):
    1870-1881

    In this study, we address the problem of online RGB-D tracking which confronted with various challenges caused by deformation, occlusion, background clutter, and abrupt motion. Various trackers have different strengths and weaknesses, and thus a single tracker can merely perform well in specific scenarios. We propose a 3D tracker-level fusion algorithm (TLF3D) which enhances the strengths of different trackers and suppresses their weaknesses to achieve robust tracking performance in various scenarios. The fusion result is generated from outputs of base trackers by optimizing an energy function considering both the 3D cube attraction and 3D trajectory smoothness. In addition, three complementary base RGB-D trackers with intrinsically different tracking components are proposed for the fusion algorithm. We perform extensive experiments on a large-scale RGB-D benchmark dataset. The evaluation results demonstrate the effectiveness of the proposed fusion algorithm and the superior performance of the proposed TLF3D tracker against state-of-the-art RGB-D trackers.

  • Energy Efficiency Improvement by Dynamic Reconfiguration for Embedded Systems

    Kei KINOSHITA  Yoshiki YAMAGUCHI  Daisuke TAKANO  Tomoyuki OKAMURA  Tetsuhiko YAO  

     
    PAPER-Architecture

      Pubricized:
    2014/11/19
      Vol:
    E98-D No:2
      Page(s):
    220-229

    This paper seeks to improve power-performance efficiency of embedded systems by the use of dynamic reconfiguration. Programmable logic devices (PLDs) have the competence to optimize the power consumption by the use of partial and/or dynamic reconfiguration. It is a non-exclusive approach, which can use other power-reduction techniques simultaneous, and thus it is applicable to a myriad of systems. The power-performance improvement by dynamic reconfiguration was evaluated through an augmented reality system that translates Japanese into English. It is a wearable and mobile system with a head-mounted display (HMD). In the system, the computing core detects a Japanese word from an input video frame and the translated term will be output to the HMD. It includes various image processing approaches such as pattern recognition and object tracking, and these functions run sequentially. The system does not need to prepare all functions simultaneously, which provides a function by reconfiguration only when it is needed. In other words, by dynamic reconfiguration, the spatiotemporal module-based pipeline can introduce the reduction of its circuit amount and power consumption compared to the naive approach. The approach achieved marked improvements; the computational speed was the same but the power consumption was reduced to around $ rac{1}{6}$.

  • Super-Resolution Reconstruction for Spatio-Temporal Resolution Enhancement of Video Sequences

    Miki HASEYAMA  Daisuke IZUMI  Makoto TAKIZAWA  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E95-D No:9
      Page(s):
    2355-2358

    A method for spatio-temporal resolution enhancement of video sequences based on super-resolution reconstruction is proposed. A new observation model is defined for accurate resolution enhancement, which enables subpixel motion in intermediate frames to be obtained. A modified optimization formula for obtaining a high-resolution sequence is also adopted.

  • A 10T Non-precharge Two-Port SRAM Reducing Readout Power for Video Processing

    Hiroki NOGUCHI  Yusuke IGUCHI  Hidehiro FUJIWARA  Shunsuke OKUMURA  Yasuhiro MORITA  Koji NII  Hiroshi KAWAGUCHI  Masahiko YOSHIMOTO  

     
    PAPER

      Vol:
    E91-C No:4
      Page(s):
    543-552

    We propose a low-power non-precharge-type two-port SRAM for video processing that exploits statistical similarity in images. To minimize the charge/discharge power on a read bitline, the proposed memory cell (MC) has ten transistors (10T), comprised of the conventional 6T MC, a readout inverter and a transmission gate for a read port. In addition, to incorporate three wordlines, we propose a shared wordline structure, with which the vertical cell size of the 10T MC is fitted to the same size as the conventional 8T MC. Since the readout inverter fully charges/discharges a read bitline, there is no precharge circuit on the read bitline. Thus, power is not consumed by precharging, but is consumed only when a readout datum is changed. This feature is suitable to video processing since image data have spatial correlation and similar data are read out in consecutive cycles. As well as the power reduction, the prechargeless structure shortens a cycle time by 38% compared with the conventional SRAM, because it does not require a precharge period. This, in turn, demonstrates that the proposed SRAM operates at a lower voltage, which achieves further power reduction. Compared to the conventional 8T SRAM, the proposed SRAM reduces a charge/discharge possibility to 19% (81% saving) on the bitlines. As the measurement result, we confirmed that the proposed 64-kb video memory in a 90-nm process achieves an 85% power saving on the read bitline, when considered as an H.264 reconstructed image memory. The area overhead is 14.4%.

  • Semi-Automatic Video Object Segmentation Using LVQ with Color and Spatial Features

    Hariadi MOCHAMAD  Hui Chien LOY  Takafumi AOKI  

     
    PAPER-Image Processing and Multimedia Systems

      Vol:
    E88-D No:7
      Page(s):
    1553-1560

    This paper presents a semi-automatic algorithm for video object segmentation. Our algorithm assumes the use of multiple key video frames in which a semantic object of interest is defined in advance with human assistance. For video frames between every two key frames, the specified video object is tracked and segmented automatically using Learning Vector Quantization (LVQ). Each pixel of a video frame is represented by a 5-dimensional feature vector integrating spatial and color information. We introduce a parameter K to adjust the balance of spatial and color information. Experimental results demonstrate that the algorithm can segment the video object consistently with less than 2% average error when the object is moving at a moderate speed.

  • A Novel Computationally Adaptive Hardware Algorithm for Video Motion Estimation

    Vasily G. MOSHNYAGA  

     
    PAPER-Imaging Circuits and Algorithms

      Vol:
    E82-C No:9
      Page(s):
    1749-1754

    A new hardware algorithm for the block matching video motion estimation is presented. The algorithm works in the full-search fashion but unlike the Full-Search Block Matching Algorithm (FSBMA) it adjusts the number of computations dynamically to variable picture contents. Due to incorporated mechanism of data-driven thresholding, the proposed algorithm performs as four times as less operations comparing to the FSBMA while maintaining the same quality of results. Its hardware implementation is simple and compact. A supportive hardware design as well as simulation results on benchmarks are outlined.

  • System Electronics Technologies for Video Processing and Applications

    Tomio KISHIMOTO  Hironori YAMAUCHI  Ryota KASAI  

     
    INVITED PAPER

      Vol:
    E82-A No:2
      Page(s):
    197-205

    Thanks to rapid progress in computer technology and VLSI technology, we are approaching the stage where ordinary PCs will be able to handle real-time video signals as easily as they handle text data. First, features and applications of the video compression standard MPEG2 are surveyed as a typical video processing. It is clarified that real-time capability becomes more important as applications of MPEG2 widely spread. The trends of video coding in LSIs are summarized. And it is shown that the most advanced encoder/decoder LSI has an improved price-performance ratio that allows it to be adopted in consumer equipment. Finally, future directions of parallel architecture in video processing are surveyed in terms of special-purpose and general-purpose processing. The special approach has always taken the lead in video processing using sophisticated hardware-oriented parallel architectures. The general-purpose architecture method has gradually evolved in accordance with a software-oriented architecture. Both approaches will continue to evolve into a new stage by selecting possible parallel architectures such as multimedia instruction sets and process-level parallelism, and applying them in compound use. The so-called super processor architecture will emerge in the near future and it will be an ideal method that can manage rapid increase in requirements of capability and applicability in video processing.