Yutaka TAKAGI Takanori FUJISAWA Masaaki IKEHARA
In this paper, we propose a method for removing block noise which appears in JPEG (Joint Photographic Experts Group) encoded images. We iteratively perform the 3D wiener filtering and correction of the coefficients. In the wiener filtering, we perform the block matching for each patch in order to get the patches which have high similarities to the reference patch. After wiener filtering, the collected patches are returned to the places where they were and aggregated. We compare the performance of the proposed method to some conventional methods, and show that the proposed method has an excellent performance.
Yibo FAN Leilei HUANG Zheng XIE Xiaoyang ZENG
In the newly finalized video coding standard, namely high efficiency video coding (HEVC), new notations like coding unit (CU), prediction unit (PU) and transformation unit (TU) are introduced to improve the coding performance. As a result, the reconstruction loop in intra encoding is heavily burdened to choose the best partitions or modes for them. In order to solve the bottleneck problems in cycle and hardware cost, this paper proposed a high-throughput and compact implementation for such a reconstruction loop. By “high-throughput”, it refers to that it has a fixed throughput of 32 pixel/cycle independent of the TU/PU size (except for 4×4 TUs). By “compact”, it refers to that it fully explores the reusability between discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) as well as that between quantization (Q) and de-quantization (IQ). Besides the contributions made in designing related hardware, this paper also provides a universal formula to analyze the cycle cost of the reconstruction loop and proposed a parallel-process scheme to further reduce the cycle cost. This design is verified on the Stratix IV FPGA. The basic structure achieved a maximum frequency of 150MHz and a hardware cost of 64K ALUTs, which could support the real time TU/PU partition decision for 4K×2K@20fps videos.
Ngoc-Giao PHAM Suk-Hwan LEE Ki-Ryong KWON
Nowadays, vector map content is widely used in the areas of life, science and the military. Due to the fact that vector maps bring great value and that their production process is expensive, a large volume of vector map data is attacked, stolen and illegally distributed by pirates. Thus, vector map data must be encrypted before being stored and transmitted in order to ensure the access and to prevent illegal copying. This paper presents a novel perceptual encryption algorithm for ensuring the secured storage and transmission of vector map data. Polyline data of vector maps are extracted to interpolate a spline curve, which is represented by an interpolating vector, the curvature degree coefficients, and control points. The proposed algorithm is based on encrypting the control points of the spline curve in the frequency domain of discrete cosine transform. Control points are transformed and selectively encrypted in the frequency domain of discrete cosine transform. They are then used in an inverse interpolation to generate the encrypted vector map. Experimental results show that the entire vector map is altered after the encryption process, and the proposed algorithm is very effective for a large dataset of vector maps.
We present a lifting-based lapped transform (L-LT) and a reversible symmetric extension (RSE) in the boundary processing for more effective lossy-to-lossless image coding of data with various qualities from only one piece of lossless compressed data. The proposed dual-DCT-lifting-based LT (D2L-LT) parallel processes two identical LTs and consists of 1-D and 2-D DCT-liftings which allow the direct use of a DCT matrix in each lifting coefficient. Since the DCT-lifting can utilize any existing DCT software or hardware, it has great potential for elegant implementations that are dependent on the architecture and DCT algorithm used. In addition, we present an improved RSE (IRSE) that works by recalculating the boundary processing and solves the boundary problem that the DCT-lifting-based L-LT (DL-LT) has. We show that D2L-LT with IRSE mostly outperforms conventional L-LTs in lossy-to-lossless image coding.
Jin XU Yuansong QIAO Zhizhong FU
Because the perceptual compressive sensing framework can achieve a much better performance than the legacy compressive sensing framework, it is very promising for the compressive sensing based image compression system. In this paper, we propose an innovative adaptive perceptual block compressive sensing scheme. Firstly, a new block-based statistical metric which can more appropriately measure each block's sparsity and perceptual sensibility is devised. Then, the approximated theoretical minimum measurement number for each block is derived from the new block-based metric and used as weight for adaptive measurements allocation. The obtained experimental results show that our scheme can significantly enhance both objective and subjective performance of a perceptual compressive sensing framework.
Yazhong ZHANG Jinjian WU Guangming SHI Xuemei XIE Yi NIU Chunxiao FAN
Reduced-reference (RR) image quality assessment (IQA) algorithm aims to automatically evaluate the distorted image quality with partial reference data. The goal of RR IQA metric is to achieve higher quality prediction accuracy using less reference information. In this paper, we introduce a new RR IQA metric by quantifying the difference of discrete cosine transform (DCT) entropy features between the reference and distorted images. Neurophysiological evidences indicate that the human visual system presents different sensitivities to different frequency bands. Moreover, distortions on different bands result in individual quality degradations. Therefore, we suggest to calculate the information degradation on each band separately for quality assessment. The information degradations are firstly measured by the entropy difference of reorganized DCT coefficients. Then, the entropy differences on all bands are pooled to obtain the quality score. Experimental results on LIVE, CSIQ, TID2008, Toyama and IVC databases show that the proposed method performs highly consistent with human perception with limited reference data (8 values).
Chun-Hung CHEN Yuan-Liang TANG Wen-Shyong HSIEH
Digital watermarking techniques have been used to assert the ownerships of digital images. The ownership information is embedded in an image as a watermark so that the owner of the image can be identified. However, many types of attacks have been used in attempts to break or remove embedded watermarks. Therefore, the watermark should be very robust against various kinds of attacks. Among them, the print-and-scan (PS) attack is very challenging because it not only alters the pixel values but also changes the positions of the original pixels. In this paper, we propose a watermarking system operating in the discrete cosine transform (DCT) domain. The polarities of the DCT coefficients are modified for watermark embedding. This is done by considering the properties of DCT coefficients under the PS attack. The proposed system is able to maintain the image quality after watermarking and the embedded watermark is very robust against the PS attack as well.
We propose a computing method for linear convolution and linear correlation between sequences using discrete cosine transform (DCT). Zero-padding is considered as well as linear convolution using discrete Fourier transform (DFT). Analyzing the circular convolution between symmetrically extended sequences, we derive the condition for zero-padding before and after the sequences. The proposed method can calculate linear convolution for any filter and also calculate linear correlation without reversing one of the input sequences. The computational complexity of the proposed method is lower than that of linear convolution using DFT.
This paper presents an M-channel (M=2n (n ∈ N)) integer discrete cosine transforms (IntDCTs) based on fast Hartley transform (FHT) for lossy-to-lossless image coding which has image quality scalability from lossy data to lossless data. Many IntDCTs with lifting structures have already been presented to achieve lossy-to-lossless image coding. Recently, an IntDCT based on direct-lifting of DCT/IDCT, which means direct use of DCT and inverse DCT (IDCT) to lifting blocks, has been proposed. Although the IntDCT shows more efficient coding performance than any conventional IntDCT, it entails many computational costs due to an extra information that is a key point to realize its direct-lifting structure. On the other hand, the almost conventional IntDCTs without an extra information cannot be easily expanded to a larger size than the standard size M=8, or the conventional IntDCT should be improved for efficient coding performance even if it realizes an arbitrary size. The proposed IntDCT does not need any extra information, can be applied to size M=2n for arbitrary n, and shows better coding performance than the conventional IntDCTs without any extra information by applying the direct-lifting to the pre- and post-processing block of DCT. Moreover, the proposed IntDCT is implemented with a half of the computational cost of the IntDCT based on direct-lifting of DCT/IDCT even though it shows the best coding performance.
Phat NGUYEN HUU Vinh TRAN-QUANG Takumi MIYOSHI
This paper proposes two algorithms to balance energy consumption among sensor nodes by distributing the workload of image compression tasks within a cluster on wireless sensor networks. The main point of the proposed algorithms is to adopt the energy threshold, which is used when we implement the exchange and/or assignment of tasks among sensor nodes. The threshold is well adaptive to the residual energy of sensor nodes, input image, compressed output, and network parameters. We apply the lapped transform technique, an extended version of the discrete cosine transform, and run length encoding before Lempel-Ziv-Welch coding to the proposed algorithms to improve both quality and compression rate in image compression scheme. We extensively conduct computational experiments to verify the our methods and find that the proposed algorithms achieve not only balancing the total energy consumption among sensor nodes and, thus, increasing the overall network lifetime, but also reducing block noise in image compression.
This paper presents an integer discrete cosine transform (IntDCT) with only dyadic values such as k/2n (k, n∈ in N). Although some conventional IntDCTs have been proposed, they are not suitable for lossless-to-lossy image coding in low-bit-word-length (coefficients) due to the degradation of the frequency decomposition performance in the system. First, the proposed M-channel lossless Walsh-Hadamard transform (LWHT) can be constructed by only (log2M)-bit-word-length and has structural regularity. Then, our 8-channel IntDCT via LWHT keeps good coding performance even if low-bit-word-length is used because LWHT, which is main part of IntDCT, can be implemented by only 3-bit-word-length. Finally, the validity of our method is proved by showing the results of lossless-to-lossy image coding in low-bit-word-length.
Daisuke TAKEDA Yasuhiko TANABE
Channel estimation is a key baseband processing task in wireless systems. Filtering or smoothing algorithms can improve the accuracy of channel estimates and the Discrete Cosine Transform (DCT) can be used for this purpose. By using the DCT, performance will be improved compared to the straight-forward approach of per subcarrier estimation (PSE). However, the complexity of the DCT is not negligible. This paper proposes a low-complexity channel estimation scheme using the DCT. Simulation results show that the performance is improved by more than 1dB compared with PSE in MIMO-OFDM system.
Jiang YIWEI Xu DE Liu NA Lang CONGYAN
Moving object completion is a process of completing moving object's missing information based on local structures. Over the past few years, a number of computable algorithms of video completion have been developed, however most of these algorithms are based on the pixel domain. Little theoretical and computational work in video completion is based on the compressed domain. In this paper, a moving object completion method on the compressed domain is proposed. It is composed of three steps: motion field transferring, thin plate spline interpolation and combination. Missing space-time blocks will be completed by placing new motion vectors on them so that the resulting video sequence will have as much global visual coherence with the video portions outside the hole. The experimental results are presented to demonstrate the efficiency and accuracy of the proposed algorithm.
Sung-Chang LIM Dae-Yeon KIM Yung-Lyul LEE
In this paper, an alternative transform based on the correlation of the residual block is proposed for the improvement of the H.264/AVC coding efficiency. A discrete sine transform is used alternately with a discrete cosine transform in order to greatly compact the energy of the signal when the correlation coefficients of the signal are in the range of -0.5 to 0.5. Therefore, the discrete sine transform is suggested to be used in conjunction with the discrete cosine transform in H.264/AVC. The alternative transform selecting the optimal transform between two transforms by using rate-distortion optimization shows a coding gain compared with H.264/AVC. The proposed method achieves a PSNR gain of up to 1.0 dB compared to JM 10.2 at relatively high bitrates.
Yi-Wei JIANG De XU Moon-Ho LEE Cong-Yan LANG
Visual inpainting is an interpolation problem that restores an image or a frame with missing or damaged parts. Over the past decades, a number of computable models of visual inpainting have been developed, but most of these models are based on the pixel domain. Little theoretical and computational work of visual inpainting is based on the compressed domain. In this paper, a visual inpainting model in the discrete cosine transform (DCT) domain is proposed. DCT coefficients of the non-inpainting blocks are utilized to get block features, and those block features are propagated to the inpainting region iteratively. The experimental results with I frames of MPEG4 are presented to demonstrate the efficiency and accuracy of the proposed algorithm.
Kiyotaka WATANABE Yoshio IWAI Hajime NAGAHARA Masahiko YACHIDA Toshiya SUZUKI
We propose a novel strategy to obtain a high spatio-temporal resolution video. To this end, we introduce a dual sensor camera that can capture two video sequences with the same field of view simultaneously. These sequences record high resolution with low frame rate and low resolution with high frame rate. This paper presents an algorithm to synthesize a high spatio-temporal resolution video from these two video sequences by using motion compensation and spectral fusion. We confirm that the proposed method improves the resolution and frame rate of the synthesized video.
In recent years, digital watermarking has become a popular technique for labeling digital images by hiding secret information which can protect the copyright. The goal of this paper is to develop a DCT-based watermarking algorithm for low power and high performance. Our energy-efficient technique focuses on reducing computation required on block-based permutation. Instead of using spacial coefficients proposed by Hsu and Wu's algorithm [1], we use DCT coefficients to pair blocks directly. The approach is implemented by C language and estimated power dissipation using Wattch toolset. The experimental results show that our approach not only reduces 99% energy consumption of pairing mechanism, but also increase the PSNR by 0.414 db for the best case. Moreover, the proposed approach is robust to a variety of signal distortions, such as JPEG, image cropping, sharpening, blurring, and intensity adjusting.
In the letter, the fast one-dimensional (1-D) and two-dimensional (2-D) algorithms for realizing low-complexity 44 discrete cosine transform (DCT) for H.264 applications are developed. Through applying matrix utilizations with Kronecker product and direct sum, the efficient fast 2-D 44 DCT algorithm can be developed from the proposed fast 1-D 44 DCT algorithm by matrix decompositions. The fast 1-D and 2-D low-complexity 44 DCT algorithms requires fewer multiplications and additions than other fast DCT algorithms. Owing to regular modularity, the proposed fast algorithms can achieve real-time H.264 video signal processing with VLSI implementation.
The rate-distortion optimization (RDO) method is an informative technology that improves the coding efficiency, but increases the computational complexity, of the H.264 encoder. In this letter, a fast Macroblock mode determination algorithm is proposed to reduce the computational complexity of the H.264 encoder. The proposed method reduces the encoder complexity by 55%, while maintaining the same level of coding efficiency.
In this paper a new watermarking technique which is combined with joint photographic experts group (JPEG) encoding system is presented. This method operates in the frequency domain by embedding a pseudo-random sequence of real numbers in a selected set of discrete cosine transform (DCT) coefficients. The embedded sequence is extracted without restoring the original image to fit the trend in the digital still camera (DSC) system. The proposed technique represents a major improvement on methods relying on the comparison between the watermarked and original images. Experimental results show that the proposed watermarking method is robust to several common image processing techniques, including JPEG compression, noise, and blurring. We also implement the whole design by synthesizing with TSMC 1P4M 0.35 µm standard cell. The chip size is 3.0643.064 mm2 for 46374 gate counts. The simulation speed can reach 50 MHz. The power dissipation is 69 mW at 3.3 V 50 MHz.