1-6hit |
Masayuki HASHIMOTO Kenji MATSUO Atsushi KOIKE Yasuyuki NAKAJIMA
This paper proposes the tile size conversion method for the wavelet image transcoding gateway and a set of methods to reduce the tile boundary artifacts caused by the conversion. In the wavelet image coding system represented by JPEG 2000, pictures are usually divided into one or more tiles and each tile is then transformed separately. On low memory terminals such as mobile terminals, some decoders are likely to have limits on what tile sizes they can decode. Assuming a system using these limited decoders, methods were investigated for converting the tile size quickly and automatically at the gateway when image data with a non-decodable tile size is received at the gateway from another system. Furthermore, tile boundary artifacts reduction methods are investigated. This paper verifies the validity of the proposed scheme by implementing it with a (5, 3) reversible filter and a (9, 7) irreversible filter. In addition, we implemented the tile size conversion gateway and evaluated the performance of the processing time. The results show the validity of the conversion gateway.
Hiroshi HASEGAWA Masao KASUGA Shuichi MATSUMOTO Atsushi KOIKE
HRTFs (head-related transfer functions) are available for sound field reproduction with spatial fidelity, since HRTFs involve the acoustic cues such as interaural time difference, interaural intensity difference and spectral cues that are used for the perception of the location of a sound image. Generally, FIR filters are used in the simulation of HRTFs. However, this method is not useful for a simply system, since the orders of the FIR filters are high. In this paper, we propose a method using IIR filter for simply realization of sound image localization. The HRTFs of a dummy-head were approximated by the following filters: (A) fourth to seventh-order IIR filters and (B) third-order IIR filters. In total, the HRTFs of 24 different directions on the horizontal plane were used as the target characteristics. Sound localization experiments for the direction and the elevation angle of a sound image were carried out for 3 subjects in a soundproof chamber. The binaural signal sounds using the HRTFs simulated by FIR filters and approximated by IIR filters (A) and (B) were reproduced via two loudspeakers, and sound image localization on the horizontal plane was realized. As the result of the experiments, the sound image localization using the HRTFs approximated by IIR filters (A) is the same accuracy as the case of using the FIR filters. This result shows that it is possible to create sound fields with binaural reproduction more simply.
Atsushi KOIKE Satoshi KATSUNO Yoshinori HATORI
Hybrid image coding method is one of the most promising methods for efficient coding of moving images. The method makes use of jointly motion-compensated prediction and orthogonal transform like DCT. This type of coding scheme was adopted in several world standards such as H.261 and MPEG in ITU-T and ISO as a basic framework [1], [2]. Most of the work done in motion-compensated prediction has been based on a block matching method. However, when input moving images include complicated motion like rotation or enlargement, it often causes block distortion in decoded images, especially in the case of very low bit-rate image coding. Recently, as one way of solving this problem, some motion-compensated prediction methods based on an affine transform or bilinear transform were developed [3]-[8]. These methods, however, cannot always express the appearance of the motion in the image plane, which is projected plane form 3-D space to a 2-D plane, since the perspective transform is usually assumed. Also, a motion-compensation method using a perspective transform was discussed in Ref, [6]. Since the motion detection method is defined as an extension of the block matching method, it can not always detect motion parameters accurately when compared to gradient-based motion detection. In this paper, we propose a new motion-compensated prediction method for coding of moving images, especially for very low bit-rate image coding such as less than 64 kbit/s. The proposed method is based on a perspective transform and the constraint principle for the temporal and spatial gradients of pixel value, and complicated motion in the image plane including rotation and enlargement based on camera zooming can also be detected theoretically in addition to translational motion. A computer simulation was performed using moving test images, and the resulting predicted images were compared with conventional methods such as the block matching method using the criteria of SNR and entropy. The results showed that SNR and entropy of the proposed method are better than those of conventional methods. Also, the proposed method was applied to very low bit-rate image coding at 16 kbit/s, and was compared with a conventional method, H.261. The resulting SNR and decoded images in the proposed method were better than those of H.261. We conclude that the proposed method is effective as a motion-compensated prediction method.
Hiroshi HASEGAWA Miyoshi AYAMA Shuichi MATSUMOTO Atsushi KOIKE Koichi TAKAGI Masao KASUGA
In this paper, the effects of visual information on associated auditory information were investigated when presented simultaneously under dynamic conditions on a wide screen. Experiments of an auditory-visual stimulus presentation using a computer graphics movie of a moving patrol car and its siren sound, which were combined in various locations, were performed in 19 subjects. The experimental results showed the following: the visual stimulus at the beginning of the presentation captured the sound image stronger than that at the end (i.e., beginning effect), the sound image separated from the visual image even when both stimulus locations were exactly at the same place and then when both stimuli moved in opposite directions from each other, the visual stimulus tended to capture the sound image stronger in the peripheral visual field than in the central visual field, and the visual stimulus moving toward the sound source captured the sound image stronger than that moving away from the sound source.
Masayuki HASHIMOTO Kenji MATSUO Atsushi KOIKE
This paper proposes an effective JPEG 2000 encoding method for reducing tiling artifacts, which cause one of the biggest problems in JPEG 2000 encoders. Symmetric pixel extension is generally thought to be the main factor in causing artifacts. However this paper shows that differences in quantization accuracy between tiles are a more significant reason for tiling artifacts at middle or low bit rates. This paper also proposes an algorithm that predicts whether tiling artifacts will occur at a tile boundary in the rate control process and that locally improves quantization accuracy by the original post quantization control. This paper further proposes a method for reducing processing time which is yet another serious problem in the JPEG 2000 encoder. The method works by predicting truncation points using the entropy of wavelet transform coefficients prior to the arithmetic coding. These encoding methods require no additional processing in the decoder. The experiments confirmed that tiling artifacts were greatly reduced and that the coding process was considerably accelerated.
Ryoichi KAWADA Osamu SUGIMOTO Atsushi KOIKE
As digital television transmission is becoming ubiquitous, a method that can remotely monitor the quality of the final and intermediate pictures is urgently needed. In particular, the case where standards conversion is included in the transmission chain is a serious issue as the input and output cannot simply be compared. This letter proposes a novel method to solve this issue. The combination of skipping fields/pixels and the previously proposed SSSWHT-RR method, using the information of correlation coefficients and variance of the picture, achieves accurate detection of picture failure.