1-9hit |
Kenta KURIHARA Masanori KIKUCHI Shoko IMAIZUMI Sayaka SHIOTA Hitoshi KIYA
In many multimedia applications, image encryption has to be conducted prior to image compression. This paper proposes a JPEG-friendly perceptual encryption method, which enables to be conducted prior to JPEG and Motion JPEG compressions. The proposed encryption scheme can provides approximately the same compression performance as that of JPEG compression without any encryption, where both gray scale images and color ones are considered. It is also shown that the proposed scheme consists of four block-based encryption steps, and provide a reasonably high level of security. Most of conventional perceptual encryption schemes have not been designed for international compression standards, but this paper focuses on applying the JPEG and Motion JPEG standards, as one of the most widely used image compression standards. In addition, this paper considers an efficient key management scheme, which enables an encryption with multiple keys to be easy to manage its keys.
Yuma KINOSHITA Sayaka SHIOTA Masahiro IWAHASHI Hitoshi KIYA
A number of successful tone mapping operators (TMOs) for contrast compression have been proposed due to the need to visualize high dynamic range (HDR) images on low dynamic range devices. This paper proposes a novel inverse tone mapping (TM) operation and a new remapping framework with the operation. Existing inverse TM operations require either the store of some parameters calculated in forward TM, or data-depended operations. The proposed inverse TM operation enables to estimate HDR images from LDR ones mapped by the Reinhard's global operator, not only without keeping any parameters but also without any data-depended calculation. The proposed remapping framework with the inverse operation consists of two TM operations. The first TM operation is carried out by the Reinhard's global operator, and then the generated LDR one is stored. When we want different quality LDR ones, the proposed inverse TM operation is applied to the stored LDR one to generate an HDR one, and the second TM operation is applied to the HDR one to generate an LDR one with desirable quality, by using an arbitrary TMO. This framework allows not only to visualize an HDR image on low dynamic range devices at low computing cost, but also to efficiently store an HDR one as an LDR one. In simulations, it is shown that the proposed inverse TM operation has low computational cost, compared to the conventional ones. Furthermore, it is confirmed that the proposed framework allows to remap the stored LDR one to another LDR one whose quality is the same as that of the LDR one remapped by the conventional inverse TMO with parameters.
Sayaka SHIOTA Kei HASHIMOTO Yoshihiko NANKAKU Keiichi TOKUDA
This paper proposes an acoustic modeling technique based on Bayesian framework using multiple model structures for speech recognition. The aim of the Bayesian approach is to obtain good prediction of observation by marginalizing all variables related to generative processes. Although the effectiveness of marginalizing model parameters was recently reported in speech recognition, most of these systems use only “one” model structure, e.g., topologies of HMMs, the number of states and mixtures, types of state output distributions, and parameter tying structures. However, it is insufficient to represent a true model distribution, because a family of such models usually does not include a true distribution in most practical cases. One of solutions of this problem is to use multiple model structures. Although several approaches using multiple model structures have already been proposed, the consistent integration of multiple model structures based on the Bayesian approach has not seen in speech recognition. This paper focuses on integrating multiple phonetic decision trees based on the Bayesian framework in HMM based acoustic modeling. The proposed method is derived from a new marginal likelihood function which includes the model structures as a latent variable in addition to HMM state sequences and model parameters, and the posterior distributions of these latent variables are obtained using the variational Bayesian method. Furthermore, to improve the optimization algorithm, the deterministic annealing EM (DAEM) algorithm is applied to the training process. The proposed method effectively utilizes multiple model structures, especially in the early stage of training and this leads to better predictive distributions and improvement of recognition performance.
Kenta KURIHARA Shoko IMAIZUMI Sayaka SHIOTA Hitoshi KIYA
In many multimedia applications, image encryption has to be conducted prior to image compression. This letter proposes an Encryption-then-Compression system using JPEG XR/JPEG-LS friendly perceptual encryption method, which enables to be conducted prior to the JPEG XR/JPEG-LS standard used as an international standard lossless compression method. The proposed encryption scheme can provides approximately the same compression performance as that of the lossless compression without any encryption. It is also shown that the proposed system consists of four block-based encryption steps, and provides a reasonably high level of security. Existing conventional encryption methods have not been designed for international lossless compression standards, but for the first time this letter focuses on applying the standards.
Chihiro GO Yuma KINOSHITA Sayaka SHIOTA Hitoshi KIYA
This paper proposes a novel multi-exposure image fusion (MEF) scheme for single-shot high dynamic range imaging with spatially varying exposures (SVE). Single-shot imaging with SVE enables us not only to produce images without color saturation regions from a single-shot image, but also to avoid ghost artifacts in the producing ones. However, the number of exposures is generally limited to two, and moreover it is difficult to decide the optimum exposure values before the photographing. In the proposed scheme, a scene segmentation method is applied to input multi-exposure images, and then the luminance of the input images is adjusted according to both of the number of scenes and the relationship between exposure values and pixel values. The proposed method with the luminance adjustment allows us to improve the above two issues. In this paper, we focus on dual-ISO imaging as one of single-shot imaging. In an experiment, the proposed scheme is demonstrated to be effective for single-shot high dynamic range imaging with SVE, compared with conventional MEF schemes with exposure compensation.
Yuma KINOSHITA Sayaka SHIOTA Hitoshi KIYA
This paper proposes a new inverse tone mapping operator (TMO) with estimated parameters. The proposed inverse TMO is based on Reinhard's global operator which is a well-known TMO. Inverse TM operations have two applications: generating an HDR image from an existing LDR one, and reconstructing an original HDR image from the mapped LDR image. The proposed one can be applied to both applications. In the latter application, two parameters used in Reinhard's TMO, i.e. the key value α regarding brightness of a mapped LDR one and the geometric mean $overline{L}_w$ of an original HDR one, are generally required for carrying out the Reinhard based inverse TMO. In this paper, we show that it is possible to estimate $overline{L}_w$ from α under some conditions, while α can be also estimated from $overline{L}_w$, so that a new inverse TMO with estimated parameter is proposed. Experimental results show that the proposed method outperforms conventional ones for both applications, in terms of high structural similarities and low computational costs.
Ayana KAWAMURA Yuma KINOSHITA Takayuki NAKACHI Sayaka SHIOTA Hitoshi KIYA
We propose a privacy-preserving machine learning scheme with encryption-then-compression (EtC) images, where EtC images are images encrypted by using a block-based encryption method proposed for EtC systems with JPEG compression. In this paper, a novel property of EtC images is first discussed, although EtC ones was already shown to be compressible as a property. The novel property allows us to directly apply EtC images to machine learning algorithms non-specialized for computing encrypted data. In addition, the proposed scheme is demonstrated to provide no degradation in the performance of some typical machine learning algorithms including the support vector machine algorithm with kernel trick and random forests under the use of z-score normalization. A number of facial recognition experiments with are carried out to confirm the effectiveness of the proposed scheme.
Ryota KAMINISHI Haruna MIYAMOTO Sayaka SHIOTA Hitoshi KIYA
This study evaluates the effects of some non-learning blind bandwidth extension (BWE) methods on state-of-the-art automatic speaker verification (ASV) systems. Recently, a non-linear bandwidth extension (N-BWE) method has been proposed as a blind, non-learning, and light-weight BWE approach. Other non-learning BWEs have also been developed in recent years. For ASV evaluations, most data available to train ASV systems is narrowband (NB) telephone speech. Meanwhile, wideband (WB) data have been used to train the state-of-the-art ASV systems, such as i-vector, d-vector, and x-vector. This can cause sampling rate mismatches when all datasets are used. In this paper, we investigate the influence of sampling rate mismatches in the x-vector-based ASV systems and how non-learning BWE methods perform against them. The results showed that the N-BWE method improved the equal error rate (EER) on ASV systems based on the x-vector when the mismatches were present. We researched the relationship between objective measurements and EERs. Consequently, the N-BWE method produced the lowest EERs on both ASV systems and obtained the lower RMS-LSD value and the higher STOI score.
Yuma KINOSHITA Sayaka SHIOTA Hitoshi KIYA
This paper proposes a novel pseudo multi-exposure image fusion method based on a single image. Multi-exposure image fusion is used to produce images without saturation regions, by using photos with different exposures. However, it is difficult to take photos suited for the multi-exposure image fusion when we take a photo of dynamic scenes or record a video. In addition, the multi-exposure image fusion cannot be applied to existing images with a single exposure or videos. The proposed method enables us to produce pseudo multi-exposure images from a single image. To produce multi-exposure images, the proposed method utilizes the relationship between the exposure values and pixel values, which is obtained by assuming that a digital camera has a linear response function. Moreover, it is shown that the use of a local contrast enhancement method allows us to produce pseudo multi-exposure images with higher quality. Most of conventional multi-exposure image fusion methods are also applicable to the proposed multi-exposure images. Experimental results show the effectiveness of the proposed method by comparing the proposed one with conventional ones.