This paper discusses design challenges and possible solutions for 3D NAND. A 3D NAND array inherently has a larger parasitic capacitance and thereby critical area in terms of product yield. To mitigate such issues associated with 3D NAND technology, array control and divided array architecture for improving reliability and yield and for reducing area overhead, program time, energy per bit and array noise are proposed.
Sheng HU Chuan XIAO Yoshiharu ISHIKAWA
Query autocompletion is an important and practical technique when users want to search for desirable information. As mobile devices become more and more popular, one of the main applications is location-aware service, such as Web mapping. In this paper, we propose a new solution to location-aware query autocompletion. We devise a trie-based index structure and integrate spatial information into trie nodes. Our method is able to answer both range and top-k queries. In addition, we discuss the extension of our method to support the error tolerant feature in case user's queries contain typographical errors. Experiments on real datasets show that the proposed method outperforms existing methods in terms of query processing performance.
The song-level feature summarization is an essential building block for browsing, retrieval, and indexing of digital music. This paper proposes a local pooling method to aggregate the feature vectors of a song over the universal background model. Two types of local activation patterns of feature vectors are derived; one representation is derived in the form of histogram, and the other is given by a binary vector. Experiments over three publicly-available music datasets show that the proposed local aggregation of the auditory features is promising for music-similarity computation.
Seong-Hyeon SHIN Woo-Jin JANG Ho-Won YUN Hochong PARK
A method for encoding detection and bit rate classification of AMR-coded speech is proposed. For each texture frame, 184 features consisting of the short-term and long-term temporal statistics of speech parameters are extracted, which can effectively measure the amount of distortion due to AMR. The deep neural network then classifies the bit rate of speech after analyzing the extracted features. It is confirmed that the proposed features provide better performance than the conventional spectral features designed for bit rate classification of coded audio.
In this letter, we analyze performances of a frequency offset estimation based on the maximum likelihood criterion and provide a theoretical proof that the mean squared error of the estimation grows with increase in the offset. Moreover, we propose a new iterative offset estimation method based on the analysis. By computer simulations, we show that the proposed estimator can achieve the lowest estimation error after a few iterations.
Takuma NAKAJIMA Masato YOSHIMI Celimuge WU Tsutomu YOSHINAGA
Cooperative caching is a key technique to reduce rapid growing video-on-demand's traffic by aggregating multiple cache storages. Existing strategies periodically calculate a sub-optimal allocation of the content caches in the network. Although such technique could reduce the generated traffic between servers, it comes with the cost of a large computational overhead. This overhead will be the cause of preventing these caches from following the rapid change in the access pattern. In this paper, we propose a light-weight scheme for cooperative caching by grouping contents and servers with color tags. In our proposal, we associate servers and caches through a color tag, with the aim to increase the effective cache capacity by storing different contents among servers. In addition to the color tags, we propose a novel hybrid caching scheme that divides its storage area into colored LFU (Least Frequently Used) and no-color LRU (Least Recently Used) areas. The colored LFU area stores color-matching contents to increase cache hit rate and no-color LRU area follows rapid changes in access patterns by storing popular contents regardless of their tags. On the top of the proposed architecture, we also present a new routing algorithm that takes benefit of the color tags information to reduce the traffic by fetching cached contents from the nearest server. Evaluation results, using a backbone network topology, showed that our color-tag based caching scheme could achieve a performance close to the sub-optimal one obtained with a genetic algorithm calculation, with only a few seconds of computational overhead. Furthermore, the proposed hybrid caching could limit the degradation of hit rate from 13.9% in conventional non-colored LFU, to only 2.3%, which proves the capability of our scheme to follow rapid insertions of new popular contents. Finally, the color-based routing scheme could reduce the traffic by up to 31.9% when compared with the shortest-path routing.
Jingjing LIU Chao ZHANG Changyong PAN
In the advanced digital terrestrial/television multimedia broadcasting (DTMB-A) standard, a preamble based on distance detection (PBDD) is adopted for robust synchronization and signalling transmission. However, traditional signalling detection method will completely fail to work under severe frequency selective channels with ultra-long delay spread 0dB echoes. In this paper, a novel transmission parameter signalling detection method is proposed for the preamble in DTMB-A. Compared with the conventional signalling detection method, the proposed scheme works much better when the maximum channel delay is close to the length of the guard interval (GI). Both theoretical analyses and simulation results demonstrate that the proposed algorithm significantly improves the accuracy and robustness of detecting the transmitted signalling.
Yonghyun BAEK Tegyu LEE Young-cheol PARK
In this letter, we propose an acoustic distance rendering (ADR) algorithm that can efficiently create the proximity effect in virtual reality (VR) systems. By observing the variation of acoustic cues caused by the movement of the sound source in the near field, we develop a model that can closely approximates the near-field transfer function (NFTF). The developed model is used to efficiently compensate for the near-field effect on the head related transfer function (HRTF). The proposed algorithm is implemented and tested in the form of an audio plugin for a VR platform and the test results confirm the efficiency of the proposed algorithm.
Yang LI Zhuang MIAO Jiabao WANG Yafei ZHANG Hang LI
The latest deep hashing methods perform hash codes learning and image feature learning simultaneously by using pairwise or triplet labels. However, generating all possible pairwise or triplet labels from the training dataset can quickly become intractable, where the majority of those samples may produce small costs, resulting in slow convergence. In this letter, we propose a novel deep discriminative supervised hashing method, called DDSH, which directly learns hash codes based on a new combined loss function. Compared to previous methods, our method can take full advantages of the annotated data in terms of pairwise similarity and image identities. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application. Remarkably, our 16-bits binary representation can surpass the performance of existing 48-bits binary representation, which demonstrates that our method can effectively improve the speed and precision of large scale image retrieval systems.
In statistical approaches such as statistical static timing analysis, the distribution of the maximum of plural distributions is computed by repeating a maximum operation of two distributions. Moreover, since each distribution is represented by a linear combination of several explanatory random variables so as to handle correlations efficiently, sensitivity of the maximum of two distributions to each explanatory random variable, that is, covariance between the maximum and an explanatory random variable, must be calculated in every maximum operation. Since distribution of the maximum of two Gaussian distributions is not a Gaussian, Gaussian mixture model is used for representing a distribution. However, if Gaussian mixture models are used, then it is not always possible to make both variance and covariance of the maximum correct simultaneously. We propose a new algorithm to determine covariance without deteriorating the accuracy of variance of the maximum, and show experimental results to evaluate its performance.
Takao MURAKAMI Yosuke KAGA Kenta TAKAHASHI
The likelihood-ratio based score level fusion (LR fusion) scheme is known as one of the most promising multibiometric fusion schemes. This scheme verifies a user by computing a log-likelihood ratio (LLR) for each modality, and comparing the total LLR to a threshold. It can happen in practice that genuine LLRs tend to be less than 0 for some modalities (e.g., the user is a “goat”, who is inherently difficult to recognize, for some modalities; the user suffers from temporary physical conditions such as injuries and illness). The LR fusion scheme can handle such cases by allowing the user to select a subset of modalities at the authentication phase and setting LLRs corresponding to missing query samples to 0. A recent study, however, proposed a modality selection attack, in which an impostor inputs only query samples whose LLRs are greater than 0 (i.e., takes an optimal strategy), and proved that this attack degrades the overall accuracy even if the genuine user also takes this optimal strategy. In this paper, we investigate the impact of the modality selection attack in more details. Specifically, we investigate whether the overall accuracy is improved by eliminating “goat” templates, whose LLRs tend to be less than 0 for genuine users, from the database (i.e., restricting modality selection). As an overall performance measure, we use the KL (Kullback-Leibler) divergence between a genuine score distribution and an impostor's one. We first prove the modality restriction hardly increases the KL divergence when a user can select a subset of modalities (i.e., selective LR fusion). We second prove that the modality restriction increases the KL divergence when a user needs to input all biometric samples (i.e., non-selective LR fusion). We conduct experiments using three real datasets (NIST BSSR1 Set1, Biosecure DS2, and CASIA-Iris-Thousand), and discuss directions of multibiometric fusion systems.
This study presents the design of a phase correlator for a digital frequency discriminator (DFD) that operates in the 2.0-6.0GHz frequency range. The accuracy of frequency discrimination as determined by the isolation of the correlator mixer was analyzed, and LO-RF isolation was found to have a significant effect on the frequency discrimination error by deriving various analytic equations related to the LO-RF isolation and phase performance. We propose a novel technique (phase sector compensation) to improve the accuracy of frequency discrimination. The phase sector compensation technique improved phase error by canceling the DC offset of the I and Q signals for only the frequency bands where the mixer's LO-RF isolation was below a specified limit. In the 2.0-6.0GHz range, the phase error of the designed phase correlator was decreased from 4.57° to 4.23° (RMS), and the frequency accuracy was improved from 1.02MHz to 0.95MHz (RMS). In the 4.8-6.0GHz range, the RMS phase error was improved from 5.59° to 4.12°, the frequency accuracy was improved from 1.24MHz to 0.92MHz, and the performance of the DFD correlator was improved by 26.3% in the frequency sector where LO-RF isolation was poor. Overall, the DFD correlator performance was improved by LO leakage compensation.
Olav GEIL Stefano MARTIN Umberto MARTÍNEZ-PEÑAS Ryutaroh MATSUMOTO Diego RUANO
Asymptotically good sequences of linear ramp secret sharing schemes have been intensively studied by Cramer et al. in terms of sequences of pairs of nested algebraic geometric codes [4]-[8], [10]. In those works the focus is on full privacy and full reconstruction. In this paper we analyze additional parameters describing the asymptotic behavior of partial information leakage and possibly also partial reconstruction giving a more complete picture of the access structure for sequences of linear ramp secret sharing schemes. Our study involves a detailed treatment of the (relative) generalized Hamming weights of the considered codes.
Xiaoqing YE Jiamao LI Han WANG Xiaolin ZHANG
Accurate stereo matching remains a challenging problem in case of weakly-textured areas, discontinuities and occlusions. In this letter, a novel stereo matching method, consisting of leveraging feature ensemble network to compute matching cost, error detection network to predict outliers and priority-based occlusion disambiguation for refinement, is presented. Experiments on the Middlebury benchmark demonstrate that the proposed method yields competitive results against the state-of-the-art algorithms.
Makoto TAKITA Masanori HIROTOMO Masakatu MORII
Symbol-pair read channels output overlapping pairs of symbols in storage applications. Pair distance and pair error are used in the channels. In this paper, we discuss error-trapping decoding for cyclic codes over symbol-pair read channels. By putting some restrictions on the correctable pair error patterns, we propose a novel error-trapping decoding algorithm over the channels and show a circuitry for implementing the decoding algorithm. In addition, we discuss how to modify the restrictions on the correctable pair error patterns.
Kentaro KATO Somsak CHOOMCHUAY
This paper analyzes the time domain Reed Solomon Decoder with FPGA implementation. Data throughput and area is carefully evaluated compared with typical frequency domain Reed Solomon Decoder. In this analysis, three hardware architecture to enhance the data throughput, namely, the pipelined architecture, the parallel architecture, and the truncated arrays, is evaluated, too. The evaluation reveals that the number of the consumed resources of RS(255, 239) is about 20% smaller than those of the frequency domain decoder although data throughput is less than 10% of the frequency domain decoder. The number of the consumed resources of the pipelined architecture is 28% smaller than that of the parallel architecture when data throughput is same. It is because the pipeline architecture requires less extra logics than the parallel architecture. To get higher data throughput, the pipelined architecture is better than the parallel architecture from the viewpoint of consumed resources.
Fumito TAKEUCHI Masaaki NISHINO Norihito YASUDA Takuya AKIBA Shin-ichi MINATO Masaaki NAGATA
This paper deals with the constrained DAG shortest path problem (CDSP), which finds the shortest path on a given directed acyclic graph (DAG) under any logical constraints posed on taken edges. There exists a previous work that uses binary decision diagrams (BDDs) to represent the logical constraints, and traverses the input DAG and the BDD simultaneously. The time and space complexity of this BDD-based method is derived from BDD size, and tends to be fast only when BDDs are small. However, since it does not prioritize the search order, there is considerable room for improvement, particularly for large BDDs. We combine the well-known A* search with the BDD-based method synergistically, and implement several novel heuristic functions. The key insight here is that the ‘shortest path’ in the BDD is a solution of a relaxed problem, just as the shortest path in the DAG is. Experiments, particularly practical machine learning applications, show that the proposed method decreases search time by up to 2 orders of magnitude, with the specific result that it is 2,000 times faster than a commercial solver. Moreover, the proposed method can reduce the peak memory usage up to 40 times less than the conventional method.
Tsubasa MIYAUCHI Ayato ONO Hiroki YOSHIMURA Masashi NISHIYAMA Yoshio IWAI
We propose a method for embedding the awareness state and response state in an image-based avatar to smoothly and automatically start an interaction with a user. When both states are not embedded, the image-based avatar can become non-responsive or slow to respond. To consider the beginning of an interaction, we observed the behaviors between a user and receptionist in an information center. Our method replayed the behaviors of the receptionist at appropriate times in each state of the image-based avatar. Experimental results demonstrate that, at the beginning of the interaction, our method for embedding the awareness state and response state increased subjective scores more than not embedding the states.
One of the problems associated with voice conversion from a nonparallel corpus is how to find the best match or alignment between the source and the target vector sequences without linguistic information. In a previous study, alignment was achieved by minimizing the distance between the source vector and the transformed vector. This method, however, yielded a sequence of feature vectors that were not well matched with the underlying speaker model. In this letter, the vectors were selected from the candidates by maximizing the overall likelihood of the selected vectors with respect to the target model in the HMM context. Both objective and subjective evaluations were carried out using the CMU ARCTIC database to verify the effectiveness of the proposed method.
Jun WANG Guoqing WANG Leida LI
A quantized index for evaluating the pattern similarity of two different datasets is designed by calculating the number of correlated dictionary atoms. Guided by this theory, task-specific biometric recognition model transferred from state-of-the-art DNN models is realized for both face and vein recognition.