In this paper, we present an approach of detecting speech presence for which the decision rule is based on a combination of multiple features using a sigmoid function. A minimum classification error (MCE) training is used to update the weights adjustment for the combination. The features, consisting of three parameters: the ratio of ZCR, the spectral energy, and spectral entropy, are combined linearly with weights derived from the sub-band domain. First, the Bark-scale wavelet decomposition (BSWD) is used to split the input speech into 24 critical sub-bands. Next, the feature parameters are derived from the selected frequency sub-band to form robust voice feature parameters. In order to discard the seriously corrupted frequency sub-band, a strategy of adaptive frequency sub-band extraction (AFSE) dependant on the sub-band SNR is then applied to only the frequency sub-band used. Finally, these three feature parameters, which only consider the useful sub-band, are combined through a sigmoid type function incorporating optimal weights based on MSE training to detect either a speech present frame or a speech absent frame. Experimental results show that the performance of the proposed algorithm is superior to the standard methods such as G.729B and AMR2.
Seung Jun BAEK Daehee KIM Seong-Jun OH Jong-Arm JUN
We consider a queuing model with applications to electric vehicle (EV) charging systems in smart grids. We adopt a scheme where an Electric Service Company (ESCo) broadcasts a one bit signal to EVs, possibly indicating 'on-peak' periods during which electricity cost is high. EVs randomly suspend/resume charging based on the signal. To model the dynamics of EVs we propose an M/M/∞ queue with random interruptions, and analyze the dynamics using time-scale decomposition. There exists a trade-off: one may postpone charging activity to 'off-peak' periods during which electricity cost is cheaper, however this incurs extra delay in completion of charging. Using our model we characterize achievable trade-offs between the mean cost and delay perceived by users. Next we consider a scenario where EVs respond to the signal based on the individual loads. Simulation results show that peak electricity demand can be reduced if EVs carrying higher loads are less sensitive to the signal.
For realistic scale-free networks, we investigate the traffic properties of stochastic routing inspired by a zero-range process known in statistical physics. By parameters α and δ, this model controls degree-dependent hopping of packets and forwarding of packets with higher performance at more busy nodes. Through a theoretical analysis and numerical simulations, we derive the condition for the concentration of packets at a few hubs. In particular, we show that the optimal α and δ are involved in the trade-off between a detour path for α < 0 and long wait at hubs for α > 0; In the low-performance regime at a small δ, the wandering path for α < 0 better reduces the mean travel time of a packet with high reachability. Although, in the high-performance regime at a large δ, the difference between α > 0 and α < 0 is small, neither the wandering long path with short wait trapped at nodes (α = -1), nor the short hopping path with long wait trapped at hubs (α = 1) is advisable. A uniformly random walk (α = 0) yields slightly better performance. We also discuss the congestion phenomena in a more complicated situation with packet generation at each time step.
Yoshimitsu TAKAMATSU Ryuichi FUJIMOTO Tsuyoshi SEKINE Takaya YASUDA Mitsumasa NAKAMURA Takuya HIRAKAWA Masato ISHII Motohiko HAYASHI Hiroya ITO Yoko WADA Teruo IMAYAMA Tatsuro OOMOTO Yosuke OGASAWARA Masaki NISHIKAWA Yoshihiro YOSHIDA Kenji YOSHIOKA Shigehito SAIGUSA Hiroshi YOSHIDA Nobuyuki ITOH
This paper presents a single-chip RF tuner/OFDM demodulator for a mobile digital TV application called “1-segment broadcasting.” To achieve required performances for the single-chip receiver, a tunable technique for a low-noise amplifier (LNA) and spurious suppression techniques are proposed in this paper. Firstly, to receive all channels from 470 MHz to 770 MHz and to relax distortion characteristics of following circuit blocks such as an RF variable-gain amplifier and a mixer, a tunable technique for the LNA is proposed. Then, to improve the sensitivity, spurious signal suppression techniques are also proposed. The single-chip receiver using the proposed techniques is fabricated in 90 nm CMOS technology and total die size is 3.26 mm 3.26 mm. Using the tunable LNA and suppressing undesired spurious signals, the sensitivities of less than -98.6 dBm are achieved for all the channels.
Xincun JI Fuqing HUANG Jianhui WU Longxing SHI
A 1.8 V, 5 GHz low power frequency synthesizer for Wireless Sensor Networks is presented in 0.18 µm CMOS technology. A low power phase-switching prescaler is designed, and the current mode phase rotator is merged into the first divide-by-2 circuit of the prescaler to reduce power and propagation delay. An improved charge pump circuit is proposed to compensate for the dynamic effects with the charge pump. By a divide-by-2 circuit, the frequency synthesizer can provide a 2.324-2.714 GHz quadrature output frequency in 1 MHz steps with a 4 MHz reference frequency. The measured output phase noise is -110 dBc/Hz at 1-MHz offset frequency. The power consumption of the PLL is 11.2 mW at 1.8 V supply voltage.
Fumiaki INOUE Yongbing ZHANG Yusheng JI
We propose a distributed data management approach in this paper for a large-scale position-tracking system composed of multiple small systems based on wireless tag technologies such as RFID and Wi-Fi tags. Each of these small systems is called a domain, and a domain server manages the position data of the users belonging to its managing domain and also to the other domains but temporarily residing in its domain. The domain servers collaborate with each other to globally manage the position data, realizing the global position tracking. Several domains can be further grouped to form a larger domain, called a higher-domain, so that the whole system is constructed in a hierarchical structure. We implemented the proposed approach in an experimental environment, and conducted a performance evaluation on the proposed approach and compared it with an existing approach wherein a central server is used to manage the position data of all the users. The results showed that the position data processing load is distributed among the domain servers and the traffic for position data transmission over the backbone network can be significantly restrained.
Zhenyu LIU Dongsheng WANG Takeshi IKENAGA
Variable block size motion estimation developed by the latest video coding standard H.264/AVC is the efficient approach to reduce the temporal redundancies. The intensive computational complexity coming from the variable block size technique makes the hardwired accelerator essential, for real-time applications. Propagate partial sums of absolute differences (Propagate Partial SAD) and SAD Tree hardwired engines outperform other counterparts, especially considering the impact of supporting variable block size technique. In this paper, the authors apply the architecture-level and the circuit-level approaches to improve the maximum operating frequency and reduce the hardware overhead of Propagate Partial SAD and SAD Tree, while other metrics, in terms of latency, memory bandwidth and hardware utilization, of the original architectures are maintained. Experiments demonstrate that by using the proposed approaches, at 110.8 MHz operating frequency, compared with the original architectures, 14.7% and 18.0% gate count can be saved for Propagate Partial SAD and SAD Tree, respectively. With TSMC 0.18 µm 1P6M CMOS technology, the proposed Propagate Partial SAD architecture achieves 231.6 MHz operating frequency at a cost of 84.1 k gates. Correspondingly, the maximum work frequency of the optimized SAD Tree architecture is improved to 204.8 MHz, which is almost two times of the original one, while its hardware overhead is merely 88.5 k-gate.
Yosuke HIMURA Kensuke FUKUDA Patrice ABRY Kenjiro CHO Hiroshi ESAKI
In this paper, we discuss the validity of the multi-scale gamma model and characterize the differences in host-level application traffic with this model by using a real traffic trace collected on a 150-Mbps transpacific link. First, we investigate the dependency of the model (parameters α and β, and fitting accuracy ε) on time scale Δ, then find suitable time scales for the model. Second, we inspect the relations among α, β, and ε, in order to characterize the differences in the types of applications. The main findings of the paper are as follows. (1) Different types of applications show different dependencies of α, β, and ε on Δ, and display different suitable Δs for the model. The model is more accurate if the traffic consists of intermittently-sent packets than other. (2) More appropriate models are obtained with specific α and β values (e.g., 0.1 < α < 1, and β < 2 for Δ = 500 ms). Also, application-specific traffic presents specific ranges of α, β, and ε for each Δ, so that these characteristics can be used in application identification methods such as anomaly detection and other machine learning methods.
Woong-Kee LOH Yang-Sae MOON Jun-Gyu KANG
In this paper, we emphasize the need for data cleansing when clustering large-scale transaction databases and propose a new data cleansing method that improves clustering quality and performance. We evaluate our data cleansing method through a series of experiments. As a result, the clustering quality and performance were significantly improved by up to 165% and 330%, respectively.
Recent studies investigating the Internet topology reported that inter Autonomous System (AS) topology exhibits a power-law degree distribution which is known as the scale-free property. Although there are many models to generate scale-free topologies, no game theoretic approaches have been proposed yet. In this paper, we propose the new dynamic game theoretic model for the AS level Internet topology formation. Through numerical simulations, we show our process tends to give emergence of the topologies which have the scale-free property especially in the case of large decay parameters and large random link costs. The significance of our study is summarized as following three topics. Firstly, we show that scale-free topologies can also emerge from the game theoretic model. Secondly, we propose the new dynamic process of the network formation game for modeling a process of AS topology formation, and show that our model is appropriate in the micro and macro senses. In the micro sense, our topology formation process is appropriate because this represents competitive and distributed situation observed in the real AS level Internet topology formation process. In the macro sense, some of statistical properties of emergent topologies from our process are similar to those of which also observed in the real AS level Internet topology. Finally, we demonstrate the numerical simulations of our process which is deterministic variation of dynamic process of network formation game with transfers. This is also the new result in the field of the game theory.
In this paper, a human detection method is developed. An appearance based detector and a motion based detector are proposed respectively. A multi scale block histogram of template feature (MB-HOT) is used to detect human by the appearance. It integrates the gray value information and the gradient value information, and represents the relationship of three blocks. Experiment on INRIA dataset shows that this feature is more discriminative than other features, such as histogram of orientation gradient (HOG). A motion based feature is also proposed to capture the relative motion of human body. This feature is calculated in optical flow domain and experimental result in our dataset shows that this feature outperforms other motion based features. The detection responses obtained by two features are combined to reduce the false detection. Graphic process unit (GPU) based implementation is proposed to accelerate the calculation of two features, and make it suitable for real time applications.
Aram KAWEWONG Sirinart TANGRUAMSUB Osamu HASEGAWA
A novel Position-Invariant Robust Feature, designated as PIRF, is presented to address the problem of highly dynamic scene recognition. The PIRF is obtained by identifying existing local features (i.e. SIFT) that have a wide baseline visibility within a place (one place contains more than one sequential images). These wide-baseline visible features are then represented as a single PIRF, which is computed as an average of all descriptors associated with the PIRF. Particularly, PIRFs are robust against highly dynamical changes in scene: a single PIRF can be matched correctly against many features from many dynamical images. This paper also describes an approach to using these features for scene recognition. Recognition proceeds by matching an individual PIRF to a set of features from test images, with subsequent majority voting to identify a place with the highest matched PIRF. The PIRF system is trained and tested on 2000+ outdoor omnidirectional images and on COLD datasets. Despite its simplicity, PIRF offers a markedly better rate of recognition for dynamic outdoor scenes (ca. 90%) than the use of other features. Additionally, a robot navigation system based on PIRF (PIRF-Nav) can outperform other incremental topological mapping methods in terms of time (70% less) and memory. The number of PIRFs can be reduced further to reduce the time while retaining high accuracy, which makes it suitable for long-term recognition and localization.
Yoshifumi KAWAMURA Takashi HIKAGE Toshio NOJIMA
The aim of this study is to develop a new whole-body averaged specific absorption rate (SAR) estimation method based on the external-cylindrical field scanning technique. This technique is adopted with the goal of simplifying the dosimetry estimation of human phantoms that have different postures or sizes. An experimental scaled model system is constructed. In order to examine the validity of the proposed method for realistic human models, we discuss the pros and cons of measurements and numerical analyses based on the finite-difference time-domain (FDTD) method. We consider the anatomical European human phantoms and plane-wave in the 2 GHz mobile phone frequency band. The measured whole-body averaged SAR results obtained by the proposed method are compared with the results of the FDTD analyses.
This paper describes an efficient image enhancement method based on the Multi-Scale Retinex (MSR) approach for pre-processing of video applications. The processing amount is drastically reduced to 4 orders less than that of the original MSR, and 1 order less than the latest fast MSR method. For the efficient processing, our proposed method employs multi-stage and multi-rate filter processing which is constructed by a x-y separable and polyphase structure. In addition, the MSR association is effectively implemented during the above multi-stage processing. The method also modifies a weighting function for enhancement to improve color rendition of bright areas in an image. A variety of evaluation results show that the performance of our simplified method is similar to those of the original MSR, in terms of visual perception, contrast enhancement effects, and hue changes. Moreover, experimental results show that pre-processing of the proposed method contributes to clear foreground object separation.
Ken NAKAOKA Mamoru YOKOTA Kunihiko SASAKI Tetsuo HORIMATSU
This paper studies the feasibility of 700 MHz band inter-vehicle communication system when it is put into practical use in urban area. To verify the system, a large-scale demonstration experiment in a quasi-street test course is performed. In the experiment, a number of vehicles which are equipped with communication devices conforming to ITS FORUM RC-006 specifications are employed. A simulation method that is applicable to large-scale communication model is also designed, and the validity of the method is verified by utilizing the results derived from the experiment. Based on this model, the quality of the inter-vehicle communication system in urban area communication environment is estimated. The results show that the system's performance satisfies the requirements of representative prevention scenes of traffic accident, and the feasibility of the 700 MHz band inter-vehicle communication system specified in RC-006 is verified in the practical use in urban communication environment.
Traditional wavelet-based speech enhancement algorithms are ineffective in the presence of highly non-stationary noise because of the difficulties in the accurate estimation of the local noise spectrum. In this paper, a simple method of noise estimation employing the use of a voice activity detector is proposed. We can improve the output of a wavelet-based speech enhancement algorithm in the presence of random noise bursts according to the results of VAD decision. The noisy speech is first preprocessed using bark-scale wavelet packet decomposition ( BSWPD ) to convert a noisy signal into wavelet coefficients (WCs). It is found that the VAD using bark-scale spectral entropy, called as BS-Entropy, parameter is superior to other energy-based approach especially in variable noise-level. The wavelet coefficient threshold (WCT) of each subband is then temporally adjusted according to the result of VAD approach. In a speech-dominated frame, the speech is categorized into either a voiced frame or an unvoiced frame. A voiced frame possesses a strong tone-like spectrum in lower subbands, so that the WCs of lower-band must be reserved. On the contrary, the WCT tends to increase in lower-band if the speech is categorized as unvoiced. In a noise-dominated frame, the background noise can be almost completely removed by increasing the WCT. The objective and subjective experimental results are then used to evaluate the proposed system. The experiments show that this algorithm is valid on various noise conditions, especially for color noise and non-stationary noise conditions.
Adjustment of a certain parameter in the course of performing a trajectory task such as drawing or gesturing is a common manipulation in pen-based interaction. Since pen tip information is confined to x-y coordinate data, such concurrent parameter adjustment is not easily accomplished in devices using only a pen tip. This paper comparatively investigates the performance of inherent pen input modalities (Pressure, Tilt, Azimuth, and Rolling) and Key Pressing with the non-preferred hand used for precision parameter manipulation during pen sliding actions. We elaborate our experimental design framework here and conduct experimentation to evaluate the effect of the five techniques. Results show that Pressure enabled the fastest performance along with the lowest error rate, while Azimuth exhibited the worst performance. Tilt showed slightly faster performance and achieved a lower error rate than Rolling. However, Rolling achieved the most significant learning effect on Selection Time and was favored over Tilt in subjective evaluations. Our experimental results afford a general understanding of the performance of inherent pen input modalities in the course of a trajectory task in HCI (human computer interaction).
Miki HASEYAMA Makoto TAKIZAWA Takashi YAMAMOTO
In this paper, a new video frame interpolation method based on image morphing for frame rate up-conversion is proposed. In this method, image features are extracted by Scale-Invariant Feature Transform in each frame, and their correspondence in two contiguous frames is then computed separately in foreground and background regions. By using the above two functions, the proposed method accurately generates interpolation frames and thus achieves frame rate up-conversion.
Vera SHEINMAN Takenobu TOKUNAGA
In this study we introduce AdjScales, a method for scaling similar adjectives by their strength. It combines existing Web-based computational linguistic techniques in order to automatically differentiate between similar adjectives that describe the same property by strength. Though this kind of information is rarely present in most of the lexical resources and dictionaries, it may be useful for language learners that try to distinguish between similar words. Additionally, learners might gain from a simple visualization of these differences using unidimensional scales. The method is evaluated by comparison with annotation on a subset of adjectives from WordNet by four native English speakers. It is also compared against two non-native speakers of English. The collected annotation is an interesting resource in its own right. This work is a first step toward automatic differentiation of meaning between similar words for language learners. AdjScales can be useful for lexical resource enhancement.
Chowdhury Farhan AHMED Syed Khairuzzaman TANBEER Byeong-Soo JEONG Young-Koo LEE
Traditional frequent pattern mining algorithms do not consider different semantic significances (weights) of the items. By considering different weights of the items, weighted frequent pattern (WFP) mining becomes an important research issue in data mining and knowledge discovery area. However, the existing state-of-the-art WFP mining algorithms consider all the data from the very beginning of a database to discover the resultant weighted frequent patterns. Therefore, their approaches may not be suitable for the large-scale data environment such as data streams where the volume of data is huge and unbounded. Moreover, they cannot extract the recent change of knowledge in a data stream adaptively by considering the old information which may not be interesting in the current time period. Another major limitation of the existing algorithms is to scan a database multiple times for finding the resultant weighted frequent patterns. In this paper, we propose a novel large-scale algorithm WFPMDS (Weighted Frequent Pattern Mining over Data Streams) for sliding window-based WFP mining over data streams. By using a single scan of data stream, the WFPMDS algorithm can discover important knowledge from the recent data elements. Extensive performance analyses show that our proposed algorithm is very efficient for sliding window-based WFP mining over data streams.