In this paper, we propose a new buffer map notification scheme for Peer-to-Peer Video-on-Demand systems (P2P VoDs) which support VCR operations such as fast-forward, fast-backward, and seek. To enhance the fluidity of such VCR operations, we need to refine the size of each piece as small as possible. However, such a refinement significantly degrades the performance of buffer map notification schemes with respect to the overhead, piece availability and the efficiency of resource utilizations. The basic idea behind our proposed scheme is to use a piece-based buffer map with a segment-based buffer map in a complementary manner. The result of simulations indicates that the proposed scheme certainly increases the accuracy of the information on the piece availability in the neighborhood with a sufficiently low cost, which reduces the intermittent waiting time of each peer by more than 40% even under a situation in which 50% of peers conduct the fast-forward operation over a range of 30% of the entire video.
Xiao XIA Xinye LIN Xiaodong WANG Xingming ZHOU Deke GUO
To facilitate the discovery of mobile apps in personal devices, we present the personalized live homescreen system. The system mines the usage patterns of mobile apps, generates personalized predictions, and then makes apps available at users' hands whenever they want them. Evaluations have verified the promising effectiveness of our system.
Huawei TIAN Yao ZHAO Zheng WANG Rongrong NI Lunming QIN
With the rapid development of multi-view video coding (MVC) and light field rendering (LFR), Free-View Television (FTV) has emerged as new entrainment equipment, which can bring more immersive and realistic feelings for TV viewers. In FTV broadcasting system, the TV-viewer can freely watch a realistic arbitrary view of a scene generated from a number of original views. In such a scenario, the ownership of the multi-view video should be verified not only on the original views, but also on any virtual view. However, capacities of existing watermarking schemes as copyright protection methods for LFR-based FTV are only one bit, i.e., presence or absence of the watermark, which seriously impacts its usage in practical scenarios. In this paper, we propose a robust multi-bit watermarking scheme for LFR-based free-view video. The direct-sequence code division multiple access (DS-CDMA) watermark is constructed according to the multi-bit message and embedded into DCT domain of each view frame. The message can be extracted bit-by-bit from a virtual frame generated at an arbitrary view-point with a correlation detector. Furthermore, we mathematically prove that the watermark can be detected from any virtual view. Experimental results also show that the watermark in FTV can be successfully detected from a virtual view. Moreover, the proposed watermark method is robust against common signal processing attacks, such as Gaussian filtering, salt & peppers noising, JPEG compression, and center cropping.
Xinyuan CAI Chunheng WANG Baihua XIAO Yunxue SHAO
Face verification is the task of determining whether two given face images represent the same person or not. It is a very challenging task, as the face images, captured in the uncontrolled environments, may have large variations in illumination, expression, pose, background, etc. The crucial problem is how to compute the similarity of two face images. Metric learning has provided a viable solution to this problem. Until now, many metric learning algorithms have been proposed, but they are usually limited to learning a linear transformation. In this paper, we propose a nonlinear metric learning method, which learns an explicit mapping from the original space to an optimal subspace using deep Independent Subspace Analysis (ISA) network. Compared to the linear or kernel based metric learning methods, the proposed deep ISA network is a deep and local learning architecture, and therefore exhibits more powerful ability to learn the nature of highly variable dataset. We evaluate our method on the Labeled Faces in the Wild dataset, and results show superior performance over some state-of-the-art methods.
Ervianto ABDULLAH Satoshi FUJITA
Recently Peer-to-Peer Content Delivery Networks (P2P CDNs) have attracted considerable attention as a cost-effective way to disseminate digital contents to paid users in a scalable and dependable manner. However, due to its peer-to-peer nature, it faces threat from “colluders” who paid for the contents but illegally share them with unauthorized peers. This means that the detection of colluders is a crucial task for P2P CDNs to preserve the right of contents holders and paid users. In this paper, we propose two colluder detection schemes for P2P CDNs. The first scheme is based on the reputation collected from all peers participating in the network and the second scheme improves the quality of colluder identification by using a technique which is well known in the field of system level diagnosis. The performance of the schemes is evaluated by simulation. The simulation results indicate that even when 10% of authorized peers are colluders, our schemes identify all colluders without causing misidentifications.
Robot covering problem has gained attention as having the most promising applications in our real life. Previous spanning tree coverage algorithm addressed this problem well in a static environment, but not in a dynamic one. In this paper, we present and analyze our algorithm workable in a dynamic environment with less shadow areas.
Tomoya OHTA Satoshi DENNO Masahiro MORIKURA
This paper proposes a novel heterodyne multiband multiple-input multiple-output (MIMO) receiver with baseband automatic gain control (AGC) for cognitive radios. The proposed receiver uses heterodyne reception implemented with a wide-passband band-pass filter in the radio frequency (RF) stage to be able to receive signals in arbitrary frequency bands. Even when an RF Hilbert transformer is utilized in the receiver, image-band interference occurs due to the imperfection of the Hilbert transformer. In the receiver, analog baseband AGC is introduced to prevent the baseband signals exceeding the voltage reference of analog-to-digital converters (ADCs). This paper proposes a novel technique to estimate the imperfection of the Hilbert transformer in the heterodyne multiband MIMO receiver with baseband AGC. The proposed technique estimates not only the imperfection of the Hilbert transformer but also the AGC gain ratio, and analog devices imperfection in the feedback loop, which enables to offset the imperfection of the Hilbert transformer. The performance of the proposed receiver is verified by using computer simulations. As a result, the required resolution of the ADC is 9 bits in the proposed receiver. Moreover, the proposed receiver has less computational complexity than that with the baseband interference cancellation unless a frequency band is changed every 9 packets or less.
Manato FUJIMOTO Hayato OZAKI Takuya SUZUKI Hiroaki KOYAMASHITA Tomotaka WADA Kouichi MUTSUURA Hiromi OKADA
Recently, the border security systems attract attention as large-scale monitoring system in wireless sensor networks (WSNs). In the border security systems whose aim is the monitoring of illegal immigrants and the information management in long-period, it deploys a lot of sensor nodes that have the communication and sensing functions in the detection area. Hence, the border security systems are necessary to reduce the power consumption of the whole system in order to extend the system lifetime and accurately monitor the track of illegal immigrants. In this paper, we propose two effective barrier coverage construction methods by switch dynamically operation modes of sensor nodes to reduce the operating time of the sensing function that wastes a lot of power consumption. We carry out performance evaluations by computer simulations to show the effectiveness of two proposed methods and show that the proposed methods are suitable for the border security systems.
Young-Duk KIM Won-Seok KANG Kookrae CHO Dongkyun KIM
In general, the sensor network has a many-to-one communication architecture wherein each node transmits its data to a sink. This leads the congested nodes to die early and nodes nears the sink suffer from significant traffic concentrations. In this paper, we propose a cross-layer based routing and MAC protocol which is compatible with the IEEE 802.15.4 standard without additional overhead. The key mechanism is to provide dynamic route discovery and route maintenance operations to avoid and mitigate the most congested nodes by monitoring link status such as link delay, buffer occupancy and residential energy. In addition, the proposed protocol also provides a dynamic tuning of BE (Binary Exponent) and frame retransmission opportunities according to the hop distance to the sink node to mitigate funnel effects. We conducted simulations, verifying the performance over existing protocols.
Kohei OGAWA Masahiro MORIKURA Koji YAMAMOTO Tomoyuki SUGIHARA
As a promising wireless access standard for machine-to-machine (M2M) networks, the IEEE 802.11 task group ah has been discussing a new standard which is based on the wireless local area network (WLAN) standard. This new standard will support an enormous number of stations (STAs) such as 6,000 STAs. To mitigate degradation of the throughput and delay performance in WLANs that employ a carrier sense multiple access with collision avoidance (CSMA/CA) protocol, this paper proposes a virtual grouping method which exploits the random arbitration interframe space number scheme. This method complies with the CSMA/CA protocol, which employs distributed medium access control. Moreover, power saving is another important issue for M2M networks, where most STAs are operated by primary or secondary batteries. This paper proposes a new power saving method for the IEEE 802.11ah based M2M network employing the proposed virtual grouping method. With the proposed virtual grouping and power saving methods, the STAs can save their power by as much as 90% and maintain good throughput and delay performance.
Md. Ezharul ISLAM Nobuo FUNABIKI Toru NAKANISHI Kan WATANABE
Nowadays, with spreads of inexpensive small communication devices, a number of wireless local area networks (WLANs) have been deployed even in the same building for the Internet access services. Their wireless access-points (APs) are often independently installed and managed by different groups such as departments or laboratories in a university or a company. Then, a user host can access to multiple WLANs by detecting signals from their APs, which increases the energy consumption and the operational cost. It may also degrade the communication performance by increasing interferences. In this paper, we present an AP aggregation approach to solve these problems in multiple WLAN environments by aggregating deployed APs of different groups into limited ones using virtual APs. First, we formulate the AP aggregation problem as a combinatorial optimization problem and prove the NP-completeness of its decision problem. Then, we propose its heuristic algorithm composed of five phases. We verify the effectiveness through extensive simulations using the WIMNET simulator.
Inwoong LEE Jincheol PARK Seonghyun KIM Taegeun OH Sanghoon LEE
We seek a resource allocation algorithm through carrier allocation and modulation mode selection for improving the quality of service (QoS) that can adapt to various screen sizes and dynamic channel variations. In terms of visual quality, the expected visual entropy (EVE) is defined to quantify the visual information of being contained in each layer of the scalable video coding (SVC). Fairness optimization is conducted to maximize the EVE using an objective function for given constraints of radio resources. To conduct the fairness optimization, we propose a novel approximation algorithm for resource allocation for the maximal EVE. Simulations confirm that the QoS in terms of the EVE or peak signal to noise ratio (PSNR) is significantly improved by using the novel algorithm.
There is a well known Steiner tree algorithm called minimum-cost paths heuristic (MPH), which is used for many multicast network operations and is considered a benchmark for other Steiner tree algorithms. MPH's average case time complexity is O(m(l+nlog n)), where m is the number of end nodes, n is the number of nodes, and l is the number of links in the network, because MPH has to run Dijkstra's algorithm as many times as the number of end nodes. The author recently proposed a Steiner tree algorithm called branch-based multi-cast (BBMC), which produces exactly the same multicast tree as MPH in a constant processing time irrespective of the number of multicast end nodes. However, the theoretical result for the average case time complexity of BBMC was expressed as O(log m(l+nlog n)) and could not accurately reflect the above experimental result. This paper proves that the average case time complexity of BBMC can be shortened to O(l+nlog n), which is independent of the number of end nodes, when there is an upper limit of the node degree, which is the number of links connected to a node. In addition, a new parameter β is applied to BBMC, so that the multicast tree created by BBMC has less links on it. Even though the tree costs increase due to this parameter, the tree cost increase rates are much smaller than the link decrease rates.
Yoshikazu MIYANAGA Wataru TAKAHASHI Shingo YOSHIZAWA
This paper introduces our developed noise robust speech communication techniques and describes its implementation to a smart info-media system, i.e., a small robot. Our designed speech communication system consists of automatic speech detection, recognition, and rejection. By using automatic speech detection and recognition, an observed speech waveform can be recognized without a manual trigger. In addition, using speech rejection, this system only accepts registered speech phrases and rejects any other words. In other words, although an arbitrary input speech waveform can be fed into this system and recognized, the system responds only to the registered speech phrases. The developed noise robust speech processing can reduce various noises in many environments. In addition to the design of noise robust speech recognition, the LSI design of this system has been introduced. By using the design of speech recognition application specific IC (ASIC), we can simultaneously realize low power consumption and real-time processing. This paper describes the LSI architecture of this system and its performances in some field experiments. In terms of current speech recognition accuracy, the system can realize 85-99% under 0-20dB SNR and echo environments.
Trung-Nghia PHUNG Thanh-Son PHAN Thang Tat VU Mai Chi LUONG Masato AKAGI
The most important advantage of HMM-based TTS is its highly intelligible. However, speech synthesized by HMM-based TTS is muffled and far from natural, especially under limited data conditions, which is mainly caused by its over-smoothness. Therefore, the motivation for this paper is to improve the naturalness of HMM-based TTS trained under limited data conditions while preserving its intelligibility. To achieve this motivation, a hybrid TTS between HMM-based TTS and the modified restricted Temporal Decomposition (MRTD), named HTD in this paper, was proposed. Here, TD is an interpolation model of decomposing a spectral or prosodic sequence of speech into sparse event targets and dynamic event functions, and MRTD is one simplified version of TD. With a determination of event functions close to the concept of co-articulation in speech, MRTD can synthesize smooth speech and the smoothness in synthesized speech can be adjusted by manipulating event targets of MRTD. Previous studies have also found that event functions of MRTD can represent linguistic information of speech, which is important to perceive speech intelligibility, while sparse event targets can convey the non-linguistics information, which is important to perceive the naturalness of speech. Therefore, prosodic trajectories and MRTD event functions of the spectral trajectory generated by HMM-based TTS were kept unchanged to preserve the high and stable intelligibility of HMM-based TTS. Whereas MRTD event targets of the spectral trajectory generated by HMM-based TTS were rendered with an original speech database to enhance the naturalness of synthesized speech. Experimental results with small Vietnamese datasets revealed that the proposed HTD was equivalent to HMM-based TTS in terms of intelligibility but was superior to it in terms of naturalness. Further discussions show that HTD had a small footprint. Therefore, the proposed HTD showed its strong efficiency under limited data conditions.
Asahi TAKAOKA Satoshi TAYU Shuichi UENO
We consider the minimum feedback vertex set problem for some bipartite graphs and degree-constrained graphs. We show that the problem is linear time solvable for bipartite permutation graphs and NP-hard for grid intersection graphs. We also show that the problem is solvable in O(n2log 6n) time for n-vertex graphs with maximum degree at most three.
We consider an output feedback control problem of a chain of integrators under sensor noise. The sensor noise enters the output feedback channel in an additive form. A similar problem has been addressed most recently in [9], but their result has been developed only under AC sensor noise. We generalize the result of [9] by allowing the sensor noise to include both AC and DC components. With our new output feedback controller, we show that the ultimate bounds of all states can be made arbitrarily small. We show the generality of our result over [9] via an example.
Xin LI Jielin PAN Qingwei ZHAO Yonghong YAN
Morphemes, which are obtained from morphological parsing, and statistical sub-words, which are derived from data-driven splitting, are commonly used as the recognition units for speech recognition of agglutinative languages. In this letter, we propose a discriminative approach to select the splitting result, which is more likely to improve the recognizer's performance, for each distinct word type. An objective function which involves the unigram language model (LM) probability and the count of misrecognized phones on the acoustic training data is defined and minimized. After determining the splitting result for each word in the text corpus, we select the frequent units to build a hybrid vocabulary including morphemes and statistical sub-words. Compared to a statistical sub-word based system, the hybrid system achieves 0.8% letter error rates (LERs) reduction on the test set.
The link structure of the Web is generally viewed as a webgraph. One of the main objectives of web structure mining is to find hidden communities on the Web based on the webgraph, and one of its approaches tries to enumerate substructures, each of which corresponds to a set of web pages of a community or its core. Research has shown that certain substructures can find sets of pages that are inherently irrelevant to communities. In this paper, we propose a model, which we call contracted webgraphs, where such substructures are contracted into single nodes to hide useless information. We then try structure mining iteratively on those contracted webgraphs since we can expect to find further hidden information once irrelevant information is eliminated. We also explore the structural properties of contracted webgraphs from the viewpoint of scale-freeness, and we observe that they exhibit novel and extreme self-similarities.
Chih-Ming CHEN Ying-ping CHEN Tzu-Ching SHEN John K. ZAO
LT codes are the first practical rateless codes whose reception overhead totally depends on the degree distribution adopted. The capability of LT codes with a particular degree distribution named robust soliton has been theoretically analyzed; it asymptotically approaches the optimum when the message length approaches infinity. However, real applications making use of LT codes have finite number of input symbols. It is quite important to refine degree distributions because there are distributions whose performance can exceed that of the robust soliton distribution for short message length. In this work, a practical framework that employs evolutionary algorithms is proposed to search for better degree distributions. Our experiments empirically prove that the proposed framework is robust and can customize degree distributions for LT codes with different message length. The decoding error probabilities of the distributions found in the experiments compare well with those of robust soliton distributions. The significant improvement of LT codes with the optimized degree distributions is demonstrated in the paper.