Automatically recognizing pain and estimating pain intensity is an emerging research area that has promising applications in the medical and healthcare field, and this task possesses a crucial role in the diagnosis and treatment of patients who have limited ability to communicate verbally and remains a challenge in pattern recognition. Recently, deep learning has achieved impressive results in many domains. However, deep architectures require a significant amount of labeled data for training, and they may fail to outperform conventional handcrafted features due to insufficient data, which is also the problem faced by pain detection. Furthermore, the latest studies show that handcrafted features may provide complementary information to deep-learned features; hence, combining these features may result in improved performance. Motived by the above considerations, in this paper, we propose an innovative method based on the combination of deep spatiotemporal and handcrafted features for pain intensity estimation. We use C3D, a deep 3-dimensional convolutional network that takes a continuous sequence of video frames as input, to extract spatiotemporal facial features. C3D models the appearance and motion of videos simultaneously. For handcrafted features, we propose extracting the geometric information by computing the distance between normalized facial landmarks per frame and the ones of the mean face shape, and we extract the appearance information using the histogram of oriented gradients (HOG) features around normalized facial landmarks per frame. Two levels of SVRs are trained using spatiotemporal, geometric and appearance features to obtain estimation results. We tested our proposed method on the UNBC-McMaster shoulder pain expression archive database and obtained experimental results that outperform the current state-of-the-art.
Ryo MASUMURA Taichi ASAMI Takanobu OBA Hirokazu MASATAKI Sumitaka SAKAUCHI Akinori ITO
This paper proposes a novel domain adaptation method that can utilize out-of-domain text resources and partially domain matched text resources in language modeling. A major problem in domain adaptation is that it is hard to obtain adequate adaptation effects from out-of-domain text resources. To tackle the problem, our idea is to carry out model merger in a latent variable space created from latent words language models (LWLMs). The latent variables in the LWLMs are represented as specific words selected from the observed word space, so LWLMs can share a common latent variable space. It enables us to perform flexible mixture modeling with consideration of the latent variable space. This paper presents two types of mixture modeling, i.e., LWLM mixture models and LWLM cross-mixture models. The LWLM mixture models can perform a latent word space mixture modeling to mitigate domain mismatch problem. Furthermore, in the LWLM cross-mixture models, LMs which individually constructed from partially matched text resources are split into two element models, each of which can be subjected to mixture modeling. For the approaches, this paper also describes methods to optimize mixture weights using a validation data set. Experiments show that the mixture in latent word space can achieve performance improvements for both target domain and out-of-domain compared with that in observed word space.
Shaojun ZHANG Julong LAN Chao QI Penghao SUN
Distributed control plane architecture has been employed in software-defined data center networks to improve the scalability of control plane. However, since the flow space is partitioned by assigning switches to different controllers, the network topology is also partitioned and the rule setup process has to invoke multiple controllers. Besides, the control load balancing based on switch migration is heavyweight. In this paper, we propose a lightweight load partition method which decouples the flow space from the network topology. The flow space is partitioned with hosts rather than switches as carriers, which supports fine-grained and lightweight load balancing. Moreover, the switches are no longer needed to be assigned to different controllers and we keep all of them controlled by each controller, thus each flow request can be processed by exactly one controller in a centralized style. Evaluations show that our scheme reduces rule setup costs and achieves lightweight load balancing.
Takahiro KODAMA Gabriella CINCOTTI
A novel adaptive code division multiplexing system with hybrid electrical and optical codes is proposed for flexible and dynamic resource allocation in next generation asynchronous optical access networks. We analyze the performance of a 10Gbps × 12 optical node unit, using hierarchical 8-level optical and 4-level electrical phase shift keying codes.
Takashi MAEHATA Suguru KAMEDA Noriharu SUEMATSU
The 1-bit band-pass delta-sigma modulator (BP-DSM) achieves high resolution if it uses an oversampling technique. This method can generate concurrent dual-band RF signals from a digitally modulated signal using a 1-bit digital pulse train. It was previously reported that the adjacent channel leakage ratio (ACLR) deteriorates owing to the asymmetrical waveform created by the pulse transition mismatch error of the rising and falling waveforms in the time domain and that the ACLR can be improved by distortion compensation. However, the reported distortion compensation method can only be performed for single-band transmission, and it fails to support multi-band transmission because the asymmetrical waveform compensated signal extends over a wide frequency range and is itself a harmful distortion outside the target band. Unfortunately, the increase of out-of-band power causes the BP-DSM unstable. We therefore propose a distortion compensator for a concurrent dual-band 1-bit BP-DSM that consists of a noise transfer function with a quasi-elliptic filter that can control the out-of-band gain frequency response against out-of-band oscillation. We demonstrate that dual-band LTE signals, each with 40MHz (2×20MHz) bandwidth, at 1.5 and 3.0GHz, can be compensated concurrently for spurious distortion under various combinations of rising and falling times and ACLR of up to 48dB, each with 120MHz bandwidth, including the double sided adjacent channels and next adjacent channels, is achieved.
Taishi OGAWA Atsushi NAKAZAWA Toyoaki NISHIDA
We present a human point of gaze estimation system using corneal surface reflection and omnidirectional image taken by spherical panorama cameras, which becomes popular recent years. Our system enables to find where a user is looking at only from an eye image in a 360° surrounding scene image, thus, does not need gaze mapping from partial scene images to a whole scene image that are necessary in conventional eye gaze tracking system. We first generate multiple perspective scene images from an omnidirectional (equirectangular) image and perform registration between the corneal reflection and perspective images using a corneal reflection-scene image registration technique. We then compute the point of gaze using a corneal imaging technique leveraged by a 3D eye model, and project the point to an omnidirectional image. The 3D eye pose is estimate by using the particle-filter-based tracking algorithm. In experiments, we evaluated the accuracy of the 3D eye pose estimation, robustness of registration and accuracy of PoG estimations using two indoor and five outdoor scenes, and found that gaze mapping error was 5.546 [deg] on average.
Chang-Hee KANG Sung-Soon PARK Young-Hwan YOU Hyoung-Kyu SONG
In wireless communication systems, OFDM technology is a communication method that can yield high data rates. However, OFDM systems suffer high PAPR values due to the use of many of subcarriers. The SLM and the PTS technique were proposed to solve the PAPR problem in OFDM systems. However, these approaches have the disadvantage of having high complexity. This paper proposes a method which has lower complexity than the conventional PTS method but has less performance degradation.
Hongbin LIN Xiuping PENG Chao FENG Qisheng TONG Kai LIU
The concept of Gaussian integer sequence pair is generalized from a single Gaussian integer sequence. In this letter, by adopting cyclic difference set pairs, a new construction method for perfect Gaussian integer sequence pairs is presented. Furthermore, the necessary and sufficient conditions for constructing perfect Gaussian integer sequence pairs are given. Through the research in this paper, a large number of perfect Gaussian integer sequence pairs can be obtained, which can greatly extend the existence of perfect sequence pairs.
Channel capacity is a useful numerical index not only for grasping the upper limit of the transmission bit rate but also for comparing the abilities of various digital transmission schemes commonly used in radio-wave propagation environments because the channel capacity does not depend on specific communication methods such as modulation/demodulation schemes or error correction schemes. In this paper, modeling of the noncoherent capacity in a highly underspread WSSUS channel is investigated using a new approach. Unlike the conventional method, namely, the information theoretic method, a very straightforward formula can be obtained in a statistical manner. Although the modeling in the present study is carried out using a somewhat less rigorous approach, the result obtained is useful for roughly understanding the channel capacity in doubly selective fading environments. We clarify that the radio wave propagation parameter of the spread factor, which is the product of the Doppler spread and the delay spread, can be related quantitatively to the effective maximum signal-to-interference ratio by a simple formula. Using this model, the physical limit of wireless digital transmission is discussed from a radio wave propagation perspective.
Yinan LIU Qingbo WU Liangzhi TANG Linfeng XU
In this paper, we propose a novel self-supervised learning of video representation which is capable to anticipate the video category by only reading its short clip. The key idea is that we employ the Siamese convolutional network to model the self-supervised feature learning as two different image matching problems. By using frame encoding, the proposed video representation could be extracted from different temporal scales. We refine the training process via a motion-based temporal segmentation strategy. The learned representations for videos can be not only applied to action anticipation, but also to action recognition. We verify the effectiveness of the proposed approach on both action anticipation and action recognition using two datasets namely UCF101 and HMDB51. The experiments show that we can achieve comparable results with the state-of-the-art self-supervised learning methods on both tasks.
Nawfal AL-ZUBAIDI R-SMITH Lubomír BRANČÍK
Numerical inverse Laplace transform (NILT) methods are potential methods for time domain simulations, for instance the analysis of the transient phenomena in systems with lumped and/or distributed parameters. This paper proposes a numerical inverse Laplace transform method based originally on hyperbolic relations. The method is further enhanced by properly adapting several convergence acceleration techniques, namely, the epsilon algorithm of Wynn, the quotient-difference algorithm of Rutishauser and the Euler transform. The resulting accelerated models are compared as for their accuracy and computational efficiency. Moreover, an expansion to two dimensions is presented for the first time in the context of the accelerated hyperbolic NILT method, followed by the error analysis. The expansion is done by repeated application of one-dimensional partial numerical inverse Laplace transforms. A detailed static error analysis of the resulting 2D NILT is performed to prove the effectivness of the method. The work is followed by a practical application of the 2D NILT method to simulate voltage/current distributions along a transmission line. The method and application are programmed using the Matlab language.
In recent years, since Turbo and LDPC codes are very close to the Shannon limit, a great deal of attention has been placed on the capacity of AWGN and fading channels with arbitrary inputs. However, no closed-form solution has been developed due to the complicated Gaussian integrations. In this paper, we investigate the capacity of AWGN and fading channels with BPSK/QPSK modulation. First, a simple series representation with fast-convergence for the capacity of AWGN is developed. Further, based on the series expression, the capacity of fading channels including Rayleigh, Nakagami and Rice fading can be obtained through some special functions. Numerical results verify the accuracy and convergence speed of the proposed expressions for the capacity of AWGN and fading channels.
Hongyan WANG Quan CHENG Bingnan PEI
The issue of robust multi-input multi-output (MIMO) radar waveform design is investigated in the presence of imperfect clutter prior knowledge to improve the worst-case detection performance of space-time adaptive processing (STAP). Robust design is needed because waveform design is often sensitive to uncertainties in the initial parameter estimates. Following the min-max approach, a robust waveform covariance matrix (WCM) design is formulated in this work with the criterion of maximization of the worst-case output signal-interference-noise-ratio (SINR) under the constraint of the initial parameter estimation errors to ease this sensitivity systematically and thus improve the robustness of the detection performance to the uncertainties in the initial parameter estimates. To tackle the resultant complicated and nonlinear robust waveform optimization issue, a new diagonal loading (DL) based iterative approach is developed, in which the inner and outer optimization problems can be relaxed to convex problems by using DL method, and hence both of them can be solved very effectively. As compared to the non-robust method and uncorrelated waveforms, numerical simulations show that the proposed method can improve the robustness of the detection performance of STAP.
Chaiwat BUAJONG Chanon WARISARN
In this paper, we demonstrate how to subtract the intertrack interference (ITI) before the decoding process in multi-track multi-head bit-patterned media recording (BPMR) system, which can obtain a better bit error rate (BER) performance. We focus on the three-track/three-head BPMR channel and propose the ITI subtraction technique that performs together with a rate-5/6 two dimensional (2D) modulation code. Since the coded system can provide the estimated recorded bit sequence with a high reliability rate for the center track. However, the upper and lower data sequences still be interfered with their sidetracks, which results to have a low reliability rate. Therefore, we propose to feedback the data from the center and upper tracks for subtracting the ITI effect of the lower track. Meanwhile, the feedback data from the center and lower tracks will be also used to subtract the ITI effect of the upper track. The use of our proposed technique can effectively reduce the severity of ITI effect which caused from the two sidetracks. The computer simulation results in the presence of position and size fluctuations show that the proposed system yields better BER performance than a conventional coded system, especially when an areal density (AD) is ultra high.
This paper presents a self-calibrating dynamic latched comparator with a stochastic offset voltage detector that can be realized by using simple digital circuitry. An offset voltage of the comparator is compensated by using a statistical calibration scheme, and the offset voltage detector uses the uncertainty in the comparator output. Thanks to the simple offset detection technique, all the calibration circuitry can be synthesized using only standard logic cells. This paper also gives a design methodology that can provide the optimal design parameters for the detector on the basis of fundamental statistics, and the correctness of the design methodology was statistically validated through measurement. The proposed self-calibrating comparator system was fabricated in a 180 nm 1P6M CMOS process. The prototype achieved a 38 times improvement in the three-sigma of the offset voltage from 6.01 mV to 158 µV.
Kenya KONDO Koichi TANNO Hiroki TAMURA Shigetoshi NAKATAKE
In this paper, we propose the novel low voltage CMOS current mode reference circuit. It reduces the minimum supply voltage by consisting the subthreshold two stage operational amplifier (OPAMP) which is regarded as the combination of the proportional to absolute temperature (PTAT) and the complementary to absolute temperature (CTAT) current generators. It makes possible to implement without extra OPAMP. This proposed circuit has been designed and evaluated by SPICE simulation using TSMC 65nm CMOS process with 3.3V (2.5V over-drive) transistor option. From simulation results, the line sensitivity is as good as 0.196%/V under the condition that the range of supply voltage (VDD) is wide as 0.6V to 3.0V. The temperature coefficient is 71ppm/ under the condition that the temperature range is from -40 to 125 and VDD=0.6V. The power supply rejection ratio (PSRR) is -47.7dB when VDD=0.6V and the noise frequency is 100Hz. According to comparing the proposed circuit with prior current mode circuits, we could confirm the performance of the proposed circuit is better than that of prior circuits.
Sae IWATA Tomoyuki NITTA Toshinori TAKAYAMA Masao YANAGISAWA Nozomu TOGAWA
Cell phones with GPS function as well as GPS loggers are widely used and users' geographic information can be easily obtained. However, still battery consumption in these mobile devices is main concern and then obtaining GPS positioning data so frequently is not allowed. In this paper, a stayed location estimation method for sparse GPS positioning information is proposed. After generating initial clusters from a sequence of measured positions, the effective radius is set for every cluster based on positioning accuracy and the clusters are merged effectively using it. After that, short-time clusters are removed temporarily but measured positions included in them are not removed. Then the clusters are merged again, taking all the measured positions into consideration. This process is performed twice, in other words, two-stage short-time cluster removal is performed, and finally accurate stayed location estimation is realized even when the GPS positioning interval is five minutes or more. Experiments demonstrate that the total distance error between the estimated stayed location and the true stayed location is reduced by more than 33% and also the proposed method much improves F1 measure compared to conventional state-of-the-art methods.
Ziwei DENG Yilin HOU Xina CHENG Takeshi IKENAGA
3D ball tracking is of great significance in ping-pong game analysis, which can be utilized to applications such as TV contents and tactic analysis, with some of them requiring real-time implementation. This paper proposes a CPU-GPU platform based Particle Filter for multi-view ball tracking including 4 proposals. The multi-peak estimation and the ball-like observation model are proposed in the algorithm design. The multi-peak estimation aims at obtaining a precise ball position in case the particles' likelihood distribution has multiple peaks under complex circumstances. The ball-like observation model with 4 different likelihood evaluation, utilizes the ball's unique features to evaluate the particle's similarity with the target. In the GPU implementation, the double-queue structure and the vectorized data combination are proposed. The double-queue structure aims at achieving task parallelism between some data-independent tasks. The vectorized data combination reduces the time cost in memory access by combining 3 different image data to 1 vector data. Experiments are based on ping-pong videos recorded in an official match taken by 4 cameras located in 4 corners of the court. The tracking success rate reaches 99.59% on CPU. With the GPU acceleration, the time consumption is 8.8 ms/frame, which is sped up by a factor of 98 compared with its CPU version.
Given a sequence of k convex polygons in the plane, a start point s, and a target point t, we seek a shortest path that starts at s, visits in order each of the polygons, and ends at t. We revisit this touring polygons problem, which was introduced by Dror et al. (STOC 2003), by describing a simple method to compute the so-called last step shortest path maps, one per polygon. We obtain an O(kn)-time solution to the problem for a sequence of pairwise disjoint convex polygons and an O(k2n)-time solution for possibly intersecting convex polygons, where n is the total number of vertices of all polygons. A major simplification is made on the operation of locating query points in the last step shortest path maps. Our results improve upon the previous time bounds roughly by a factor of log n.
Guiping JIN Dan LIU Miaolan LI Yuehui CUI
In this paper, a simple pattern reconfigurable antenna with broadband circular polarization is proposed. The proposed antenna consists of four rectangular loops, a feeding network and four reflectors. Circular polarization is achieved by cutting two slots on opposite sides of the loops. By controlling the states of the four PIN diodes present in the feeding network, the proposed antenna can achieve four different pattern modes at the same frequency. Experiments show that the antenna has a bandwidth of 47.6% covering 1.73-2.81GHz for reflection coefficient (|S11|)<-10dB and a bandwidth of 55% covering 1.62-2.85GHz for axial ratio <3dB. The average gain is 8.5dBi and the radiation patterns are stable.