Shu JIANG Rui WANG Zuchao LI Masao UTIYAMA Kehai CHEN Eiichiro SUMITA Hai ZHAO Bao-liang LU
Standard neural machine translation (NMT) is on the assumption that the document-level context is independent. Most existing document-level NMT approaches are satisfied with a smattering sense of global document-level information, while this work focuses on exploiting detailed document-level context in terms of a memory network. The capacity of the memory network that detecting the most relevant part of the current sentence from memory renders a natural solution to model the rich document-level context. In this work, the proposed document-aware memory network is implemented to enhance the Transformer NMT baseline. Experiments on several tasks show that the proposed method significantly improves the NMT performance over strong Transformer baselines and other related studies.
Takahisa YAMAMOTO Shiki TAKEUCHI Atsushi NAKAZAWA
Visual sentiment analysis has a lot of applications, including image captioning, opinion mining, and advertisement; however, it is still a difficult problem and existing algorithms cannot produce satisfactory results. One of the difficulties in classifying images into emotions is that visual sentiments are evoked by different types of information - visual and semantic information where visual information includes colors or textures, and semantic information includes types of objects evoking emotions and/or their combinations. In contrast to the existing methods that use only visual information, this paper shows a novel algorithm for image emotion recognition that uses both information simultaneously. For semantic features, we introduce an object vector and a word vector. The object vector is created by an object detection method and reflects existing objects in an image. The word vector is created by transforming the names of detected objects through a word embedding model. This vector will be similar among objects that are semantically similar. These semantic features and a visual feature made by a fine-tuned convolutional neural network (CNN) are concatenated. We perform the classification by the concatenated feature vector. Extensive evaluation experiments using emotional image datasets show that our method achieves the best accuracy except for one dataset against other existing methods. The improvement in accuracy of our method from existing methods is 4.54% at the highest.
We propose a new method for improving the recognition performance of phonemes, speech emotions, and music genres using multi-task learning. When tasks are closely related, multi-task learning can improve the performance of each task by learning common feature representation for all the tasks. However, the recognition tasks considered in this study demand different input signals of speech and music at different time scales, resulting in input features with different characteristics. In addition, a training dataset with multiple labels for all information sources is not available. Considering these issues, we conduct multi-task learning in a sequential training process using input features with a single label for one information source. A comparative evaluation confirms that the proposed method for multi-task learning provides higher performance for all recognition tasks than individual learning for each task as in conventional methods.
Applications of continuous-time (CT) comparator include relaxation oscillators, pulse width modulators, and so on. CT comparator receives a differential input and outputs a strobe ideally when the differential input crosses zero. Unlike the DT comparators with positive feedback circuit, amplifiers consuming static power must be employed in CT comparators to amplify the input signal. Therefore, minimization of comparator delay under the constraint of power consumption often becomes an issue. This paper analyzes transient behavior of a CT comparator. Using “constant delay approximation”, the comparator delay is derived as a function of input slew rate, number of stages of the preamplifier, and device parameters in each block. This paper also discusses optimum design of the CT comparator. The condition for minimum comparator delay is derived with keeping power consumption constant. The results include that the optimum DC gain of the preamplifier is e∼e3 per stage depending on the element which dominates load capacitance of the preamplifier.
Zhe LYU Changjun YU Di YAO Aijun LIU Xuguang YANG
Observations of gravity waves based on High Frequency Surface Wave Radar can make contributions to a better understanding of the energy transfer process between the ocean and the ionosphere. In this paper, through processing the observed data of the ionospheric clutter from HFSWR during the period of the Typhoon Rumbia with short-time Fourier transform method, HFSWR was proven to have the capability of gravity wave detection.
Yue YIN Haoze CHEN Zongdian LI Tao YU Kei SAKAGUCHI
Communication systems operating in the millimeter-wave (mmWave) band have the potential to realize ultra-high throughput and ultra-low latency vehicle-to-vehicle (V2V) communications in 5G and beyond wireless networks. Moreover, because of the weak penetration nature of mmWave, one mmWave channel can be reused in all V2V links, which improves the spectrum efficiency. Although the outstanding performance of the mmWave above has been widely acknowledged, there are still some shortcomings. One of the unavoidable defects is multipath interference. Even though the direct interference link cannot penetrate vehicle bodies, other interference degrades the throughput of the mmWave V2V communication. In this paper, we focus on the multipath interference caused by signal reflections from roads and surroundings, where the interference strength varies in road scenarios. Firstly, we analyze the multipath channel models of mmWave V2V with relay in three typical road scenarios (single straight roads, horizontal curves, and slopes). Their interference differences are clarified. Based on the analysis, a novel method of ZigZag antenna configuration is proposed to guarantee the required data rate. Secondly, the performance of the proposed method is evaluated by simulation. It proves that the ZigZag antenna configuration with an optimal antenna height can significantly suppress the destructive interference, and ensure a throughput over 1Gbps comparing to the conventional antenna configuration at 60GHz band. Furthermore, the effectiveness of ZigZag antenna configuration is demonstrated on a single straight road by outdoor experiments.
We consider an asymptotic stabilization problem for a chain of integrators by using an event-triggered controller. The times required between event-triggered executions and controller updates are uncertain, time-varying, and not necessarily small. We show that the considered system can be asymptotically stabilized by an event-triggered gain-scaling controller. Also, we show that the interexecution times are lower bounded and their lower bounds can be manipulated by a gain-scaling factor. Some future extensions are also discussed. An example is given for illustration.
Hao ZHOU Zhuangzhuang ZHANG Yun LIU Meiyan XUAN Weiwei JIANG Hailing XIONG
Single image dehazing algorithm based on Dark Channel Prior (DCP) is widely known. More and more image dehazing algorithms based on DCP have been proposed. However, we found that it is more effective to use DCP in the RAW images before the ISP pipeline. In addition, for the problem of DCP failure in the sky area, we propose an algorithm to segment the sky region and compensate the transmission. Extensive experimental results on both subjective and objective evaluation demonstrate that the performance of the modified DCP (MDCP) has been greatly improved, and it is competitive with the state-of-the-art methods.
Hiroyuki SHINBO Kousuke YAMAZAKI Yoji KISHI
To achieve highly efficient spectrum usage, dynamic sharing of scarce spectrum resources has recently become the subject of intense discussion. The technologies of dynamic spectrum sharing (DSS) have already been adopted or are scheduled to be adopted in a number of countries, and Japan is no exception. The authors and organizations collaborating in the research and development project being undertaken in Japan have studied a novel DSS system positioned between the fifth-generation mobile communication system (5G system) and different incumbent radio systems. Our DSS system has three characteristics. (1) It detects dynamically unused sharable spectrums (USSs) of incumbent radio systems for the space axis by using novel propagation models and estimation of the transmitting location with radio sensor information. (2) It manages USSs for the time axis by interference calculation with propagation parameters, fair assignment and future usage of USSs. (3) It utilizes USSs for the spectrum axis by using methods that decrease interference for lower separation distances. In this paper, we present an overview and the technologies of our DSS system and its applications in Japan.
In this paper, for the purpose of clarifying the desired ITS information and communication systems considering both safety and social feasibility to prevention overengineering, using a microscopic traffic flow simulator, we discuss the required information acquisition rate of three types of safety driving support systems, that is, the sensor type and the communication type, the sensor and communication fusion type. Performances are evaluated from the viewpoint of preventing overengineering performance using the “TsRm evaluation method” that considers a vehicle approaching within the range of R meters within T seconds as the vehicle with a high possibility of collision, and that evaluates only those vehicles. The results show that regarding the communication radius and the sensing range, overengineering performance may be estimated when all vehicles in the evaluation area are used for evaluations without considering each vehicle's location, velocity and acceleration as in conventional evaluations. In addition, it is clarified that the sensor and communication fusion type system is advantageous by effectively complementing the defects of the sensor type systems and the communication type systems.
Yuya KAMATAKI Yusuke KAMEDA Yasuyo KITA Ichiro MATSUDA Susumu ITOH
This paper proposes a lossless coding method for HDR color images stored in a floating point format called Radiance RGBE. In this method, three mantissa and a common exponent parts, each of which is represented in 8-bit depth, are encoded using the block-adaptive prediction technique with some modifications considering the data structure.
Chen CHEN Maojun ZHANG Hanlin TAN Huaxin XIAO
Pedestrian detection is an essential but challenging task in computer vision, especially in crowded scenes due to heavy intra-class occlusion. In human visual system, head information can be used to locate pedestrian in a crowd because it is more stable and less likely to be occluded. Inspired by this clue, we propose a dual-task detector which detects head and human body simultaneously. Concretely, we estimate human body candidates from head regions with statistical head-body ratio. A head-body alignment map is proposed to perform relational learning between human bodies and heads based on their inherent correlation. We leverage the head information as a strict detection criterion to suppress common false positives of pedestrian detection via a novel pull-push loss. We validate the effectiveness of the proposed method on the CrowdHuman and CityPersons benchmarks. Experimental results demonstrate that the proposed method achieves impressive performance in detecting heavy-occluded pedestrians with little additional computation cost.
Chao WANG Michihiko OKUYAMA Ryo MATSUOKA Takahiro OKABE
Water detection is important for machine vision applications such as visual inspection and robot motion planning. In this paper, we propose an approach to per-pixel water detection on unknown surfaces with a hyperspectral image. Our proposed method is based on the water spectral characteristics: water is transparent for visible light but translucent/opaque for near-infrared light and therefore the apparent near-infrared spectral reflectance of a surface is smaller than the original one when water is present on it. Specifically, we use a linear combination of a small number of basis vector to approximate the spectral reflectance and estimate the original near-infrared reflectance from the visible reflectance (which does not depend on the presence or absence of water) to detect water. We conducted a number of experiments using real images and show that our method, which estimates near-infrared spectral reflectance based on the visible spectral reflectance, has better performance than existing techniques.
Yasunori SUZUKI Shoichi NARAHASHI
This paper presents linearization technologies for high efficiency power amplifiers of cellular base stations. These technologies are important to actualizing highly efficient power amplifiers that reduce power consumption of the base station equipment and to achieving a sufficient non-linear distortion compensation level. It is well known that it is very difficult for a power amplifier using linearization technologies to achieve simultaneously high efficiency and a sufficient non-linear distortion compensation level. This paper presents two approaches toward addressing this technical issue. The first approach is a feed-forward power amplifier using the Doherty amplifier as the main amplifier. The second approach is a digital predistortion linearizer that compensates for frequency dependent intermodulation distortion components. Experimental results validate these approaches as effective for providing power amplification for base stations.
This paper presents an analytical model that yields the unavailability of a network function when each backup server can protect two functions and can recover one of them. Previous work describes a model to deal with the case that each function can be protected only by one server. In our model, we allow each function to be protected by multiple servers to ensure function availability. This requires us to know the feasible states of a connected component and its state transitions. By adopting the divide-and-conquer method, we enumerate the feasible states of a connected component. We then classify its state transitions. Based on the obtained feasible states and the classification of the state transitions, we enumerate the feasible states incoming to and outgoing from a general state, the transfer rates, and the conditions. With those informations, we generate multiple equations about the state transitions. Finally, by solving them, we obtain the probabilities that a connected component is in each state and calculate the unavailability of a function. Numerical results show that the average unavailability of a function is reduced by 18% and 5.7% in our two examined cases by allowing each function to be protected by multiple servers.
Erik DAHLMAN Gunnar MILDH Stefan PARKVALL Patrik PERSSON Gustav WIKSTRÖM Hideshi MURAI
The paper provides an overview of the current status of the 5G evolution as well as a research outlook on the future wireless-access evolution towards 6G.
Jiafeng MAO Qing YU Kiyoharu AIZAWA
Well annotated dataset is crucial to the training of object detectors. However, the production of finely annotated datasets for object detection tasks is extremely labor-intensive, therefore, cloud sourcing is often used to create datasets, which leads to these datasets tending to contain incorrect annotations such as inaccurate localization bounding boxes. In this study, we highlight a problem of object detection with noisy bounding box annotations and show that these noisy annotations are harmful to the performance of deep neural networks. To solve this problem, we further propose a framework to allow the network to modify the noisy datasets by alternating refinement. The experimental results demonstrate that our proposed framework can significantly alleviate the influences of noise on model performance.
Kakeru MATSUBARA Shun KUROKI Koki ITO Kazushi SHIMADA Kazuki MARUTA Chang-Jun AHN
This letter expands the previously proposed High Time Resolution Carrier Interferometry (HTRCI) to estimate a larger amount of channel status information (CSI). HTRCI is based on a comb-type pilot symbol on OFDM and CSI for null subcarriers are interpolated by time-domain signal processing. In order to utilize such null pilot subcarriers for increasing estimable CSI, they should generally be separated in frequency-domain prior to estimation and interpolation processes. The main proposal is its separation scheme in conjunction with the HTRCI treatment of the temporal domain. Its effectiveness is verified by a pilot de-contamination on downlink two-cell MIMO transmission scenario. Binary error rate (BER) performance can be improved in comparison to conventional HTRCI and zero padding (ZP) which replaces the impulse response alias with zeros.
Natsuki UENO Shoichi KOYAMA Hiroshi SARUWATARI
We propose a useful formulation for ill-posed inverse problems in Hilbert spaces with nonlinear clipping effects. Ill-posed inverse problems are often formulated as optimization problems, and nonlinear clipping effects may cause nonconvexity or nondifferentiability of the objective functions in the case of commonly used regularized least squares. To overcome these difficulties, we present a tractable formulation in which the objective function is convex and differentiable with respect to optimization variables, on the basis of the Bregman divergence associated with the primitive function of the clipping function. By using this formulation in combination with the representer theorem, we need only to deal with a finite-dimensional, convex, and differentiable optimization problem, which can be solved by well-established algorithms. We also show two practical examples of inverse problems where our theory can be applied, estimation of band-limited signals and time-harmonic acoustic fields, and evaluate the validity of our theory by numerical simulations.
In this letter, we will prove that chaotic binary sequences generated by the tent map and Walsh functions are i.i.d. (independent and identically distributed) and orthogonal to each other.