1-12hit |
Kazuki OMI Jun KIMATA Toru TAMAKI
In this paper, we propose a multi-domain learning model for action recognition. The proposed method inserts domain-specific adapters between layers of domain-independent layers of a backbone network. Unlike a multi-head network that switches classification heads only, our model switches not only the heads, but also the adapters for facilitating to learn feature representations universal to multiple domains. Unlike prior works, the proposed method is model-agnostic and doesn't assume model structures unlike prior works. Experimental results on three popular action recognition datasets (HMDB51, UCF101, and Kinetics-400) demonstrate that the proposed method is more effective than a multi-head architecture and more efficient than separately training models for each domain.
Xina CHENG Yiming ZHAO Takeshi IKENAGA
Real-time 3D players tracking plays an important role in sports analysis, especially for the live services of sports broadcasting, which have a strict limitation on processing time. For these kinds of applications, 3D trajectories of players contribute to high-level game analysis such as tactic analysis and commercial applications such as TV contents. Thus real-time implementation for 3D players tracking is expected. In order to achieve real-time for 60fps videos with high accuracy, (that means the processing time should be less than 16.67ms per frame), the factors that limit the processing time of target algorithm include: 1) Large image area of each player. 2) Repeated processing of multiple players in multiple views. 3) Complex calculation of observation algorithm. To deal with the above challenges, this paper proposes a representative spatial selection and temporal combination based real-time implementation for multi-view volleyball players tracking on the GPU device. First, the representative spatial pixel selection, which detects the pixels that mostly represent one image region to scale down the image spatially, reduces the number of processing pixels. Second, the representative temporal likelihood combination shares observation calculation by using the temporal correlation between images so that the times of complex calculation is reduced. The experiments are based on videos of the Final and Semi-Final Game of 2014 Japan Inter High School Games of Men's Volleyball in Tokyo Metropolitan Gymnasium. On the GPU device GeForce GTX 1080Ti, the tracking system achieves real-time on 60fps videos and keeps the tracking accuracy higher than 97%.
Cheng LUO Wei CAO Lingli WANG Philip H. W. LEONG
With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.
Kunihiko AKAHANE Takahiro ISHINABE Yosei SHIBATA Hideo FUJIKAKE
We show that light leakage that occurs in reflective polarizers at large angles of incidence can be suppressed by using anisotropic dielectric multilayers with larger reflective indices in thickness direction and that the interference-included 2×2 Jones matrix method is useful for the investigation of the optical propagation properties of the dielectric multilayers. The thickness of the reflective polarizer can also be reduced by optimizing the distribution of the multilayers in the stack, whilst considering the visual sensitivity. These results indicate that it is possible to realize a high-quality liquid crystal display with wide viewing angles and high light utilization efficiency.
Jin-Woo BAE Seung-Hyun LEE Ji-Sang YOO
In this paper, we propose a wavelet-based fast motion estimation algorithm for video sequence encoding with a low bit-rate. By using one of the properties of wavelet transform, multi-resolution analysis (MRA), and the spatial interpolation of an image, we can simultaneously reduce the prediction error and the computational complexity inherent in video sequence encoding. In addition, by defining a significant block (SB) based on the differential information of wavelet coefficients between successive frames, the proposed algorithm enables us to make up for the increase in the number of motion vectors when the MRME algorithm is used. As a result, we are not only able to improve the peak signal-to-noise ratio (PSNR), but also reduce the computational complexity by up to 67%.
Lachlan B. MICHAEL Ryuji KOHNO
To reduce the bandwidth needed for data transmission in an ad-hoc communication network, such as an Intelligent Transportation System (ITS) inter-vehicle communication network, a broadcast scheme is proposed where the data to be transmitted is arranged into several classes. Each class contains more specific and detailed information. Since information to be transmitted often has a geographical relevance, the classes can be structured to represent this relationship. As data is routed through the ad-hoc network, the total amount of transmitted data is reduced by removing the data contained in one class on each hop. The class structure is adaptive so that in unforeseen situations the relative importance of transmitted data can be dynamically adjusted. Furthermore different manufacturers can implement different classes structures, and total length of data may be different. By computer simulation it was shown that in the proposed system the required bandwidth for transmission to achieve similar data reception rates to conventional non-structured data schemes can be reduced to less than one third, resulting in a more efficient transmission scheme. In addition a packet structure similar to IP packets is proposed which will enable easily integration of multimedia transmissions into vehicle to vehicle communications.
Hironori WAKANA Masaki FUJIBAYASHI Noriyoshi FUSHIMI Osamu MICHIKAMI
By depositing insulating layers on oxide superconducting films, the films generally deteriorate. When an insulating multilayer of CeO2(50 )
This paper describes the wafer-level, three-dimensional packaging for MEMS in which sensors, actuators, electronic circuits and other functions are combined together in one integrated block. Si wafers with built-in MEMS functions were integrated with no change in thickness to ensure mechanical strength and improve heat dissipation. In the entire process of three-dimensional integration, Si wafers were processed at temperatures below 400C to prevent degradation of their built-in functions. A description is made of the low-temperature oxidation technology developed by us, which makes through-holes of high density and high aspect ratio in Si wafers with built-in functions by the Optical Excitation Electropolishing Method (OEEM) and forms an oxide film on the hole walls simply by replacing electrolyte. Next, a description is presented of the bumpless interconnection method which fills through-holes of stacked layers with metal by the molten metal suction method and of the electrocapillary effect as a countermeasure to prevent the filler metal from dropping out of holes under its own weight.
Kiyomi NAKAMURA Shingo MIYAMOTO
Although previous studies using artificial neural networks have been actively applied to object shape recognition, little attention has been paid to the recognition of spatial elements (e.g. position, rotation and size). In the present study, a rotation and size spreading associative neural network (RS-SAN net) is proposed and the efficacy of the RS-SAN net in object orientation (rotation), size and shape recognition is shown. The RS-SAN net pays attention to the fact that the spatial recognition system in the brain (parietal cortex) is involved in both the spatial (e.g. position, rotation and size) and shape recognition of an object. The RS-SAN net uses spatial spreading by spreading layers, generalized inverse learning and population vector methods for the recognition of the object. The information of the object orientation and size is spread by double spreading layers which have similar tuning characteristics to spatial discrimination neurons (e.g. axis orientation neurons and size discrimination neurons) in the parietal cortex. The RS-SAN net simultaneously recognizes the size of the object irrespective of its orientation and shape, the orientation irrespective of its size and shape, and the shape irrespective of its size and orientation.
Carla Denise CASTANHO Wei CHEN Koichi WADA Akihiro FUJIWARA
P-complete problems seem to have no parallel algorithm which runs in polylogarithmic time using a polynomial number of processors. A P-complete problem is in the class EP (Efficient and Polynomially fast) if and only if there exists a cost optimal algorithm to solve it in T(n) = O(t(n)ε) (ε < 1) using P(n) processors such that T(n) P(n) = O(t(n)), where t(n) is the time complexity of the fastest sequential algorithm which solves the problem. The goal of our research is to find EP parallel algorithms for some P-complete problems. In this paper first we consider the convex layers problem. We give an algorithm for computing the convex layers of a set S of n points in the plane. Let k be the number of the convex layers of S. When 1 k nε/2 (0 ε < 1) our algorithm runs in O((n log n)/p) time using p processors, where 1 p n1-ε/2, and it is cost optimal. Next, we consider the envelope layers problem of a set S of n line segments in the plane. Let k be the number of the envelope layers of S. When 1 k nε/2 (0 ε < 1), we propose an algorithm for computing the envelope layers of S in O((n α(n) log3 n)/p) time using p processors, where 1 p n1-ε/2, and α(n) is the functional inverse of Ackermann's function which grows extremely slowly. The computational model we use in this paper is the CREW-PRAM. Our first algorithm, for the convex layers problem, belongs to EP, and the second one, for the envelope layers problem, belongs to the class EP if a small factor of log n is ignored.
Yoshihiro NAKA Hiroyoshi IKUNO Masahiko NISHIMOTO Akira YATA
We present a finite-difference time-domain (FD-TD) method with the perfectly matched layers (PMLs) absorbing boundary condition (ABC) based on the multidimensional wave digital filters (MD-WDFs) for discrete-time modelling of Maxwell's equations and show its effectiveness. First we propose modified forms of the Maxwell's equations in the PMLs and its MD-WDFs' representation by using the current-controlled voltage sources. In order to estimate the lower bound of numerical errors which come from the discretization of the Maxwell's equations, we examine the numerical dispersion relation and show the advantage of the FD-TD method based on the MD-WDFs over the Yee algorithm. Simultaneously, we estimate numerical errors in practical problems as a function of grid cell size and show that the MD-WDFs can obtain highly accurate numerical solutions in comparison with the Yee algorithm. Then we analyze several typical dielectric optical waveguide problems such as the tapered waveguide and the grating filter, and confirm that the FD-TD method based on the MD-WDFs can also treat radiation and reflection phenomena, which commonly done using the Yee algorithm.
Shigeki NAKAGAWA Masahiko NAOE
Co-Zr and Co-Zr-Ta amorphous films were prepared by the Kr sputtering method for use as the backlayers of Co-Cr perpendicular magnetic recording tape media. The effect of the addition of Ta to Co-Zr thin films was also investigated. Lower substrate temperature was required to prepare amorphous Co-Zr films with excellent soft magnetic properties. The relationships among Ta content X, magnetostriction constant λ and magnetic characteristics such as coercivity Hc and relative permeability µr were clarified. A method of evaluating λ of soft magnetic thin films deposited on polymer sheet substrate has been presented. Films with composition of (Co95.7Zr4.3) 100-X TaX at X of 10 at.% possessed sufficiency soft magnetic properties such as low Hc below 80 A/m and high µr above 600. Addition of Ta was effective in changing change the sign of λ from positive to negative. It was found that the negative magnetoelastic energy and the smaller λ caused the soft magnetism.