Marcus BARKOWSKY Enrico MASALA Glenn VAN WALLENDAEL Kjell BRUNNSTRÖM Nicolas STAELENS Patrick LE CALLET
The current development of video quality assessment algorithms suffers from the lack of available video sequences for training, verification and validation to determine and enhance the algorithm's application scope. The Joint Effort Group of the Video Quality Experts Group (VQEG-JEG) is currently driving efforts towards the creation of large scale, reproducible, and easy to use databases. These databases will contain bitstreams of recent video encoders (H.264, H.265), packet loss impairment patterns and impaired bitstreams, pre-parsed bitstream information into files in XML syntax, and well-known objective video quality measurement outputs. The database is continuously updated and enlarged using reproducible processing chains. Currently, more than 70,000 sequences are available for statistical analysis of video quality measurement algorithms. New research questions are posed as the database is designed to verify and validate models on a very large scale, testing and validating various scopes of applications, while subjective assessment has to be limited to a comparably small subset of the database. Special focus is given on the principles guiding the database development, and some results are given to illustrate the practical usefulness of such a database with respect to the detailed new research questions.
Jiayi ZHU Dajiang ZHOU Shinji KIMURA Satoshi GOTO
High efficiency video coding (HEVC) is the new generation video compression standard. Sample adaptive offset (SAO) is a new compression tool adopted in HEVC which reduces the distortion between original samples and reconstructed samples. SAO estimation is the process of determining SAO parameters in video encoding. It is divided into two phases: statistic collection and parameters determination. There are two difficulties for VLSI implementation of SAO estimation. The first is that there are huge amount of samples to deal with in statistic collection phase. The other is that the complexity of Rate Distortion Optimization (RDO) in parameters determination phase is very high. In this article, a fast SAO estimation algorithm and its corresponding VLSI architecture are proposed. For the first difficulty, we use bitmaps to collect statistics of all the 16 samples in one 4×4 block simultaneously. For the second difficulty, we simplify a series of complicated procedures in HM to balance the algorithms complexity and BD-rate performance. Experimental results show that the proposed algorithm maintains the picture quality improvement. The VLSI design based on this algorithm can be implemented using 156.32K gates, 8,832bits single port RAM for 8bits depth case. It can be synthesized to 400MHz @ 65nm technology and is capable of 8K×4K @ 120fps encoding.
Jun YAO Yasuhiko NAKASHIMA Naveen DEVISETTI Kazuhiro YOSHIMURA Takashi NAKADA
General purpose many-core architecture (MCA) such as GPGPU has recently been used widely to continue the performance scaling when the continuous increase in the working frequency has approached the manufacturing limitation. However, both the general purpose MCA and its building block general purpose processor (GPP) lack a tuning capability to boost energy efficiency for individual applications, especially computation intensive applications. As an alternative to the above MCA platforms, we propose in this paper our LAPP (Linear Array Pipeline) architecture, which takes a special-purpose reconfigurable structure for an optimal MIPS/W. However, we also keep the backward binary compatibility, which is not featured in most special hardware. More specifically, we used a general purpose VLIW processor, interpreting a commercial VLIW ISA, as the baseline frontend part to provide the backward binary compatibility. We also extended the functional unit (FU) stage into an FU array to form the reconfigurable backend for efficient execution of program hotspots to exploit parallelism. The hardware modules in this general purpose reconfigurable architecture have been locally zoned into several groups to apply preferable low-power techniques according to the module hardware features. Our results show that under a comparable performance, the tightly coupled general/special purpose hardware, which is based on a 180nm cell library, can achieve 10.8 times the MIPS/W of MCA architecture of the same technology features. When a 65 technology node is assumed, a similar 9.4x MIPS/W can be achieved by using the LAPP without changing program binaries.
Hiroaki KONOURA Dawood ALNAJJAR Yukio MITSUYAMA Hajime SHIMADA Kazutoshi KOBAYASHI Hiroyuki KANBARA Hiroyuki OCHI Takashi IMAGAWA Kazutoshi WAKABAYASHI Masanori HASHIMOTO Takao ONOYE Hidetoshi ONODERA
This paper proposes a mixed-grained reconfigurable architecture consisting of fine-grained and coarse-grained fabrics, each of which can be configured for different levels of reliability depending on the reliability requirement of target applications, e.g. mission-critical applications to consumer products. Thanks to the fine-grained fabrics, the architecture can accommodate a state machine, which is indispensable for exploiting C-based behavioral synthesis to trade latency with resource usage through multi-step processing using dynamic reconfiguration. In implementing the architecture, the strategy of dynamic reconfiguration, the assignment of configuration storage and the number of implementable states are key factors that determine the achievable trade-off between used silicon area and latency. We thus split the configuration bits into two classes; state-wise configuration bits and state-invariant configuration bits for minimizing area overhead of configuration bit storage. Through a case study, we experimentally explore the appropriate number of implementable states. A proof-of-concept VLSI chip was fabricated in 65nm process. Measurement results show that applications on the chip can be working in a harsh radiation environment. Irradiation tests also show the correlation between the number of sensitive bits and the mean time to failure. Furthermore, the temporal error rate of an example application due to soft errors in the datapath was measured and demonstrated for reliability-aware mapping.
Masanori HIROTOMO Masakatu MORII
In this paper, we propose an efficient method for computing the weight spectrum of LDPC convolutional codes based on circulant matrices of quasi-cyclic codes. In the proposed method, we reduce the memory size of their parity-check matrices with the same distance profile as the original codes, and apply a forward and backward tree search algorithm to the parity-check matrices of reduced memory. We show numerical results of computing the free distance and the low-part weight spectrum of LDPC convolutional codes of memory about 130.
Ye GAO Masayuki SATO Ryusuke EGAWA Hiroyuki TAKIZAWA Hiroaki KOBAYASHI
Vector processors have significant advantages for next generation multimedia applications (MMAs). One of the advantages is that vector processors can achieve high data transfer performance by using a high bandwidth memory sub-system, resulting in a high sustained computing performance. However, the high bandwidth memory sub-system usually leads to enormous costs in terms of chip area, power and energy consumption. These costs are too expensive for commodity computer systems, which are the main execution platform of MMAs. This paper proposes a new multi-banked cache memory for commodity computer systems called MVP-cache in order to expand the potential of vector architectures on MMAs. Unlike conventional multi-banked cache memories, which employ one tag array and one data array in a sub-cache, MVP-cache associates one tag array with multiple independent data arrays of small-sized cache lines. In this way, MVP-cache realizes less static power consumption on its tag arrays. MVP-cache can also achieve high efficiency on short vector data transfers because the flexibility of data transfers can be improved by independently controlling the data transfers of each data array.
Yao ZHENG Limin XIAO Wenqi TANG Lihong SHANG Guangchao YAO Li RUAN
The dynamic time warping (DTW) algorithm is widely used to determine time series similarity search. As DTW has quadratic time complexity, the time taken for similarity search is the bottleneck for virtually all time series data mining algorithms. In this paper, we present a parallel approach for DTW on a heterogeneous platform with a graphics processing unit (GPU). In order to exploit fine-grained data-level parallelism, we propose a specific parallel decomposition in DTW. Furthermore, we introduce an optimization technique called diamond tiling to improve the utilization of threads. Results show that our approach substantially reduces computational time.
Jun-Sang PARK Sung-Ho YOON Youngjoon WON Myung-Sup KIM
Internet traffic classification is an essential step for stable service provision. The payload signature classifier is considered a reliable method for Internet traffic classification but is prohibitively computationally expensive for real-time handling of large amounts of traffic on high-speed networks. In this paper, we describe several design techniques to minimize the search space of traffic classification and improve the processing speed of the payload signature classifier. Our suggestions are (1) selective matching algorithms based on signature type, (2) signature reorganization using hierarchical structure and traffic locality, and (3) early packet sampling in flow. Each can be applied individually, or in any combination in sequence. The feasibility of our selections is proved via experimental evaluation on traffic traces of our campus and a commercial ISP. We observe 2 to 5 times improvement in processing speed against the untuned classification system and Snort Engine, while maintaining the same level of accuracy.
Jie ZHANG Chuan XIAO Toyohide WATANABE Yoshiharu ISHIKAWA
Presentation slide composition is an important job for knowledge workers. Instead of starting from scratch, users tend to make new presentation slides by reusing existing ones. A primary challenge in slide reuse is to select desired materials from a collection of existing slides. The state-of-the-art solution utilizes texts and images in slides as well as file names to help users to retrieve the materials they want. However, it only allows users to choose an entire slide as a query but does not support the search for a single element such as a few keywords, a sentence, an image, or a diagram. In this paper, we investigate content-based search for a variety of elements in presentation slides. Users may freely choose a slide element as a query. We propose different query processing methods to deal with various types of queries and improve the search efficiency. A system with a user-friendly interface is designed, based on which experiments are performed to evaluate the effectiveness and the efficiency of the proposed methods.
Qing DU Yu LIU Dongping HUANG Haoran XIE Yi CAI Huaqing MIN
With the development of the Internet, there are more and more shared resources on the Web. Personalized search becomes increasingly important as users demand higher retrieval quality. Personalized search needs to take users' personalized profiles and information needs into consideration. Collaborative tagging (also known as folksonomy) systems allow users to annotate resources with their own tags (features) and thus provide a powerful way for organizing, retrieving and sharing different types of social resources. To capture and understand user preferences, a user is typically modeled as a vector of tag: value pairs (i.e., a tag-based user profile) in collaborative tagging systems. In such a tag-based user profile, a user's preference degree on a group of tags (i.e., a combination of several tags) mainly depends on the preference degree on every individual tag in the group. However, the preference degree on a combination of tags (a tag-group) cannot simply be obtained from linearly combining the preference on each tag. The combination of a user's two favorite tags may not be favorite for the user. In this article, we examine the limitations of previous tag-based personalized search. To overcome their problems, we model a user profile based on combinations of tags (tag-groups) and then apply it to the personalized search. By comparing it with the state-of-the-art methods, experimental results on a real data set shows the effectiveness of our proposed user profile method.
Tian LIANG Wei HENG Chao MENG Guodong ZHANG
In this paper, we consider multi-source multi-relay power allocation in cooperative wireless networks. A new intelligent optimization algorithm, multi-objective free search (MOFS), is proposed to efficiently allocate cooperative relay power to better support multiple sources transmission. The existence of Pareto optimal solutions is analyzed for the proposed multi-objective power allocation model when the objectives conflict with each other, and the MOFS algorithm is validated using several test functions and metrics taken from the standard literature on evolutionary multi-objective optimization. Simulation results show that the proposed scheme can effectively get the potential optimal solutions of multi-objective power allocation problem, and it can effectively optimize the tradeoff between network sum-rate and fairness in different applications by selection of the corresponding solution.
Tao WANG Huaimin WANG Gang YIN Cheng YANG Xiang LI Peng ZOU
The large amounts of freely available open source software over the Internet are fundamentally changing the traditional paradigms of software development. Efficient categorization of the massive projects for retrieving relevant software is of vital importance for Internet-based software development such as solution searching, best practices learning and so on. Many previous works have been conducted on software categorization by mining source code or byte code, but were verified on only relatively small collections of projects with coarse-grained categories or clusters. However, Internet-based software development requires finer-grained, more scalable and language-independent categorization approaches. In this paper, we propose a novel approach to hierarchically categorize software projects based on their online profiles. We design a SVM-based categorization framework and adopt a weighted combination strategy to aggregate different types of profile attributes from multiple repositories. Different basic classification algorithms and feature selection techniques are employed and compared. Extensive experiments are carried out on more than 21,000 projects across five repositories. The results show that our approach achieves significant improvements by using weighted combination. Compared to the previous work, our approach presents competitive results with more finer-grained and multi-layered category hierarchy with more than 120 categories. Unlike approaches that use source code or byte code, our approach is more effective for large-scale and language-independent software categorization. In addition, experiments suggest that hierarchical categorization combined with general keyword-based searching improves the retrieval efficiency and accuracy.
This paper proposes a robust and fast lyric search method for music information retrieval (MIR). The effectiveness of lyric search systems based on full-text retrieval engines or web search engines is highly compromised when the queries of lyric phrases contain incorrect parts due to mishearing. To improve the robustness of the system, the authors introduce acoustic distance, which is computed based on a confusion matrix of an automatic speech recognition experiment, into Dynamic-Programming (DP)-based phonetic string matching to identify the songs that the misheard lyric phrases refer to. An evaluation experiment verified that the search accuracy is increased by 4.4% compared with the conventional method. Furthermore, in this paper a two-pass search algorithm is proposed to realize real-time execution. The algorithm pre-selects the probable candidates using a rapid index-based search in the first pass and executes a DP-based search process with an adaptive termination strategy in the second pass. Experimental results show that the proposed search method reduced processing time by more than 86.2% compared with the conventional methods for the same search accuracy.
For break arcs occurring between Ag and Ag/SnO$_2$ 12,wt% electrical contact pairs, the electrical conductivity, viscosity and specific heat at constant pressure are calculated as thermodynamic and transport properties. Mixture rates of contact material vapor are 0%, 1%, 10% and 100%. Influence of the contact material on the properties is investigated. Temperature for the calculation ranges from 2000,K to 20000,K. Following results are shown. When the mixture rate is changed, the electrical conductivity varies at lower temperature (< 10000,K), and the viscosity and specific heat vary widely at all temperature range. The electrical conductivity is independent of the mixture rate when the temperature is exceeding 10000,K. The thermodynamic and transport properties are independent of the kind of the contact materials.
Chen LI Zhenbiao LI Qian WANG Du LIU Makoto HASEGAWA Lingling LI
To clarify the dependence of arc duration on atmosphere, experiments were conducted under conditions of air, N$_{2}$, Ar, He and CO$_{2}$ with the pressure of 0.1,MPa in a 14,V/28,V/42,V circuit respectively. A quantitative relationship between arc duration and gas parameters such as ionization potential, thermal conductivity was obtained from the experimental data. Besides, the inherent mechanism of influence of atmosphere on arc duration was discussed.
Yoshiki KAYANO Kazuaki MIYANAGA Hiroshi INOUE
Since electromagnetic (EM) noise resulting from an arc discharge disturbs other electric devices, parameters on electromagnetic compatibility, as well as lifetime and reliability, are important properties for electrical contacts. To clarify the characteristics and the mechanism of the generation of the EM noise, the arc column and voltage fluctuations generated by slowly breaking contacts with external direct current (DC) magnetic field, up to 20,mT, was investigated experimentally using Ag$_{90.7{ m wt%}}$SnO$_{2,9.3{ m wt}%}$ material. Firstly the motion of the arc column is measured by high-speed camera. Secondary, the distribution of the motion of the arc and contact voltage are discussed. It was revealed that the contact voltage fluctuation in the arc duration is related to the arc column motion.
Akimitsu DOI Takao HINAMOTO Wu-Sheng LU
For two-dimensional IIR digital filters described by the Fornasini-Marchesini second model, the problem of jointly optimizing high-order error feedback and realization to minimize the effects of roundoff noise at the filter output subject to l2-scaling constraints is investigated. The problem at hand is converted into an unconstrained optimization problem by using linear-algebraic techniques. The unconstrained optimization problem is then solved iteratively by applying an efficient quasi-Newton algorithm with closed-form formulas for key gradient evaluation. Finally, a numerical example is presented to illustrate the validity and effectiveness of the proposed technique.
A global tree local X-net network (GTLX) is introduced to realize high-performance data transfer in a multiple-valued fine-grain reconfigurable VLSI (MVFG-RVLSI). A global pipelined tree network is utilized to realize high-performance long-distance bit-parallel data transfer. Moreover, a logic-in-memory architecture is employed for solving data transfer bottleneck between a block data memory and a cell. A local X-net network is utilized to realize simple interconnections and compact switch blocks for eight-near neighborhood data transfer. Moreover, multiple-valued signaling is utilized to improve the utilization of the X-net network, where two binary data can be transferred from two adjacent cells to one common adjacent cell simultaneously at each “X” intersection. To evaluate the MVFG-RVLSI, a fast Fourier transform (FFT) operation is mapped onto a previous MVFG-RVLSI using only the X-net network and the MVFG-RVLSI using the GTLX. As a result, the computation time, the power consumption and the transistor count of the MVFG-RVLSI using the GTLX are reduced by 25%, 36% and 56%, respectively, in comparison with those of the MVFG-RVLSI using only the X-net network.
Xue ZHOU Mo CHEN Xinglei CUI Guofu ZHAI
High voltage DC contactors, for operation at voltage levels up to at least about 300,volts, find their increasing markets in applications such as electrical vehicles and aircrafts in which size and weight of cables are of extreme importance. The copper bridge-type contact, cooperated with magnetic field provided by permanent magnets and sealed in an arc chamber filled with high pressure gases, is a mainly used structure to interrupt the DC arc rapidly. Arc characteristic in different gases at different pressure varies greatly. This paper is focused on the arc characteristics of the bridge-type contact system when magnetic field is applied with nitrogen and gas at different pressure. The pressure of the gases varies from 1,atm to 2.5,atm. Arc characteristics, such as arc durations at different stages and arc motions in those gases are comparatively studied. The results are instructive for choosing the suitable arcing atmosphere in a DC bridge-type arc chamber of a contactor.
Daniel MADRIGAL Gustavo TORRES Felix RAMOS
In this paper we present a cognitive architecture inspired on the biological functioning of the motor system in humans. To test the model, we built a robotic hand with a Lego Mindstorms™ kit. Then, through communication between the architecture and the robotic hand, the latter was able to perform the movement of the fingers, which therefore allowed it to perform grasping of some objects. In order to obtain these results, the architecture performed a conversion of the activation of motor neuron pools into specific degrees of servo motor movement. In this case, servo motors acted as muscles, and degrees of movement as exerted muscle force. Finally, this architecture will be integrated with high-order cognitive functions towards getting automatic motor commands generation, through planning and decision making mechanisms.