IEICE global.ieice.org Site

Keyword Search Result

[Keyword] (42807hit)

10541-10560hit(42807hit)

Window Memory Layout Scheme for Alternate Row-Wise/Column-Wise Matrix Access
Lei GUO Yuhua TANG Yong DOU Yuanwu LEI Meng MA Jie ZHOU

PAPER-Computer System

Vol:
E96-D No:12
Page(s):
2765-2775
The effective bandwidth of the dynamic random-access memory (DRAM) for the alternate row-wise/column-wise matrix access (AR/CMA) mode, which is a basic characteristic in scientific and engineering applications, is very low. Therefore, we propose the window memory layout scheme (WMLS), which is a matrix layout scheme that does not require transposition, for AR/CMA applications. This scheme maps one row of a logical matrix into a rectangular memory window of the DRAM to balance the bandwidth of the row- and column-wise matrix access and to increase the DRAM IO bandwidth. The optimal window configuration is theoretically analyzed to minimize the total number of no-data-visit operations of the DRAM. Different WMLS implementationsare presented according to the memory structure of field-programmable gata array (FPGA), CPU, and GPU platforms. Experimental results show that the proposed WMLS can significantly improve DRAM bandwidth for AR/CMA applications. achieved speedup factors of 1.6× and 2.0× are achieved for the general-purpose CPU and GPU platforms, respectively. For the FPGA platform, the WMLS DRAM controller is custom. The maximum bandwidth for the AR/CMA mode reaches 5.94 GB/s, which is a 73.6% improvement compared with that of the traditional row-wise access mode. Finally, we apply WMLS scheme for Chirp Scaling SAR application, comparing with the traditional access approach, the maximum speedup factors of 4.73X, 1.33X and 1.56X can be achieved for FPGA, CPU and GPU platform, respectively.
Apps at Hand: Personalized Live Homescreen Based on Mobile App Usage Prediction
Xiao XIA Xinye LIN Xiaodong WANG Xingming ZHOU Deke GUO

LETTER-Information Network

Vol:
E96-D No:12
Page(s):
2860-2864
To facilitate the discovery of mobile apps in personal devices, we present the personalized live homescreen system. The system mines the usage patterns of mobile apps, generates personalized predictions, and then makes apps available at users' hands whenever they want them. Evaluations have verified the promising effectiveness of our system.
An Efficient O(1) Contrast Enhancement Algorithm Using Parallel Column Histograms
Yan-Tsung PENG Fan-Chieh CHENG Shanq-Jang RUAN

LETTER

Vol:
E96-D No:12
Page(s):
2724-2725
Display devices play image files, of which contrast enhancement methods are usually employed to bring out visual details to achieve better visual quality. However, applied to high resolution images, the contrast enhancement method entails high computation costs mostly due to histogram computations. Therefore, this letter proposes a parallel histogram calculation algorithm using the column histograms and difference histograms to reduce histogram computations. Experimental results show that the proposed algorithm is effective for histogram-based image contrast enhancement.
Teachability of a Subclass of Simple Deterministic Languages
Yasuhiro TAJIMA

PAPER-Fundamentals of Information Systems

Vol:
E96-D No:12
Page(s):
2733-2742
We show teachability of a subclass of simple deterministic languages. The subclass we define is called stack uniform simple deterministic languages. Teachability is derived by showing the query learning algorithm for this language class. Our learning algorithm uses membership, equivalence and superset queries. Then, it terminates in polynomial time. It is already known that simple deterministic languages are polynomial time query learnable by context-free grammars. In contrast, our algorithm guesses a hypothesis by a stack uniform simple deterministic grammar, thus our result is strict teachability of the subclass of simple deterministic languages. In addition, we discuss parameters of the polynomial for teachability. The “thickness” is an important parameter for parsing and it should be one of parameters to evaluate the time complexity.
Improving Text Categorization with Semantic Knowledge in Wikipedia
Xiang WANG Yan JIA Ruhua CHEN Hua FAN Bin ZHOU

PAPER-Artificial Intelligence, Data Mining

Vol:
E96-D No:12
Page(s):
2786-2794
Text categorization, especially short text categorization, is a difficult and challenging task since the text data is sparse and multidimensional. In traditional text classification methods, document texts are represented with “Bag of Words (BOW)” text representation schema, which is based on word co-occurrence and has many limitations. In this paper, we mapped document texts to Wikipedia concepts and used the Wikipedia-concept-based document representation method to take the place of traditional BOW model for text classification. In order to overcome the weakness of ignoring the semantic relationships among terms in document representation model and utilize rich semantic knowledge in Wikipedia, we constructed a semantic matrix to enrich Wikipedia-concept-based document representation. Experimental evaluation on five real datasets of long and short text shows that our approach outperforms the traditional BOW method.
A Robust Signal Recognition Method for Communication System under Time-Varying SNR Environment
Jing-Chao LI Yi-Bing LI Shouhei KIDERA Tetsuo KIRIMOTO

PAPER-Pattern Recognition

Vol:
E96-D No:12
Page(s):
2814-2819
As a consequence of recent developments in communications, the parameters of communication signals, such as the modulation parameter values, are becoming unstable because of time-varying SNR under electromagnetic conditions. In general, it is difficult to classify target signals that have time-varying parameters using traditional signal recognition methods. To overcome this problem, this study proposes a novel recognition method that works well even for such time-dependent communication signals. This method is mainly composed of feature extraction and classification processes. In the feature extraction stage, we adopt Shannon entropy and index entropy to obtain the stable features of modulated signals. In the classification stage, the interval gray relation theory is employed as suitable for signals with time-varying parameter spaces. The advantage of our method is that it can deal with time-varying SNR situations, which cannot be handled by existing methods. The results from numerical simulation show that the proposed feature extraction algorithm, based on entropy characteristics in time-varying SNR situations,offers accurate clustering performance, and the classifier, based on interval gray relation theory, can achieve a recognition rate of up to 82.9%, even when the SNR varies from -10 to -6 dB.
An Auction Based Distribute Mechanism for P2P Adaptive Bandwidth Allocation
Fang ZUO Wei ZHANG

PAPER

Vol:
E96-D No:12
Page(s):
2704-2712
In P2P applications, networks are formed by devices belonging to independent users. Therefore, routing hotspots or routing congestions are typically created by an unanticipated new event that triggers an unanticipated surge of users to request streaming service from some particular nodes; and a challenging problem is how to provide incentive mechanisms to allocation bandwidth more fairly in order to avoid congestion and other short backs for P2P QoS. In this paper, we study P2P bandwidth game — the bandwidth allocation in P2P networks. Unlike previous works which focus either on routing or on forwarding, this paper investigates the game theoretic mechanism to incentivize node's real bandwidth demands and propose novel method that avoid congestion proactively, that is, prior to a congestion event. More specifically, we define an incentive-compatible pricing vector explicitly and give theoretical proofs to demonstrate that our mechanism can provide incentives for nodes to tell the true bandwidth demand. In order to apply this mechanism to the P2P distribution applications, we evaluate our mechanism by NS-2 simulations. The simulation results show that the incentive pricing mechanism can distribute the bandwidth fairly and effectively and can also avoid the routing hotspot and congestion effectively.
A Practical and Optimal Path Planning for Autonomous Parking Using Fast Marching Algorithm and Support Vector Machine
Quoc Huy DO Seiichi MITA Keisuke YONEDA

PAPER-Artificial Intelligence, Data Mining

Vol:
E96-D No:12
Page(s):
2795-2804
This paper proposes a novel practical path planning framework for autonomous parking in cluttered environments with narrow passages. The proposed global path planning method is based on an improved Fast Marching algorithm to generate a path while considering the moving forward and backward maneuver. In addition, the Support Vector Machine is utilized to provide the maximum clearance from obstacles considering the vehicle dynamics to provide a safe and feasible path. The algorithm considers the most critical points in the map and the complexity of the algorithm is not affected by the shape of the obstacles. We also propose an autonomous parking scheme for different parking situation. The method is implemented on autonomous vehicle platform and validated in the real environment with narrow passages.
Robust Multi-Bit Watermarking for Free-View Television Using Light Field Rendering
Huawei TIAN Yao ZHAO Zheng WANG Rongrong NI Lunming QIN

PAPER-Image Processing and Video Processing

Vol:
E96-D No:12
Page(s):
2820-2829
With the rapid development of multi-view video coding (MVC) and light field rendering (LFR), Free-View Television (FTV) has emerged as new entrainment equipment, which can bring more immersive and realistic feelings for TV viewers. In FTV broadcasting system, the TV-viewer can freely watch a realistic arbitrary view of a scene generated from a number of original views. In such a scenario, the ownership of the multi-view video should be verified not only on the original views, but also on any virtual view. However, capacities of existing watermarking schemes as copyright protection methods for LFR-based FTV are only one bit, i.e., presence or absence of the watermark, which seriously impacts its usage in practical scenarios. In this paper, we propose a robust multi-bit watermarking scheme for LFR-based free-view video. The direct-sequence code division multiple access (DS-CDMA) watermark is constructed according to the multi-bit message and embedded into DCT domain of each view frame. The message can be extracted bit-by-bit from a virtual frame generated at an arbitrary view-point with a correlation detector. Furthermore, we mathematically prove that the watermark can be detected from any virtual view. Experimental results also show that the watermark in FTV can be successfully detected from a virtual view. Moreover, the proposed watermark method is robust against common signal processing attacks, such as Gaussian filtering, salt & peppers noising, JPEG compression, and center cropping.
A Trusted Network Access Protocol for WLAN Mesh Networks
Yuelei XIAO Yumin WANG Liaojun PANG Shichong TAN

LETTER-Information Network

Vol:
E96-D No:12
Page(s):
2865-2869
To solve the problems of the existing trusted network access protocols for Wireless Local Area Network (WLAN) mesh networks, we propose a new trusted network access protocol for WLAN mesh networks, which is abbreviated as WMN-TNAP. This protocol implements mutual user authentication and Platform-Authentication between the supplicant and Mesh Authenticator (MA), and between the supplicant and Authentication Server (AS) of a WLAN mesh network, establishes the key management system for the WLAN mesh network, and effectively prevents the platform configuration information of the supplicant, MA and AS from leaking out. Moreover, this protocol is proved secure based on the extended Strand Space Model (SSM) for trusted network access protocols.
Nonlinear Metric Learning with Deep Independent Subspace Analysis Network for Face Verification
Xinyuan CAI Chunheng WANG Baihua XIAO Yunxue SHAO

PAPER-Image Recognition, Computer Vision

Vol:
E96-D No:12
Page(s):
2830-2838
Face verification is the task of determining whether two given face images represent the same person or not. It is a very challenging task, as the face images, captured in the uncontrolled environments, may have large variations in illumination, expression, pose, background, etc. The crucial problem is how to compute the similarity of two face images. Metric learning has provided a viable solution to this problem. Until now, many metric learning algorithms have been proposed, but they are usually limited to learning a linear transformation. In this paper, we propose a nonlinear metric learning method, which learns an explicit mapping from the original space to an optimal subspace using deep Independent Subspace Analysis (ISA) network. Compared to the linear or kernel based metric learning methods, the proposed deep ISA network is a deep and local learning architecture, and therefore exhibits more powerful ability to learn the nature of highly variable dataset. We evaluate our method on the Labeled Faces in the Wild dataset, and results show superior performance over some state-of-the-art methods.
Accelerating Range Query Processing on R-Tree Using Graphics Processing Units
Boseon YU Hyunduk KIM Wonik CHOI Dongseop KWON

PAPER-Data Engineering, Web Information Systems

Vol:
E96-D No:12
Page(s):
2776-2785
Recently, various research efforts have been conducted to develop strategies for accelerating multi-dimensional query processing using the graphics processing units (GPUs). However, well-known multi-dimensional access methods such as the R-tree, B-tree, and their variants are hardly applicable to GPUs in practice, mainly due to the characteristics of a hierarchical index structure. More specifically, the hierarchical structure not only causes frequent transfers of small volumes of data but also provides limited opportunity to exploit the advanced data parallelism of GPUs. To address these problems, we propose an approach that uses GPUs as a buffer. The main idea is that object entries in recently visited leaf nodes are buffered in the global memory of GPUs and processed by massive parallel threads of the GPUs. Through extensive performance studies, we observed that the proposed approach achieved query performance up to five times higher than that of the original R-tree.
A Characterization of Optimal FF Coding Rate Using a New Optimistically Optimal Code
Mitsuharu ARIMURA Hiroki KOGA Ken-ichi IWATA

LETTER-Source Coding

Vol:
E96-A No:12
Page(s):
2443-2446
In this letter, we first introduce a stronger notion of the optimistic achievable coding rate and discuss a coding theorem. Next, we give a necessary and sufficient condition under which the coding rates of all the optimal FF codes asymptotically converge to a constant.
Floorplan Driven Architecture and High-Level Synthesis Algorithm for Dynamic Multiple Supply Voltages
Shin-ya ABE Youhua SHI Kimiyoshi USAMI Masao YANAGISAWA Nozomu TOGAWA

PAPER-High-Level Synthesis and System-Level Design

Vol:
E96-A No:12
Page(s):
2597-2611
In this paper, we propose an adaptive voltage huddle-based distributed-register architecture (AVHDR architecture), which integrates dynamic multiple supply voltages and interconnection delay into high-level synthesis. In AVHDR architecture, voltages can be dynamically assigned for energy reduction. In other words, low supply voltages are assigned to non-critical operations, and leakage power is cut off by turning off the power supply to the sleeping functional units. Next, an AVHDR-based high-level synthesis algorithm is proposed. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, the modules in each huddle can be placed close to each other and the corresponding AVHDR architecture can be generated and optimized with floorplanning information. Experimental results show that on average our algorithm achieves 43.9% energy-saving compared with conventional algorithms.
Standard Cell Structure with Flexible P/N Well Boundaries for Near-Threshold Voltage Operation
Shinichi NISHIZAWA Tohru ISHIHARA Hidetoshi ONODERA

PAPER-Physical Level Design

Vol:
E96-A No:12
Page(s):
2499-2507
This paper propose a structure of standard cells where the P/N boundary ratio of each cell can be independently customized for near-threshold operation. Lowering the supply voltage is one of the most promising approaches for reducing the power consumption of VLSI circuit, however, this causes an increase of imbalance between rise and fall delays for cells having transistor stacks. Conventional cell library with fixed P/N boundary is not efficient to compensate this delay imbalance. Proposed structure achieves individual P/N boundary ratio optimization for each standard cell, therefore it cancels the imbalance between rise and fall delays at the expense of cell area. Proposed structure is verified using measured result of Ring Oscillator circuits and simulation result of benchmark circuits in 65nm CMOS. The experiments with ISCAS'85 benchmark circuits demonstrate that the standard cell library consisting of the proposed cells reduces the power consumption of the benchmark circuits by 16% on average without increasing the circuit area, compared to that of the same circuit synthesized with a library which is not optimized for the near-threshold operation.
Amperometric Biosensor with Composites of Carbon Nanotube, Hexaamineruthenium(III)chloride, and Plasma-Polymerized Film
Tatsuya HOSHINO Takahiro INOUE Hitoshi MUGURUMA

PAPER-Organic Molecular Electronics

Vol:
E96-C No:12
Page(s):
1536-1540
A novel fabrication approach for the amperometric biosensor composed of carbon nanotubes (CNT), a plasma-polymerized film (PPF), hexamineruthenium(III)chloride (RU), and enzyme glucose oxidase (GOD) is reported. The configuration of the electrochemical electrode is multilayer films which contain sputtered gold, lower acetonitrile PPF, CNT, RU, GOD, and upper acetonitrile PPF, sequentially. First, PPF deposited on Au acts as a permselective membrane and as a scaffold for CNT layer formation. Second, PPF directly deposited on GOD acts as a matrix for enzyme immobilization. To facilitate the electrochemical communication between the CNT layer and GOD, CNT was treated with nitrogen plasma. The electron transfer mediator RU play a role as the mediator, in which the electron caused by enzymatic reaction transports to the electrode. The synergy between the electron transfer mediator and CNT provides benefits in terms of lowering the operational potential and enhancing the sensitivity (current). The optimized glucose biosensor revealed a sensitivity of 3.4µA mM-1 cm-2 at +0.4V vs. Ag/AgCl, linear dynamic range of 2.5-19mM, and a response time of 6s.
Offline Permutation Algorithms on the Discrete Memory Machine with Performance Evaluation on the GPU
Akihiko KASAGI Koji NAKANO Yasuaki ITO

PAPER

Vol:
E96-D No:12
Page(s):
2617-2625
The Discrete Memory Machine (DMM) is a theoretical parallel computing model that captures the essence of the shared memory access of GPUs. Bank conflicts should be avoided for maximizing the bandwidth of the shared memory access. Offline permutation of an array is a task to copy all elements in array a into array b along a permutation given in advance. The main contribution of this paper is to implement a conflict-free permutation algorithm on the DMM in a GPU. We have also implemented straightforward permutation algorithms on the GPU. The experimental results for 1024 double (64-bit) numbers on NVIDIA GeForce GTX-680 show that the straightforward permutation algorithm takes 247.8 ns for the random permutation and 1684ns for the worst permutation that involves the maximum bank conflicts. Our conflict-free permutation algorithm runs in 167ns for any permutation including the random permutation and the worst permutation, although it performs more memory accesses. It follows that our conflict-free permutation is 1.48 times faster for the random permutation and 10.0 times faster for the worst permutation.
Battery-Aware Task Mapping for Coarse-Grained Reconfigurable Architecture
Shouyi YIN Rui SHI Leibo LIU Shaojun WEI

PAPER

Vol:
E96-D No:12
Page(s):
2524-2535
Coarse-grained Reconfigurable Architecture (CGRA) is a parallel computing platform that provides both high performance of hardware and high flexibility of software. It is becoming a promising platform for embedded and mobile applications. Since the embedded and mobile devices are usually battery-powered, improving battery lifetime becomes one of the primary design issues in using CGRAs. In this paper, we propose a battery-aware task-mapping method to optimize energy consumption and improve battery lifetime. The proposed method mainly addresses two problems: task partitioning and task scheduling when mapping applications onto CGRA. The task partitioning and scheduling are formulated as a joint optimization problem of minimizing the energy consumption. The nonlinear effects of real battery are taken into account in problem formulation. Using the insights from the problem formulation, we design the task-mapping algorithm. We have used several real-world benchmarks to test the effectiveness of the proposed method. Experiment results show that our method can dramatically lower the energy consumption and prolong the battery-life.
Efficient Proofs for CNF Formulas on Attributes in Pairing-Based Anonymous Credential System
Nasima BEGUM Toru NAKANISHI Nobuo FUNABIKI

PAPER-Information Security

Vol:
E96-A No:12
Page(s):
2422-2433
To enhance user privacy, anonymous credential systems allow the user to convince a verifier of the possession of a certificate issued by the issuing authority anonymously. In the systems, the user can prove relations on his/her attributes embedded into the certificate. Previously, a pairing-based anonymous credential system with constant-size proofs in the number of attributes of the user was proposed. This system supports the proofs of the inner product relations on attributes, and thus can handle the complex logical relations on attributes as the CNF and DNF formulas. However this system suffers from the computational cost: The proof generation needs exponentiations depending on the number of the literals in OR relations. In this paper, we propose a pairing-based anonymous credential system with the constant-size proofs for CNF formulas and the more efficient proof generation. In the proposed system, the proof generation needs only multiplications depending on the number of literals, and thus it is more efficient than the previously proposed system. The key of our construction is to use an extended accumulator, by which we can verify that multiple attributes are included in multiple sets, all at once. This leads to the verification of CNF formulas on attributes. Since the accumulator is mainly calculated by multiplications, we achieve the better computational costs.
A High Performance HEVC De-Blocking Filter and SAO Architecture for UHDTV Decoder
Jiayi ZHU Dajiang ZHOU Satoshi GOTO

PAPER-High-Level Synthesis and System-Level Design

Vol:
E96-A No:12
Page(s):
2612-2622
High efficiency video coding (HEVC) is the next generation video compression standard. In-loop filter is an important component of HEVC which is composed of two parts, deblocking filter (DBF) and sample adaptive offset (SAO). In this article, we propose a high performance in-loop filter architecture for HEVC which integrate both deblocking filter and SAO. To achieve it, several ideas are adopted. Firstly, SAO is processed based on drifted block, which suits the output pattern of deblocking filter and ease the coupling of deblocking filter and SAO. Secondly, luma and chroma samples of each 4×4 block are organized in same memory storage unit and they are processed simultaneously to raise the parallelism. Thirdly, in both deblocking filter and SAO, calculation core is implemented in combinational logic and data storage is implemented in register groups. Calculation core keeps processing data continually, which greatly raises the utilization of DBF core and SAO core. Fourthly, task level pipeline in processing 8×8 block is employed between deblocking filter and SAO. By these means, a high performance in-loop filter including both deblocking filter and SAO is achieved without any intermediate storage or circuit. It takes only four cycles to finish the deblocking filter and SAO of one 8×8 block. The implementation results show that the proposed solution can be synthesized to 240MHz with 65nm technology. Thus this solution can process 3.84G pixels/s at maximum. UHDTV 4320p (7680×4320) @ 60fps decoding can be realized with 124.4MHz working frequency by the proposed architecture.

10541-10560hit(42807hit)

Keyword Search Result

[Keyword] (42807hit)

Window Memory Layout Scheme for Alternate Row-Wise/Column-Wise Matrix Access

Apps at Hand: Personalized Live Homescreen Based on Mobile App Usage Prediction

An Efficient O(1) Contrast Enhancement Algorithm Using Parallel Column Histograms

Teachability of a Subclass of Simple Deterministic Languages

Improving Text Categorization with Semantic Knowledge in Wikipedia

A Robust Signal Recognition Method for Communication System under Time-Varying SNR Environment

An Auction Based Distribute Mechanism for P2P Adaptive Bandwidth Allocation

A Practical and Optimal Path Planning for Autonomous Parking Using Fast Marching Algorithm and Support Vector Machine

Robust Multi-Bit Watermarking for Free-View Television Using Light Field Rendering

A Trusted Network Access Protocol for WLAN Mesh Networks

Nonlinear Metric Learning with Deep Independent Subspace Analysis Network for Face Verification

Accelerating Range Query Processing on R-Tree Using Graphics Processing Units

A Characterization of Optimal FF Coding Rate Using a New Optimistically Optimal Code

Floorplan Driven Architecture and High-Level Synthesis Algorithm for Dynamic Multiple Supply Voltages

Standard Cell Structure with Flexible P/N Well Boundaries for Near-Threshold Voltage Operation

Amperometric Biosensor with Composites of Carbon Nanotube, Hexaamineruthenium(III)chloride, and Plasma-Polymerized Film

Offline Permutation Algorithms on the Discrete Memory Machine with Performance Evaluation on the GPU

Battery-Aware Task Mapping for Coarse-Grained Reconfigurable Architecture

Efficient Proofs for CNF Formulas on Attributes in Pairing-Based Anonymous Credential System

A High Performance HEVC De-Blocking Filter and SAO Architecture for UHDTV Decoder

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles