The search functionality is under construction.

Keyword Search Result

[Keyword] MPU(1519hit)

1-20hit(1519hit)

  • Efficient Realization of an SC Circuit with Feedback and Its Applications Open Access

    Yuto ARIMURA  Shigeru YAMASHITA  

     
    PAPER-VLSI Design Technology and CAD

      Pubricized:
    2023/10/26
      Vol:
    E107-A No:7
      Page(s):
    958-965

    Stochastic Computing (SC) allows additions and multiplications to be realized with lower power than the conventional binary operations if we admit some errors. However, for many complex functions which cannot be realized by only additions and multiplications, we do not know a generic efficient method to calculate a function by using an SC circuit; it is necessary to realize an SC circuit by using a generic method such as polynomial approximation methods for such a function, which may lose the advantage of SC. Thus, there have been many researches to consider efficient SC realization for specific functions; an efficient SC square root circuit with a feedback circuit was proposed by D. Wu et al. recently. This paper generalizes the SC square root circuit with a feedback circuit; we identify a situation when we can implement a function efficiently by an SC circuit with a feedback circuit. As examples of our generalization, we propose SC circuits to calculate the n-th root calculation and division. We also show our analysis on the accuracy of our SC circuits and the hardware costs; our results show the effectiveness of our method compared to the conventional SC designs; our framework may be able to implement a SC circuit that is better than the existing methods in terms of the hardware cost or the calculation error.

  • A Ranking Information Based Network for Facial Beauty Prediction Open Access

    Haochen LYU  Jianjun LI  Yin YE  Chin-Chen CHANG  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2024/01/26
      Vol:
    E107-D No:6
      Page(s):
    772-780

    The purpose of Facial Beauty Prediction (FBP) is to automatically assess facial attractiveness based on human aesthetics. Most neural network-based prediction methods do not consider the ranking information in the task. For scoring tasks like facial beauty prediction, there is abundant ranking information both between images and within images. Reasonable utilization of these information during training can greatly improve the performance of the model. In this paper, we propose a novel end-to-end Convolutional Neural Network (CNN) model based on ranking information of images, incorporating a Rank Module and an Adaptive Weight Module. We also design pairwise ranking loss functions to fully leverage the ranking information of images. Considering training efficiency and model inference capability, we choose ResNet-50 as the backbone network. We conduct experiments on the SCUT-FBP5500 dataset and the results show that our model achieves a new state-of-the-art performance. Furthermore, ablation experiments show that our approach greatly contributes to improving the model performance. Finally, the Rank Module with the corresponding ranking loss is plug-and-play and can be extended to any CNN model and any task with ranking information. Code is available at https://github.com/nehcoah/Rank-Info-Net.

  • Federated Deep Reinforcement Learning for Multimedia Task Offloading and Resource Allocation in MEC Networks Open Access

    Rongqi ZHANG  Chunyun PAN  Yafei WANG  Yuanyuan YAO  Xuehua LI  

     
    PAPER-Network

      Vol:
    E107-B No:6
      Page(s):
    446-457

    With maturation of 5G technology in recent years, multimedia services such as live video streaming and online games on the Internet have flourished. These multimedia services frequently require low latency, which pose a significant challenge to compute the high latency requirements multimedia tasks. Mobile edge computing (MEC), is considered a key technology solution to address the above challenges. It offloads computation-intensive tasks to edge servers by sinking mobile nodes, which reduces task execution latency and relieves computing pressure on multimedia devices. In order to use MEC paradigm reasonably and efficiently, resource allocation has become a new challenge. In this paper, we focus on the multimedia tasks which need to be uploaded and processed in the network. We set the optimization problem with the goal of minimizing the latency and energy consumption required to perform tasks in multimedia devices. To solve the complex and non-convex problem, we formulate the optimization problem as a distributed deep reinforcement learning (DRL) problem and propose a federated Dueling deep Q-network (DDQN) based multimedia task offloading and resource allocation algorithm (FDRL-DDQN). In the algorithm, DRL is trained on the local device, while federated learning (FL) is responsible for aggregating and updating the parameters from the trained local models. Further, in order to solve the not identically and independently distributed (non-IID) data problem of multimedia devices, we develop a method for selecting participating federated devices. The simulation results show that the FDRL-DDQN algorithm can reduce the total cost by 31.3% compared to the DQN algorithm when the task data is 1000 kbit, and the maximum reduction can be 35.3% compared to the traditional baseline algorithm.

  • Reservoir-Based 1D Convolution: Low-Training-Cost AI Open Access

    Yuichiro TANAKA  Hakaru TAMUKOH  

     
    LETTER-Neural Networks and Bioengineering

      Pubricized:
    2023/09/11
      Vol:
    E107-A No:6
      Page(s):
    941-944

    In this study, we introduce a reservoir-based one-dimensional (1D) convolutional neural network that processes time-series data at a low computational cost, and investigate its performance and training time. Experimental results show that the proposed network consumes lower training computational costs and that it outperforms the conventional reservoir computing in a sound-classification task.

  • Data-Quality Aware Incentive Mechanism Based on Stackelberg Game in Mobile Edge Computing Open Access

    Shuyun LUO  Wushuang WANG  Yifei LI  Jian HOU  Lu ZHANG  

     
    PAPER-Mobile Information Network and Personal Communications

      Pubricized:
    2023/09/14
      Vol:
    E107-A No:6
      Page(s):
    873-880

    Crowdsourcing becomes a popular data-collection method to relieve the burden of high cost and latency for data-gathering. Since the involved users in crowdsourcing are volunteers, need incentives to encourage them to provide data. However, the current incentive mechanisms mostly pay attention to the data quantity, while ignoring the data quality. In this paper, we design a Data-quality awaRe IncentiVe mEchanism (DRIVE) for collaborative tasks based on the Stackelberg game to motivate users with high quality, the highlight of which is the dynamic reward allocation scheme based on the proposed data quality evaluation method. In order to guarantee the data quality evaluation response in real-time, we introduce the mobile edge computing framework. Finally, one case study is given and its real-data experiments demonstrate the superior performance of DRIVE.

  • App-Level Multi-Surface Framework for Supporting Cross-Platform User Interface Distribution Open Access

    Yeongwoo HA  Seongbeom PARK  Jieun LEE  Sangeun OH  

     
    LETTER-Information Network

      Pubricized:
    2023/12/19
      Vol:
    E107-D No:4
      Page(s):
    564-568

    With the recent advances in IoT, there is a growing interest in multi-surface computing, where a mobile app can cooperatively utilize multiple devices' surfaces. We propose a novel framework that seamlessly augments mobile apps with multi-surface computing capabilities. It enables various apps to employ multiple surfaces with acceptable performance.

  • A Trie-Based Authentication Scheme for Approximate String Queries Open Access

    Yu WANG  Liangyong YANG  Jilian ZHANG  Xuelian DENG  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2023/12/20
      Vol:
    E107-D No:4
      Page(s):
    537-543

    Cloud computing has become the mainstream computing paradigm nowadays. More and more data owners (DO) choose to outsource their data to a cloud service provider (CSP), who is responsible for data management and query processing on behalf of DO, so as to cut down operational costs for the DO.  However, in real-world applications, CSP may be untrusted, hence it is necessary to authenticate the query result returned from the CSP.  In this paper, we consider the problem of approximate string query result authentication in the context of database outsourcing. Based on Merkle Hash Tree (MHT) and Trie, we propose an authenticated tree structure named MTrie for authenticating approximate string query results. We design efficient algorithms for query processing and query result authentication. To verify effectiveness of our method, we have conducted extensive experiments on real datasets and the results show that our proposed method can effectively authenticate approximate string query results.

  • Research on Building an ARM-Based Container Cloud Platform Open Access

    Lin CHEN  Xueyuan YIN  Dandan ZHAO  Hongwei LU  Lu LI  Yixiang CHEN  

     
    PAPER-General Fundamentals and Boundaries

      Pubricized:
    2023/08/07
      Vol:
    E107-A No:4
      Page(s):
    654-665

    ARM chips with low energy consumption and low-cost investment have been rapidly applied to smart office and smart entertainment including cloud mobile phones and cloud games. This paper first summarizes key technologies and development status of the above scenarios including CPU, memory, IO hardware virtualization characteristics, ARM hypervisor and container, GPU virtualization, network virtualization, resource management and remote transmission technologies. Then, in view of the current lack of publicly referenced ARM cloud constructing solutions, this paper proposes and constructs an implementation framework for building an ARM cloud, and successively focuses on the formal definition of virtualization framework, Android container system and resource quota management methods, GPU virtualization based on API remoting and GPU pass-through, and the remote transmission technology. Finally, the experimental results show that the proposed model and corresponding component implementation methods are effective, especially, the pass-through mode for virtualizing GPU resources has higher performance and higher parallelism.

  • ILP Based Approaches for Optimizing Early Decompute in Two Level Adiabatic Logic Circuits

    Yuya USHIODA  Mineo KANEKO  

     
    PAPER-VLSI Design Technology and CAD

      Pubricized:
    2023/09/04
      Vol:
    E107-A No:3
      Page(s):
    600-609

    Adiabatic logic circuits are regarded as one of the most attractive solutions for low-power circuit design. This study is dedicated to optimizing the design of the Two-Level Adiabatic Logic (2LAL) circuit, which boasts a relatively simple structure and superior low-power performance among many asymptotically adiabatic or quasi-adiabatic logic families, but suffers from a large number of timing buffers for “decompute”. Our focus is on the “early decompute” technique for fully pipelined 2LAL, and we propose two ILP approaches for minimizing hardware cost through optimization of early decompute. In the first approach, the problem is formulated as a kind of scheduling problem, while it is reformulated as node selection problem (stable set problem). The performance of the proposed methods are evaluated using several benchmark circuits from ISCAS-85, and the maximum 70% hardware reduction is observed compared with an existing method.

  • Identification of Redundant Flip-Flops Using Fault Injection for Low-Power Approximate Computing Circuits

    Jiaxuan LU  Yutaka MASUDA  Tohru ISHIHARA  

     
    PAPER-VLSI Design Technology and CAD

      Pubricized:
    2023/08/31
      Vol:
    E107-A No:3
      Page(s):
    540-548

    Approximate computing (AC) saves energy and improves performance by introducing approximation into computation in error-torrent applications. This work focuses on an AC strategy that accurately performs important computations and approximates others. In order to make AC circuits practical, we need to determine which computation is how important carefully, and thus need to appropriately approximate the redundant computation for maintaining the required computational quality. In this paper, we focus on the importance of computations at the flip-flop (FF) level and propose a novel importance evaluation methodology. The key idea of the proposed methodology is a two-step fault injection algorithm to extract the near-optimal set of redundant FFs in the circuit. In the first step, the proposed methodology performs the FI simulation for each FF and extracts the candidates of redundant FFs. Then, in the second step, the proposed methodology extracts the set of redundant FFs in a binary search manner. Thanks to the two-step strategy, the proposed algorithm reduces the complexity of architecture exploration from an exponential order to a linear order without understanding the functionality and behavior of the target application program. Experimental results show that the proposed methodology identifies the candidates of redundant FFs depending on the given constraints. In a case study of an image processing accelerator, the truncation for identified redundant FFs reduces the circuit area by 29.6% and saves power dissipation by 44.8% under the ASIC implementation while satisfying the PSNR constraint. Similarly, the dynamic power dissipation is saved by 47.2% under the FPGA implementation.

  • Efficient Construction of Encoding Polynomials in a Distributed Coded Computing Scheme

    Daisuke HIBINO  Tomoharu SHIBUYA  

     
    PAPER-Cryptography and Information Security

      Pubricized:
    2023/08/10
      Vol:
    E107-A No:3
      Page(s):
    476-485

    Distributed computing is one of the powerful solutions for computational tasks that need the massive size of dataset. Lagrange coded computing (LCC), proposed by Yu et al. [15], realizes private and secure distributed computing under the existence of stragglers, malicious workers, and colluding workers by using an encoding polynomial. Since the encoding polynomial depends on a dataset, it must be updated every arrival of new dataset. Therefore, it is necessary to employ efficient algorithm to construct the encoding polynomial. In this paper, we propose Newton coded computing (NCC) which is based on Newton interpolation to construct the encoding polynomial. Let K, L, and T be the number of data, the length of each data, and the number of colluding workers, respectively. Then, the computational complexity for construction of an encoding polynomial is improved from O(L(K+T)log 2(K+T)log log (K+T)) for LCC to O(L(K+T)log (K+T)) for the proposed method. Furthermore, by applying the proposed method, the computational complexity for updating the encoding polynomial is improved from O(L(K+T)log 2(K+T)log log (K+T)) for LCC to O(L) for the proposed method.

  • Information-Theoretic Perspectives for Simulation-Based Security in Multi-Party Computation

    Mitsugu IWAMOTO  

     
    INVITED PAPER-Cryptography and Information Security

      Pubricized:
    2023/12/01
      Vol:
    E107-A No:3
      Page(s):
    360-372

    Information-theoretic security and computational security are fundamental paradigms of security in the theory of cryptography. The two paradigms interact with each other but have shown different progress, which motivates us to explore the intersection between them. In this paper, we focus on Multi-Party Computation (MPC) because the security of MPC is formulated by simulation-based security, which originates from computational security, even if it requires information-theoretic security. We provide several equivalent formalizations of the security of MPC under a semi-honest model from the viewpoints of information theory and statistics. The interpretations of these variants are so natural that they support the other aspects of simulation-based security. Specifically, the variants based on conditional mutual information and sufficient statistics are interesting because security proofs for those variants can be given by information measures and factorization theorem, respectively. To exemplify this, we show several security proofs of BGW (Ben-Or, Goldwasser, Wigderson) protocols, which are basically proved by constructing a simulator.

  • Correlated Randomness Reduction in Domain-Restricted Secure Two-Party Computation

    Keitaro HIWATASHI  Koji NUIDA  

     
    PAPER

      Pubricized:
    2023/10/04
      Vol:
    E107-A No:3
      Page(s):
    283-290

    Secure two-party computation is a cryptographic tool that enables two parties to compute a function jointly without revealing their inputs. It is known that any function can be realized in the correlated randomness (CR) model, where a trusted dealer distributes input-independent CR to the parties beforehand. Sometimes we can construct more efficient secure two-party protocol for a function g than that for a function f, where g is a restriction of f. However, it is not known in which case we can construct more efficient protocol for domain-restricted function. In this paper, we focus on the size of CR. We prove that we can construct more efficient protocol for a domain-restricted function when there is a “good” structure in CR space of a protocol for the original function, and show a unified way to construct a more efficient protocol in such case. In addition, we show two applications of the above result: The first application shows that some known techniques of reducing CR size for domain-restricted function can be derived in a unified way, and the second application shows that we can construct more efficient protocol than an existing one using our result.

  • Precoder Optimization Using Data Correlation for Wireless Data Aggregation

    Ayano NAKAI-KASAI  Naoyuki HAYASHI  Tadashi WADAYAMA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E107-B No:3
      Page(s):
    330-338

    In this paper, we consider precoder design for wireless data aggregation in sensor networks. The precoder optimization problem can be formulated as minimization of mean squared error under transmit power and block diagonal constraints. We include statistical correlation of data into the optimization problem, which is appeared in typical applications but is ignored in conventional designing methods. We propose precoder optimization algorithms based on projected gradient descent with projection onto the constraint sets. The proposed method can achieve better performance than the conventional methods that do not incorporate data correlation, especially when data are highly correlated. We also extend the proposed approach to the context of over-the-air computation.

  • Lightweight and Fast Low-Light Image Enhancement Method Based on PoolFormer

    Xin HU  Jinhua WANG  Sunhan XU  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2023/10/05
      Vol:
    E107-D No:1
      Page(s):
    157-160

    Images captured in low-light environments have low visibility and high noise, which will seriously affect subsequent visual tasks such as target detection and face recognition. Therefore, low-light image enhancement is of great significance in obtaining high-quality images and is a challenging problem in computer vision tasks. A low-light enhancement model, LLFormer, based on the Vision Transformer, uses axis-based multi-head self-attention and a cross-layer attention fusion mechanism to reduce the complexity and achieve feature extraction. This algorithm can enhance images well. However, the calculation of the attention mechanism is complex and the number of parameters is large, which limits the application of the model in practice. In response to this problem, a lightweight module, PoolFormer, is used to replace the attention module with spatial pooling, which can increase the parallelism of the network and greatly reduce the number of model parameters. To suppress image noise and improve visual effects, a new loss function is constructed for model optimization. The experiment results show that the proposed method not only reduces the number of parameters by 49%, but also performs better in terms of image detail restoration and noise suppression compared with the baseline model. On the LOL dataset, the PSNR and SSIM were 24.098dB and 0.8575 respectively. On the MIT-Adobe FiveK dataset, the PSNR and SSIM were 27.060dB and 0.9490. The evaluation results on the two datasets are better than the current mainstream low-light enhancement algorithms.

  • Node-to-Set Disjoint Paths Problem in Cross-Cubes

    Rikuya SASAKI  Hiroyuki ICHIDA  Htoo Htoo Sandi KYAW  Keiichi KANEKO  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2023/10/06
      Vol:
    E107-D No:1
      Page(s):
    53-59

    The increasing demand for high-performance computing in recent years has led to active research on massively parallel systems. The interconnection network in a massively parallel system interconnects hundreds of thousands of processing elements so that they can process large tasks while communicating among others. By regarding the processing elements as nodes and the links between processing elements as edges, respectively, we can discuss various problems of interconnection networks in the framework of the graph theory. Many topologies have been proposed for interconnection networks of massively parallel systems. The hypercube is a very popular topology and it has many variants. The cross-cube is such a topology, which can be obtained by adding one extra edge to each node of the hypercube. The cross-cube reduces the diameter of the hypercube, and allows cycles of odd lengths. Therefore, we focus on the cross-cube and propose an algorithm that constructs disjoint paths from a node to a set of nodes. We give a proof of correctness of the algorithm. Also, we show that the time complexity and the maximum path length of the algorithm are O(n3 log n) and 2n - 3, respectively. Moreover, we estimate that the average execution time of the algorithm is O(n2) based on a computer experiment.

  • A Coded Aperture as a Key for Information Hiding Designed by Physics-in-the-Loop Optimization

    Tomoki MINAMATA  Hiroki HAMASAKI  Hiroshi KAWASAKI  Hajime NAGAHARA  Satoshi ONO  

     
    PAPER

      Pubricized:
    2023/09/28
      Vol:
    E107-D No:1
      Page(s):
    29-38

    This paper proposes a novel application of coded apertures (CAs) for visual information hiding. CA is one of the representative computational photography techniques, in which a patterned mask is attached to a camera as an alternative to a conventional circular aperture. With image processing in the post-processing phase, various functions such as omnifocal image capturing and depth estimation can be performed. In general, a watermark embedded as high-frequency components is difficult to extract if captured outside the focal length, and defocus blur occurs. Installation of a CA into the camera is a simple solution to mitigate the difficulty, and several attempts are conducted to make a better design for stable extraction. On the contrary, our motivation is to design a specific CA as well as an information hiding scheme; the secret information can only be decoded if an image with hidden information is captured with the key aperture at a certain distance outside the focus range. The proposed technique designs the key aperture patterns and information hiding scheme through evolutionary multi-objective optimization so as to minimize the decryption error of a hidden image when using the key aperture while minimizing the accuracy when using other apertures. During the optimization process, solution candidates, i.e., key aperture patterns and information hiding schemes, are evaluated on actual devices to account for disturbances that cannot be considered in optical simulations. Experimental results have shown that decoding can be performed with the designed key aperture and similar ones, that decrypted image quality deteriorates as the similarity between the key and the aperture used for decryption decreases, and that the proposed information hiding technique works on actual devices.

  • Resource Allocation for Mobile Edge Computing System Considering User Mobility with Deep Reinforcement Learning

    Kairi TOKUDA  Takehiro SATO  Eiji OKI  

     
    PAPER-Network

      Pubricized:
    2023/10/06
      Vol:
    E107-B No:1
      Page(s):
    173-184

    Mobile edge computing (MEC) is a key technology for providing services that require low latency by migrating cloud functions to the network edge. The potential low quality of the wireless channel should be noted when mobile users with limited computing resources offload tasks to an MEC server. To improve the transmission reliability, it is necessary to perform resource allocation in an MEC server, taking into account the current channel quality and the resource contention. There are several works that take a deep reinforcement learning (DRL) approach to address such resource allocation. However, these approaches consider a fixed number of users offloading their tasks, and do not assume a situation where the number of users varies due to user mobility. This paper proposes Deep reinforcement learning model for MEC Resource Allocation with Dummy (DMRA-D), an online learning model that addresses the resource allocation in an MEC server under the situation where the number of users varies. By adopting dummy state/action, DMRA-D keeps the state/action representation. Therefore, DMRA-D can continue to learn one model regardless of variation in the number of users during the operation. Numerical results show that DMRA-D improves the success rate of task submission while continuing learning under the situation where the number of users varies.

  • An Anomalous Behavior Detection Method Utilizing IoT Power Waveform Shapes

    Kota HISAFURU  Kazunari TAKASAKI  Nozomu TOGAWA  

     
    PAPER

      Pubricized:
    2023/08/16
      Vol:
    E107-A No:1
      Page(s):
    75-86

    In recent years, with the wide spread of the Internet of Things (IoT) devices, security issues for hardware devices have been increasing, where detecting their anomalous behaviors becomes quite important. One of the effective methods for detecting anomalous behaviors of IoT devices is to utilize consumed energy and operation duration time extracted from their power waveforms. However, the existing methods do not consider the shape of time-series data and cannot distinguish between power waveforms with similar consumed energy and duration time but different shapes. In this paper, we propose a method for detecting anomalous behaviors based on the shape of time-series data by incorporating a shape-based distance (SBD) measure. The proposed method first obtains the entire power waveform of the target IoT device and extracts several application power waveforms. After that, we give the invariances to them, and we can effectively obtain the SBD between every two application power waveforms. Based on the SBD values, the local outlier factor (LOF) method can finally distinguish between normal application behaviors and anomalous application behaviors. Experimental results demonstrate that the proposed method successfully detects anomalous application behaviors, while the existing state-of-the-art method fails to detect them.

  • D2EcoSys: Decentralized Digital Twin EcoSystem Empower Co-Creation City-Level Digital Twins Open Access

    Kenji KANAI  Hidehiro KANEMITSU  Taku YAMAZAKI  Shintaro MORI  Aram MINE  Sumiko MIYATA  Hironobu IMAMURA  Hidenori NAKAZATO  

     
    INVITED PAPER

      Pubricized:
    2023/10/26
      Vol:
    E107-B No:1
      Page(s):
    50-62

    A city-level digital twin is a critical enabling technology to construct a smart city that helps improve citizens' living conditions and quality of life. Currently, research and development regarding the digital replica city are pursued worldwide. However, many research projects only focus on creating the 3D city model. A mechanism to involve key players, such as data providers, service providers, and application developers, is essential for constructing the digital replica city and producing various city applications. Based on this motivation, the authors of this paper are pursuing a research project, namely Decentralized Digital Twin EcoSystem (D2EcoSys), to create an ecosystem to advance (and self-grow) the digital replica city regarding time and space directions, city services, and values. This paper introduces an overview of the D2EcoSys project: vision, problem statement, and approach. In addition, the paper discusses the recent research results regarding networking technologies and demonstrates an early testbed built in the Kashiwa-no-ha smart city.

1-20hit(1519hit)