Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Gyuyeong KIM
Hyun KWON Jun LEE
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Guangjin Ouyang Yong Guo Yu Lu Fang He
Yuyao LIU Qingyong LI Shi BAO Wen WANG
Cong PANG Ye NI Jia Ming CHENG Lin ZHOU Li ZHAO
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Kazuya KAKIZAKI Kazuto FUKUCHI Jun SAKUMA
Yitong WANG Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Waqas NAWAZ Muhammad UZAIR Kifayat ULLAH KHAN Iram FATIMA
Haeyoung Lee
Ji XI Pengxu JIANG Yue XIE Wei JIANG Hao DING
Weiwei JING Zhonghua LI
Sena LEE Chaeyoung KIM Hoorin PARK
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Chih-Ping Wang Duen-Ren Liu
Yuya TAKADA Rikuto MOCHIDA Miya NAKAJIMA Syun-suke KADOYA Daisuke SANO Tsuyoshi KATO
Yi Huo Yun Ge
Rikuto MOCHIDA Miya NAKAJIMA Haruki ONO Takahiro ANDO Tsuyoshi KATO
Koichi FUJII Tomomi MATSUI
Yaotong SONG Zhipeng LIU Zhiming ZHANG Jun TANG Zhenyu LEI Shangce GAO
Souhei TAKAGI Takuya KOJIMA Hideharu AMANO Morihiro KUGA Masahiro IIDA
Jun ZHOU Masaaki KONDO
Tetsuya MANABE Wataru UNUMA
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Jingjing Liu Chuanyang Liu Yiquan Wu Zuo Sun
Zhenglong YANG Weihao DENG Guozhong WANG Tao FAN Yixi LUO
Yoshiaki TAKATA Akira ONISHI Ryoma SENDA Hiroyuki SEKI
Dinesh DAULTANI Masayuki TANAKA Masatoshi OKUTOMI Kazuki ENDO
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Genta INOUE Daiki OKONOGI Satoru JIMBO Thiem Van CHU Masato MOTOMURA Kazushi KAWAMURA
Hikaru USAMI Yusuke KAMEDA
Yinan YANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
Tsutomu SASAO Takashi MATSUBARA Katsufumi TSUJI Yoshiaki KOGA
A universal interconnection network implements arbitrary interconnections among n terminals. This paper considers a problem to realize such a network using contact switches. When n=2, it can be implemented with a single switch. The number of different connections among n terminals is given by the Bell number B(n). The Bell number shows the total number of methods to partition n distinct elements. For n=2, 3, 4, 5 and 6, the corresponding Bell numbers are 2, 5, 15, 52, and 203, respectively. This paper shows a method to realize an n terminal universal interconnection network with $rac {3}{8}(n^2-1)$ contact switches when n=2m+1≥5, and $rac {n}{8}(3n+2)$ contact switches, when n=2m≥6. Also, it shows that a lower bound on the number of contact switches to realize an n-terminal universal interconnection network is ⌈log 2B(n)⌉, where B(n) is the Bell number.
Tsutomu SASAO Yuto HORIKAWA Yukihiro IGUCHI
A classification function maps a set of vectors into several classes. A machine learning problem is treated as a design problem for partially defined classification functions. To realize classification functions for MNIST hand written digits, three different architectures are considered: Single-unit realization, 45-unit realization, and 45-unit ×r realization. The 45-unit realization consists of 45 ternary classifiers, 10 counters, and a max selector. Test accuracy of these architectures are compared using MNIST data set.
Akira ITO Rei UENO Naofumi HOMMA
This study presents a formal verification method for Galois-field (GF) arithmetic circuits with the characteristics of more than two values. The proposed method formally verifies the correctness of circuit functionality (i.e., the input-output relations given as GF-polynomials) by checking the equivalence between a specification and a gate-level netlist. We represent a netlist using simultaneous algebraic equations and solve them based on a novel polynomial reduction method that can be efficiently applied to arithmetic over extension fields $mathbb{F}_{p^m}$, where the characteristic p is larger than two. By using the reverse topological term order to derive the Gröbner basis, our method can complete the verification, even when a target circuit includes bugs. In addition, we introduce an extension of the Galois-Field binary moment diagrams to perform the polynomial reductions faster. Our experimental results show that the proposed method can efficiently verify practical $mathbb{F}_{p^m}$ arithmetic circuits, including those used in modern cryptography. Moreover, we demonstrate that the extended polynomial reduction technique can enable verification that is up to approximately five times faster than the original one.
Radomir S. STANKOVIĆ Milena STANKOVIĆ Claudio MORAGA Jaakko T. ASTOLA
Binary bent functions have a strictly specified number of non-zero values. In the same way, ternary bent functions satisfy certain requirements on the elements of their value vectors. These requirements can be used to specify six classes of ternary bent functions. Classes are mutually related by encoding of function values. Given a basic ternary bent function, other functions in the same class can be constructed by permutation matrices having a block structure similar to that of the factor matrices appearing in the Good-Thomas decomposition of Cooley-Tukey Fast Fourier transform and related algorithms.
Milo&scaron M. RADMANOVIĆ Radomir S. STANKOVIĆ
Multiple-valued bent functions are functions with highest nonlinearity which makes them interesting for multiple-valued cryptography. Since the general structure of bent functions is still unknown, methods for construction of bent functions are often based on some deterministic criteria. For practical applications, it is often necessary to be able to construct a bent function that does not belong to any specific class of functions. Thus, the criteria for constructions are combined with exhaustive search over all possible functions which can be very CPU time consuming. A solution is to restrict the search space by some conditions that should be satisfied by the produced bent functions. In this paper, we proposed the construction method based on spectral subsets of multiple-valued bent functions satisfying certain appropriately formulated restrictions in Galois field (GF) and Reed-Muller-Fourier (RMF) domains. Experimental results show that the proposed method efficiently constructs ternary and quaternary bent functions by using these restrictions.
A nonvolatile field-programmable gate array (NV-FPGA), where the circuit-configuration information still remains without power supply, offers a powerful solution against the standby power issue. In this paper, an NV-FPGA is proposed where the programmable logic and interconnect function blocks are described in a hardware description language and are pushed through a standard-cell-based design flow with nonvolatile flip-flops. The use of the standard-cell-based design flow makes it possible to migrate any arbitrary process technology and to perform architecture-level simulation with physical information. As a typical example, the proposed NV-FPGA is designed under 55nm CMOS/100nm magnetic tunnel junction (MTJ) technologies, and the performance of the proposed NV-FPGA is evaluated in comparison with that of a CMOS-only volatile FPGA.
Naoto SOGA Shimpei SATO Hiroki NAKAHARA
Advancements in portable electrocardiographs have allowed electrocardiogram (ECG) signals to be recorded in everyday life. Machine-learning techniques, including deep learning, have been used in numerous studies to analyze ECG signals because they exhibit superior performance to conventional methods. A mobile ECG analysis device is needed so that abnormal ECG waves can be detected anywhere. Such mobile device requires a real-time performance and low power consumption, however, deep-learning based models often have too many parameters to implement on mobile hardware, its amount of hardware is too large and dissipates much power consumption. We propose a design flow to implement the outlier detector using an autoencoder on a low-end FPGA. To shorten the preparation time of ECG data used in training an autoencoder, an unsupervised learning technique is applied. Additionally, to minimize the volume of the weight parameters, a weight sparseness technique is applied, and all the parameters are converted into fixed-point values. We show that even if the parameters are reduced converted into fixed-point values, the outlier detection performance degradation is only 0.83 points. By reducing the volume of the weight parameters, all the parameters can be stored in on-chip memory. We design the architecture according to the CRS format, which is the well-known data structure of a sparse matrix, minimizing the hardware size and reducing the power consumption. We use weight sharing to further reduce the weight-parameter volumes. By using weight sharing, we could reduce the bit width of the memories by 60% while maintaining the outlier detection performance. We implemented the autoencoder on a Digilent Inc. ZedBoard and compared the results with those for the ARM mobile CPU for a built-in device. The results indicated that our FPGA implementation of the outlier detector was 12 times faster and 106 times more energy-efficient.
Takao WAHO Tomoaki KOIZUMI Hitoshi HAYASHI
A feedforward (FF) network using ΔΣ modulators is investigated to implement a non-binary analog-to-digital (A/D) converter. Weighting coefficients in the network are determined to suppress the generation of quantization noise. A moving average is adopted to prevent the analog signal amplitude from increasing beyond the allowable input range of the modulators. The noise transfer function is derived and used to estimate the signal-to-noise ratio (SNR). The FF network output is a non-uniformly distributed multi-level signal, which results in a better SNR than a uniformly distributed one. Also, the effect of the characteristic mismatch in analog components on the SNR is analyzed. Our behavioral simulations show that the SNR is improved by more than 30 dB, or equivalently a bit resolution of 5 bits, compared with a conventional first-order ΔΣ modulator.
Yosuke IIJIMA Keigo TAYA Yasushi YUMINAKA
To meet the increasing demand for high-speed communication in VLSI (very large-scale integration) systems, next-generation high-speed data transmission standards (e.g., IEEE 802.3bs and PCIe 6.0) will adopt four-level pulse amplitude modulation (PAM-4) for data coding. Although PAM-4 is spectrally efficient to mitigate inter-symbol interference caused by bandwidth-limited wired channels, it is more sensitive than conventional non-return-to-zero line coding. To evaluate the received signal quality when using adaptive coefficient settings for a PAM-4 equalizer during data transmission, we propose an eye-opening monitor technique based on machine learning. The proposed technique uses a Gaussian mixture model to classify the received PAM-4 symbols. Simulation and experimental results demonstrate the feasibility of adaptive equalization for PAM-4 coding.
Ryoichi MIYAUCHI Akio YOSHIDA Shuya NAKANO Hiroki TAMURA Koichi TANNO Yutaka FUKUCHI Yukio KAWAMURA Yuki KODAMA Yuichi SEKIYA
This paper describes the Fractional-N All Digital Frequency Locked Loop (ADFLL) with Robustness for PVT variation and its application for the microcontroller unit. The conventional FLL is difficult to achieve the required specification by using the fine CMOS process. Especially, the conventional FLL has some problems such as unexpected operation and long lock time that are caused by PVT variation. To overcome these problems, we propose a new ADFLL which uses dynamic selecting digital filter coefficients. The proposed ADFLL was evaluatied through the HSPICE simulation and fabricating chips using a 0.13 µm CMOS process. From these results, we observed the proposed ADFLL has robustness for PVT variation by using dynamic selecting digital filter coefficient, and the lock time is improved up to 57%, clock jitter is 0.85 nsec.
Ryosuke NISHIHARA Hidehiko MATSUBAYASHI Tomomoto ISHIKAWA Kentaro MORI Yutaka HATA
The frequency of uterine peristalsis is closely related to the success rate of pregnancy. An ultrasonic imaging is almost always employed for the measure of the frequency. The physician subjectively evaluates the frequency from the ultrasound image by the naked eyes. This paper aims to measure the frequency of uterine peristalsis from the ultrasound image. The ultrasound image consists of relative amounts in the brightness, and the contour of the uterine is not clear. It was not possible to measure the frequency by using the inter-frame difference and optical flow, which are the representative methods of motion detection, since uterine peristaltic movement is too small to apply them. This paper proposes a measurement method of the frequency of the uterine peristalsis from the ultrasound image in the implantation phase. First, traces of uterine peristalsis are semi-automatically done from the images with location-axis and time-axis. Second, frequency analysis of the uterine peristalsis is done by Fourier transform for 3 minutes. As a result, the frequency of uterine peristalsis was known as the frequency with the dominant frequency ingredient with maximum value among the frequency spectrums. Thereby, we evaluate the number of the frequency of uterine peristalsis quantitatively from the ultrasound image. Finally, the success rate of pregnancy is calculated from the frequency based on Fuzzy logic. This enabled us to evaluate the success rate of pregnancy by measuring the uterine peristalsis from the ultrasound image.
Xiaoping ZHOU Peng LI Yulong ZENG Xuepeng FAN Peng LIU Toshiaki MIYAZAKI
Blockchain-based voting, including liquid voting, has been extensively studied in recent years. However, it remains challenging to implement liquid voting on blockchain using Ethereum smart contract. The challenge comes from the gas limit, which is that the number of instructions for processing a ballot cannot exceed a certain amount. This restricts the application scenario with respect to algorithms whose time complexity is linear to the number of voters, i.e., O(n). As the blockchain technology can well share and reuse the resources, we study a model of liquid voting on blockchain and propose a fast algorithm, named Flash, to eliminate the restriction. The key idea behind our algorithm is to shift some on-chain process to off-chain. In detail, we first construct a Merkle tree off-chain which contains all voters' properties. Second, we use Merkle proof and interval tree to process each ballot with O(log n) on-chain time complexity. Theoretically, the algorithm can support up to 21000 voters with respect to the current gas limit on Ethereum. Experimentally, the result implies that the consumed gas fee remains at a very low level when the number of voters increases. This means our algorithm makes liquid voting on blockchain practical even for massive voters.
Ahmed Salih AL-KHALEEFA Rosilah HASSAN Mohd Riduan AHMAD Faizan QAMAR Zheng WEN Azana Hafizah MOHD AMAN Keping YU
Machine learning is becoming an attractive topic for researchers and industrial firms in the area of computational intelligence because of its proven effectiveness and performance in resolving real-world problems. However, some challenges such as precise search, intelligent discovery and intelligent learning need to be addressed and solved. One most important challenge is the non-steady performance of various machine learning models during online learning and operation. Online learning is the ability of a machine-learning model to modernize information without retraining the scheme when new information is available. To address this challenge, we evaluate and analyze four widely used online machine learning models: Online Sequential Extreme Learning Machine (OSELM), Feature Adaptive OSELM (FA-OSELM), Knowledge Preserving OSELM (KP-OSELM), and Infinite Term Memory OSELM (ITM-OSELM). Specifically, we provide a testbed for the models by building a framework and configuring various evaluation scenarios given different factors in the topological and mathematical aspects of the models. Furthermore, we generate different characteristics of the time series to be learned. Results prove the real impact of the tested parameters and scenarios on the models. In terms of accuracy, KP-OSELM and ITM-OSELM are superior to OSELM and FA-OSELM. With regard to time efficiency related to the percentage of decreases in active features, ITM-OSELM is superior to KP-OSELM.
Mingrui ZHU Yangjian JI Wenjun JU Xinjian GU Chao LIU Zhifang XU
With the development of power market demand response capability, load aggregators play a more important role in the coordination between power grid and users. They have a wealth of user side business data resources related to user demand, load management and equipment operation. By building a business model of business data resource utilization and innovating the content and mode of intelligent power service, it can guide the friendly interaction between power supply, power grid and load, effectively improve the flexibility of power grid regulation, speed up demand response and refine load management. In view of the current situation of insufficient utilization of business resources, low user participation and imperfect business model, this paper analyzes the process of home appliance enterprises participating in peak shaving and valley filling (PSVF) as load aggregators, and expounds the relationship between the participants in the power market; a business service model of smart home appliance participating in PSVF based on cloud platform is put forward; the market value created by home appliance business resources for each participant under the joint action of market-oriented means, information technology and power consumption technology is discussed, and typical business scenarios are listed; taking Haier business resource analysis as an example, the feasibility of the proposed business model in innovating the content and value realization of intelligent power consumption services is proved.
Thi Thu HIEN NGUYEN Thai BINH NGUYEN Ngoc PHUONG PHAM Quoc TRUONG DO Tu LUC LE Chi MAI LUONG
Speech recognition is a technique that recognizes words and sentences in audio form and converts them into text sentences. Currently, with the advancement of deep learning technologies, speech recognition has achieved very satisfactory results close to human abilities. However, there are still limitations in identification results such as lack of punctuation, capitalization, and standardized numerical data. Vietnamese also contains local words, homonyms, etc, which make it difficult to read and understand the identification results for users as well as to perform the next tasks in Natural Language Processing (NLP). In this paper, we propose to combine the transformer decoder with conditional random field (CRF) to restore punctuation and capitalization for the Vietnamese automatic speech recognition (ASR) output. By chunking input sentences and merging output sequences, it is possible to handle longer strings with greater accuracy. Experiments show that the method proposed in the Vietnamese post-speech recognition dataset delivers the best results.
Ye TAO Fang KONG Wenjun JU Hui LI Ruichun HOU
As an important type of science and technology service resource, energy consumption data play a vital role in the process of value chain integration between home appliance manufacturers and the state grid. Accurate electricity consumption prediction is essential for demand response programs in smart grid planning. The vast majority of existing prediction algorithms only exploit data belonging to a single domain, i.e., historical electricity load data. However, dependencies and correlations may exist among different domains, such as the regional weather condition and local residential/industrial energy consumption profiles. To take advantage of cross-domain resources, a hybrid energy consumption prediction framework is presented in this paper. This framework combines the long short-term memory model with an encoder-decoder unit (ED-LSTM) to perform sequence-to-sequence forecasting. Extensive experiments are conducted with several of the most commonly used algorithms over integrated cross-domain datasets. The results indicate that the proposed multistep forecasting framework outperforms most of the existing approaches.
Weizhi LIAO Mingtong HUANG Pan MA Yu WANG
There are many knowledge entities in sci-tech intelligence resources. Extracting these knowledge entities is of great importance for building knowledge networks, exploring the relationship between knowledge, and optimizing search engines. Many existing methods, which are mainly based on rules and traditional machine learning, require significant human involvement, but still suffer from unsatisfactory extraction accuracy. This paper proposes a novel approach for knowledge entity extraction based on BiLSTM and conditional random field (CRF).A BiLSTM neural network to obtain the context information of sentences, and CRF is then employed to integrate global label information to achieve optimal labels. This approach does not require the manual construction of features, and outperforms conventional methods. In the experiments presented in this paper, the titles and abstracts of 20,000 items in the existing sci-tech literature are processed, of which 50,243 items are used to build benchmark datasets. Based on these datasets, comparative experiments are conducted to evaluate the effectiveness of the proposed approach. Knowledge entities are extracted and corresponding knowledge networks are established with a further elaboration on the correlation of two different types of knowledge entities. The proposed research has the potential to improve the quality of sci-tech information services.
Ying KANG Aiqin HOU Zimin ZHAO Daguang GAN
Paper recommendation has become an increasingly important yet challenging task due to the rapidly expanding volume and scope of publications in the broad research community. Due to the lack of user profiles in public digital libraries, most existing methods for paper recommendation are through paper similarity measurements based on citations or contents, and still suffer from various performance issues. In this paper, we construct a graphical form of citation relations to identify relevant papers and design a hybrid recommendation model that combines both citation- and content-based approaches to measure paper similarities. Considering that citations at different locations in one article are likely of different significance, we define a concept of citation similarity with varying weights according to the sections of citations. We evaluate the performance of our recommendation method using Spearman correlation on real publication data from public digital libraries such as CiteSeer and Wanfang. Extensive experimental results show that the proposed hybrid method exhibits better performance than state-of-the-art techniques, and achieves 40% higher recommendation accuracy in average in comparison with citation-based approaches.
Zimin ZHAO Ying KANG Aiqin HOU Daguang GAN
Differentiable neural architecture search (DARTS) is now a widely disseminated weight-sharing neural architecture search method and it consists of two stages: search and evaluation. However, the original DARTS suffers from some well-known shortcomings. Firstly, the width and depth of the network, as well as the operation of two stages are discontinuous, which causes a performance collapse. Secondly, DARTS has a high computational overhead. In this paper, we propose a synchronous progressive approach to solve the discontinuity problem for network depth and width and we use the 0-1 loss function to alleviate the discontinuity problem caused by the discretization of operation. The computational overhead is reduced by using the partial channel connection. Besides, we also discuss and propose a solution to the aggregation of skip operations during the search process of DARTS. We conduct extensive experiments on CIFAR-10 and WANFANG datasets, specifically, our approach reduces search time significantly (from 1.5 to 0.1 GPU days) and improves the accuracy of image recognition.
Pengtao JIA Qi ZHAO Boze LI Jing ZHANG
Gait recognition distinguishes one individual from others according to the natural patterns of human gaits. Gait recognition is a challenging signal processing technology for biometric identification due to the ambiguity of contours and the complex feature extraction procedure. In this work, we proposed a new model - the convolutional neural network (CNN) joint attention mechanism (CJAM) - to classify the gait sequences and conduct person identification using the CASIA-A and CASIA-B gait datasets. The CNN model has the ability to extract gait features, and the attention mechanism continuously focuses on the most discriminative area to achieve person identification. We present a comprehensive transformation from gait image preprocessing to final identification. The results from 12 experiments show that the new attention model leads to a lower error rate than others. The CJAM model improved the 3D-CNN, CNN-LSTM (long short-term memory), and the simple CNN by 8.44%, 2.94% and 1.45%, respectively.
Xueqing ZHANG Xiaoxia LIU Jun GUO Wenlei BAI Daguang GAN
As scientific and technological resources are experiencing information overload, it is quite expensive to find resources that users are interested in exactly. The personalized recommendation system is a good candidate to solve this problem, but data sparseness and the cold starting problem still prevent the application of the recommendation system. Sparse data affects the quality of the similarity measurement and consequently the quality of the recommender system. In this paper, we propose a matrix factorization recommendation algorithm based on similarity calculation(SCMF), which introduces potential similarity relationships to solve the problem of data sparseness. A penalty factor is adopted in the latent item similarity matrix calculation to capture more real relationships furthermore. We compared our approach with other 6 recommendation algorithms and conducted experiments on 5 public data sets. According to the experimental results, the recommendation precision can improve by 2% to 9% versus the traditional best algorithm. As for sparse data sets, the prediction accuracy can also improve by 0.17% to 18%. Besides, our approach was applied to patent resource exploitation provided by the wanfang patents retrieval system. Experimental results show that our method performs better than commonly used algorithms, especially under the cold starting condition.
Wenlei BAI Jun GUO Xueqing ZHANG Baoying LIU Daguang GAN
To find the exact items from the massive patent resources for users is a matter of great urgency. Although the recommender systems have shot this problem to a certain extent, there are still some challenging problems, such as tracking user interests and improving the recommendation quality when the rating matrix is extremely sparse. In this paper, we propose a novel method called Collaborative Filtering Auto-Encoder for the top-N recommendation. This method employs Auto-Encoders to extract the item's features, converts a high-dimensional sparse vector into a low-dimensional dense vector, and then uses the dense vector for similarity calculation. At the same time, to make the recommendation list closer to the user's recent interests, we divide the recommendation weight into time-based and recent similarity-based weights. In fact, the proposed method is an improved, item-based collaborative filtering model with more flexible components. Experimental results show that the method consistently outperforms state-of-the-art top-N recommendation methods by a significant margin on standard evaluation metrics.
Weizhi LIAO Guanglei YE Weijun YAN Yaheng MA Dongzhou ZUO
An efficient Feature selection strategy is important in the dimension reduction of data. Extensive existing research efforts could be summarized into three classes: Filter method, Wrapper method, and Embedded method. In this work, we propose an integrated two-stage feature extraction method, referred to as FWS, which combines Filter and Wrapper method to efficiently extract important features in an innovative hybrid mode. FWS conducts the first level of selection to filter out non-related features using correlation analysis and the second level selection to find out the near-optimal sub set that capturing valuable discrete features by evaluating the performance of predictive model trained on such sub set. Compared with the technologies such as mRMR and Relief-F, FWS significantly improves the detection performance through an integrated optimization strategy.Results show the performance superiority of the proposed solution over several well-known methods for feature selection.
Weizhi LIAO Yaheng MA Yiling CAO Guanglei YE Dongzhou ZUO
Aiming at the problem that traditional text-level sentiment analysis methods usually ignore the emotional tendency corresponding to the object or attribute. In this paper, a novel two-stage fine-grained text-level sentiment analysis model based on syntactic rule matching and deep semantics is proposed. Based on analyzing the characteristics and difficulties of fine-grained sentiment analysis, a two-stage fine-grained sentiment analysis algorithm framework is constructed. In the first stage, the objects and its corresponding opinions are extracted based on syntactic rules matching to obtain preliminary objects and opinions. The second stage based on deep semantic network to extract more accurate objects and opinions. Aiming at the problem that the extraction result contains multiple objects and opinions to be matched, an object-opinion matching algorithm based on the minimum lexical separation distance is proposed to achieve accurate pairwise matching. Finally, the proposed algorithm is evaluated on several public datasets to demonstrate its practicality and effectiveness.
Fanying ZHENG Yangjian JI Fu GU Xinjian GU Jin ZHANG
To address slow response and scattered resources in patent service, this paper proposes a one-stop service business model based on scientific and technological resource bundle. The proposed one-step model is composed of a project model, a resource bundle model and a service product model through Web Service integration. This paper describes the patent resource bundle model from the aspects of content and context, and designs the configuration of patent service products and patent resource bundle. The model is then applied to the patent service of the Yangtze River Delta urban agglomeration in China, and the monthly agent volume increased by 38.8%, and the average response time decreased by 14.3%. Besides, it is conducive to improve user satisfaction and resource sharing efficiency of urban agglomeration.
Fanying ZHENG Fu GU Yangjian JI Jianfeng GUO Xinjian GU Jin ZHANG
In the context of Web 2.0, the interaction between users and resources is more and more frequent in the process of resource sharing and consumption. However, the current research on resource pricing mainly focuses on the attributes of the resource itself, and does not weigh the interests of the resource sharing participants. In order to deal with these problems, the pricing mechanism of resource-user interaction evaluation based on multi-agent game theory is established in this paper. Moreover, the user similarity, the evaluation bias based on link analysis and punishment of academic group cheating are also included in the model. Based on the data of 181 scholars and 509 articles from the Wanfang database, this paper conducts 5483 pricing experiments for 13 months, and the results show that this model is more effective than other pricing models - the pricing accuracy of resource resources is 94.2%, and the accuracy of user value evaluation is 96.4%. Besides, this model can intuitively show the relationship within users and within resources. The case study also exhibits that the user's knowledge level is not positively correlated with his or her authority. Discovering and punishing academic group cheating is conducive to objectively evaluating researchers and resources. The pricing mechanism of scientific and technological resources and the users proposed in this paper is the premise of fair trade of scientific and technological resources.
Yangshengyan LIU Fu GU Yangjian JI Yijie WU Jianfeng GUO Xinjian GU Jin ZHANG
Resource sharing is to ensure required resources available for their demanders. However, due to the lack of proper sharing model, the current sharing rate of the scientific and technological resources is low, impeding technological innovation and value chain development. Here we propose a novel method to share scientific and technological resources by storing resources as nodes and correlations as links to form a complex network. We present a few-shot relational learning model to solve the cold-start and long-tail problems that are induced by newly added resources. Experimentally, using NELL-One and Wiki-One datasets, our one-shot results outperform the baseline framework - metaR by 40.2% and 4.1% on MRR in Pre-Train setting. We also show two practical applications, a resource graph and a resource map, to demonstrate how the complex network helps resource sharing.
Yida HONG Yanlei YIN Cheng GUO Xiaobao LIU
Many scientific and technological resources (STR) cannot meet the needs of real demand-based industrial services. To address this issue, the characteristics of scientific and technological resource services (STRS) are analyzed, and a method of the optimal combination of demand-based STR based on multi-community collaborative search is then put forward. An optimal combined evaluative system that includes various indexes, namely response time, innovation, composability, and correlation, is developed for multi-services of STR, and a hybrid optimal combined model for STR is constructed. An evaluative algorithm of multi-community collaborative search is used to study the interactions between general communities and model communities, thereby improving the adaptive ability of the algorithm to random dynamic resource services. The average convergence value CMCCSA=0.00274 is obtained by the convergence measurement function, which exceeds other comparison algorithms. The findings of this study indicate that the proposed methods can preferably reach the maximum efficiency of demand-based STR, and new ideas and methods for implementing demand-based real industrial services for STR are provided.
Kazuei HIRONAKA Kensuke IIZUKA Miho YAMAKURA Akram BEN AHMED Hideharu AMANO
Multi-FPGA systems have been receiving a lot of attention as a low cost and energy efficient system for Multi-access Edge Computing (MEC). For such purpose, a bare-metal multi-FPGA system called FiC (Flow-in-Cloud) is under development. In this paper, we introduce the FiC multi FPGA cluster which is applied partial reconfiguration (PR) FPGA design flow to support online user defined accelerator replacement while executing FPGA interconnection network and its low-level multiple FPGA management software called remote PR manager. With the remote PR manager, the user can define the FiC FPGA cluster setup by JSON and control the cluster from user application with the cooperation of simple cluster management tool / library called ficmgr on the client host and REST API service provider called ficwww on Raspberry Pi 3 (RPi3) on each node. According to the evaluation results with a prototype FiC FPGA cluster system with 12 nodes, using with online application replacement by PR and on-the-fly FPGA bitstream compression, the time for FPGA bitstream distribution was reduced to 1/17 and the total cluster setup time was reduced by 21∼57% than compared to cluster setup with full configuration FPGA bitstream.
Thao-Nguyen TRUONG Ryousei TAKANO
Data parallelism is the dominant method used to train deep learning (DL) models on High-Performance Computing systems such as large-scale GPU clusters. When training a DL model on a large number of nodes, inter-node communication becomes bottle-neck due to its relatively higher latency and lower link bandwidth (than intra-node communication). Although some communication techniques have been proposed to cope with this problem, all of these approaches target to deal with the large message size issue while diminishing the effect of the limitation of the inter-node network. In this study, we investigate the benefit of increasing inter-node link bandwidth by using hybrid switching systems, i.e., Electrical Packet Switching and Optical Circuit Switching. We found that the typical data-transfer of synchronous data-parallelism training is long-lived and rarely changed that can be speed-up with optical switching. Simulation results on the Simgrid simulator show that our approach speed-up the training time of deep learning applications, especially in a large-scale manner.
Guang-Hua SONG Xin-Feng LI Zhe-Ming LU
Recently, the controllability of complex networks has become a hot topic in the field of network science, where the driver nodes play a key and central role. Therefore, studying their structural characteristics is of great significance to understand the underlying mechanism of network controllability. In this paper, we systematically investigate the nodal centrality of driver nodes in controlling complex networks, we find that the driver nodes tend to be low in-degree but high out-degree nodes, and most of driver nodes tend to have low betweenness centrality but relatively high closeness centrality. We also find that the tendencies of driver nodes towards eigenvector centrality and Katz centrality show very similar behaviors, both high eigenvector centrality and high Katz centrality are avoided by driver nodes. Finally, we find that the driver nodes towards PageRank centrality demonstrate a polarized distribution, i.e., the vast majority of driver nodes tend to be low PageRank nodes whereas only few driver nodes tend to be high PageRank nodes.
Yusuke HARA Xueting WANG Toshihiko YAMASAKI
Video inpainting is a task of filling missing regions in videos. In this task, it is important to efficiently use information from other frames and generate plausible results with sufficient temporal consistency. In this paper, we present a video inpainting method jointly using affine transformation and deformable convolutions for frame alignment. The former is responsible for frame-scale rough alignment and the latter performs pixel-level fine alignment. Our model does not depend on 3D convolutions, which limits the temporal window, or troublesome flow estimation. The proposed method achieves improved object removal results and better PSNR and SSIM values compared with previous learning-based methods.
Yu SONG Xu QIAO Yutaro IWAMOTO Yen-Wei CHEN Yili CHEN
Accurate and automatic quantitative cephalometry analysis is of great importance in orthodontics. The fundamental step for cephalometry analysis is to annotate anatomic-interested landmarks on X-ray images. Computer-aided automatic method remains to be an open topic nowadays. In this paper, we propose an efficient deep learning-based coarse-to-fine approach to realize accurate landmark detection. In the coarse detection step, we train a deep learning-based deformable transformation model by using training samples. We register test images to the reference image (one training image) using the trained model to predict coarse landmarks' locations on test images. Thus, regions of interest (ROIs) which include landmarks can be located. In the fine detection step, we utilize trained deep convolutional neural networks (CNNs), to detect landmarks in ROI patches. For each landmark, there is one corresponding neural network, which directly does regression to the landmark's coordinates. The fine step can be considered as a refinement or fine-tuning step based on the coarse detection step. We validated the proposed method on public dataset from 2015 International Symposium on Biomedical Imaging (ISBI) grand challenge. Compared with the state-of-the-art method, we not only achieved the comparable detection accuracy (the mean radial error is about 1.0-1.6mm), but also largely shortened the computation time (4 seconds per image).
Jiabao GAO Yuchen YAO Zhengjie LI Jinmei LAI
A series of Binarized Neural Networks (BNNs) show the accepted accuracy in image classification tasks and achieve the excellent performance on field programmable gate array (FPGA). Nevertheless, we observe existing designs of BNNs are quite time-consuming in change of the target BNN and acceleration of a new BNN. Therefore, this paper presents FCA-BNN, a flexible and configurable accelerator, which employs the layer-level configurable technique to execute seamlessly each layer of target BNN. Initially, to save resource and improve energy efficiency, the hardware-oriented optimal formulas are introduced to design energy-efficient computing array for different sizes of padded-convolution and fully-connected layers. Moreover, to accelerate the target BNNs efficiently, we exploit the analytical model to explore the optimal design parameters for FCA-BNN. Finally, our proposed mapping flow changes the target network by entering order, and accelerates a new network by compiling and loading corresponding instructions, while without loading and generating bitstream. The evaluations on three major structures of BNNs show the differences between inference accuracy of FCA-BNN and that of GPU are just 0.07%, 0.31% and 0.4% for LFC, VGG-like and Cifar-10 AlexNet. Furthermore, our energy-efficiency results achieve the results of existing customized FPGA accelerators by 0.8× for LFC and 2.6× for VGG-like. For Cifar-10 AlexNet, FCA-BNN achieves 188.2× and 60.6× better than CPU and GPU in energy efficiency, respectively. To the best of our knowledge, FCA-BNN is the most efficient design for change of the target BNN and acceleration of a new BNN, while keeps the competitive performance.
A neural network that outputs reconstructed images based on projection data containing scattered X-rays is presented, and the proposed scheme exhibits better accuracy than conventional computed tomography (CT), in which the scatter information is removed. In medical X-ray CT, it is a common practice to remove scattered X-rays using a collimator placed in front of the detector. In this study, the scattered X-rays were assumed to have useful information, and a method was devised to utilize this information effectively using a neural network. Therefore, we generated 70,000 projection data by Monte Carlo simulations using a cube comprising 216 (6 × 6 × 6) smaller cubes having random density parameters as the target object. For each projection simulation, the densities of the smaller cubes were reset to different values, and detectors were deployed around the target object to capture the scattered X-rays from all directions. Then, a neural network was trained using these projection data to output the densities of the smaller cubes. We confirmed through numerical evaluations that the neural-network approach that utilized scattered X-rays reconstructed images with higher accuracy than did the conventional method, in which the scattered X-rays were removed. The results of this study suggest that utilizing the scattered X-ray information can help significantly reduce patient dosing during imaging.
Cache prefetching technique brings huge benefits to performance improvement, but it comes at the cost of microarchitectural security in processors. In this letter, we deep dive into internal workings of a DCUIP prefetcher, which is one of prefetchers equipped in Intel processors. We discover that a DCUIP table is shared among different execution contexts in hyperthreading-enabled processors, which leads to another microarchitectural vulnerability. By exploiting the vulnerability, we propose a DCUIP poisoning attack. We demonstrate an AES encryption key can be extracted from an AES-NI implementation by mounting the proposed attack.
Dongni HU Chengxin CHEN Pengyuan ZHANG Junfeng LI Yonghong YAN Qingwei ZHAO
Recently, automated recognition and analysis of human emotion has attracted increasing attention from multidisciplinary communities. However, it is challenging to utilize the emotional information simultaneously from multiple modalities. Previous studies have explored different fusion methods, but they mainly focused on either inter-modality interaction or intra-modality interaction. In this letter, we propose a novel two-stage fusion strategy named modality attention flow (MAF) to model the intra- and inter-modality interactions simultaneously in a unified end-to-end framework. Experimental results show that the proposed approach outperforms the widely used late fusion methods, and achieves even better performance when the number of stacked MAF blocks increases.
Kohei SAKAI Keita TAKAHASHI Toshiaki FUJII
Coded-aperture imaging has been utilized for compressive light field acquisition; several images are captured using different aperture patterns, and from those images, an entire light field is computationally reconstructed. This method has been extended to dynamic light fields (moving scenes). However, this method assumed that the patterns were gray-valued and of arbitrary shapes. Implementation of such patterns required a special device such as a liquid crystal on silicon (LCoS) display, which made the imaging system costly and prone to noise. To address this problem, we propose the use of a binary aperture pattern rotating along time, which can be implemented with a rotating plate with a hole. We demonstrate that although using such a pattern limits the design space, our method can still achieve a high reconstruction quality comparable to the original method.