Takao WAHO Tomoaki KOIZUMI Hitoshi HAYASHI
A feedforward (FF) network using ΔΣ modulators is investigated to implement a non-binary analog-to-digital (A/D) converter. Weighting coefficients in the network are determined to suppress the generation of quantization noise. A moving average is adopted to prevent the analog signal amplitude from increasing beyond the allowable input range of the modulators. The noise transfer function is derived and used to estimate the signal-to-noise ratio (SNR). The FF network output is a non-uniformly distributed multi-level signal, which results in a better SNR than a uniformly distributed one. Also, the effect of the characteristic mismatch in analog components on the SNR is analyzed. Our behavioral simulations show that the SNR is improved by more than 30 dB, or equivalently a bit resolution of 5 bits, compared with a conventional first-order ΔΣ modulator.
Yosuke IIJIMA Keigo TAYA Yasushi YUMINAKA
To meet the increasing demand for high-speed communication in VLSI (very large-scale integration) systems, next-generation high-speed data transmission standards (e.g., IEEE 802.3bs and PCIe 6.0) will adopt four-level pulse amplitude modulation (PAM-4) for data coding. Although PAM-4 is spectrally efficient to mitigate inter-symbol interference caused by bandwidth-limited wired channels, it is more sensitive than conventional non-return-to-zero line coding. To evaluate the received signal quality when using adaptive coefficient settings for a PAM-4 equalizer during data transmission, we propose an eye-opening monitor technique based on machine learning. The proposed technique uses a Gaussian mixture model to classify the received PAM-4 symbols. Simulation and experimental results demonstrate the feasibility of adaptive equalization for PAM-4 coding.
Xueqing ZHANG Xiaoxia LIU Jun GUO Wenlei BAI Daguang GAN
As scientific and technological resources are experiencing information overload, it is quite expensive to find resources that users are interested in exactly. The personalized recommendation system is a good candidate to solve this problem, but data sparseness and the cold starting problem still prevent the application of the recommendation system. Sparse data affects the quality of the similarity measurement and consequently the quality of the recommender system. In this paper, we propose a matrix factorization recommendation algorithm based on similarity calculation(SCMF), which introduces potential similarity relationships to solve the problem of data sparseness. A penalty factor is adopted in the latent item similarity matrix calculation to capture more real relationships furthermore. We compared our approach with other 6 recommendation algorithms and conducted experiments on 5 public data sets. According to the experimental results, the recommendation precision can improve by 2% to 9% versus the traditional best algorithm. As for sparse data sets, the prediction accuracy can also improve by 0.17% to 18%. Besides, our approach was applied to patent resource exploitation provided by the wanfang patents retrieval system. Experimental results show that our method performs better than commonly used algorithms, especially under the cold starting condition.
Jiabao GAO Yuchen YAO Zhengjie LI Jinmei LAI
A series of Binarized Neural Networks (BNNs) show the accepted accuracy in image classification tasks and achieve the excellent performance on field programmable gate array (FPGA). Nevertheless, we observe existing designs of BNNs are quite time-consuming in change of the target BNN and acceleration of a new BNN. Therefore, this paper presents FCA-BNN, a flexible and configurable accelerator, which employs the layer-level configurable technique to execute seamlessly each layer of target BNN. Initially, to save resource and improve energy efficiency, the hardware-oriented optimal formulas are introduced to design energy-efficient computing array for different sizes of padded-convolution and fully-connected layers. Moreover, to accelerate the target BNNs efficiently, we exploit the analytical model to explore the optimal design parameters for FCA-BNN. Finally, our proposed mapping flow changes the target network by entering order, and accelerates a new network by compiling and loading corresponding instructions, while without loading and generating bitstream. The evaluations on three major structures of BNNs show the differences between inference accuracy of FCA-BNN and that of GPU are just 0.07%, 0.31% and 0.4% for LFC, VGG-like and Cifar-10 AlexNet. Furthermore, our energy-efficiency results achieve the results of existing customized FPGA accelerators by 0.8× for LFC and 2.6× for VGG-like. For Cifar-10 AlexNet, FCA-BNN achieves 188.2× and 60.6× better than CPU and GPU in energy efficiency, respectively. To the best of our knowledge, FCA-BNN is the most efficient design for change of the target BNN and acceleration of a new BNN, while keeps the competitive performance.
Keiichiro SATO Ryoichi SHINKUMA Takehiro SATO Eiji OKI Takanori IWAI Takeo ONISHI Takahiro NOBUKIYO Dai KANETOMO Kozo SATODA
Predictive spatial-monitoring, which predicts spatial information such as road traffic, has attracted much attention in the context of smart cities. Machine learning enables predictive spatial-monitoring by using a large amount of aggregated sensor data. Since the capacity of mobile networks is strictly limited, serious transmission delays occur when loads of communication traffic are heavy. If some of the data used for predictive spatial-monitoring do not arrive on time, prediction accuracy degrades because the prediction has to be done using only the received data, which implies that data for prediction are ‘delay-sensitive’. A utility-based allocation technique has suggested modeling of temporal characteristics of such delay-sensitive data for prioritized transmission. However, no study has addressed temporal model for prioritized transmission in predictive spatial-monitoring. Therefore, this paper proposes a scheme that enables the creation of a temporal model for predictive spatial-monitoring. The scheme is roughly composed of two steps: the first involves creating training data from original time-series data and a machine learning model that can use the data, while the second step involves modeling a temporal model using feature selection in the learning model. Feature selection enables the estimation of the importance of data in terms of how much the data contribute to prediction accuracy from the machine learning model. This paper considers road-traffic prediction as a scenario and shows that the temporal models created with the proposed scheme can handle real spatial datasets. A numerical study demonstrated how our temporal model works effectively in prioritized transmission for predictive spatial-monitoring in terms of prediction accuracy.
Toshihisa NABETANI Masahiro SEKIYA
With the development of the IEEE 802.11 standard for wireless LANs, there has been an enormous increase in the usage of wireless LANs in factories, plants, and other industrial environments. In industrial applications, wireless LAN systems require high reliability for stable real-time communications. In this paper, we propose a multi-access-point (AP) diversity method that contributes to the realization of robust data transmissions toward realization of ultra-reliable low-latency communications (URLLC) in wireless LANs. The proposed method can obtain a diversity effect of multipaths with independent transmission errors and collisions without modification of the IEEE 802.11 standard or increasing overhead of communication resources. We evaluate the effects of the proposed method by numerical analysis, develop a prototype to demonstrate its feasibility, and perform experiments using the prototype in a factory wireless environment. These numerical evaluations and experiments show that the proposed method increases reliability and decreases transmission delay.
Haitong YANG Guangyou ZHOU Tingting HE Maoxi LI
The current approaches to semantic role classification usually first define a representation vector for a candidate role and feed the vector into a deep neural network to perform classification. The representation vector contains some lexicalization features like word embeddings, lemmar embeddings. From linguistics, the semantic role frame of a sentence is a joint structure with strong dependencies between arguments which is not considered in current deep SRL systems. Therefore, this paper proposes a global deep reranking model to exploit these strong dependencies. The evaluation experiments on the CoNLL 2009 shared tasks show that our system can outperforms a strong local system significantly that does not consider role dependency relations.
Rui LI Ruqi XIAO Hong GU Weimin SU
A novel direction of arrival (DOA) estimation method for the coherent signal is presented in this paper. The proposed method applies the eigenvector associated with max eigenvalue, which contains the DOAs of all signals, to form a Toeplitz matrix, yielding an unconstrained optimization problem. Then, the DOA is obtained by peak searching of the pseudo power spectrum without the knowledge of signal number. It is illustrated that the method has a great performance and low computation complexity for the coherent signal. Simulation results verify the usefulness of the method.
Ruilin ZHANG Xingyu WANG Hirofumi SHINOHARA
In this paper, we describe a post-processing technique having high extraction efficiency (ExE) for de-biasing and de-correlating a random bitstream generated by true random number generators (TRNGs). This research is based on the N-bit von Neumann (VN_N) post-processing method. It improves the ExE of the original von Neumann method close to the Shannon entropy bound by a large N value. However, as the N value increases, the mapping table complexity increases exponentially (2N), which makes VN_N unsuitable for low-power TRNGs. To overcome this problem, at the algorithm level, we propose a waiting strategy to achieve high ExE with a small N value. At the architectural level, a Hamming weight mapping-based hierarchical structure is used to reconstruct the large mapping table using smaller tables. The hierarchical structure also decreases the correlation factor in the raw bitstream. To develop a technique with high ExE and low cost, we designed and fabricated an 8-bit von Neumann with waiting strategy (VN_8W) in a 130-nm CMOS. The maximum ExE of VN_8W is 62.21%, which is 2.49 times larger than the ExE of the original von Neumann. NIST SP 800-22 randomness test results proved the de-biasing and de-correlation abilities of VN_8W. As compared with the state-of-the-art optimized 7-element iterated von Neumann, VN_8W achieved more than 20% energy reduction with higher ExE. At 0.45V and 1MHz, VN_8W achieved the minimum energy of 0.18pJ/bit, which was suitable for sub-pJ low energy TRNGs.
Yuta UKON Shimpei SATO Atsushi TAKAHASHI
Advanced information-processing services such as computer vision require a high-performance digital circuit to perform high-load processing at high speed. To achieve high-speed processing, several image-processing applications use an approximate computing technique to reduce idle time of the circuit. However, it is difficult to design the high-speed image-processing circuit while controlling the error rate so as not to degrade service quality, and this technique is used for only a few applications. In this paper, we propose a method that achieves high-speed processing effectively in which processing time for each task is changed by roughly detecting its completion. Using this method, a high-speed processing circuit with a low error rate can be designed. The error rate is controllable, and a circuit design method to minimize the error rate is also presented in this paper. To confirm the effectiveness of our proposal, a ripple-carry adder (RCA), 2-dimensional discrete cosine transform (2D-DCT) circuit, and histogram of oriented gradients (HOG) feature calculation circuit are evaluated. Effective clock periods of these circuits obtained by our method with around 1% error rate are improved about 64%, 6%, and 12%, respectively, compared with circuits without error. Furthermore, the impact of the miscalculation on a video monitoring service using an object detection application is investigated. As a result, more than 99% of detection points required to be obtained are detected, and it is confirmed the miscalculation hardly degrades the service quality.
Dongzhen WANG Daqing HUANG Cheng XU
The reconnaissance mode with the cooperation of two unmanned aerial vehicles (UAVs) equipped with airborne visual tracking platforms is a common practice for localizing a target. Apart from the random noises from sensors, the localization performance is much dependent on their cooperative trajectories. In our previous work, we have proposed a cooperative trajectory generating method that proves better than EKF based method. In this letter, an improved online trajectory generating method is proposed to enhance the previous one. First, the least square estimation method has been replaced with a geometric-optimization based estimation method, which can obtain a better estimation performance than the least square method proposed in our previous work; second, in the trajectory optimization phase, the position error caused by estimation method is also considered, which can further improve the optimization performance of the next way points of the two UAVs. The improved method can well be applied to the two-UAV trajectory planning for corporative target localization, and the simulation results confirm that the improved method achieves an obviously better localization performance than our previous method and the EKF-based method.
Pan TAN Zhengchun ZHOU Haode YAN Yong WANG
Locally repairable codes (LRCs) with availability have received considerable attention in recent years since they are able to solve many problems in distributed storage systems such as repairing multiple node failures and managing hot data. Constructing LRCs with locality r and availability t (also called (r, t)-LRCs) with new parameters becomes an interesting research subject in coding theory. The objective of this paper is to propose two generic constructions of cyclic (r, t)-LRCs via linearized polynomials over finite fields. These two constructions include two earlier ones of cyclic LRCs from trace functions and truncated trace functions as special cases and lead to LRCs with new parameters that can not be produced by earlier ones.
Junhao ZHANG Masafumi KAZUNO Mizuki MOTOYOSHI Suguru KAMEDA Noriharu SUEMATSU
In this paper, we propose a direct digital RF transmitter with a 1-bit band-pass delta-sigma modulator (BP-DSM) that uses high order image components of the 7th Nyquist zone in Manchester coding for microwave and milimeter wave application. Compared to the conventional non-return-to-zero (NRZ) coding, in which the high order image components of 1-bit BP-DSM attenuate severely in the form of sinc function, the proposed 1-bit direct digital RF transmitter in Manchester code can improve the output power and signal-to-noise ratio (SNR) of the image components at specific (4n-1)th and (4n-2)th Nyquist Zone, which is confirmed by calculating of the power spectral density. Measurements are made to compare three types of 1-bit digital-to-analog converter (DAC) signal in output power and SNR; NRZ, 50% duty return-to-zero (RZ) and Manchester coding. By using 1 Vpp/8Gbps DAC output, 1-bit signals in Manchester coding show the highest output power of -20.3dBm and SNR of 40.3dB at 7th Nyquist Zone (26GHz) in CW condition. As a result, compared to NRZ and RZ coding, at 7th Nyquist zone, the output power is improved by 8.1dB and 6dB, respectively. Meanwhile, the SNR is improved by 7.6dB and 4.9dB, respectively. In 5Mbps-QPSK condition, 1-bit signals in Manchester code show the lowest error vector magnitude (EVM) of 2.4% and the highest adjacent channel leakage ratio (ACLR) of 38.2dB with the highest output power of -18.5dBm at 7th Nyquist Zone (26GHz), respectively, compared to the NRZ and 50% duty RZ coding. The measurement and simulation results of the image component of 1-bit signals at 7th Nyquist Zone (26GHz) are consistent with the calculation results.
Toshiki YAMADA Yoshihiro TAKAGI Chiyumi YAMADA Akira OTOMO
The optical properties of new tricyanopyrroline (TCP)-based chromophores with a benzyloxy group bound to aminobenzene donor unit were characterized by hyper-Rayleigh scattering (HRS), absorption spectrum, and 1H-NMR measurements, and the influence of the benzyloxy group on TCP-based chromophores was discussed based on the data. A positive effect of NLO properties was found in TCP-based NLO chromophores with a benzyloxy group compared with benchmark NLO chromophores without the benzyloxy group, suggesting an influence of intra-molecular hydrogen bond. Furthermore, we propose a formation of double intra-molecular hydrogen bonds in the TCP chromophore with monoene as the π-conjugation bridge and aminobenzene with a benzyloxy group as the donor unit.
It is found that the electrical resistance-length characteristic in an electroactive supercoiled polymer artificial muscle strongly depends on the temperature. This may come from the thermal expansion of coils in the artificial muscle, which increases the contact area of neighboring coils and results in a lower electrical resistance at a higher temperature. On the other hand, the electrical resistance-length characteristic collected during electrical driving seriously deviates from those collected at constant temperatures. Inhomogeneous heating during electrical driving seems to be a key for the deviation.
Rizal Setya PERDANA Yoshiteru ISHIDA
Automatic generation of textual stories from visual data representation, known as visual storytelling, is a recent advancement in the problem of images-to-text. Instead of using a single image as input, visual storytelling processes a sequential array of images into coherent sentences. A story contains non-visual concepts as well as descriptions of literal object(s). While previous approaches have applied external knowledge, our approach was to regard the non-visual concept as the semantic correlation between visual modality and textual modality. This paper, therefore, presents new features representation based on a canonical correlation analysis between two modalities. Attention mechanism are adopted as the underlying architecture of the image-to-text problem, rather than standard encoder-decoder models. Canonical Correlation Attention Mechanism (CAAM), the proposed end-to-end architecture, extracts time series correlation by maximizing the cross-modal correlation. Extensive experiments on VIST dataset ( http://visionandlanguage.net/VIST/dataset.html ) were conducted to demonstrate the effectiveness of the architecture in terms of automatic metrics, with additional experiments show the impact of modality fusion strategy.
Cuffless blood pressure (BP) monitors are noninvasive devices that measure systolic and diastolic BP without an inflatable cuff. They are easy to use, safe, and relatively accurate for resting-state BP measurement. Although commercially available from online retailers, BP monitors must be approved or certificated by medical regulatory bodies for clinical use. Cuffless BP monitoring devices also need to be approved; however, only the Institute of Electrical and Electronics Engineers (IEEE) certify these devices. In this paper, the principles of cuffless BP monitors are described, and the current situation regarding BP monitor standards and approval for medical use is discussed.
As NAND flash-based storage has been settled, a flash translation layer (FTL) has been in charge of mapping data addresses on NAND flash memory. Many FTLs implemented various mapping schemes, but the amount of mapping data depends on the mapping level. However, the FTL should contemplate mapping consistency irrespective of how much mapping data dwell in the storage. Furthermore, the recovery cost by the inconsistency needs to be considered for a faster storage reboot time. This letter proposes a novel method that enhances the consistency for a page-mapping level FTL running a legacy logging policy. Moreover, the recovery cost of page mappings also decreases. The novel method is to adopt a virtually-shrunk segment and deactivate page-mapping logs by assembling and storing the segments. This segment scheme already gave embedded NAND flash-based storage enhance its response time in our previous study. In addition to that improved result, this novel plan maximizes the page-mapping consistency, therefore improves the recovery cost compared with the legacy page-mapping FTL.
Emerging byte-addressable non-volatile memory devices attract much attention. A non-volatile main memory (NVMM) built on them enables larger memory size and lower power consumption than a traditional DRAM main memory. To fully utilize an NVMM, both software and hardware must be cooperatively optimized. Simultaneously, even focusing on a memory module, its micro architecture is still being developed though real non-volatile memory modules, such as Intel Optane DC persistent memory (DCPMM), have been on the market. Looking at existing NVMM evaluation environments, software simulators can evaluate various micro architectures with their long simulation time. Emulators can evaluate the whole system fast with less flexibility in their configuration than simulators. Thus, an NVMM emulator that can realize flexible and fast system evaluation still has an important role to explore the optimal system. In this paper, we introduce an NVMM emulator for embedded systems and explore a direction of optimization techniques for NVMMs by using it. It is implemented on an SoC-FPGA board employing three NVMM behaviour models: coarse-grain, fine-grain and DCPMM-based. The coarse and fine models enable NVMM performance evaluations based on extensions of traditional DRAM behaviour. The DCPMM-based model emulates the behaviour of a real DCPMM. Whole evaluation environment is also provided including Linux kernel modifications and several runtime functions. We first validate the developed emulator with an existing NVMM emulator, a cycle-accurate NVMM simulator and a real DCPMM. Then, the program behavior differences among three models are evaluated with SPEC CPU programs. As a result, the fine-grain model reveals the program execution time is affected by the frequency of NVMM memory requests rather than the cache hit ratio. Comparing with the fine-grain model and the coarse-grain model under the condition of the former's longer total write latency than the latter's, the former shows lower execution time for four of fourteen programs than the latter because of the bank-level parallelism and the row-buffer access locality exploited by the former model.
Shota ISHIMURA Kosuke NISHIMURA Yoshiaki NAKANO Takuo TANEMURA
Coherent transceivers are now regarded as promising candidates for upgrading the current 400Gigabit Ethernet (400GbE) transceivers to 800G. However, due to the complicated structure of a dual-polarization IQ modulator (DP-IQM) with its bulky polarization-beam splitter/comber (PBS/PBC), the increase in the transmitter size and cost is inevitable. In this paper, we propose a compact PBS/PBC-free transmitter structure with a straight-line configuration. By using the concept of polarization differential modulation, the proposed transmitter is capable of generating a DP phase-shift-keyed (DP-PSK) signal, which makes it directly applicable to the current coherent systems. A detailed analysis of the system performance reveals that the imperfect equalization and the bandwidth limitation at the receiver are the dominant penalty factors. Although such a penalty is usually unacceptable in long-haul applications, the proposed transmitter can be attractive due to its significant simplicity and compactness for short-reach applications, where the cost and the footprint are the primary concerns.