Riaz-ul-haque MIAN Tomoki NAKAMURA Masuo KAJIYAMA Makoto EIKI Michihiro SHINTANI
Wafer-level performance prediction techniques have been increasingly gaining attention in production LSI testing due to their ability to reduce measurement costs without compromising test quality. Despite the availability of several efficient methods, the site-to-site variation commonly observed in multi-site testing for radio frequency circuits remains inadequately addressed. In this manuscript, we propose a wafer-level performance prediction approach for multi-site testing that takes into account the site-to-site variation. Our proposed method is built on the Gaussian process, a widely utilized wafer-level spatial correlation modeling technique, and enhances prediction accuracy by extending hierarchical modeling to leverage the test site information test engineers provide. Additionally, we propose a test-site sampling method that maximizes cost reduction while maintaining sufficient estimation accuracy. Our experimental results, which employ industrial production test data, demonstrate that our proposed method can decrease the estimation error to 1/19 of that a conventional method achieves. Furthermore, our sampling method can reduce the required measurements by 97% while ensuring satisfactory estimation accuracy.
The very high path loss caused by molecular absorption becomes the biggest problem in Terahertz (THz) wireless communications. Recently, the multi-band ultra-massive multi-input multi-output (UM-MIMO) system has been proposed to overcome the distance problem. In UM-MIMO systems, the impact of mutual coupling among antennas on the system performance is unable to be ignored because of the dense array. In this letter, a channel model of UM-MIMO communication system is developed which considers coupling effect. The effect of mutual coupling in the subarray on the functionality of the system has been investigated through simulation studies, and reliable results have been derived.
Takuma NAGAO Tomoki NAKAMURA Masuo KAJIYAMA Makoto EIKI Michiko INOUE Michihiro SHINTANI
Statistical wafer-level characteristic variation modeling offers an attractive method for reducing the measurement cost in large-scale integrated (LSI) circuit testing while maintaining test quality. In this method, the performance of unmeasured LSI circuits fabricated on a wafer is statistically predicted based on a few measured LSI circuits. Conventional statistical methods model spatially smooth variations in the wafers. However, actual wafers can exhibit discontinuous variations that are systematically caused by the manufacturing environment, such as shot dependence. In this paper, we propose a modeling method that considers discontinuous variations in wafer characteristics by applying the knowledge of manufacturing engineers to a model estimated using Gaussian process regression. In the proposed method, the process variation is decomposed into systematic discontinuous and global components to improve estimation accuracy. An evaluation performed using an industrial production test dataset indicates that the proposed method effectively reduces the estimation error for an entire wafer by over 36% compared with conventional methods.
Yuya DEGAWA Toru KOIZUMI Tomoki NAKAMURA Ryota SHIOYA Junichiro KADOMOTO Hidetsugu IRIE Shuichi SAKAI
One of the performance bottlenecks of a processor is the front-end that supplies instructions. Various techniques, such as cache replacement algorithms and hardware prefetching, have been investigated to facilitate smooth instruction supply at the front-end and to improve processor performance. In these approaches, one of the most important factors has been the reduction in the number of instruction cache misses. By using the number of instruction cache misses or derived factors, previous studies have explained the performance improvements achieved by their proposed methods. However, we found that the number of instruction cache misses does not always explain performance changes well in modern processors. This is because the front-end in modern processors handles subsequent instruction cache misses in overlap with earlier ones. Based on this observation, we propose a novel factor: the number of miss regions. We define a region as a sequence of instructions from one branch misprediction to the next, while we define a miss region as a region that contains one or more instruction cache misses. At the boundary of each region, the pipeline is flushed owing to a branch misprediction. Thus, cache misses after this boundary are not handled in overlap with cache misses before the boundary. As a result, the number of miss regions is equal to the number of cache misses that are processed without overlap. In this paper, we demonstrate that the number of miss regions can well explain the variation in performance through mathematical models and simulation results. The results show that the model explains cycles per instruction with an average error of 1.0% and maximum error of 4.1% when applying an existing prefetcher to the instruction cache. The idea of miss regions highlights that instruction cache misses and branch mispredictions interact with each other in processors with a decoupled front-end. We hope that considering this interaction will motivate the development of fast performance estimation methods and new microarchitectural methods.
Kensuke IIZUKA Haruna TAKAGI Aika KAMEI Kazuei HIRONAKA Hideharu AMANO
FPGA cluster is a promising platform for future computing not only in the cloud but in the 5G wireless base stations with limited power supply by taking significant advantage of power efficiency. However, almost no power analyses with real systems have been reported. This work reports the detailed power consumption analyses of two FPGA clusters, namely FiC and M-KUBOS clusters with introducing power measurement tools and running the real applications. From the detailed analyses, we find that the number of activated links mainly determines the total power consumption of the systems regardless they are used or not. To improve the performance of applications while reducing power consumption, we should increase the clock frequency of the applications, use the minimum number of links and apply link aggregation. We also propose the power model for both clusters from the results of the analyses and this model can estimate the total power consumption of both FPGA clusters at the design step with 15% errors at maximum.
We thank Kamata et al. (2023) [1] for their interest in our work [2], and for providing an explanation of the quasi-linear kernel from a viewpoint of multiple kernel learning. In this letter, we first give a summary of the quasi-linear SVM. Then we provide a discussion on the novelty of quasi-linear kernels against multiple kernel learning. Finally, we explain the contributions of our work [2].
Hong LI Wenjun CAO Chen WANG Xinrui ZHU Guisheng LIAO Zhangqing HE
The configurable Ring oscillator Physical unclonable function (CRO PUF) is the newly proposed strong PUF based on classic RO PUF, which can generate exponential Challenge-Response Pairs (CRPs) and has good uniqueness and reliability. However, existing proposals have low hardware utilization and vulnerability to modeling attacks. In this paper, we propose a Novel Configurable Dual State (CDS) PUF with lower overhead and higher resistance to modeling attacks. This structure can be flexibly transformed into RO PUF and TERO PUF in the same topology according to the parity of the Hamming Weight (HW) of the challenge, which can achieve 100% utilization of the inverters and improve the efficiency of hardware utilization. A feedback obfuscation mechanism (FOM) is also proposed, which uses the stable count value of the ring oscillator in the PUF as the updated mask to confuse and hide the original challenge, significantly improving the effect of resisting modeling attacks. The proposed FOM-CDS PUF is analyzed by building a mathematical model and finally implemented on Xilinx Artix-7 FPGA, the test results show that the FOM-CDS PUF can effectively resist several popular modeling attack methods and the prediction accuracy is below 60%. Meanwhile it shows that the FOM-CDS PUF has good performance with uniformity, Bit Error Rate at different temperatures, Bit Error Rate at different voltages and uniqueness of 53.68%, 7.91%, 5.64% and 50.33% respectively.
Takehiro TAKAYANAGI Kiyoshi IZUMI
Personalized stock recommendations aim to suggest stocks tailored to individual investor needs, significantly aiding the financial decision making of an investor. This study shows the advantages of incorporating context into personalized stock recommendation systems. We embed item contextual information such as technical indicators, fundamental factors, and business activities of individual stocks. Simultaneously, we consider user contextual information such as investors' personality traits, behavioral characteristics, and attributes to create a comprehensive investor profile. Our model incorporating contextual information, validated on novel stock recommendation tasks, demonstrated a notable improvement over baseline models when incorporating these contextual features. Consistent outperformance across various hyperparameters further underscores the robustness and utility of our model in integrating stocks' features and investors' traits into personalized stock recommendations.
This paper proposes a deep neural network named BayesianPUFNet that can achieve high prediction accuracy even with few challenge-response pairs (CRPs) available for training. Generally, modeling attacks are a vulnerability that could compromise the authenticity of physically unclonable functions (PUFs); thus, various machine learning methods including deep neural networks have been proposed to assess the vulnerability of PUFs. However, conventional modeling attacks have not considered the cost of CRP collection and analyzed attacks based on the assumption that sufficient CRPs were available for training; therefore, previous studies may have underestimated the vulnerability of PUFs. Herein, we show that the application of Bayesian deep neural networks that incorporate Bayesian statistics can provide accurate response prediction even in situations where sufficient CRPs are not available for learning. Numerical experiments show that the proposed model uses only half the CRP to achieve the same response prediction as that of the conventional methods. Our code is openly available on https://github.com/bayesian-puf-net/bayesian-puf-net.git.
This research develops a new automatic path following control method for a car model based on just-in-time modeling. The purpose is that a lot of basic driving data for various situations are accumulated into a database, and we realize automatic path following for unknown roads by using only data in the database. Especially, just-in-time modeling is repeatedly utilized in order to follow the desired points on the given road. From the results of a numerical simulation, it turns out that the proposed new method can make the car follow the desired points on the given road with small error, and it shows high computational efficiency.
This study develops a new automatic hovering control method based on just-in-time modeling for a multicopter. Especially, the main aim is to compute gains of a feedback control law such that the multicopter hovers at a desired height and at a desired time without overshoot/undershoot. First, a database that contains various hovering data is constructed, and then the proposed method computes gains for a query input from the database. From simulation results, it turns out that the multicopter achieves control purposes, and hence the new method is effective.
Shogo UMEYAMA Yoshinori DOBASHI
We present an interactive modeling system for Japanese castles. We develop an user interface that can generate the fundamental structure of the castle tower consisting of stone walls, turrets, and roofs. By clicking on the screen displaying the 3D space with the mouse, relevant parameters are calculated automatically to generate 3D models of Japanese-style castles. We use characteristic curves that often appear in ancient Japanese architecture for the realistic modeling of the castles. We evaluate the effectiveness of our method by comparing the castle generated by our method with a commercially-available 3D mode of a castle.
Ryosuke MISHIMA Kunihiko HIRAISHI
In 2015, the Ministry of Land, Infrastructure and Transportation started to provide information on aircraft flying over Japan, called CARATS Open Data, and to promote research on aviation systems actively. The airspace is divided into sectors, which are used for limiting air traffic to control safely and efficiently. Since the demand for air transportation is increasing, new optimization techniques and efficient control have been required to predict and resolve demand-capacity imbalances in the airspace. In this paper, we aim to construct mathematical models of the inter-sector air traffic flow from CARATS Open Data. In addition, we develop methods to predict future sector demand. Accuracy of the prediction is evaluated by comparison between predicted sector demand and the actual data.
This study considered an extension of a sparse regularization method with scaling, especially in thresholding methods that are simple and typical examples of sparse modeling. In this study, in the setting of a non-parametric orthogonal regression problem, we developed and analyzed a thresholding method in which soft thresholding estimators are independently expanded by empirical scaling values. The scaling values have a common hyper-parameter that is an order of expansion of an ideal scaling value to achieve hard thresholding. We simply refer to this estimator as a scaled soft thresholding estimator. The scaled soft thresholding method is a bridge method between soft and hard thresholding methods. This new estimator is indeed consistent with an adaptive LASSO estimator in the orthogonal case; i.e., it is thus an another derivation of an adaptive LASSO estimator. It is a general method that includes soft thresholding and non-negative garrote as special cases. We subsequently derived the degree of freedom of the scaled soft thresholding in calculating the Stein's unbiased risk estimate. We found that it is decomposed into the degree of freedom of soft thresholding and the remainder term connecting to the hard thresholding. As the degree of freedom reflects the degree of over-fitting, this implies that the scaled soft thresholding has an another source of over-fitting in addition to the number of un-removed components. The theoretical result was verified by a simple numerical example. In this process, we also focused on the non-monotonicity in the above remainder term of the degree of freedom and found that, in a sparse and large sample setting, it is mainly caused by useless components that are not related to the target function.
Van Hung PHAM Tuan Hung NGUYEN Duc Minh NGUYEN Hisashi MORISHITA
In this paper, we propose a new method based on copula theory to evaluate the detection performance of a distributed-processing multistatic radar system (DPMRS). By applying the Gaussian copula to model the dependence of local decisions in a DPMRS as well as data fusion rules of AND, OR, and K/N, the performance of a DPMRS for detecting Swerling fluctuating targets can be easily evaluated even under non-Gaussian clutter with a nonuniform dependence matrix. The reliability and flexibility of this method are validated by applying the proposed method to a previous problem by other authors, and our other investigation results indicate its high potential for evaluating DPMRS performance in various cases involving different models of target and clutter.
Bitcoin is one of popular cryptocurrencies widely used over the world, and its blockchain technology has attracted considerable attention. In Bitcoin system, it has been reported that transactions are prioritized according to transaction fees, and that transactions with high priorities are likely to be confirmed faster than those with low priorities. In this paper, we consider performance modeling of Bitcoin-blockchain system in order to characterize the transaction-confirmation time. We first introduce the Bitcoin system, focusing on proof-of-work, the consensus mechanism of Bitcoin blockchain. Then, we show some queueing models and its analytical results, discussing the implications and insights obtained from the queueing models.
Sooyong JEONG Sungdeok CHA Woo Jin LEE
Embedded software often interacts with multiple inputs from various sensors whose dependency is often complex or partially known to developers. With incomplete information on dependency, testing is likely to be insufficient in detecting errors. We propose a method to enhance testing coverage of embedded software by identifying subtle and often neglected dependencies using information contained in usage log. Usage log, traditionally used primarily for investigative purpose following accidents, can also make useful contribution during testing of embedded software. Our approach relies on first individually developing behavioral model for each environmental input, performing compositional analysis while identifying feasible but untested dependencies from usage log, and generating additional test cases that correspond to untested or insufficiently tested dependencies. Experimental evaluation was performed on an Android application named Gravity Screen as well as an Arduino-based wearable glove app. Whereas conventional CTM-based testing technique achieved average branch coverage of 26% and 68% on these applications, respectively, proposed technique achieved 100% coverage in both.
Lin YAN Mingyong ZENG Shuai REN Zhangkai LUO
Traffic categorization aims to classify network traffic into major service types. A modern deep neural network based on temporal sequence modeling is proposed for encrypted traffic categorization. The contemporary techniques such as dilated convolution and residual connection are adopted as the basic building block. The raw traffic files are pre-processed to generate 1-dimensional flow byte sequences and are feed into our specially-devised network. The proposed approach outperforms other existing methods greatly on a public traffic dataset.
Takaaki SAEKI Yuki SAITO Shinnosuke TAKAMICHI Hiroshi SARUWATARI
This paper proposes two high-fidelity and computationally efficient neural voice conversion (VC) methods based on a direct waveform modification using spectral differentials. The conventional spectral-differential VC method with a minimum-phase filter achieves high-quality conversion for narrow-band (16 kHz-sampled) VC but requires heavy computational cost in filtering. This is because the minimum phase obtained using a fixed lifter of the Hilbert transform often results in a long-tap filter. Furthermore, when we extend the method to full-band (48 kHz-sampled) VC, the computational cost is heavy due to increased sampling points, and the converted-speech quality degrades due to large fluctuations in the high-frequency band. To construct a short-tap filter, we propose a lifter-training method for data-driven phase reconstruction that trains a lifter of the Hilbert transform by taking into account filter truncation. We also propose a frequency-band-wise modeling method based on sub-band multi-rate signal processing (sub-band modeling method) for full-band VC. It enhances the computational efficiency by reducing sampling points of signals converted with filtering and improves converted-speech quality by modeling only the low-frequency band. We conducted several objective and subjective evaluations to investigate the effectiveness of the proposed methods through implementation of the real-time, online, full-band VC system we developed, which is based on the proposed methods. The results indicate that 1) the proposed lifter-training method for narrow-band VC can shorten the tap length to 1/16 without degrading the converted-speech quality, and 2) the proposed sub-band modeling method for full-band VC can improve the converted-speech quality while reducing the computational cost, and 3) our real-time, online, full-band VC system can convert 48 kHz-sampled speech in real time attaining the converted speech with a 3.6 out of 5.0 mean opinion score of naturalness.
Yuan HE Yasutaka WADA Wenchao LUO Ryuichi SAKAMOTO Guanqin PAN Thang CAO Masaaki KONDO
Due to the slowdown of Moore's Law, power limitation has been one of the most critical issues for current and future HPC systems. To more efficiently utilize HPC systems when power budgets or deadlines are given, it is very desirable to accurately estimate the performance or power consumption of applications before conducting their tuned production runs on any specific systems. In order to ease such estimations, we showcase a straight-forward and yet effective method, based on the enhanced power management framework and DSL we developed, to help HPC users to clarify the performance and power relationships of their applications. This method demonstrates an easy process of profiling, modeling and management on both performance and power of HPC systems and applications. In our evaluations, only a few (up to 3) profiled runs are necessary before very precise models of HPC applications can be obtained through this method (and algorithm), which has dramatically improved the efficiency of and lowered the difficulty in utilizing HPC systems under limited power budgets.