Yoichi HINAMOTO Shotaro NISHIMURA
A state-space approach for adaptive second-order IIR notch digital filters is explored. A simplified iterative algorithm is derived from the gradient-descent method to minimize the mean-squared output of an adaptive notch digital filter. The stability and parameter-estimation bias are then analyzed by employing a first-order linear dynamical system. As a consequence, it is clarified that the resulting parameter estimate is unbiased. Finally, a numerical example is presented to demonstrate the validity and effectiveness of the adaptive state-space notch digital filter and bias analysis of parameter estimation.
Yoshinori ITOTAGAWA Koma ATSUMI Hikaru SEBE Daisuke KANEMOTO Tetsuya HIROSE
This paper describes a programmable differential bandgap reference (PD-BGR) for ultra-low-power IoT (Internet-of-Things) edge node devices. The PD-BGR consists of a current generator (CG) and differential voltage generator (DVG). The CG is based on a bandgap reference (BGR) and generates an operating current and a voltage, while the DVG generates another voltage from the current. A differential voltage reference can be obtained by taking the voltage difference from the voltages. The PD-BGR can produce a programmable differential output voltage by changing the multipliers of MOSFETs in a differential pair and resistance with digital codes. Simulation results showed that the proposed PD-BGR can generate 25- to 200-mV reference voltages with a 25-mV step within a ±0.7% temperature inaccuracy in a temperature range from -20 to 100°C. A Monte Carlo simulation showed that the coefficient of the variation in the reference was within 1.1%. Measurement results demonstrated that our prototype chips can generate stable programmable differential output voltages, almost the same results as those of the simulation. The average power consumption was only 88.4 nW, with a voltage error of -4/+3 mV with 5 samples.
Ayumu YAMADA Zhiyuan HUANG Naoko MISAWA Chihiro MATSUI Ken TAKEUCHI
In this work, fluctuation patterns of ReRAM current are classified automatically by proposed fluctuation pattern classifier (FPC). FPC is trained with artificially created dataset to overcome the difficulties of measured current signals, including the annotation cost and imbalanced data amount. Using FPC, fluctuation occurrence under different write conditions is analyzed for both HRS and LRS current. Based on the measurement and classification results, physical models of fluctuations are established.
Fuyuki KIHARA Chihiro MATSUI Ken TAKEUCHI
In this work, we propose a 1T1R ReRAM CiM architecture for Hyperdimensional Computing (HDC). The number of Source Lines and Bit Lines is reduced by introducing memory cells that are connected in series, which is especially advantageous when using a 3D implementation. The results of CiM operations contain errors, but HDC is robust against them, so that even if the XNOR operation has an error of 25%, the inference accuracy remains above 90%.
Wenxia BAO An LIN Hua HUANG Xianjun YANG Hemu CHEN
Recent years have seen remarkable progress in human pose estimation. However, manual annotation of keypoints remains tedious and imprecise. To alleviate this problem, this paper proposes a novel method called Multi-Scale Contrastive Learning (MSCL). This method uses a siamese network structure with upper and lower branches that capture diffirent views of the same image. Each branch uses a backbone network to extract image representations, employing multi-scale feature vectors to capture information. These feature vectors are then passed through an enhanced feature pyramid for fusion, producing more robust feature representations. The feature vectors are then further encoded by mapping and prediction heads to predict the feature vector of another view. Using negative cosine similarity between vectors as a loss function, the backbone network is pre-trained on a large-scale unlabeled dataset, enhancing its capacity to extract visual representations. Finally, transfer learning is performed on a small amount of labelled data for the pose estimation task. Experiments on COCO datasets show significant improvements in Average Precision (AP) of 1.8%, 0.9%, and 1.2% with 1%, 5%, and 10% labelled data on COCO. In addition, the Percentage of Correct Keypoints (PCK) improves by 0.5% on MPII&AIC, outperforming mainstream contrastive learning methods.
Batnasan LUVAANJALBA Elaine Yi-Ling WU
Emergency Medical Services (EMS) play a crucial role in healthcare systems, managing pre-hospital or out-of-hospital emergencies from the onset of an emergency call to the patient’s arrival at a healthcare facility. The design of an efficient ambulance location model is pivotal in enhancing survival rates, controlling morbidity, and preventing disability. Key factors in the classical models typically include travel time, demand zones, and the number of stations. While urban EMS systems have received extensive examination due to their centralized populations, rural areas pose distinct challenges. These include lower population density and longer response distances, contributing to a higher fatality rate due to sparse population distribution, limited EMS stations, and extended travel times. To address these challenges, we introduce a novel mathematical model that aims to optimize coverage and equity. A distinctive feature of our model is the integration of equity within the objective function, coupled with a focus on practical response time that includes the period required for personal protective equipment procedures, ensuring the model’s applicability and realism in emergency response scenarios. We tackle the proposed problem using a tailored genetic algorithm and propose a greedy algorithm for solution construction. The implementation of our tailored Genetic Algorithm promises efficient and effective EMS solutions, potentially enhancing emergency care and health outcomes in rural communities.
Shijie WANG Xuejiao HU Sheng LIU Ming LI Yang LI Sidan DU
Detecting key frames in videos has garnered substantial attention in recent years, it is a point-level task and has deep research value and application prospect in daily life. For instances, video surveillance system, video cover generation and highlight moment flashback all demands the technique of key frame detection. However, the task is beset by challenges such as the sparsity of key frame instances, imbalances between target frames and background frames, and the absence of post-processing method. In response to these problems, we introduce a novel and effective Temporal Interval Guided (TIG) framework to precisely localize specific frames. The framework is incorporated with a proposed Point-Level-Soft non-maximum suppression (PLS-NMS) post-processing algorithm which is suitable for point-level task, facilitated by the well-designed confidence score decay function. Furthermore, we propose a TIG-loss, exhibiting sensitivity to temporal interval from target frame, to optimize the two-stage framework. The proposed method can be broadly applied to key frame detection in video understanding, including action start detection and static video summarization. Extensive experimentation validates the efficacy of our approach on action start detection benchmark datasets: THUMOS’14 and Activitynet v1.3, and we have reached state-of-the-art performance. Competitive results are also demonstrated on SumMe and TVSum datasets for deep learning based static video summarization.
This letter introduces an innovation for the heterogeneous storage architecture of AI chips, specifically focusing on the integration of six transistors(6T) and eight transistors(8T) hybrid SRAM. Traditional approaches to reducing SRAM power consumption typically involve lowering the operating voltage, a method that often substantially diminishes the recognition rate of neural networks. However, the innovative design detailed in this letter amalgamates the strengths of both SRAM types. It operates at a voltage lower than conventional SRAM, thereby significantly reducing the power consumption in neural networks without compromising performance.
Haiyang LIU Xiaopeng JIAO Lianrong MA
In this letter, we investigate the application of the subgradient method to design efficient algorithm for linear programming (LP) decoding of binary linear codes. A major drawback of the original formulation of LP decoding is that the description complexity of the feasible region is exponential in the check node degrees of the code. In order to tackle the problem, we propose a processing technique for LP decoding with the subgradient method, whose complexity is linear in the check node degrees. Consequently, a message-passing type decoding algorithm can be obtained, whose per-iteration complexity is extremely low. Moreover, if the algorithm converges to a valid codeword, it is guaranteed to be a maximum likelihood codeword. Simulation results on several binary linear codes with short lengths suggest that the performances of LP decoding based on the subgradient method and the state-of-art LP decoding implementation approach are comparable.
Shinobu NAGAYAMA Tsutomu SASAO Jon T. BUTLER
This paper proposes a decomposition method for symmetric multiple-valued functions. It decomposes a given symmetric multiple-valued function into three parts. By using suitable decision diagrams for the three parts, we can represent symmetric multiple-valued functions compactly. By deriving theorems on sizes of the decision diagrams, this paper shows that space complexity of the proposed representation is low. This paper also presents algorithms to construct the decision diagrams for symmetric multiple-valued functions with low time complexity. Experimental results show that the proposed method represents randomly generated symmetric multiple-valued functions more compactly than the conventional representation method using standard multiple-valued decision diagrams. Symmetric multiple-valued functions are a basic class of functions, and thus, their compact representation benefits many applications where they appear.
Ji XI Yue XIE Pengxu JIANG Wei JIANG
Currently, a significant portion of acoustic scene categorization (ASC) research is centered around utilizing Convolutional Neural Network (CNN) models. This preference is primarily due to CNN’s ability to effectively extract time-frequency information from audio recordings of scenes by employing spectrum data as input. The expression of many dimensions can be achieved by utilizing 2D spectrum characteristics. Nevertheless, the diverse interpretations of the same object’s existence in different positions on the spectrum map can be attributed to the discrepancies between spectrum properties and picture qualities. The lack of distinction between different aspects of input information in ASC-based CNN networks may result in a decline in system performance. Considering this, a feature pyramid segmentation (FPS) approach based on CNN is proposed. The proposed approach involves utilizing spectrum features as the input for the model. These features are split based on a preset scale, and each segment-level feature is then fed into the CNN network for learning. The SoftMax classifier will receive the output of all feature scales, and these high-level features will be fused and fed to it to categorize different scenarios. The experiment provides evidence to support the efficacy of the FPS strategy and its potential to enhance the performance of the ASC system.
Pengxu JIANG Yang YANG Yue XIE Cairong ZOU Qingyun WANG
Convolutional neural network (CNN) is widely used in acoustic scene classification (ASC) tasks. In most cases, local convolution is utilized to gather time-frequency information between spectrum nodes. It is challenging to adequately express the non-local link between frequency domains in a finite convolution region. In this paper, we propose a dual-path convolutional neural network based on band interaction block (DCNN-bi) for ASC, with mel-spectrogram as the model’s input. We build two parallel CNN paths to learn the high-frequency and low-frequency components of the input feature. Additionally, we have created three band interaction blocks (bi-blocks) to explore the pertinent nodes between various frequency bands, which are connected between two paths. Combining the time-frequency information from two paths, the bi-blocks with three distinct designs acquire non-local information and send it back to the respective paths. The experimental results indicate that the utilization of the bi-block has the potential to improve the initial performance of the CNN substantially. Specifically, when applied to the DCASE 2018 and DCASE 2020 datasets, the CNN exhibited performance improvements of 1.79% and 3.06%, respectively.
Risheng QIN Hua KUANG He JIANG Hui YU Hong LI Zhuan LI
This paper proposes a determination method of the cascaded number for lumped parameter models (LPMs) of the transmission lines. The LPM is used to simulate long-distance transmission lines, and the cascaded number significantly impacts the simulation results. Currently, there is a lack of a system-level determination method of the cascaded number for LPMs. Based on the theoretical analysis and eigenvalue decomposition of network matrix, this paper discusses the error in resonance characteristics between distributed parameter model and LPMs. Moreover, it is deduced that optimal cascaded numbers of the cascaded π-type and T-type LPMs are the same, and the Γ-type LPM has a lowest analog accuracy. The principle that the maximum simulation frequency is less than the first resonance frequency of each segment is presented. According to the principle, optimal cascaded numbers of cascaded π-type, T-type, and Γ-type LPMs are obtained. The effectiveness of the proposed determination method is verified by simulation.
Beibei LI Xun RAN Yiran LIU Wensheng LI Qingling DUAN
Fish skin color detection plays a critical role in aquaculture. However, challenges arise from image color cast and the limited dataset, impacting the accuracy of the skin color detection process. To address these issues, we proposed a novel fish skin color detection method, termed VH-YOLOv5s. Specifically, we constructed a dataset for fish skin color detection to tackle the limitation posed by the scarcity of available datasets. Additionally, we proposed a Variance Gray World Algorithm (VGWA) to correct the image color cast. Moreover, the designed Hybrid Spatial Pyramid Pooling (HSPP) module effectively performs multi-scale feature fusion, thereby enhancing the feature representation capability. Extensive experiments have demonstrated that VH-YOLOv5s achieves excellent detection results on the Plectropomus leopardus skin color dataset, with a precision of 91.7%, recall of 90.1%, mAP@0.5 of 95.2%, and mAP@0.5:0.95 of 57.5%. When compared to other models such as Centernet, AutoAssign, and YOLOX-s, VH-YOLOv5s exhibits superior detection performance, surpassing them by 2.5%, 1.8%, and 1.7%, respectively. Furthermore, our model can be deployed directly on mobile phones, making it highly suitable for practical applications.
Guang LI Ren TOGO Takahiro OGAWA Miki HASEYAMA
In this study, we propose a novel dataset distillation method based on parameter pruning. The proposed method can synthesize more robust distilled datasets and improve distillation performance by pruning difficult-to-match parameters during the distillation process. Experimental results on two benchmark datasets show the superiority of the proposed method.
We address a path planning problem for heterogeneous multi-robot systems under specifications consisting of temporal constraints and routing tasks such as package delivery services. The robots are partitioned into several groups based on their dynamics and specifications. We introduce a concise description of such tasks, called a work, and extend counting LTL to represent such specifications. We convert the problem into an ILP problem. We show that the number of variables in the ILP problem is fewer than that of the existing method using cLTL+. By simulation, we show that the computation time of the proposed method is faster than that of the existing method.
Daichi MINAMIDE Tatsuhiro TSUCHIYA
In interdependent systems, such as electric power systems, entities or components mutually depend on each other. Due to these interdependencies, a small number of initial failures can propagate throughout the system, resulting in catastrophic system failures. This paper addresses the problem of finding the set of entities whose failures will have the worst effects on the system. To this end, a two-phase algorithm is developed. In the first phase, the tight bound on failure propagation steps is computed using a Boolean Satisfiablility (SAT) solver. In the second phase, the problem is formulated as an Integer Linear Programming (ILP) problem using the obtained step bound and solved with an ILP solver. Experimental results show that the algorithm scales to large problem instances and outperforms a single-phase algorithm that uses a loose step bound.
Fuma MOTOYAMA Koichi KOBAYASHI Yuh YAMASHITA
Control of complex networks such as gene regulatory networks is one of the fundamental problems in control theory. A Boolean network (BN) is one of the mathematical models in complex networks, and represents the dynamic behavior by Boolean functions. In this paper, a solution method for the finite-time control problem of BNs is proposed using a BDD (binary decision diagram). In this problem, we find all combinations of the initial state and the control input sequence such that a certain control specification is satisfied. The use of BDDs enables us to solve this problem for BNs such that the conventional method cannot be applied. First, after the outline of BNs and BDDs is explained, the problem studied in this paper is given. Next, a solution method using BDDs is proposed. Finally, a numerical example on a 67-node BN is presented.
Xiaoyong SONG Zhichuan GUO Xinshuo WANG Mangu SONG
In software defined network (SDN), packet processing is commonly implemented using match-action model, where packets are processed based on matched actions in match action table. Due to the limited FPGA on-board resources, it is an important challenge to achieve large-scale high throughput based on exact matching (EM), while solving hash conflicts and out-of-order problems. To address these issues, this study proposed an FPGA-based EM table that leverages shared rule tables across multiple pipelines to eliminate memory replication and enhance overall throughput. An out-of-order reordering function is used to ensure packet sequencing within the pipelines. Moreover, to handle collisions and increase load factor of hash table, multiple hash table blocks are combined and an auxiliary CAM-based EM table is integrated in each pipeline. To the best of our knowledge, this is the first time that the proposed design considers the recovery of out-of-order operations in multi-channel EM table for high-speed network packets processing application. Furthermore, it is implemented on Xilinx Alveo U250 field programmable gate arrays, which has a million rules and achieves a processing speed of 200 million operations per second, theoretically enabling throughput exceeding 100 Gbps for 64-Byte size packets.
Previously a method was reported to determine the mathematical representation of the microwave oscillator admittance by using numerical calculation. When analyzing the load characteristics and synchronization phenomena by using this formula, the analysis results meet with the experimental results. This paper describes a method to determine the mathematical representation manually.