The search functionality is under construction.

Keyword Search Result

[Keyword] MEC(222hit)

1-20hit(222hit)

  • Power Peak Load Forecasting Based on Deep Time Series Analysis Method Open Access

    Ying-Chang HUNG  Duen-Ren LIU  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2024/03/21
      Vol:
    E107-D No:7
      Page(s):
    845-856

    The prediction of peak power load is a critical factor directly impacting the stability of power supply, characterized significantly by its time series nature and intricate ties to the seasonal patterns in electricity usage. Despite its crucial importance, the current landscape of power peak load forecasting remains a multifaceted challenge in the field. This study aims to contribute to this domain by proposing a method that leverages a combination of three primary models - the GRU model, self-attention mechanism, and Transformer mechanism - to forecast peak power load. To contextualize this research within the ongoing discourse, it’s essential to consider the evolving methodologies and advancements in power peak load forecasting. By delving into additional references addressing the complexities and current state of the power peak load forecasting problem, this study aims to build upon the existing knowledge base and offer insights into contemporary challenges and strategies adopted within the field. Data preprocessing in this study involves comprehensive cleaning, standardization, and the design of relevant functions to ensure robustness in the predictive modeling process. Additionally, recognizing the necessity to capture temporal changes effectively, this research incorporates features such as “Weekly Moving Average” and “Monthly Moving Average” into the dataset. To evaluate the proposed methodologies comprehensively, this study conducts comparative analyses with established models such as LSTM, Self-attention network, Transformer, ARIMA, and SVR. The outcomes reveal that the models proposed in this study exhibit superior predictive performance compared to these established models, showcasing their effectiveness in accurately forecasting electricity consumption. The significance of this research lies in two primary contributions. Firstly, it introduces an innovative prediction method combining the GRU model, self-attention mechanism, and Transformer mechanism, aligning with the contemporary evolution of predictive modeling techniques in the field. Secondly, it introduces and emphasizes the utility of “Weekly Moving Average” and “Monthly Moving Average” methodologies, crucial in effectively capturing and interpreting seasonal variations within the dataset. By incorporating these features, this study enhances the model’s ability to account for seasonal influencing factors, thereby significantly improving the accuracy of peak power load forecasting. This contribution aligns with the ongoing efforts to refine forecasting methodologies and addresses the pertinent challenges within power peak load forecasting.

  • Real-Time Video Matting Based on RVM and Mobile ViT Open Access

    Chengyu WU  Jiangshan QIN  Xiangyang LI  Ao ZHAN  Zhengqiang WANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2024/01/29
      Vol:
    E107-D No:6
      Page(s):
    792-796

    Real-time matting is a challenging research in deep learning. Conventional CNN (Convolutional Neural Networks) approaches are easy to misjudge the foreground and background semantic and have blurry matting edges, which result from CNN’s limited concentration on global context due to receptive field. We propose a real-time matting approach called RMViT (Real-time matting with Vision Transformer) with Transformer structure, attention and content-aware guidance to solve issues above. The semantic accuracy improves a lot due to the establishment of global context and long-range pixel information. The experiments show our approach exceeds a 30% reduction in error metrics compared with existing real-time matting approaches.

  • Analysis of Blood Cell Image Recognition Methods Based on Improved CNN and Vision Transformer Open Access

    Pingping WANG  Xinyi ZHANG  Yuyan ZHAO  Yueti LI  Kaisheng XU  Shuaiyin ZHAO  

     
    PAPER-Neural Networks and Bioengineering

      Pubricized:
    2023/09/15
      Vol:
    E107-A No:6
      Page(s):
    899-908

    Leukemia is a common and highly dangerous blood disease that requires early detection and treatment. Currently, the diagnosis of leukemia types mainly relies on the pathologist’s morphological examination of blood cell images, which is a tedious and time-consuming process, and the diagnosis results are highly subjective and prone to misdiagnosis and missed diagnosis. This research suggests a blood cell image recognition technique based on an enhanced Vision Transformer to address these problems. Firstly, this paper incorporate convolutions with token embedding to replace the positional encoding which represent coarse spatial information. Then based on the Transformer’s self-attention mechanism, this paper proposes a sparse attention module that can select identifying regions in the image, further enhancing the model’s fine-grained feature expression capability. Finally, this paper uses a contrastive loss function to further increase the intra-class consistency and inter-class difference of classification features. According to experimental results, The model in this study has an identification accuracy of 92.49% on the Munich single-cell morphological dataset, which is an improvement of 1.41% over the baseline. And comparing with sota Swin transformer, this method still get greater performance. So our method has the potential to provide reference for clinical diagnosis by physicians.

  • FA-YOLO: A High-Precision and Efficient Method for Fabric Defect Detection in Textile Industry Open Access

    Kai YU  Wentao LYU  Xuyi YU  Qing GUO  Weiqiang XU  Lu ZHANG  

     
    PAPER-Neural Networks and Bioengineering

      Pubricized:
    2023/09/04
      Vol:
    E107-A No:6
      Page(s):
    890-898

    The automatic defect detection for fabric images is an essential mission in textile industry. However, there are some inherent difficulties in the detection of fabric images, such as complexity of the background and the highly uneven scales of defects. Moreover, the trade-off between accuracy and speed should be considered in real applications. To address these problems, we propose a novel model based on YOLOv4 to detect defects in fabric images, called Feature Augmentation YOLO (FA-YOLO). In terms of network structure, FA-YOLO adds an additional detection head to improve the detection ability of small defects and builds a powerful Neck structure to enhance feature fusion. First, to reduce information loss during feature fusion, we perform the residual feature augmentation (RFA) on the features after dimensionality reduction by using 1×1 convolution. Afterward, the attention module (SimAM) is embedded into the locations with rich features to improve the adaptation ability to complex backgrounds. Adaptive spatial feature fusion (ASFF) is also applied to output of the Neck to filter inconsistencies across layers. Finally, the cross-stage partial (CSP) structure is introduced for optimization. Experimental results based on three real industrial datasets, including Tianchi fabric dataset (72.5% mAP), ZJU-Leaper fabric dataset (0.714 of average F1-score) and NEU-DET steel dataset (77.2% mAP), demonstrate the proposed FA-YOLO achieves competitive results compared to other state-of-the-art (SoTA) methods.

  • Data-Quality Aware Incentive Mechanism Based on Stackelberg Game in Mobile Edge Computing Open Access

    Shuyun LUO  Wushuang WANG  Yifei LI  Jian HOU  Lu ZHANG  

     
    PAPER-Mobile Information Network and Personal Communications

      Pubricized:
    2023/09/14
      Vol:
    E107-A No:6
      Page(s):
    873-880

    Crowdsourcing becomes a popular data-collection method to relieve the burden of high cost and latency for data-gathering. Since the involved users in crowdsourcing are volunteers, need incentives to encourage them to provide data. However, the current incentive mechanisms mostly pay attention to the data quantity, while ignoring the data quality. In this paper, we design a Data-quality awaRe IncentiVe mEchanism (DRIVE) for collaborative tasks based on the Stackelberg game to motivate users with high quality, the highlight of which is the dynamic reward allocation scheme based on the proposed data quality evaluation method. In order to guarantee the data quality evaluation response in real-time, we introduce the mobile edge computing framework. Finally, one case study is given and its real-data experiments demonstrate the superior performance of DRIVE.

  • Development of a Coanda-Drone with Built-in Propellers

    Zejing ZHAO  Bin ZHANG  Hun-ok LIM  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/11/10
      Vol:
    E107-D No:2
      Page(s):
    180-190

    In this study, a Coanda-drone with length, width, and height of 121.6, 121.6, and 191[mm] was designed, and its total mass was 1166.7[g]. Using four propulsion devices, it could produce a maximum of 5428[g] thrust. Its structure is very different from conventional drones because in this study it combines the design of the jet engine of a jet fixed-wing drone with the fuselage structure layout of a rotary-wing drone. The advantage of jet drone's high propulsion is kept so that it can output greater thrust under the same variation of PWM waveform output. In this study, the propulsion device performs high-speed jetting, and the airflow around the propulsion device will also be jetted downward along the direction of the airflow.

  • CASEformer — A Transformer-Based Projection Photometric Compensation Network

    Yuqiang ZHANG  Huamin YANG  Cheng HAN  Chao ZHANG  Chaoran ZHU  

     
    PAPER

      Pubricized:
    2023/09/29
      Vol:
    E107-D No:1
      Page(s):
    13-28

    In this paper, we present a novel photometric compensation network named CASEformer, which is built upon the Swin module. For the first time, we combine coordinate attention and channel attention mechanisms to extract rich features from input images. Employing a multi-level encoder-decoder architecture with skip connections, we establish multiscale interactions between projection surfaces and projection images, achieving precise inference and compensation. Furthermore, through an attention fusion module, which simultaneously leverages both coordinate and channel information, we enhance the global context of feature maps while preserving enhanced texture coordinate details. The experimental results demonstrate the superior compensation effectiveness of our approach compared to the current state-of-the-art methods. Additionally, we propose a method for multi-surface projection compensation, further enriching our contributions.

  • Statistical-Mechanical Analysis of Adaptive Volterra Filter for Nonwhite Input Signals

    Koyo KUGIYAMA  Seiji MIYOSHI  

     
    PAPER

      Pubricized:
    2023/07/13
      Vol:
    E107-A No:1
      Page(s):
    87-95

    The Volterra filter is one of the digital filters that can describe nonlinearity. In this paper, we analyze the dynamic behaviors of an adaptive signal processing system with the Volterra filter for nonwhite input signals by a statistical-mechanical method. Assuming the self-averaging property with an infinitely long tapped-delay line, we derive simultaneous differential equations that describe the behaviors of macroscopic variables in a deterministic and closed form. We analytically solve the derived equations to reveal the effect of the nonwhiteness of the input signal on the adaptation process. The results for the second-order Volterra filter show that the nonwhiteness decreases the mean-square error (MSE) in the early stages of the adaptation process and increases the MSE in the later stages.

  • Power Analysis and Power Modeling of Directly-Connected FPGA Clusters

    Kensuke IIZUKA  Haruna TAKAGI  Aika KAMEI  Kazuei HIRONAKA  Hideharu AMANO  

     
    PAPER

      Pubricized:
    2023/07/20
      Vol:
    E106-D No:12
      Page(s):
    1997-2005

    FPGA cluster is a promising platform for future computing not only in the cloud but in the 5G wireless base stations with limited power supply by taking significant advantage of power efficiency. However, almost no power analyses with real systems have been reported. This work reports the detailed power consumption analyses of two FPGA clusters, namely FiC and M-KUBOS clusters with introducing power measurement tools and running the real applications. From the detailed analyses, we find that the number of activated links mainly determines the total power consumption of the systems regardless they are used or not. To improve the performance of applications while reducing power consumption, we should increase the clock frequency of the applications, use the minimum number of links and apply link aggregation. We also propose the power model for both clusters from the results of the analyses and this model can estimate the total power consumption of both FPGA clusters at the design step with 15% errors at maximum.

  • A Driver Fatigue Detection Algorithm Based on Dynamic Tracking of Small Facial Targets Using YOLOv7

    Shugang LIU  Yujie WANG  Qiangguo YU  Jie ZHAN  Hongli LIU  Jiangtao LIU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/08/21
      Vol:
    E106-D No:11
      Page(s):
    1881-1890

    Driver fatigue detection has become crucial in vehicle safety technology. Achieving high accuracy and real-time performance in detecting driver fatigue is paramount. In this paper, we propose a novel driver fatigue detection algorithm based on dynamic tracking of Facial Eyes and Yawning using YOLOv7, named FEY-YOLOv7. The Coordinate Attention module is inserted into YOLOv7 to enhance its dynamic tracking accuracy by focusing on coordinate information. Additionally, a small target detection head is incorporated into the network architecture to promote the feature extraction ability of small facial targets such as eyes and mouth. In terms of compution, the YOLOv7 network architecture is significantly simplified to achieve high detection speed. Using the proposed PERYAWN algorithm, driver status is labeled and detected by four classes: open_eye, closed_eye, open_mouth, and closed_mouth. Furthermore, the Guided Image Filtering algorithm is employed to enhance image details. The proposed FEY-YOLOv7 is trained and validated on RGB-infrared datasets. The results show that FEY-YOLOv7 has achieved mAP of 0.983 and FPS of 101. This indicates that FEY-YOLOv7 is superior to state-of-the-art methods in accuracy and speed, providing an effective and practical solution for image-based driver fatigue detection.

  • FOM-CDS PUF: A Novel Configurable Dual State Strong PUF Based on Feedback Obfuscation Mechanism against Modeling Attacks

    Hong LI  Wenjun CAO  Chen WANG  Xinrui ZHU  Guisheng LIAO  Zhangqing HE  

     
    PAPER-Cryptography and Information Security

      Pubricized:
    2023/03/29
      Vol:
    E106-A No:10
      Page(s):
    1311-1321

    The configurable Ring oscillator Physical unclonable function (CRO PUF) is the newly proposed strong PUF based on classic RO PUF, which can generate exponential Challenge-Response Pairs (CRPs) and has good uniqueness and reliability. However, existing proposals have low hardware utilization and vulnerability to modeling attacks. In this paper, we propose a Novel Configurable Dual State (CDS) PUF with lower overhead and higher resistance to modeling attacks. This structure can be flexibly transformed into RO PUF and TERO PUF in the same topology according to the parity of the Hamming Weight (HW) of the challenge, which can achieve 100% utilization of the inverters and improve the efficiency of hardware utilization. A feedback obfuscation mechanism (FOM) is also proposed, which uses the stable count value of the ring oscillator in the PUF as the updated mask to confuse and hide the original challenge, significantly improving the effect of resisting modeling attacks. The proposed FOM-CDS PUF is analyzed by building a mathematical model and finally implemented on Xilinx Artix-7 FPGA, the test results show that the FOM-CDS PUF can effectively resist several popular modeling attack methods and the prediction accuracy is below 60%. Meanwhile it shows that the FOM-CDS PUF has good performance with uniformity, Bit Error Rate at different temperatures, Bit Error Rate at different voltages and uniqueness of 53.68%, 7.91%, 5.64% and 50.33% respectively.

  • iLEDGER: A Lightweight Blockchain Framework with New Consensus Method for IoT Applications

    Veeramani KARTHIKA  Suresh JAGANATHAN  

     
    PAPER-Cryptography and Information Security

      Pubricized:
    2023/03/06
      Vol:
    E106-A No:9
      Page(s):
    1251-1262

    Considering the growth of the IoT network, there is a demand for a decentralized solution. Incorporating the blockchain technology will eliminate the challenges faced in centralized solutions, such as i) high infrastructure, ii) maintenance cost, iii) lack of transparency, iv) privacy, and v) data tampering. Blockchain-based IoT network allows businesses to access and share the IoT data within their organization without a central authority. Data in the blockchain are stored as blocks, which should be validated and added to the chain, for this consensus mechanism plays a significant role. However, existing methods are not designed for IoT applications and lack features like i) decentralization, ii) scalability, iii) throughput, iv) faster convergence, and v) network overhead. Moreover, current blockchain frameworks failed to support resource-constrained IoT applications. In this paper, we proposed a new consensus method (WoG) and a lightweight blockchain framework (iLEDGER), mainly for resource-constrained IoT applications in a permissioned environment. The proposed work is tested in an application that tracks the assets using IoT devices (Raspberry Pi 4 and RFID). Furthermore, the proposed consensus method is analyzed against benign failures, and performance parameters such as CPU usage, memory usage, throughput, transaction execution time, and block generation time are compared with state-of-the-art methods.

  • A Lightweight and Efficient Infrared Pedestrian Semantic Segmentation Method

    Shangdong LIU  Chaojun MEI  Shuai YOU  Xiaoliang YAO  Fei WU  Yimu JI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/06/13
      Vol:
    E106-D No:9
      Page(s):
    1564-1571

    The thermal imaging pedestrian segmentation system has excellent performance in different illumination conditions, but it has some drawbacks(e.g., weak pedestrian texture information, blurred object boundaries). Meanwhile, high-performance large models have higher latency on edge devices with limited computing performance. To solve the above problems, in this paper, we propose a real-time thermal infrared pedestrian segmentation method. The feature extraction layers of our method consist of two paths. Firstly, we utilize the lossless spatial downsampling to obtain boundary texture details on the spatial path. On the context path, we use atrous convolutions to improve the receptive field and obtain more contextual semantic information. Then, the parameter-free attention mechanism is introduced at the end of the two paths for effective feature selection, respectively. The Feature Fusion Module (FFM) is added to fuse the semantic information of the two paths after selection. Finally, we accelerate method inference through multi-threading techniques on the edge computing device. Besides, we create a high-quality infrared pedestrian segmentation dataset to facilitate research. The comparative experiments on the self-built dataset and two public datasets with other methods show that our method also has certain effectiveness. Our code is available at https://github.com/mcjcs001/LEIPNet.

  • An Integrated Convolutional Neural Network with a Fusion Attention Mechanism for Acoustic Scene Classification

    Pengxu JIANG  Yue XIE  Cairong ZOU  Li ZHAO  Qingyun WANG  

     
    LETTER-Engineering Acoustics

      Pubricized:
    2023/02/06
      Vol:
    E106-A No:8
      Page(s):
    1057-1061

    In human-computer interaction, acoustic scene classification (ASC) is one of the relevant research domains. In real life, the recorded audio may include a lot of noise and quiet clips, making it hard for earlier ASC-based research to isolate the crucial scene information in sound. Furthermore, scene information may be scattered across numerous audio frames; hence, selecting scene-related frames is crucial for ASC. In this context, an integrated convolutional neural network with a fusion attention mechanism (ICNN-FA) is proposed for ASC. Firstly, segmented mel-spectrograms as the input of ICNN can assist the model in learning the short-term time-frequency correlation information. Then, the designed ICNN model is employed to learn these segment-level features. In addition, the proposed global attention layer may gather global information by integrating these segment features. Finally, the developed fusion attention layer is utilized to fuse all segment-level features while the classifier classifies various situations. Experimental findings using ASC datasets from DCASE 2018 and 2019 indicate the efficacy of the suggested method.

  • Parallel Implementation of CNN on Multi-FPGA Cluster

    Yasuyu FUKUSHIMA  Kensuke IIZUKA  Hideharu AMANO  

     
    PAPER-Computer System

      Pubricized:
    2023/04/12
      Vol:
    E106-D No:7
      Page(s):
    1198-1208

    We developed a PYNQ cluster that consists of economical Zynq boards, called M-KUBOS, that are interconnected through low-cost high-performance GTH serial links. For the software environment, we employed the PYNQ open-source software platform. The PYNQ cluster is anticipated to be a multi-access edge computing (MEC) server for 5G mobile networks. We implemented the ResNet-50 inference accelerator on the PYNQ cluster for image recognition of MEC applications. By estimating the execution time of each ResNet-50 layer, layers of ResNet-50 were divided into multiple boards so that the execution time of each board would be as equal as possible for efficient pipeline processing. Owing to the PYNQ cluster in which FPGAs were directly connected by high-speed serial links, stream processing without network bottlenecks and pipeline processing between boards were readily realized. The implementation on 4 boards achieved 292 GOPS performance, 75.1 FPS throughput, and 7.81 GOPS/W power efficiency. It achieved 17 times faster speed and 130 times more power efficiency compared to the implementation on the CPU, and 5.8 times more power efficiency compared to the implementation on the GPU.

  • Time-Series Prediction Based on Double Pyramid Bidirectional Feature Fusion Mechanism

    Na WANG  Xianglian ZHAO  

     
    PAPER-Digital Signal Processing

      Pubricized:
    2022/12/20
      Vol:
    E106-A No:6
      Page(s):
    886-895

    The application of time-series prediction is very extensive, and it is an important problem across many fields, such as stock prediction, sales prediction, and loan prediction and so on, which play a great value in production and life. It requires that the model can effectively capture the long-term feature dependence between the output and input. Recent studies show that Transformer can improve the prediction ability of time-series. However, Transformer has some problems that make it unable to be directly applied to time-series prediction, such as: (1) Local agnosticism: Self-attention in Transformer is not sensitive to short-term feature dependence, which leads to model anomalies in time-series; (2) Memory bottleneck: The spatial complexity of regular transformation increases twice with the sequence length, making direct modeling of long time-series infeasible. In order to solve these problems, this paper designs an efficient model for long time-series prediction. It is a double pyramid bidirectional feature fusion mechanism network with parallel Temporal Convolution Network (TCN) and FastFormer. This network structure can combine the time series fine-grained information captured by the Temporal Convolution Network with the global interactive information captured by FastFormer, it can well handle the time series prediction problem.

  • GazeFollowTR: A Method of Gaze Following with Reborn Mechanism

    Jingzhao DAI  Ming LI  Xuejiao HU  Yang LI  Sidan DU  

     
    PAPER-Vision

      Pubricized:
    2022/11/30
      Vol:
    E106-A No:6
      Page(s):
    938-946

    Gaze following is the task of estimating where an observer is looking inside a scene. Both the observer and scene information must be learned to determine the gaze directions and gaze points. Recently, many existing works have only focused on scenes or observers. In contrast, revealed frameworks for gaze following are limited. In this paper, a gaze following method using a hybrid transformer is proposed. Based on the conventional method (GazeFollow), we conduct three developments. First, a hybrid transformer is applied for learning head images and gaze positions. Second, the pinball loss function is utilized to control the gaze point error. Finally, a novel ReLU layer with the reborn mechanism (reborn ReLU) is conducted to replace traditional ReLU layers in different network stages. To test the performance of our developments, we train our developed framework with the DL Gaze dataset and evaluate the model on our collected set. Through our experimental results, it can be proven that our framework can achieve outperformance over our referred methods.

  • Photochemical Stability of Organic Electro-Optic Polymer at 1310-nm Wavelength Open Access

    Yukihiro TOMINARI  Toshiki YAMADA  Takahiro KAJI  Akira OTOMO  

     
    BRIEF PAPER

      Pubricized:
    2022/11/10
      Vol:
    E106-C No:6
      Page(s):
    228-231

    We investigated the photochemical stability of an electro-optic (EO) polymer under laser irradiation at 1310nm to reveal photodegradation mechanisms. It was found that one-photon absorption excitation assisted with the thermal energy at the temperature is involved in the photodegradation process, in contrast to our previous studies at a wavelength of 1550nm where two-photon absorption excitation is involved in the photodegradation process. Thus, both the excitation wavelength and the thermal energy strongly affect to the degradation mechanism. In any cases, the photodegradation of EO polymers is mainly related to the generation of exited singlet oxygen.

  • Microneedle of Biodegradable Polyacid Anhydride with a Capillary Open Groove for Reagent Transfer

    Satomitsu IMAI  Kazuki CHIDAISYO  Kosuke YASUDA  

     
    BRIEF PAPER

      Pubricized:
    2022/11/28
      Vol:
    E106-C No:6
      Page(s):
    248-252

    Incorporating a tool for administering medication, such as a syringe, is required in microneedles (MNs) for medical use. This renders it easier for non-medical personnel to administer medication. Because it is difficult to fabricate a hollow MN, we fabricated a capillary groove on an MN and its substrate to enable the administration of a higher dosage. MN grooving is difficult to accomplish via the conventional injection molding method used for polylactic acid. Therefore, biodegradable polyacid anhydride was selected as the material for the MN. Because polyacid anhydride is a low-viscosity liquid at room temperature, an MN can be grooved using a processing method similar to vacuum casting. This study investigated the performance of the capillary force of the MN and the optimum shape and size of the MN by a puncture test.

  • A Visual Question Answering Network Merging High- and Low-Level Semantic Information

    Huimin LI  Dezhi HAN  Chongqing CHEN  Chin-Chen CHANG  Kuan-Ching LI  Dun LI  

     
    PAPER-Core Methods

      Pubricized:
    2022/01/06
      Vol:
    E106-D No:5
      Page(s):
    581-589

    Visual Question Answering (VQA) usually uses deep attention mechanisms to learn fine-grained visual content of images and textual content of questions. However, the deep attention mechanism can only learn high-level semantic information while ignoring the impact of the low-level semantic information on answer prediction. For such, we design a High- and Low-Level Semantic Information Network (HLSIN), which employs two strategies to achieve the fusion of high-level semantic information and low-level semantic information. Adaptive weight learning is taken as the first strategy to allow different levels of semantic information to learn weights separately. The gate-sum mechanism is used as the second to suppress invalid information in various levels of information and fuse valid information. On the benchmark VQA-v2 dataset, we quantitatively and qualitatively evaluate HLSIN and conduct extensive ablation studies to explore the reasons behind HLSIN's effectiveness. Experimental results demonstrate that HLSIN significantly outperforms the previous state-of-the-art, with an overall accuracy of 70.93% on test-dev.

1-20hit(222hit)