The search functionality is under construction.

Keyword Search Result

[Keyword] LSTM(32hit)

1-20hit(32hit)

  • MuSRGM: A Genetic Algorithm-Based Dynamic Combinatorial Deep Learning Model for Software Reliability Engineering Open Access

    Ning FU  Duksan RYU  Suntae KIM  

     
    PAPER-Software Engineering

      Pubricized:
    2024/02/06
      Vol:
    E107-D No:6
      Page(s):
    761-771

    In the software testing phase, software reliability growth models (SRGMs) are commonly used to evaluate the reliability of software systems. Traditional SRGMs are restricted by their assumption of a continuous growth pattern for the failure detection rate (FDR) throughout the testing phase. However, the assumption is compromised by Change-Point phenomena, where FDR fluctuations stem from variations in testing personnel or procedural modifications, leading to reduced prediction accuracy and compromised software reliability assessments. Therefore, the objective of this study is to improve software reliability prediction using a novel approach that combines genetic algorithm (GA) and deep learning-based SRGMs to account for the Change-point phenomenon. The proposed approach uses a GA to dynamically combine activation functions from various deep learning-based SRGMs into a new mutated SRGM called MuSRGM. The MuSRGM captures the advantages of both concave and S-shaped SRGMs and is better suited to capture the change-point phenomenon during testing and more accurately reflect actual testing situations. Additionally, failure data is treated as a time series and analyzed using a combination of Long Short-Term Memory (LSTM) and Attention mechanisms. To assess the performance of MuSRGM, we conducted experiments on three distinct failure datasets. The results indicate that MuSRGM outperformed the baseline method, exhibiting low prediction error (MSE) on all three datasets. Furthermore, MuSRGM demonstrated remarkable generalization ability on these datasets, remaining unaffected by uneven data distribution. Therefore, MuSRGM represents a highly promising advanced solution that can provide increased accuracy and applicability for software reliability assessment during the testing phase.

  • LSTM Neural Network Algorithm for Handover Improvement in a Non-Ideal Network Using O-RAN Near-RT RIC Open Access

    Baud Haryo PRANANTO   ISKANDAR   HENDRAWAN  Adit KURNIAWAN  

     
    PAPER-Network Management/Operation

      Vol:
    E107-B No:6
      Page(s):
    458-469

    Handover is an important property of cellular communication that enables the user to move from one cell to another without losing the connection. It is a very crucial process for the quality of the user’s experience because it may interrupt data transmission. Therefore, good handover management is very important in the current and future cellular systems. Several techniques have been employed to improve the handover performance, usually to increase the probability of a successful handover. One of the techniques is predictive handover which predicts the target cell using some methods other than the traditional measurement-based algorithm, including using machine learning. Several studies have been conducted in the implementation of predictive handover, most of them by modifying the internal algorithm of existing network elements, such as the base station. We implemented a predictive handover algorithm using an intelligent node outside the existing network elements to minimize the modification of the network and to create modularity in the system. Using a recently standardized Open Radio Access Network (O-RAN) Near Realtime Radio Intelligent Controller (Near-RT RIC), we created a modular application that can improve the handover performance by determining the target cell using machine learning techniques. In our previous research, we modified The Near-RT RIC original software that is using vector autoregression to determine the target cell by predicting the throughput of each neighboring cell. We also modified the method using a Multi-Layer Perceptron (MLP) neural network. In this paper, we redesigned the neural network using Long Short-Term Memory (LSTM) that can better handle time series data. We proved that our proposed LSTM-based machine learning algorithms used in Near-RT RIC can improve the handover performance compared to the traditional measurement-based algorithm.

  • GNSS Spoofing Detection Using Multiple Sensing Devices and LSTM Networks

    Xin QI  Toshio SATO  Zheng WEN  Yutaka KATSUYAMA  Kazuhiko TAMESUE  Takuro SATO  

     
    PAPER

      Pubricized:
    2023/08/03
      Vol:
    E106-B No:12
      Page(s):
    1372-1379

    The rise of next-generation logistics systems featuring autonomous vehicles and drones has brought to light the severe problem of Global navigation satellite system (GNSS) location data spoofing. While signal-based anti-spoofing techniques have been studied, they can be challenging to apply to current commercial GNSS modules in many cases. In this study, we explore using multiple sensing devices and machine learning techniques such as decision tree classifiers and Long short-term memory (LSTM) networks for detecting GNSS location data spoofing. We acquire sensing data from six trajectories and generate spoofing data based on the Software-defined radio (SDR) behavior for evaluation. We define multiple features using GNSS, beacons, and Inertial measurement unit (IMU) data and develop models to detect spoofing. Our experimental results indicate that LSTM networks using ten-sequential past data exhibit higher performance, with the accuracy F1 scores above 0.92 using appropriate features including beacons and generalization ability for untrained test data. Additionally, our results suggest that distance from beacons is a valuable metric for detecting GNSS spoofing and demonstrate the potential for beacon installation along future drone highways.

  • A Novel Method for Lightning Prediction by Direct Electric Field Measurements at the Ground Using Recurrent Neural Network

    Masamoto FUKAWA  Xiaoqi DENG  Shinya IMAI  Taiga HORIGUCHI  Ryo ONO  Ikumi RACHI  Sihan A  Kazuma SHINOMURA  Shunsuke NIWA  Takeshi KUDO  Hiroyuki ITO  Hitoshi WAKABAYASHI  Yoshihiro MIYAKE  Atsushi HORI  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2022/06/08
      Vol:
    E105-D No:9
      Page(s):
    1624-1628

    A method to predict lightning by machine learning analysis of atmospheric electric fields is proposed for the first time. In this study, we calculated an anomaly score with long short-term memory (LSTM), a recurrent neural network analysis method, using electric field data recorded every second on the ground. The threshold value of the anomaly score was defined, and a lightning alarm at the observation point was issued or canceled. Using this method, it was confirmed that 88.9% of lightning occurred while alarming. These results suggest that a lightning prediction system with an electric field sensor and machine learning can be developed in the future.

  • A Learning-Based Service Function Chain Early Fault Diagnosis Mechanism Based on In-Band Network Telemetry

    Meiming FU  Qingyang LIU  Jiayi LIU  Xiang WANG  Hongyan YANG  

     
    PAPER-Information Network

      Pubricized:
    2021/10/27
      Vol:
    E105-D No:2
      Page(s):
    344-354

    Network virtualization has become a promising paradigm for supporting diverse vertical services in Software Defined Networks (SDNs). Each vertical service is carried by a virtual network (VN), which normally has a chaining structure. In this way, a Service Function Chain (SFC) is composed by an ordered set of virtual network functions (VNFs) to provide tailored network services. Such new programmable flexibilities for future networks also bring new network management challenges: how to collect and analyze network measurement data, and further predict and diagnose the performance of SFCs? This is a fundamental problem for the management of SFCs, because the VNFs could be migrated in case of SFC performance degradation to avoid Service Level Agreement (SLA) violation. Despite the importance of the problem, SFC performance analysis has not attracted much research attention in the literature. In this current paper, enabled by a novel detailed network debugging technology, In-band Network Telemetry (INT), we propose a learning based framework for early SFC fault prediction and diagnosis. Based on the SFC traffic flow measurement data provided by INT, the framework firstly extracts SFC performance features. Then, Long Short-Term Memory (LSTM) networks are utilized to predict the upcoming values for these features in the next time slot. Finally, Support Vector Machine (SVM) is utilized as network fault classifier to predict possible SFC faults. We also discuss the practical utilization relevance of the proposed framework, and conduct a set of network emulations to validate the performance of the proposed framework.

  • Detecting Depression from Speech through an Attentive LSTM Network

    Yan ZHAO  Yue XIE  Ruiyu LIANG  Li ZHANG  Li ZHAO  Chengyu LIU  

     
    LETTER-Speech and Hearing

      Pubricized:
    2021/08/24
      Vol:
    E104-D No:11
      Page(s):
    2019-2023

    Depression endangers people's health conditions and affects the social order as a mental disorder. As an efficient diagnosis of depression, automatic depression detection has attracted lots of researcher's interest. This study presents an attention-based Long Short-Term Memory (LSTM) model for depression detection to make full use of the difference between depression and non-depression between timeframes. The proposed model uses frame-level features, which capture the temporal information of depressive speech, to replace traditional statistical features as an input of the LSTM layers. To achieve more multi-dimensional deep feature representations, the LSTM output is then passed on attention layers on both time and feature dimensions. Then, we concat the output of the attention layers and put the fused feature representation into the fully connected layer. At last, the fully connected layer's output is passed on to softmax layer. Experiments conducted on the DAIC-WOZ database demonstrate that the proposed attentive LSTM model achieves an average accuracy rate of 90.2% and outperforms the traditional LSTM network and LSTM with local attention by 0.7% and 2.3%, respectively, which indicates its feasibility.

  • Research on a Prediction Method for Carbon Dioxide Concentration Based on an Optimized LSTM Network of Spatio-Temporal Data Fusion

    Jun MENG  Gangyi DING  Laiyang LIU  

     
    LETTER-Data Engineering, Web Information Systems

      Pubricized:
    2021/07/08
      Vol:
    E104-D No:10
      Page(s):
    1753-1757

    In view of the different spatial and temporal resolutions of observed multi-source heterogeneous carbon dioxide data and the uncertain quality of observations, a data fusion prediction model for observed multi-scale carbon dioxide concentration data is studied. First, a wireless carbon sensor network is created, the gross error data in the original dataset are eliminated, and remaining valid data are combined with kriging method to generate a series of continuous surfaces for expressing specific features and providing unified spatio-temporally normalized data for subsequent prediction models. Then, the long short-term memory network is used to process these continuous time- and space-normalized data to obtain the carbon dioxide concentration prediction model at any scales. Finally, the experimental results illustrate that the proposed method with spatio-temporal features is more accurate than the single sensor monitoring method without spatio-temporal features.

  • Extracting Knowledge Entities from Sci-Tech Intelligence Resources Based on BiLSTM and Conditional Random Field

    Weizhi LIAO  Mingtong HUANG  Pan MA  Yu WANG  

     
    PAPER

      Pubricized:
    2021/04/22
      Vol:
    E104-D No:8
      Page(s):
    1214-1221

    There are many knowledge entities in sci-tech intelligence resources. Extracting these knowledge entities is of great importance for building knowledge networks, exploring the relationship between knowledge, and optimizing search engines. Many existing methods, which are mainly based on rules and traditional machine learning, require significant human involvement, but still suffer from unsatisfactory extraction accuracy. This paper proposes a novel approach for knowledge entity extraction based on BiLSTM and conditional random field (CRF).A BiLSTM neural network to obtain the context information of sentences, and CRF is then employed to integrate global label information to achieve optimal labels. This approach does not require the manual construction of features, and outperforms conventional methods. In the experiments presented in this paper, the titles and abstracts of 20,000 items in the existing sci-tech literature are processed, of which 50,243 items are used to build benchmark datasets. Based on these datasets, comparative experiments are conducted to evaluate the effectiveness of the proposed approach. Knowledge entities are extracted and corresponding knowledge networks are established with a further elaboration on the correlation of two different types of knowledge entities. The proposed research has the potential to improve the quality of sci-tech information services.

  • Image Captioning Algorithm Based on Multi-Branch CNN and Bi-LSTM

    Shan HE  Yuanyao LU  Shengnan CHEN  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/04/19
      Vol:
    E104-D No:7
      Page(s):
    941-947

    The development of deep learning and neural networks has brought broad prospects to computer vision and natural language processing. The image captioning task combines cutting-edge methods in two fields. By building an end-to-end encoder-decoder model, its description performance can be greatly improved. In this paper, the multi-branch deep convolutional neural network is used as the encoder to extract image features, and the recurrent neural network is used to generate descriptive text that matches the input image. We conducted experiments on Flickr8k, Flickr30k and MSCOCO datasets. According to the analysis of the experimental results on evaluation metrics, the model proposed in this paper can effectively achieve image caption, and its performance is better than classic image captioning models such as neural image annotation models.

  • FiC-RNN: A Multi-FPGA Acceleration Framework for Deep Recurrent Neural Networks

    Yuxi SUN  Hideharu AMANO  

     
    PAPER-Computer System

      Pubricized:
    2020/09/24
      Vol:
    E103-D No:12
      Page(s):
    2457-2462

    Recurrent neural networks (RNNs) have been proven effective for sequence-based tasks thanks to their capability to process temporal information. In real-world systems, deep RNNs are more widely used to solve complicated tasks such as large-scale speech recognition and machine translation. However, the implementation of deep RNNs on traditional hardware platforms is inefficient due to long-range temporal dependence and irregular computation patterns within RNNs. This inefficiency manifests itself in the proportional increase in the latency of RNN inference with respect to the number of layers of deep RNNs on CPUs and GPUs. Previous work has focused mostly on optimizing and accelerating individual RNN cells. To make deep RNN inference fast and efficient, we propose an accelerator based on a multi-FPGA platform called Flow-in-Cloud (FiC). In this work, we show that the parallelism provided by the multi-FPGA system can be taken advantage of to scale up the inference of deep RNNs, by partitioning a large model onto several FPGAs, so that the latency stays close to constant with respect to increasing number of RNN layers. For single-layer and four-layer RNNs, our implementation achieves 31x and 61x speedup compared with an Intel CPU.

  • Citation Count Prediction Based on Neural Hawkes Model

    Lisha LIU  Dongjin YU  Dongjing WANG  Fumiyo FUKUMOTO  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2020/08/03
      Vol:
    E103-D No:11
      Page(s):
    2379-2388

    With the rapid development of scientific research, the number of publications, such as scientific papers and patents, has grown rapidly. It becomes increasingly important to identify those with high quality and great impact from such a large volume of publications. Citation count is one of the well-known indicators of the future impact of the publications. However, how to interpret a large number of uncertain factors of publications as relevant features and utilize them to capture the impact of publications over time is still a challenging problem. This paper presents an approach that effectively leverages a variety of factors with a neural-based citation prediction model. Specifically, the proposed model is based on the Neural Hawkes Process (NHP) with the continuous-time Long Short-Term Memory (cLSTM), which can capture the aging effect and the phenomenon of sleeping beauty more effectively from publication covariates as well as citation counts. The experimental results on two datasets show that the proposed approach outperforms the state-of-the-art baselines. In addition, the contribution of covariates to performance improvement is also verified.

  • Contextualized Character Embedding with Multi-Sequence LSTM for Automatic Word Segmentation

    Hyunyoung LEE  Seungshik KANG  

     
    PAPER-Natural Language Processing

      Pubricized:
    2020/08/19
      Vol:
    E103-D No:11
      Page(s):
    2371-2378

    Contextual information is a crucial factor in natural language processing tasks such as sequence labeling. Previous studies on contextualized embedding and word embedding have explored the context of word-level tokens in order to obtain useful features of languages. However, unlike it is the case in English, the fundamental task in East Asian languages is related to character-level tokens. In this paper, we propose a contextualized character embedding method using n-gram multi-sequences information with long short-term memory (LSTM). It is hypothesized that contextualized embeddings on multi-sequences in the task help each other deal with long-term contextual information such as the notion of spans and boundaries of segmentation. The analysis shows that the contextualized embedding of bigram character sequences encodes well the notion of spans and boundaries for word segmentation rather than that of unigram character sequences. We find out that the combination of contextualized embeddings from both unigram and bigram character sequences at output layer rather than the input layer of LSTMs improves the performance of word segmentation. The comparison showed that our proposed method outperforms the previous models.

  • New Word Detection Using BiLSTM+CRF Model with Features

    Jianyong DUAN  Zheng TAN  Mei ZHANG  Hao WANG  

     
    PAPER-Natural Language Processing

      Pubricized:
    2020/07/14
      Vol:
    E103-D No:10
      Page(s):
    2228-2236

    With the widespread popularity of a large number of social platforms, an increasing number of new words gradually appear. However, such new words have made some NLP tasks like word segmentation more challenging. Therefore, new word detection is always an important and tough task in NLP. This paper aims to extract new words using the BiLSTM+CRF model which added some features selected by us. These features include word length, part of speech (POS), contextual entropy and degree of word coagulation. Comparing to the traditional new word detection methods, our method can use both the features extracted by the model and the features we select to find new words. Experimental results demonstrate that our model can perform better compared to the benchmark models.

  • A Highly Configurable 7.62GOP/s Hardware Implementation for LSTM

    Yibo FAN  Leilei HUANG  Kewei CHEN  Xiaoyang ZENG  

     
    PAPER-Integrated Electronics

      Pubricized:
    2019/11/27
      Vol:
    E103-C No:5
      Page(s):
    263-273

    The neural network has been one of the most useful techniques in the area of speech recognition, language translation and image analysis in recent years. Long Short-Term Memory (LSTM), a popular type of recurrent neural networks (RNNs), has been widely implemented on CPUs and GPUs. However, those software implementations offer a poor parallelism while the existing hardware implementations lack in configurability. In order to make up for this gap, a highly configurable 7.62 GOP/s hardware implementation for LSTM is proposed in this paper. To achieve the goal, the work flow is carefully arranged to make the design compact and high-throughput; the structure is carefully organized to make the design configurable; the data buffering and compression strategy is carefully chosen to lower the bandwidth without increasing the complexity of structure; the data type, logistic sigmoid (σ) function and hyperbolic tangent (tanh) function is carefully optimized to balance the hardware cost and accuracy. This work achieves a performance of 7.62 GOP/s @ 238 MHz on XCZU6EG FPGA, which takes only 3K look-up table (LUT). Compared with the implementation on Intel Xeon E5-2620 CPU @ 2.10GHz, this work achieves about 90× speedup for small networks and 25× speed-up for large ones. The consumption of resources is also much less than that of the state-of-the-art works.

  • Patient-Specific ECG Classification with Integrated Long Short-Term Memory and Convolutional Neural Networks

    Jiaquan WU  Feiteng LI  Zhijian CHEN  Xiaoyan XIANG  Yu PU  

     
    PAPER-Biological Engineering

      Pubricized:
    2020/02/13
      Vol:
    E103-D No:5
      Page(s):
    1153-1163

    This paper presents an automated patient-specific ECG classification algorithm, which integrates long short-term memory (LSTM) and convolutional neural networks (CNN). While LSTM extracts the temporal features, such as the heart rate variance (HRV) and beat-to-beat correlation from sequential heartbeats, CNN captures detailed morphological characteristics of the current heartbeat. To further improve the classification performance, adaptive segmentation and re-sampling are applied to align the heartbeats of different patients with various heart rates. In addition, a novel clustering method is proposed to identify the most representative patterns from the common training data. Evaluated on the MIT-BIH arrhythmia database, our algorithm shows the superior accuracy for both ventricular ectopic beats (VEB) and supraventricular ectopic beats (SVEB) recognition. In particular, the sensitivity and positive predictive rate for SVEB increase by more than 8.2% and 8.8%, respectively, compared with the prior works. Since our patient-specific classification does not require manual feature extraction, it is potentially applicable to embedded devices for automatic and accurate arrhythmia monitoring.

  • Software Development Effort Estimation from Unstructured Software Project Description by Sequence Models

    Tachanun KANGWANTRAKOOL  Kobkrit VIRIYAYUDHAKORN  Thanaruk THEERAMUNKONG  

     
    PAPER

      Pubricized:
    2020/01/14
      Vol:
    E103-D No:4
      Page(s):
    739-747

    Most existing methods of effort estimations in software development are manual, labor-intensive and subjective, resulting in overestimation with bidding fail, and underestimation with money loss. This paper investigates effectiveness of sequence models on estimating development effort, in the form of man-months, from software project data. Four architectures; (1) Average word-vector with Multi-layer Perceptron (MLP), (2) Average word-vector with Support Vector Regression (SVR), (3) Gated Recurrent Unit (GRU) sequence model, and (4) Long short-term memory (LSTM) sequence model are compared in terms of man-months difference. The approach is evaluated using two datasets; ISEM (1,573 English software project descriptions) and ISBSG (9,100 software projects data), where the former is a raw text and the latter is a structured data table explained the characteristic of a software project. The LSTM sequence model achieves the lowest and the second lowest mean absolute errors, which are 0.705 and 14.077 man-months for ISEM and ISBSG datasets respectively. The MLP model achieves the lowest mean absolute errors which is 14.069 for ISBSG datasets.

  • A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features

    Yoonhee KIM  Deokgyu YUN  Hannah LEE  Seung Ho CHOI  

     
    LETTER-Speech and Hearing

      Pubricized:
    2019/12/11
      Vol:
    E103-D No:3
      Page(s):
    714-715

    This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.

  • Attentive Sequences Recurrent Network for Social Relation Recognition from Video Open Access

    Jinna LV  Bin WU  Yunlei ZHANG  Yunpeng XIAO  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2019/09/02
      Vol:
    E102-D No:12
      Page(s):
    2568-2576

    Recently, social relation analysis receives an increasing amount of attention from text to image data. However, social relation analysis from video is an important problem, which is lacking in the current literature. There are still some challenges: 1) it is hard to learn a satisfactory mapping function from low-level pixels to high-level social relation space; 2) how to efficiently select the most relevant information from noisy and unsegmented video. In this paper, we present an Attentive Sequences Recurrent Network model, called ASRN, to deal with the above challenges. First, in order to explore multiple clues, we design a Multiple Feature Attention (MFA) mechanism to fuse multiple visual features (i.e. image, motion, body, and face). Through this manner, we can generate an appropriate mapping function from low-level video pixels to high-level social relation space. Second, we design a sequence recurrent network based on Global and Local Attention (GLA) mechanism. Specially, an attention mechanism is used in GLA to integrate global feature with local sequence feature to select more relevant sequences for the recognition task. Therefore, the GLA module can better deal with noisy and unsegmented video. At last, extensive experiments on the SRIV dataset demonstrate the performance of our ASRN model.

  • Tweet Stance Detection Using Multi-Kernel Convolution and Attentive LSTM Variants

    Umme Aymun SIDDIQUA  Abu Nowshed CHY  Masaki AONO  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/09/25
      Vol:
    E102-D No:12
      Page(s):
    2493-2503

    Stance detection in twitter aims at mining user stances expressed in a tweet towards a single or multiple target entities. Detecting and analyzing user stances from massive opinion-oriented twitter posts provide enormous opportunities to journalists, governments, companies, and other organizations. Most of the prior studies have explored the traditional deep learning models, e.g., long short-term memory (LSTM) and gated recurrent unit (GRU) for detecting stance in tweets. However, compared to these traditional approaches, recently proposed densely connected bidirectional LSTM and nested LSTMs architectures effectively address the vanishing-gradient and overfitting problems as well as dealing with long-term dependencies. In this paper, we propose a neural network model that adopts the strengths of these two LSTM variants to learn better long-term dependencies, where each module coupled with an attention mechanism that amplifies the contribution of important elements in the final representation. We also employ a multi-kernel convolution on top of them to extract the higher-level tweet representations. Results of extensive experiments on single and multi-target benchmark stance detection datasets show that our proposed method achieves substantial improvement over the current state-of-the-art deep learning based methods.

  • Multi-Level Attention Based BLSTM Neural Network for Biomedical Event Extraction

    Xinyu HE  Lishuang LI  Xingchen SONG  Degen HUANG  Fuji REN  

     
    PAPER-Natural Language Processing

      Pubricized:
    2019/04/26
      Vol:
    E102-D No:9
      Page(s):
    1842-1850

    Biomedical event extraction is an important and challenging task in Information Extraction, which plays a key role for medicine research and disease prevention. Most of the existing event detection methods are based on shallow machine learning methods which mainly rely on domain knowledge and elaborately designed features. Another challenge is that some crucial information as well as the interactions among words or arguments may be ignored since most works treat words and sentences equally. Therefore, we employ a Bidirectional Long Short Term Memory (BLSTM) neural network for event extraction, which can skip handcrafted complex feature extraction. Furthermore, we propose a multi-level attention mechanism, including word level attention which determines the importance of words in a sentence, and the sentence level attention which determines the importance of relevant arguments. Finally, we train dependency word embeddings and add sentence vectors to enrich semantic information. The experimental results show that our model achieves an F-score of 59.61% on the commonly used dataset (MLEE) of biomedical event extraction, which outperforms other state-of-the-art methods.

1-20hit(32hit)