1-3hit |
Lin YAN Mingyong ZENG Shuai REN Zhangkai LUO
Traffic categorization aims to classify network traffic into major service types. A modern deep neural network based on temporal sequence modeling is proposed for encrypted traffic categorization. The contemporary techniques such as dilated convolution and residual connection are adopted as the basic building block. The raw traffic files are pre-processed to generate 1-dimensional flow byte sequences and are feed into our specially-devised network. The proposed approach outperforms other existing methods greatly on a public traffic dataset.
Tachanun KANGWANTRAKOOL Kobkrit VIRIYAYUDHAKORN Thanaruk THEERAMUNKONG
Most existing methods of effort estimations in software development are manual, labor-intensive and subjective, resulting in overestimation with bidding fail, and underestimation with money loss. This paper investigates effectiveness of sequence models on estimating development effort, in the form of man-months, from software project data. Four architectures; (1) Average word-vector with Multi-layer Perceptron (MLP), (2) Average word-vector with Support Vector Regression (SVR), (3) Gated Recurrent Unit (GRU) sequence model, and (4) Long short-term memory (LSTM) sequence model are compared in terms of man-months difference. The approach is evaluated using two datasets; ISEM (1,573 English software project descriptions) and ISBSG (9,100 software projects data), where the former is a raw text and the latter is a structured data table explained the characteristic of a software project. The LSTM sequence model achieves the lowest and the second lowest mean absolute errors, which are 0.705 and 14.077 man-months for ISEM and ISBSG datasets respectively. The MLP model achieves the lowest mean absolute errors which is 14.069 for ISBSG datasets.
Hansjorg HOFMANN Sakriani SAKTI Chiori HORI Hideki KASHIOKA Satoshi NAKAMURA Wolfgang MINKER
The performance of English automatic speech recognition systems decreases when recognizing spontaneous speech mainly due to multiple pronunciation variants in the utterances. Previous approaches address this problem by modeling the alteration of the pronunciation on a phoneme to phoneme level. However, the phonetic transformation effects induced by the pronunciation of the whole sentence have not yet been considered. In this article, the sequence-based pronunciation variation is modeled using a noisy channel approach where the spontaneous phoneme sequence is considered as a “noisy” string and the goal is to recover the “clean” string of the word sequence. Hereby, the whole word sequence and its effect on the alternation of the phonemes will be taken into consideration. Moreover, the system not only learns the phoneme transformation but also the mapping from the phoneme to the word directly. In this study, first the phonemes will be recognized with the present recognition system and afterwards the pronunciation variation model based on the noisy channel approach will map from the phoneme to the word level. Two well-known natural language processing approaches are adopted and derived from the noisy channel model theory: Joint-sequence models and statistical machine translation. Both of them are applied and various experiments are conducted using microphone and telephone of spontaneous speech.