IEICE global.ieice.org Site

Keyword Search Result

[Keyword] BERT(66hit)

1-20hit(66hit)

Automated Labeling of Entities in CVE Vulnerability Descriptions with Natural Language Processing Open Access
Kensuke SUMOTO Kenta KANAKOGI Hironori WASHIZAKI Naohiko TSUDA Nobukazu YOSHIOKA Yoshiaki FUKAZAWA Hideyuki KANUKA

PAPER

Pubricized:
2024/02/09
Vol:
E107-D No:5
Page(s):
674-682
Security-related issues have become more significant due to the proliferation of IT. Collating security-related information in a database improves security. For example, Common Vulnerabilities and Exposures (CVE) is a security knowledge repository containing descriptions of vulnerabilities about software or source code. Although the descriptions include various entities, there is not a uniform entity structure, making security analysis difficult using individual entities. Developing a consistent entity structure will enhance the security field. Herein we propose a method to automatically label select entities from CVE descriptions by applying the Named Entity Recognition (NER) technique. We manually labeled 3287 CVE descriptions and conducted experiments using a machine learning model called BERT to compare the proposed method to labeling with regular expressions. Machine learning using the proposed method significantly improves the labeling accuracy. It has an f1 score of about 0.93, precision of about 0.91, and recall of about 0.95, demonstrating that our method has potential to automatically label select entities from CVE descriptions.
PSDSpell: Pre-Training with Self-Distillation Learning for Chinese Spelling Correction Open Access
Li HE Xiaowu ZHANG Jianyong DUAN Hao WANG Xin LI Liang ZHAO

PAPER

Pubricized:
2023/10/25
Vol:
E107-D No:4
Page(s):
495-504
Chinese spelling correction (CSC) models detect and correct a text typo based on the misspelled character and its context. Recently, Bert-based models have dominated the research of Chinese spelling correction. However, these methods only focus on the semantic information of the text during the pretraining stage, neglecting the learning of correcting spelling errors. Moreover, when multiple incorrect characters are in the text, the context introduces noisy information, making it difficult for the model to accurately detect the positions of the incorrect characters, leading to false corrections. To address these limitations, we apply the multimodal pre-trained language model ChineseBert to the task of spelling correction. We propose a self-distillation learning-based pretraining strategy, where a confusion set is used to construct text containing erroneous characters, allowing the model to jointly learns how to understand language and correct spelling errors. Additionally, we introduce a single-channel masking mechanism to mitigate the noise caused by the incorrect characters. This mechanism masks the semantic encoding channel while preserving the phonetic and glyph encoding channels, reducing the noise introduced by incorrect characters during the prediction process. Finally, experiments are conducted on widely used benchmarks. Our model achieves superior performance against state-of-the-art methods by a remarkable gain.
Hilbert Series for Systems of UOV Polynomials
Yasuhiko IKEMATSU Tsunekazu SAITO

PAPER

Pubricized:
2023/09/11
Vol:
E107-A No:3
Page(s):
275-282
Multivariate public key cryptosystems (MPKC) are constructed based on the problem of solving multivariate quadratic equations (MQ problem). Among various multivariate schemes, UOV is an important signature scheme since it is underlying some signature schemes such as MAYO, QR-UOV, and Rainbow which was a finalist of NIST PQC standardization project. To analyze the security of a multivariate scheme, it is necessary to analyze the first fall degree or solving degree for the system of polynomial equations used in specific attacks. It is known that the first fall degree or solving degree often relates to the Hilbert series of the ideal generated by the system. In this paper, we study the Hilbert series of the UOV scheme, and more specifically, we study the Hilbert series of ideals generated by quadratic polynomials used in the central map of UOV. In particular, we derive a prediction formula of the Hilbert series by using some experimental results. Moreover, we apply it to the analysis of the reconciliation attack for MAYO.
Effective Language Representations for Danmaku Comment Classification in Nicovideo
Hiroyoshi NAGAO Koshiro TAMURA Marie KATSURAI

PAPER

Pubricized:
2023/01/16
Vol:
E106-D No:5
Page(s):
838-846
Danmaku commenting has become popular for co-viewing on video-sharing platforms, such as Nicovideo. However, many irrelevant comments usually contaminate the quality of the information provided by videos. Such an information pollutant problem can be solved by a comment classifier trained with an abstention option, which detects comments whose video categories are unclear. To improve the performance of this classification task, this paper presents Nicovideo-specific language representations. Specifically, we used sentences from Nicopedia, a Japanese online encyclopedia of entities that possibly appear in Nicovideo contents, to pre-train a bidirectional encoder representations from Transformers (BERT) model. The resulting model named Nicopedia BERT is then fine-tuned such that it could determine whether a given comment falls into any of predefined categories. The experiments conducted on Nicovideo comment data demonstrated the effectiveness of Nicopedia BERT compared with existing BERT models pre-trained using Wikipedia or tweets. We also evaluated the performance of each model in an additional sentiment classification task, and the obtained results implied the applicability of Nicopedia BERT as a feature extractor of other social media text.
Auxiliary Loss for BERT-Based Paragraph Segmentation
Binggang ZHUO Masaki MURATA Qing MA

PAPER-Natural Language Processing

Pubricized:
2022/10/20
Vol:
E106-D No:1
Page(s):
58-67
Paragraph segmentation is a text segmentation task. Iikura et al. achieved excellent results on paragraph segmentation by introducing focal loss to Bidirectional Encoder Representations from Transformers. In this study, we investigated paragraph segmentation on Daily News and Novel datasets. Based on the approach proposed by Iikura et al., we used auxiliary loss to train the model to improve paragraph segmentation performance. Consequently, the average F1-score obtained by the approach of Iikura et al. was 0.6704 on the Daily News dataset, whereas that of our approach was 0.6801. Our approach thus improved the performance by approximately 1%. The performance improvement was also confirmed on the Novel dataset. Furthermore, the results of two-tailed paired t-tests indicated that there was a statistical significance between the performance of the two approaches.
Convex and Differentiable Formulation for Inverse Problems in Hilbert Spaces with Nonlinear Clipping Effects Open Access
Natsuki UENO Shoichi KOYAMA Hiroshi SARUWATARI

PAPER-Nonlinear Problems

Pubricized:
2021/02/25
Vol:
E104-A No:9
Page(s):
1293-1303
We propose a useful formulation for ill-posed inverse problems in Hilbert spaces with nonlinear clipping effects. Ill-posed inverse problems are often formulated as optimization problems, and nonlinear clipping effects may cause nonconvexity or nondifferentiability of the objective functions in the case of commonly used regularized least squares. To overcome these difficulties, we present a tractable formulation in which the objective function is convex and differentiable with respect to optimization variables, on the basis of the Bregman divergence associated with the primitive function of the clipping function. By using this formulation in combination with the representer theorem, we need only to deal with a finite-dimensional, convex, and differentiable optimization problem, which can be solved by well-established algorithms. We also show two practical examples of inverse problems where our theory can be applied, estimation of band-limited signals and time-harmonic acoustic fields, and evaluate the validity of our theory by numerical simulations.
Real-Time Detection of Global Cyberthreat Based on Darknet by Estimating Anomalous Synchronization Using Graphical Lasso
Chansu HAN Jumpei SHIMAMURA Takeshi TAKAHASHI Daisuke INOUE Jun'ichi TAKEUCHI Koji NAKAO

PAPER-Information Network

Pubricized:
2020/06/25
Vol:
E103-D No:10
Page(s):
2113-2124
With the rapid evolution and increase of cyberthreats in recent years, it is necessary to detect and understand it promptly and precisely to reduce the impact of cyberthreats. A darknet, which is an unused IP address space, has a high signal-to-noise ratio, so it is easier to understand the global tendency of malicious traffic in cyberspace than other observation networks. In this paper, we aim to capture global cyberthreats in real time. Since multiple hosts infected with similar malware tend to perform similar behavior, we propose a system that estimates a degree of synchronizations from the patterns of packet transmission time among the source hosts observed in unit time of the darknet and detects anomalies in real time. In our evaluation, we perform our proof-of-concept implementation of the proposed engine to demonstrate its feasibility and effectiveness, and we detect cyberthreats with an accuracy of 97.14%. This work is the first practical trial that detects cyberthreats from in-the-wild darknet traffic regardless of new types and variants in real time, and it quantitatively evaluates the result.
Sphere Packing Bound and Gilbert-Varshamov Bound for b-Symbol Read Channels
Seunghoan SONG Toru FUJIWARA

PAPER-Coding Theory

Vol:
E101-A No:11
Page(s):
1915-1924
A b-symbol read channel is a channel model in which b consecutive symbols are read at once. As special cases, it includes a symbol-pair read channel (b=2) and an ordinary channel (b=1). The sphere packing bound, the Gilbert-Varshamov (G-V) bound, and the asymptotic G-V bound for symbol-pair read channels are known for b=1 and 2. In this paper, we derive these three bounds for b-symbol read channels with b≥1. From analysis of the proposed G-V bound, it is confirmed that the achievable rate is higher for b-symbol read channels compared with those for ordinary channels based on the Hamming metric. Furthermore, it is shown that the optimal value of b that maximizes the asymptotic G-V bound is finitely determined depending on the fractional minimum distance.
TCP Network Coding with Adapting Parameters for Bursty and Time-Varying Loss
Nguyen VIET HA Kazumi KUMAZOE Masato TSURU

PAPER-Fundamental Theories for Communications

Pubricized:
2017/07/27
Vol:
E101-B No:2
Page(s):
476-488
The Transmission Control Protocol (TCP) with Network Coding (TCP/NC) was proposed to introduce packet loss recovery ability at the sink without TCP retransmission, which is realized by proactively sending redundant combination packets encoded at the source. Although TCP/NC is expected to mitigate the goodput degradation of TCP over lossy networks, the original TCP/NC does not work well in burst loss and time-varying channels. No apparent scheme was provided to decide and change the network coding-related parameters (NC parameters) to suit the diverse and changeable loss conditions. In this paper, a solution to support TCP/NC in adapting to mentioned conditions is proposed, called TCP/NC with Loss Rate and Loss Burstiness Estimation (TCP/NCwLRLBE). Both the packet loss rate and burstiness are estimated by observing transmitted packets to adapt to burst loss channels. Appropriate NC parameters are calculated from the estimated probability of successful recoverable transmission based on a mathematical model of packet losses. Moreover, a new mechanism for coding window handling is developed to update NC parameters in the coding system promptly. The proposed scheme is implemented and validated in Network Simulator 3 with two different types of burst loss model. The results suggest the potential of TCP/NCwLRLBE to mitigate the TCP goodput degradation in both the random loss and burst loss channels with the time-varying conditions.
Theoretical Analyses on 2-Norm-Based Multiple Kernel Regressors
Akira TANAKA Hideyuki IMAI

PAPER-Neural Networks and Bioengineering

Vol:
E100-A No:3
Page(s):
877-887
The solution of the standard 2-norm-based multiple kernel regression problem and the theoretical limit of the considered model space are discussed in this paper. We prove that 1) The solution of the 2-norm-based multiple kernel regressor constructed by a given training data set does not generally attain the theoretical limit of the considered model space in terms of the generalization errors, even if the training data set is noise-free, 2) The solution of the 2-norm-based multiple kernel regressor is identical to the solution of the single kernel regressor under a noise free setting, in which the adopted single kernel is the sum of the same kernels used in the multiple kernel regressor; and it is also true for a noisy setting with the 2-norm-based regularizer. The first result motivates us to develop a novel framework for the multiple kernel regression problems which yields a better solution close to the theoretical limit, and the second result implies that it is enough to use the single kernel regressors with the sum of given multiple kernels instead of the multiple kernel regressors as long as the 2-norm based criterion is used.
A Novel Lambertian-RBFNN for Office Light Modeling
Wa SI Xun PAN Harutoshi OGAI Katsumi HIRAI

PAPER-Fundamentals of Information Systems

Pubricized:
2016/04/18
Vol:
E99-D No:7
Page(s):
1742-1752
In lighting control systems, accurate data of artificial light (lighting coefficients) are essential for the illumination control accuracy and energy saving efficiency. This research proposes a novel Lambertian-Radial Basis Function Neural Network (L-RBFNN) to realize modeling of both lighting coefficients and the illumination environment for an office. By adding a Lambertian neuron to represent the rough theoretical illuminance distribution of the lamp and modifying RBF neurons to regulate the distribution shape, L-RBFNN successfully solves the instability problem of conventional RBFNN and achieves higher modeling accuracy. Simulations of both single-light modeling and multiple-light modeling are made and compared with other methods such as Lambertian function, cubic spline interpolation and conventional RBFNN. The results prove that: 1) L-RBFNN is a successful modeling method for artificial light with imperceptible modeling error; 2) Compared with other existing methods, L-RBFNN can provide better performance with lower modeling error; 3) The number of training sensors can be reduced to be the same with the number of lamps, thus making the modeling method easier to apply in real-world lighting systems.
A New Class of Hilbert Pairs of Almost Symmetric Orthogonal Wavelet Bases
Daiwei WANG Xi ZHANG

PAPER-Digital Signal Processing

Vol:
E99-A No:5
Page(s):
884-891
This paper proposes a new class of Hilbert pairs of almost symmetric orthogonal wavelet bases. For two wavelet bases to form a Hilbert pair, the corresponding scaling lowpass filters are required to satisfy the half-sample delay condition. In this paper, we design simultaneously two scaling lowpass filters with the arbitrarily specified flat group delay responses at ω=0, which satisfy the half-sample delay condition. In addition to specifying the number of vanishing moments, we apply the Remez exchange algorithm to minimize the difference of frequency responses between two scaling lowpass filters, in order to improve the analyticity of complex wavelets. The equiripple behavior of the error function can be obtained through a few iterations. Therefore, the resulting complex wavelets are orthogonal and almost symmetric, and have the improved analyticity. Finally, some examples are presented to demonstrate the effectiveness of the proposed design method.
Ensemble and Multiple Kernel Regressors: Which Is Better?
Akira TANAKA Hirofumi TAKEBAYASHI Ichigaku TAKIGAWA Hideyuki IMAI Mineichi KUDO

PAPER-Neural Networks and Bioengineering

Vol:
E98-A No:11
Page(s):
2315-2324
For the last few decades, learning with multiple kernels, represented by the ensemble kernel regressor and the multiple kernel regressor, has attracted much attention in the field of kernel-based machine learning. Although their efficacy was investigated numerically in many works, their theoretical ground is not investigated sufficiently, since we do not have a theoretical framework to evaluate them. In this paper, we introduce a unified framework for evaluating kernel regressors with multiple kernels. On the basis of the framework, we analyze the generalization errors of the ensemble kernel regressor and the multiple kernel regressor, and give a sufficient condition for the ensemble kernel regressor to outperform the multiple kernel regressor in terms of the generalization error in noise-free case. We also show that each kernel regressor can be better than the other without the sufficient condition by giving examples, which supports the importance of the sufficient condition.
Far-Field Pattern Reconstruction Using an Iterative Hilbert Transform
Fan FAN Tapan K. SARKAR Changwoo PARK Jinhwan KOH

PAPER-Antennas and Propagation

Vol:
E98-B No:6
Page(s):
1032-1039
A new approach to reconstructing antenna far-field patterns from the missing part of the pattern is presented in this paper. The antenna far-field pattern can be reconstructed by utilizing the iterative Hilbert transform, which is based on the relationship between the real and imaginary part of the Hilbert transform. A moving average filter is used to reduce the errors in the restored signal as well as the computation load. Under the constraint of the causality of the current source in space, we could successfully reconstruct the data. Several examples dealing with line source antennas and antenna arrays are simulated to illustrate the applicability of this approach.
Hilbert Transform Based Time-of-Flight Estimation of Multi-Echo Ultrasonic Signals and Its Resolution Analysis
Zhenkun LU Cui YANG Gang WEI

LETTER-Ultrasonics

Vol:
E97-A No:9
Page(s):
1962-1965
In non-destructive testing (NDT), ultrasonic echo is often an overlapping multi-echo signals with noise. However, the accurate estimation of ultrasonic time-of-flight (TOF) is essential in NDT. In this letter, a novel method for TOF estimation through envelope is proposed. Firstly, the wavelet denoising technique is applied to the noisy echo to improve the estimation accuracy. Then, the Hilbert transform (HT) is used in ultrasonic signal processing in order to extract the envelope of the echo. Finally, the TOF of each component of multi-echo signals is estimated by the local maximum point of signal envelope. Furthermore, the time resolution of time-overlapping ultrasonic echoes is discussed. Numerical simulation has been carried out to show the performances of the proposed method in estimating TOF of ultrasonic signal.
Analytic and Numerical Modeling of Normal Penetration of Early-Time (E1) High Altitude Electromagnetic Pulse (HEMP) into Dispersive Underground Multilayer Structures
Hee-Do KANG Il-Young OH Tong-Ho CHUNG Jong-Gwan YOOK

PAPER-Antennas and Propagation

Vol:
E96-B No:10
Page(s):
2625-2632
In this paper, penetration phenomenon of an early-time (E1) high altitude electromagnetic pulse (HEMP) into dispersive underground multilayer structures is analyzed using electromagnetic modeling of wave propagation in frequency dependent lossy media. The electromagnetic pulse is dealt with in the power spectrum ranging from 100kHz to the 100MHz band, considering the fact that the power spectrum of the E1 HEMP rapidly decreases 30dB below its maximum value beyond the 100MHz band. In addition, the propagation channel consisting of several dielectric materials is modeled with the dispersive relative permittivity of each medium. Based on source and channel models, the propagation phenomenon is analyzed in the frequency and time domains. The attenuation levels at a 100m underground point are observed to be about 15 and 20dB at 100kHz and 1MHz, respectively, and the peak level of the penetrating electric field is found 5.6kV/m. To ensure the causality of the result, we utilize the Hilbert transform.
On Kernel Parameter Selection in Hilbert-Schmidt Independence Criterion
Masashi SUGIYAMA Makoto YAMADA

LETTER-Artificial Intelligence, Data Mining

Vol:
E95-D No:10
Page(s):
2564-2567
The Hilbert-Schmidt independence criterion (HSIC) is a kernel-based statistical independence measure that can be computed very efficiently. However, it requires us to determine the kernel parameters heuristically because no objective model selection method is available. Least-squares mutual information (LSMI) is another statistical independence measure that is based on direct density-ratio estimation. Although LSMI is computationally more expensive than HSIC, LSMI is equipped with cross-validation, and thus the kernel parameter can be determined objectively. In this paper, we show that HSIC can actually be regarded as an approximation to LSMI, which allows us to utilize cross-validation of LSMI for determining kernel parameters in HSIC. Consequently, both computational efficiency and cross-validation can be achieved.
Movement-Imagery Brain-Computer Interface: EEG Classification of Beta Rhythm Synchronization Based on Cumulative Distribution Function
Teruyoshi SASAYAMA Tetsuo KOBAYASHI

PAPER-Human-computer Interaction

Vol:
E94-D No:12
Page(s):
2479-2486
We developed a novel movement-imagery-based brain-computer interface (BCI) for untrained subjects without employing machine learning techniques. The development of BCI consisted of several steps. First, spline Laplacian analysis was performed. Next, time-frequency analysis was applied to determine the optimal frequency range and latencies of the electroencephalograms (EEGs). Finally, trials were classified as right or left based on β-band event-related synchronization using the cumulative distribution function of pretrigger EEG noise. To test the performance of the BCI, EEGs during the execution and imagination of right/left wrist-bending movements were measured from 63 locations over the entire scalp using eight healthy subjects. The highest classification accuracies were 84.4% and 77.8% for real movements and their imageries, respectively. The accuracy is significantly higher than that of previously reported machine-learning-based BCIs in the movement imagery task (paired t-test, p < 0.05). It has also been demonstrated that the highest accuracy was achieved even though subjects had never participated in movement imageries.
Evaluation of GPU-Based Empirical Mode Decomposition for Off-Line Analysis
Pulung WASKITO Shinobu MIWA Yasue MITSUKURA Hironori NAKAJO

PAPER

Vol:
E94-D No:12
Page(s):
2328-2337
In off-line analysis, the demand for high precision signal processing has introduced a new method called Empirical Mode Decomposition (EMD), which is used for analyzing a complex set of data. Unfortunately, EMD is highly compute-intensive. In this paper, we show parallel implementation of Empirical Mode Decomposition on a GPU. We propose the use of “partial+total” switching method to increase performance while keeping the precision. We also focused on reducing the computation complexity in the above method from O(N) on a single CPU to O(N/P log (N)) on a GPU. Evaluation results show our single GPU implementation using Tesla C2050 (Fermi architecture) achieves a 29.9x speedup partially, and a 11.8x speedup totally when compared to a single Intel dual core CPU.
Hilbert Scan Based Bag-of-Features for Image Retrieval
Pengyi HAO Sei-ichiro KAMATA

PAPER-Image Processing and Video Processing

Vol:
E94-D No:6
Page(s):
1260-1268
Generally, two problems of bag-of-features in image retrieval are still considered unsolved: one is that spatial information about descriptors is not employed well, which affects the accuracy of retrieval; the other is that the trade-off between vocabulary size and good precision, which decides the storage and retrieval performance. In this paper, we propose a novel approach called Hilbert scan based bag-of-features (HS-BoF) for image retrieval. Firstly, Hilbert scan based tree representation (HSBT) is studied, which is built based on the local descriptors while spatial relationships are added into the nodes by a novel grouping rule, resulting of a tree structure for each image. Further, we give two ways of codebook production based on HSBT: multi-layer codebook and multi-size codebook. Owing to the properties of Hilbert scanning and the merits of our grouping method, sub-regions of the tree are not only flexible to the distribution of local patches but also have hierarchical relations. Extensive experiments on caltech-256, 13-scene and 1 million ImageNet images show that HS-BoF obtains higher accuracy with less memory usage.

1-20hit(66hit)

Keyword Search Result

[Keyword] BERT(66hit)

Automated Labeling of Entities in CVE Vulnerability Descriptions with Natural Language Processing Open Access

PSDSpell: Pre-Training with Self-Distillation Learning for Chinese Spelling Correction Open Access

Hilbert Series for Systems of UOV Polynomials

Effective Language Representations for Danmaku Comment Classification in Nicovideo

Auxiliary Loss for BERT-Based Paragraph Segmentation

Convex and Differentiable Formulation for Inverse Problems in Hilbert Spaces with Nonlinear Clipping Effects Open Access

Real-Time Detection of Global Cyberthreat Based on Darknet by Estimating Anomalous Synchronization Using Graphical Lasso

Sphere Packing Bound and Gilbert-Varshamov Bound for b-Symbol Read Channels

TCP Network Coding with Adapting Parameters for Bursty and Time-Varying Loss

Theoretical Analyses on 2-Norm-Based Multiple Kernel Regressors

A Novel Lambertian-RBFNN for Office Light Modeling

A New Class of Hilbert Pairs of Almost Symmetric Orthogonal Wavelet Bases

Ensemble and Multiple Kernel Regressors: Which Is Better?

Far-Field Pattern Reconstruction Using an Iterative Hilbert Transform

Hilbert Transform Based Time-of-Flight Estimation of Multi-Echo Ultrasonic Signals and Its Resolution Analysis

Analytic and Numerical Modeling of Normal Penetration of Early-Time (E1) High Altitude Electromagnetic Pulse (HEMP) into Dispersive Underground Multilayer Structures

On Kernel Parameter Selection in Hilbert-Schmidt Independence Criterion

Movement-Imagery Brain-Computer Interface: EEG Classification of Beta Rhythm Synchronization Based on Cumulative Distribution Function

Evaluation of GPU-Based Empirical Mode Decomposition for Off-Line Analysis

Hilbert Scan Based Bag-of-Features for Image Retrieval

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles