Uraiwan BUATOOM Waree KONGPRAWECHNON Thanaruk THEERAMUNKONG
The outcome of document clustering depends on the scheme used to assign a weight to each term in a document. While recent works have tried to use distributions related to class to enhance the discrimination ability. It is worth exploring whether a deviation approach or an entropy approach is more effective. This paper presents a comparison between deviation-based distribution and entropy-based distribution as constraints in term weighting. In addition, their potential combinations are investigated to find optimal solutions in guiding the clustering process. In the experiments, the seeded k-means method is used for clustering, and the performances of deviation-based, entropy-based, and hybrid approaches, are analyzed using two English and one Thai text datasets. The result showed that the deviation-based distribution outperformed the entropy-based distribution, and a suitable combination of these distributions increases the clustering accuracy by 10%.
Double modular redundancy (DMR) is to execute operations twice and detect soft error by comparing the operation results. The error is corrected by executing necessary operations again. For the DMR design of conditional processing, a method is proposed which makes the secondary executions of the duplicated operations be dependent on the primary execution of the condition operation, thereby widening the schedule solution space and allowing better results to be derived. The energy minimization with the proposed method is formulated as ILP models and the optimum solution is obtained by using an ILP solver.
Van-Hai VU Quang-Phuoc NGUYEN Kiem-Hieu NGUYEN Joon-Choul SHIN Cheol-Young OCK
Since deep learning was introduced, a series of achievements has been published in the field of automatic machine translation (MT). However, Korean-Vietnamese MT systems face many challenges because of a lack of data, multiple meanings of individual words, and grammatical diversity that depends on context. Therefore, the quality of Korean-Vietnamese MT systems is still sub-optimal. This paper discusses a method for applying Named Entity Recognition (NER) and Part-of-Speech (POS) tagging to Vietnamese sentences to improve the performance of Korean-Vietnamese MT systems. In terms of implementation, we used a tool to tag NER and POS in Vietnamese sentences. In addition, we had access to a Korean-Vietnamese parallel corpus with more than 450K paired sentences from our previous research paper. The experimental results indicate that tagging NER and POS in Vietnamese sentences can improve the quality of Korean-Vietnamese Neural MT (NMT) in terms of the Bi-Lingual Evaluation Understudy (BLEU) and Translation Error Rate (TER) score. On average, our MT system improved by 1.21 BLEU points or 2.33 TER scores after applying both NER and POS tagging to the Vietnamese corpus. Due to the structural features of language, the MT systems in the Korean to Vietnamese direction always give better BLEU and TER results than translation machines in the reverse direction.
Rachasak SOMYANONTHANAKUL Thanaruk THEERAMUNKONG
Objective interestingness measures play a vital role in association rule mining of a large-scaled database because they are used for extracting, filtering, and ranking the patterns. In the past, several measures have been proposed but their similarities or relations are not sufficiently explored. This work investigates sixty-one objective interestingness measures on the pattern of A → B, to analyze their similarity and dissimilarity as well as their relationship. Three-probability patterns, P(A), P(B), and P(AB), are enumerated in both linear and exponential scales and each measure's values of those conditions are calculated, forming synthesis data for investigation. The behavior of each measure is explored by pairwise comparison based on these three-probability patterns. The relationship among the sixty-one interestingness measures has been characterized with correlation analysis and association rule mining. In the experiment, relationships are summarized using heat-map and association rule mined. As the result, selection of an appropriate interestingness measure can be realized using the generated heat-map and association rules.
Taketoshi TANAKA Norikazu ITO Shinya TAKADO Masaaki KUZUHARA Ken NAKAHARA
TCAD simulation was performed to investigate the material properties of an AlGaN/GaN structure in Deep Acceptor (DA)-rich and Deep Donor (DD)-rich GaN cases. DD-rich semi-insulating GaN generated a positively charged area thereof to prevent the electron concentration in 2DEG from decreasing, while a DA-rich counterpart caused electron depletion, which was the origin of the current collapse in AlGaN/GaN HFETs. These simulation results were well verified experimentally using three nitride samples including buffer-GaN layers with carbon concentration ([C]) of 5×1017, 5×1018, and 4×1019 cm-3. DD-rich behaviors were observed for the sample with [C]=4×1019 cm-3, and DD energy level EDD=0.6 eV was estimated by the Arrhenius plot of temperature-dependent IDS. This EDD value coincided with the previously estimated EDD. The backgate experiments revealed that these DD-rich semi-insulating GaN suppressed both current collapse and buffer leakage, thus providing characteristics desirable for practical usage.
Tachanun KANGWANTRAKOOL Kobkrit VIRIYAYUDHAKORN Thanaruk THEERAMUNKONG
Most existing methods of effort estimations in software development are manual, labor-intensive and subjective, resulting in overestimation with bidding fail, and underestimation with money loss. This paper investigates effectiveness of sequence models on estimating development effort, in the form of man-months, from software project data. Four architectures; (1) Average word-vector with Multi-layer Perceptron (MLP), (2) Average word-vector with Support Vector Regression (SVR), (3) Gated Recurrent Unit (GRU) sequence model, and (4) Long short-term memory (LSTM) sequence model are compared in terms of man-months difference. The approach is evaluated using two datasets; ISEM (1,573 English software project descriptions) and ISBSG (9,100 software projects data), where the former is a raw text and the latter is a structured data table explained the characteristic of a software project. The LSTM sequence model achieves the lowest and the second lowest mean absolute errors, which are 0.705 and 14.077 man-months for ISEM and ISBSG datasets respectively. The MLP model achieves the lowest mean absolute errors which is 14.069 for ISBSG datasets.
This paper presents a compromising strategy based on constraint relaxation for automated negotiating agents in the nonlinear utility domain. Automated negotiating agents have been studied widely and are one of the key technologies for a future society in which multiple heterogeneous agents act collaboratively and competitively in order to help humans perform daily activities. A pressing issue is that most of the proposed negotiating agents utilize an ad-hoc compromising process, in which they basically just adjust/reduce a threshold to forcibly accept their opponents' offers. Because the threshold is just reduced and the agent just accepts the offer since the value is more than the threshold, it is very difficult to show how and what the agent conceded even after an agreement has been reached. To address this issue, we describe an explainable concession process using a constraint relaxation process. In this process, an agent changes its belief by relaxing constraints, i.e., removing constraints, so that it can accept it is the opponent's offer. We also propose three types of compromising strategies. Experimental results demonstrate that these strategies are efficient.
Shi-Chei HUNG Da-Chun WU Wen-Hsiang TSAI
The two issues of art image creation and data hiding are integrated into one and solved by a single approach in this study. An automatic method for generating a new type of computer art, called stained glass image, which imitates the stained-glass window picture, is proposed. The method is based on the use of a tree structure for region growing to construct the art image. Also proposed is a data hiding method which utilizes a general feature of the tree structure, namely, number of tree nodes, to encode the data to be embedded. The method can be modified for uses in three information protection applications, namely, covert communication, watermarking, and image authentication. Besides the artistic stego-image content which may distract the hacker's attention to the hidden data, data security is also considered by randomizing both the input data and the seed locations for region growing, yielding a stego-image which is robust against the hacker's attacks. Good experimental results proving the feasibility of the proposed methods are also included.
Yoshiki TAKAI Mamoru FUKUCHI Chihiro MATSUI Reika KINOSHITA Ken TAKEUCHI
This paper analyzes the optimal SSD configuration including emerging non-volatile memories such as quadruple-level cell (QLC) NAND flash memory [1] and storage class memories (SCMs). First, SSD performance and SSD endurance lifetime of hybrid SSD are evaluated in four configurations: 1) single-level cell (SLC)/QLC NAND flash, 2) SCM/QLC NAND flash, 3) SCM/triple-level cell (TLC)/QLC NAND flash and 4) SCM/TLC NAND flash. Furthermore, these four configurations are compared in limited cost. In case of cold workloads or high total SSD cost assumption, SCM/TLC NAND flash hybrid configuration is recommended in both SSD performance and endurance lifetime. For hot workloads with low total SSD cost assumption, however, SLC/QLC NAND flash hybrid configuration is recommended with emphasis on SSD endurance lifetime. Under the same conditions as above, SCM/TLC/QLC NAND flash tri-hybrid is the best configuration in SSD performance considering cost. In particular, for prxy_0 (write-hot workload), SCM/TLC/QLC NAND flash tri-hybrid achieves 67% higher IOPS/cost than SCM/TLC NAND flash hybrid. Moreover, the configurations with the highest IOPS/cost in each workload and cost limit are picked up and analyzed with various types of SCMs. For all cases except for the case of prxy_1 with high total SSD cost assumption, middle-end SCM (write latency: 1us, read latency: 1us) is recommended in performance considering cost. However, for prxy_1 (read-hot workload) with high total SSD cost assumption, high-end SCM (write latency: 100ns, read latency: 100ns) achieves the best performance.
Hikari KOREMURA Haruhiko KANEKO
This paper presents a successive cancellation (SC) decoding of polar codes modified for insertion/deletion/substitution (IDS) error channels, in which insertions and deletions are described by drift values. The recursive calculation of the original SC decoding is modified to include the drift values as stochastic variables. The computational complexity of the modified SC decoding is O (D3) with respect to the maximum drift value D, and O (N log N) with respect to the code length N. The symmetric capacity of polar bit channel is estimated by computer simulations, and frozen bits are determined according to the estimated symmetric capacity. Simulation results show that the decoded error rate of polar code with the modified SC list decoding is lower than that of existing IDS error correction codes, such as marker-based code and spatially-coupled code.
Picross 3D is a popular single-player puzzle video game for the Nintendo DS. It presents a rectangular parallelepiped (i.e., rectangular box) made of unit cubes, some of which must be removed to construct an object in three dimensions. Each row or column has at most one integer on it, and the integer indicates how many cubes in the corresponding 1D slice remain when the object is complete. Kusano et al. showed that Picross 3D is NP-complete and Kimura et al. showed that the counting version, the another solution problem, and the fewest clues problem of Picross 3D are #P-complete, NP-complete, and Σ2P-complete, respectively, where those results are shown for the restricted input that the rectangular parallelepiped is of height four. On the other hand, Igarashi showed that Picross 3D is NP-complete even if the height of the input rectangular parallelepiped is one. Extending the result by Igarashi, we in this paper show that the counting version, the another solution problem, and the fewest clues problem of Picross 3D are #P-complete, NP-complete, and Σ2P-complete, respectively, even if the height of the input rectangular parallelepiped is one. Since the height of the rectangular parallelepiped of any instance of Picross 3D is at least one, our hardness results are best in terms of height.
Yifan WEI Wanchun LI Yuning GUO Hongshu LIAO
This paper presents a three-dimensional (3D) spatial localization algorithm by using multiple one-dimensional uniform linear arrays (ULA). We first discuss geometric features of the angle-of-arrival (AOA) measurements of the array and present the corresponding principle of spatial cone angle intersection positioning with an angular measurement model. Then, we propose a new positioning method with an analytic study on the geometric dilution of precision (GDOP) of target location in different cases. The results of simulation show that the estimation accuracy of this method can attain the Cramér-Rao Bound (CRB) under low measurement noise.
Kouji HIRATA Hiroshi YAMAMOTO Shohei KAMAMURA Toshiyuki OKA Yoshihiko UEMATSU Hideki MAEDA Miki YAMAMOTO
This paper proposes a traveling maintenance method based on the resource pool concept, as a new network maintenance model. For failure recovery, the proposed method utilizes permissible time that is ensured by shared resource pools. In the proposed method, even if a failure occurs in a communication facility, maintenance staff wait for occurrence of successive failures in other communication facilities during the permissible time instead of immediately tackling the failure. Then, the maintenance staff successively visit the communication facilities that have faulty devices and collectively repair them. Therefore, the proposed method can reduce the amount of time that the maintenance staff take for fault recovery. Furthermore, this paper provides a system design that optimizes the proposed traveling maintenance according to system requirements determined by the design philosophy of telecommunication networks. Through simulation experiments, we show the effectiveness of the proposed method.
In this paper, hierarchical interference coordination is proposed that suppresses both intra- and inter-cluster interference (ICI) in clustered wireless networks. Assuming transmitters and receivers are equipped with multiple antennas and complete channel state information is shared among all transmitters within the same cluster, interference alignment (IA) is performed that uses nulls to suppress intra-cluster interference. For ICI mitigation, we propose a null-steering precoder designed on the nullspace of a principal eigenvector of the correlated ICI channels, which eliminates a significant amount of ICI power given the exchange of cluster geometry between neighboring clusters. However, as ICI is negligible for the system in which the distance between clusters are large enough, the proposed scheme may not improve the system performance compared with the pure IA scheme that exploits all spatial degrees of freedom (DoF) to increase multiplexing gain without ICI mitigation. For the efficient interference management between intra- and inter-cluster, we analyze the decision criterion that provides an adaptive transmission mode selection between pure IA and proposed ICI reduction in given network environments. Moreover, a low computational complexity based transmission mode switching algorithm is proposed for irregularly distributed networks.
Junesang LEE Hosang LEE Jungrae HA Minho KIM Sangwon YUN Yeongsik KIM Wansoo NAH
This paper presents a methodology with which to construct an equivalent simulation model of closed-loop BCI testing for a vehicle component. The proposed model comprehensively takes the transfer impedance of the test configuration into account. The methodology used in this paper relies on circuit modeling and EM modeling as well. The BCI test probes are modeled as the equivalent circuits, and the frequency-dependent losses characteristics in the probe's ferrite are derived using a PSO algorithm. The measurement environments involving the harness cable, load simulator, DUT, and ground plane are designed through three-dimensional EM simulation. The developed circuit model and EM model are completely integrated in a commercial EM simulation tool, EMC Studio of EMCoS Ltd. The simulated results are validated through comparison with measurements. The simulated and measurement results are consistent in the range of 1MHz up to 400MHz.
Kentaro KOJIMA Kodai YAMADA Jun FURUTA Kazutoshi KOBAYASHI
Cross sections that cause single event upsets by heavy ions are sensitive to doping concentration in the source and drain regions, and the structure of the raised source and drain regions especially in FDSOI. Due to the parasitic bipolar effect (PBE), radiation-hardened flip flops with stacked transistors in FDSOI tend to have soft errors, which is consistent with measurement results by heavy-ion irradiation. Device-simulation results in this study show that the cross section is proportional to the silicon thickness of the raised layer and inversely proportional to the doping concentration in the drain. Increasing the doping concentration in the source and drain region enhance the Auger recombination of carriers there and suppresses the parasitic bipolar effect. PBE is also suppressed by decreasing the silicon thickness of the raised layer. Cgg-Vgs and Ids-Vgs characteristics change smaller than soft error tolerance change. Soft error tolerance can be effectively optimized by using these two determinants with only a small impact on transistor characteristics.
Siyang YU Kazuaki KONDO Yuichi NAKAMURA Takayuki NAKAJIMA Masatake DANTSUJI
This article introduces our investigation on learning state estimation in e-learning on the condition that visual observation and recording of a learner's behaviors is possible. In this research, we examined methods of adaptation for a new learner for whom a small number of ground truth data can be obtained.
Ping DU Akihiro NAKAO Satoshi MIKI Makoto INOUE
In the coming smart-home era, more and more household electrical appliances are generating more and more sensor data and transmitting them over the home networks, which are often connected to Internet through Point-to-Point Protocol over Ethernet (PPPoE) for desirable authentication and accounting. However, according to our knowledge, high-speed commercial home PPPoE router is still absent for a home network environment. In this paper, we first introduce and evaluate our programmable platform FLARE-DPDK for ease of programming network functions. Then we introduce our effort to build a compact 10Gbps software FLARE PPPoE router on a commercial mini-PC. In our implementation, the control plane is implemented with Linux PPPoE software for authentication-like signaling control. The data plane is implemented over FLARE-DPDK platform, where we get packets from physical network interfaces directly bypassing Linux kernel and distribute packets to multiple CPU cores for data processing in parallel. We verify our software PPPoE router in both lab and production network environment. The experimental results show that our FLARE software PPPoE router can achieve much higher throughput than a commercial PPPoE router tested in a production environment.
Ye PENG Wentao ZHAO Wei CAI Jinshu SU Biao HAN Qiang LIU
Due to the superior performance, deep learning has been widely applied to various applications, including image classification, bioinformatics, and cybersecurity. Nevertheless, the research investigations on deep learning in the adversarial environment are still on their preliminary stage. The emerging adversarial learning methods, e.g., generative adversarial networks, have introduced two vital questions: to what degree the security of deep learning with the presence of adversarial examples is; how to evaluate the performance of deep learning models in adversarial environment, thus, to raise security advice such that the selected application system based on deep learning is resistant to adversarial examples. To see the answers, we leverage image classification as an example application scenario to propose a framework of Evaluating Deep Learning for Image Classification (EDLIC) to conduct comprehensively quantitative analysis. Moreover, we introduce a set of evaluating metrics to measure the performance of different attacking and defensive techniques. After that, we conduct extensive experiments towards the performance of deep learning for image classification under different adversarial environments to validate the scalability of EDLIC. Finally, we give some advice about the selection of deep learning models for image classification based on these comparative results.
Angular Momentum (AM) has been considered as a new dimension of wireless transmissions as well as the intrinsic property of Electro-Magnetic (EM) waves. So far, AM is utilized as a discrete mode not only in the quantum states, but also in the statistical beam forming. Traditionally, the continuous value of AM is ignored and only the quantized mode number is identified. However, the recent discovery on electrons in spiral motion producing twisted radiation with AM, including Spin Angular Momentum (SAM) and Orbital Angular Momentum (OAM), proves that the continuous value of AM is available in the statistical EM wave beam. This is also revealed by the so-called fractional OAM, which is reported in optical OAM beams. Then, as the new dimension with continuous real number field, AM should turn out to be a certain spectrum, similar to the frequency spectrum usually in the wireless signal processing. In this letter, we mathematically define the AM spectrum and show the applications in the information theory analysis, which is expected to be an efficient tool for the future wireless communications with AM.