The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] language(282hit)

21-40hit(282hit)

  • Model Checking in the Presence of Schedulers Using a Domain-Specific Language for Scheduling Policies

    Nhat-Hoa TRAN  Yuki CHIBA  Toshiaki AOKI  

     
    PAPER-Software System

      Pubricized:
    2019/03/29
      Vol:
    E102-D No:7
      Page(s):
    1280-1295

    A concurrent system consists of multiple processes that are run simultaneously. The execution orders of these processes are defined by a scheduler. In model checking techniques, the scheduling policy is closely related to the search algorithm that explores all of the system states. To ensure the correctness of the system, the scheduling policy needs to be taken into account during the verification. Current approaches, which use fixed strategies, are only capable of limited kinds of policies and are difficult to extend to handle the variations of the schedulers. To address these problems, we propose a method using a domain-specific language (DSL) for the succinct specification of different scheduling policies. Necessary artifacts are automatically generated from the specification to analyze the behaviors of the system. We also propose a search algorithm for exploring the state space. Based on this method, we develop a tool to verify the system with the scheduler. Our experiments show that we could serve the variations of the schedulers easily and verify the systems accurately.

  • Feature Based Domain Adaptation for Neural Network Language Models with Factorised Hidden Layers

    Michael HENTSCHEL  Marc DELCROIX  Atsunori OGAWA  Tomoharu IWATA  Tomohiro NAKATANI  

     
    PAPER-Speech and Hearing

      Pubricized:
    2018/12/04
      Vol:
    E102-D No:3
      Page(s):
    598-608

    Language models are a key technology in various tasks, such as, speech recognition and machine translation. They are usually used on texts covering various domains and as a result domain adaptation has been a long ongoing challenge in language model research. With the rising popularity of neural network based language models, many methods have been proposed in recent years. These methods can be separated into two categories: model based and feature based adaptation methods. Feature based domain adaptation has compared to model based domain adaptation the advantage that it does not require domain labels in the corpus. Most existing feature based adaptation methods are based on bias adaptation. We propose a novel feature based domain adaptation technique using hidden layer factorisation. This method is fundamentally different from existing methods because we use the domain features to calculate a linear combination of linear layers. These linear layers can capture domain specific information and information common to different domains. In the experiments, we compare our proposed method with existing adaptation methods. The compared adaptation techniques are based on two different ideas, that is, bias based adaptation and gating of hidden units. All language models in our comparison use state-of-the-art long short-term memory based recurrent neural networks. We demonstrate the effectiveness of the proposed method with perplexity results for the well-known Penn Treebank and speech recognition results for a corpus of TED talks.

  • Automatic Speech Recognition System with Output-Gate Projected Gated Recurrent Unit

    Gaofeng CHENG  Pengyuan ZHANG  Ji XU  

     
    PAPER-Speech and Hearing

      Pubricized:
    2018/11/19
      Vol:
    E102-D No:2
      Page(s):
    355-363

    The long short-term memory recurrent neural network (LSTM) has achieved tremendous success for automatic speech recognition (ASR). However, the complicated gating mechanism of LSTM introduces a massive computational cost and limits the application of LSTM in some scenarios. In this paper, we describe our work on accelerating the decoding speed and improving the decoding accuracy. First, we propose an architecture, which is called Projected Gated Recurrent Unit (PGRU), for ASR tasks, and show that the PGRU can consistently outperform the standard GRU. Second, to improve the PGRU generalization, particularly on large-scale ASR tasks, we propose the Output-gate PGRU (OPGRU). In addition, the time delay neural network (TDNN) and normalization methods are found beneficial for OPGRU. In this paper, we apply the OPGRU for both the acoustic model and recurrent neural network language model (RNN-LM). Finally, we evaluate the PGRU on the total Eval2000 / RT03 test sets, and the proposed OPGRU single ASR system achieves 0.9% / 0.9% absolute (8.2% / 8.6% relative) reduction in word error rate (WER) compared to our previous best LSTM single ASR system. Furthermore, the OPGRU ASR system achieves significant speed-up on both acoustic model and language model rescoring.

  • Improvement of Anomaly Detection Performance Using Packet Flow Regularity in Industrial Control Networks Open Access

    Kensuke TAMURA  Kanta MATSUURA  

     
    PAPER

      Vol:
    E102-A No:1
      Page(s):
    65-73

    Since cyber attacks such as cyberterrorism against Industrial Control Systems (ICSs) and cyber espionage against companies managing them have increased, the techniques to detect anomalies in early stages are required. To achieve the purpose, several studies have developed anomaly detection methods for ICSs. In particular, some techniques using packet flow regularity in industrial control networks have achieved high-accuracy detection of attacks disrupting the regularity, i.e. normal behaviour, of ICSs. However, these methods cannot identify scanning attacks employed in cyber espionage because the probing packets assimilate into a number of normal ones. For example, the malware called Havex is customised to clandestinely acquire information from targeting ICSs using general request packets. The techniques to detect such scanning attacks using widespread packets await further investigation. Therefore, the goal of this study was to examine high performance methods to identify anomalies even if elaborate packets to avoid alert systems were employed for attacks against industrial control networks. In this paper, a novel detection model for anomalous packets concealing behind normal traffic in industrial control networks was proposed. For the proposal of the sophisticated detection method, we took particular note of packet flow regularity and employed the Markov-chain model to detect anomalies. Moreover, we regarded not only original packets but similar ones to them as normal packets to reduce false alerts because it was indicated that an anomaly detection model using the Markov-chain suffers from the ample false positives affected by a number of normal, irregular packets, namely noise. To calculate the similarity between packets based on the packet flow regularity, a vector representation tool called word2vec was employed. Whilst word2vec is utilised for the culculation of word similarity in natural language processing tasks, we applied the technique to packets in ICSs to calculate packet similarity. As a result, the Markov-chain with word2vec model identified scanning packets assimulating into normal packets in higher performance than the conventional Markov-chain model. In conclusion, employing both packet flow regularity and packet similarity in industrial control networks contributes to improving the performance of anomaly detection in ICSs.

  • In-Vehicle Voice Interface with Improved Utterance Classification Accuracy Using Off-the-Shelf Cloud Speech Recognizer

    Takeshi HOMMA  Yasunari OBUCHI  Kazuaki SHIMA  Rintaro IKESHITA  Hiroaki KOKUBO  Takuya MATSUMOTO  

     
    PAPER-Speech and Hearing

      Pubricized:
    2018/08/31
      Vol:
    E101-D No:12
      Page(s):
    3123-3137

    For voice-enabled car navigation systems that use a multi-purpose cloud speech recognition service (cloud ASR), utterance classification that is robust against speech recognition errors is needed to realize a user-friendly voice interface. The purpose of this study is to improve the accuracy of utterance classification for voice-enabled car navigation systems when inputs to a classifier are error-prone speech recognition results obtained from a cloud ASR. The role of utterance classification is to predict which car navigation function a user wants to execute from a spontaneous utterance. A cloud ASR causes speech recognition errors due to the noises that occur when traveling in a car, and the errors degrade the accuracy of utterance classification. There are many methods for reducing the number of speech recognition errors by modifying the inside of a speech recognizer. However, application developers cannot apply these methods to cloud ASRs because they cannot customize the ASRs. In this paper, we propose a system for improving the accuracy of utterance classification by modifying both speech-signal inputs to a cloud ASR and recognized-sentence outputs from an ASR. First, our system performs speech enhancement on a user's utterance and then sends both enhanced and non-enhanced speech signals to a cloud ASR. Speech recognition results from both speech signals are merged to reduce the number of recognition errors. Second, to reduce that of utterance classification errors, we propose a data augmentation method, which we call “optimal doping,” where not only accurate transcriptions but also error-prone recognized sentences are added to training data. An evaluation with real user utterances spoken to car navigation products showed that our system reduces the number of utterance classification errors by 54% from a baseline condition. Finally, we propose a semi-automatic upgrading approach for classifiers to benefit from the improved performance of cloud ASRs.

  • Automatically Generating Malware Analysis Reports Using Sandbox Logs

    Bo SUN  Akinori FUJINO  Tatsuya MORI  Tao BAN  Takeshi TAKAHASHI  Daisuke INOUE  

     
    PAPER-Network Security

      Pubricized:
    2018/08/22
      Vol:
    E101-D No:11
      Page(s):
    2622-2632

    Analyzing a malware sample requires much more time and cost than creating it. To understand the behavior of a given malware sample, security analysts often make use of API call logs collected by the dynamic malware analysis tools such as a sandbox. As the amount of the log generated for a malware sample could become tremendously large, inspecting the log requires a time-consuming effort. Meanwhile, antivirus vendors usually publish malware analysis reports (vendor reports) on their websites. These malware analysis reports are the results of careful analysis done by security experts. The problem is that even though there are such analyzed examples for malware samples, associating the vendor reports with the sandbox logs is difficult. This makes security analysts not able to retrieve useful information described in vendor reports. To address this issue, we developed a system called AMAR-Generator that aims to automate the generation of malware analysis reports based on sandbox logs by making use of existing vendor reports. Aiming at a convenient assistant tool for security analysts, our system employs techniques including template matching, API behavior mapping, and malicious behavior database to produce concise human-readable reports that describe the malicious behaviors of malware programs. Through the performance evaluation, we first demonstrate that AMAR-Generator can generate human-readable reports that can be used by a security analyst as the first step of the malware analysis. We also demonstrate that AMAR-Generator can identify the malicious behaviors that are conducted by malware from the sandbox logs; the detection rates are up to 96.74%, 100%, and 74.87% on the sandbox logs collected in 2013, 2014, and 2015, respectively. We also present that it can detect malicious behaviors from unknown types of sandbox logs.

  • Efficient Reusable Collections

    Davud MOHAMMADPUR  Ali MAHJUR  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2018/08/20
      Vol:
    E101-D No:11
      Page(s):
    2710-2719

    Efficiency and flexibility of collections have a significant impact on the overall performance of applications. The current approaches to implement collections have two main drawbacks: (i) they limit the efficiency of collections and (ii) they have not adequate support for collection composition. So, when the efficiency and flexibility of collections is important, the programmer needs to implement them himself, which leads to the loss of reusability. This article presents neoCollection, a novel approach to encapsulate collections. neoCollection has several distinguishing features: (i) it can be applied on data elements efficiently and flexibly (ii) composition of collections can be made efficiently and flexibly, a feature that does not exist in the current approaches. In order to demonstrate its effectiveness, neoCollection is implemented as an extension to Java and C++.

  • A Unified Neural Network for Quality Estimation of Machine Translation

    Maoxi LI  Qingyu XIANG  Zhiming CHEN  Mingwen WANG  

     
    LETTER-Natural Language Processing

      Pubricized:
    2018/06/18
      Vol:
    E101-D No:9
      Page(s):
    2417-2421

    The-state-of-the-art neural quality estimation (QE) of machine translation model consists of two sub-networks that are tuned separately, a bidirectional recurrent neural network (RNN) encoder-decoder trained for neural machine translation, called the predictor, and an RNN trained for sentence-level QE tasks, called the estimator. We propose to combine the two sub-networks into a whole neural network, called the unified neural network. When training, the bidirectional RNN encoder-decoder are initialized and pre-trained with the bilingual parallel corpus, and then, the networks are trained jointly to minimize the mean absolute error over the QE training samples. Compared with the predictor and estimator approach, the use of a unified neural network helps to train the parameters of the neural networks that are more suitable for the QE task. Experimental results on the benchmark data set of the WMT17 sentence-level QE shared task show that the proposed unified neural network approach consistently outperforms the predictor and estimator approach and significantly outperforms the other baseline QE approaches.

  • Construction of Spontaneous Emotion Corpus from Indonesian TV Talk Shows and Its Application on Multimodal Emotion Recognition

    Nurul LUBIS  Dessi LESTARI  Sakriani SAKTI  Ayu PURWARIANTI  Satoshi NAKAMURA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2018/05/10
      Vol:
    E101-D No:8
      Page(s):
    2092-2100

    As interaction between human and computer continues to develop to the most natural form possible, it becomes increasingly urgent to incorporate emotion in the equation. This paper describes a step toward extending the research on emotion recognition to Indonesian. The field continues to develop, yet exploration of the subject in Indonesian is still lacking. In particular, this paper highlights two contributions: (1) the construction of the first emotional audio-visual database in Indonesian, and (2) the first multimodal emotion recognizer in Indonesian, built from the aforementioned corpus. In constructing the corpus, we aim at natural emotions that are corresponding to real life occurrences. However, the collection of emotional corpora is notably labor intensive and expensive. To diminish the cost, we collect the emotional data from television programs recordings, eliminating the need of an elaborate recording set up and experienced participants. In particular, we choose television talk shows due to its natural conversational content, yielding spontaneous emotion occurrences. To cover a broad range of emotions, we collected three episodes in different genres: politics, humanity, and entertainment. In this paper, we report points of analysis of the data and annotations. The acquisition of the emotion corpus serves as a foundation in further research on emotion. Subsequently, in the experiment, we employ the support vector machine (SVM) algorithm to model the emotions in the collected data. We perform multimodal emotion recognition utilizing the predictions of three modalities: acoustic, semantic, and visual. When compared to the unimodal result, in the multimodal feature combination, we attain identical accuracy for the arousal at 92.6%, and a significant improvement for the valence classification task at 93.8%. We hope to continue this work and move towards a finer-grain, more precise quantification of emotion.

  • Domain Adaptation Based on Mixture of Latent Words Language Models for Automatic Speech Recognition Open Access

    Ryo MASUMURA  Taichi ASAMI  Takanobu OBA  Hirokazu MASATAKI  Sumitaka SAKAUCHI  Akinori ITO  

     
    PAPER-Speech and Hearing

      Pubricized:
    2018/02/26
      Vol:
    E101-D No:6
      Page(s):
    1581-1590

    This paper proposes a novel domain adaptation method that can utilize out-of-domain text resources and partially domain matched text resources in language modeling. A major problem in domain adaptation is that it is hard to obtain adequate adaptation effects from out-of-domain text resources. To tackle the problem, our idea is to carry out model merger in a latent variable space created from latent words language models (LWLMs). The latent variables in the LWLMs are represented as specific words selected from the observed word space, so LWLMs can share a common latent variable space. It enables us to perform flexible mixture modeling with consideration of the latent variable space. This paper presents two types of mixture modeling, i.e., LWLM mixture models and LWLM cross-mixture models. The LWLM mixture models can perform a latent word space mixture modeling to mitigate domain mismatch problem. Furthermore, in the LWLM cross-mixture models, LMs which individually constructed from partially matched text resources are split into two element models, each of which can be subjected to mixture modeling. For the approaches, this paper also describes methods to optimize mixture weights using a validation data set. Experiments show that the mixture in latent word space can achieve performance improvements for both target domain and out-of-domain compared with that in observed word space.

  • Submodular Based Unsupervised Data Selection

    Aiying ZHANG  Chongjia NI  

     
    PAPER-Speech and Hearing

      Pubricized:
    2018/03/14
      Vol:
    E101-D No:6
      Page(s):
    1591-1604

    Automatic speech recognition (ASR) and keyword search (KWS) have more and more found their way into our everyday lives, and their successes could boil down lots of factors. In these factors, large scale of speech data used for acoustic modeling is the key factor. However, it is difficult and time-consuming to acquire large scale of transcribed speech data for some languages, especially for low-resource languages. Thus, at low-resource condition, it becomes important with which transcribed data for acoustic modeling for improving the performance of ASR and KWS. In view of using acoustic data for acoustic modeling, there are two different ways. One is using the target language data, and another is using large scale of other source languages data for cross-lingual transfer. In this paper, we propose some approaches for efficient selecting acoustic data for acoustic modeling. For target language data, a submodular based unsupervised data selection approach is proposed. The submodular based unsupervised data selection could select more informative and representative utterances for manual transcription for acoustic modeling. For other source languages data, the high misclassified as target language based submodular multilingual data selection approach and knowledge based group multilingual data selection approach are proposed. When using selected multilingual data for multilingual deep neural network training for cross-lingual transfer, it could improve the performance of ASR and KWS of target language. When comparing our proposed multilingual data selection approach with language identification based multilingual data selection approach, our proposed approach also obtains better effect. In this paper, we also analyze and compare the language factor and the acoustic factor influence on the performance of ASR and KWS. The influence of different scale of target language data on the performance of ASR and KWS at mono-lingual condition and cross-lingual condition are also compared and analyzed, and some significant conclusions can be concluded.

  • On Implementing an Automatic Headline Generation for Discussion BBS Systems —Cases of Citizens' Deliberations for Communities—

    Katsuhide FUJITA  Ryosuke WATANABE  

     
    PAPER-Creativity Support Systems and Decision Support Systems

      Pubricized:
    2018/01/19
      Vol:
    E101-D No:4
      Page(s):
    865-873

    Recently, the opportunity to discuss topics on a variety of online discussion bulletin boards has been increasing. However, it can be difficult to understand the contents of each discussion as the number of posts increases. Therefore, it is important to generate headlines that can automatically summarize each post in order to understand the contents of each discussion at a glance. In this paper, we propose a method to extract and generate post headlines for online discussion bulletin boards, automatically. We propose templates with multiple patterns to extract important sentences from the posts. In addition, we propose a method to generate headlines by matching the templates with the patterns. Then, we evaluate the effectiveness of our proposed method using questionnaires.

  • Modeling Storylines in Lyrics

    Kento WATANABE  Yuichiroh MATSUBAYASHI  Kentaro INUI  Satoru FUKAYAMA  Tomoyasu NAKANO  Masataka GOTO  

     
    PAPER-Natural Language Processing

      Pubricized:
    2017/12/22
      Vol:
    E101-D No:4
      Page(s):
    1167-1179

    This paper addresses the issue of modeling the discourse nature of lyrics and presented the first study aiming at capturing the two common discourse-related notions: storylines and themes. We assume that a storyline is a chain of transitions over topics of segments and a song has at least one entire theme. We then hypothesize that transitions over topics of lyric segments can be captured by a probabilistic topic model which incorporates a distribution over transitions of latent topics and that such a distribution of topic transitions is affected by the theme of lyrics. Aiming to test those hypotheses, this study conducts experiments on the word prediction and segment order prediction tasks exploiting a large-scale corpus of popular music lyrics for both English and Japanese (around 100 thousand songs). The findings we gained from these experiments can be summarized into two respects. First, the models with topic transitions significantly outperformed the model without topic transitions in word prediction. This result indicates that typical storylines included in our lyrics datasets were effectively captured as a probabilistic distribution of transitions over latent topics of segments. Second, the model incorporating a latent theme variable on top of topic transitions outperformed the models without such variables in both word prediction and segment order prediction. From this result, we can conclude that considering the notion of theme does contribute to the modeling of storylines of lyrics.

  • A Survey of Thai Knowledge Extraction for the Semantic Web Research and Tools Open Access

    Ponrudee NETISOPAKUL  Gerhard WOHLGENANNT  

     
    SURVEY PAPER

      Pubricized:
    2018/01/18
      Vol:
    E101-D No:4
      Page(s):
    986-1002

    As the manual creation of domain models and also of linked data is very costly, the extraction of knowledge from structured and unstructured data has been one of the central research areas in the Semantic Web field in the last two decades. Here, we look specifically at the extraction of formalized knowledge from natural language text, which is the most abundant source of human knowledge available. There are many tools on hand for information and knowledge extraction for English natural language, for written Thai language the situation is different. The goal of this work is to assess the state-of-the-art of research on formal knowledge extraction specifically from Thai language text, and then give suggestions and practical research ideas on how to improve the state-of-the-art. To address the goal, first we distinguish nine knowledge extraction for the Semantic Web tasks defined in literature on knowledge extraction from English text, for example taxonomy extraction, relation extraction, or named entity recognition. For each of the nine tasks, we analyze the publications and tools available for Thai text in the form of a comprehensive literature survey. Additionally to our assessment, we measure the self-assessment by the Thai research community with the help of a questionnaire-based survey on each of the tasks. Furthermore, the structure and size of the Thai community is analyzed using complex literature database queries. Combining all the collected information we finally identify research gaps in knowledge extraction from Thai language. An extensive list of practical research ideas is presented, focusing on concrete suggestions for every knowledge extraction task - which can be implemented and evaluated with reasonable effort. Besides the task-specific hints for improvements of the state-of-the-art, we also include general recommendations on how to raise the efficiency of the respective research community.

  • PROVIT-CI: A Classroom-Oriented Educational Program Visualization Tool

    Yu YAN  Kohei HARA  Takenobu KAZUMA  Yasuhiro HISADA  Aiguo HE  

     
    PAPER-Educational Technology

      Pubricized:
    2017/11/01
      Vol:
    E101-D No:2
      Page(s):
    447-454

    Studies have shown that program visualization(PV) is effective for student programming exercise or self-study support. However, very few instructors actively use PV tools for programming lectures. This article discussed the impediments the instructors meet during combining PV tools into lecture classrooms and proposed a C programming classroom instruction support tool based on program visualization — PROVIT-CI (PROgram VIsualization Tool for Classroom Instruction). PROVIT-CI has been consecutively and actively used by the instructors in author's university to enhance their lectures since 2015. The evaluation of application results in an introductory C programming course shows that PROVIT-CI is effective and helpful for instructors classroom use.

  • ArchHDL: A Novel Hardware RTL Modeling and High-Speed Simulation Environment

    Shimpei SATO  Ryohei KOBAYASHI  Kenji KISE  

     
    PAPER-Design Methodology and Platform

      Pubricized:
    2017/11/17
      Vol:
    E101-D No:2
      Page(s):
    344-353

    LSIs are generally designed through four stages including architectural design, logic design, circuit design, and physical design. In architectural design and logic design, designers describe their target hardware in RTL. However, they generally use different languages for each phase. Typically a general purpose programming language such as C or C++ and a hardware description language such as Verilog HDL or VHDL are used for architectural design and logic design, respectively. That is time-consuming way for designing a hardware and more efficient design environment is required. In this paper, we propose a new hardware modeling and high-speed simulation environment for architectural design and logic design. Our environment realizes writing and verifying hardware by one language. The environment consists of (1) a new hardware description language called ArchHDL, which enables to simulate hardware faster than Verilog HDL simulation, and (2) a source code translation tool from ArchHDL code to Verilog HDL code. ArchHDL is a new language for hardware RTL modeling based on C++. The key features of this language are that (1) designers describe a combinational circuit as a function and (2) the ArchHDL library realizes non-blocking assignment in C++. Using these features, designers are able to write a hardware transparently from abstracted level description to RTL description in Verilog HDL-like style. Source codes in ArchHDL is converted to Verilog HDL codes by the translation tool and they are used to synthesize for FPGAs or ASICs. As the evaluation of our environment, we implemented a practical many-core processor in ArchHDL and measured the simulation speed on an Intel CPU and an Intel Xeon Phi processor. The simulation speed for the Intel CPU by ArchHDL achieves about 4.5 times faster than the simulation speed by Synopsys VCS. We also confirmed that the RTL simulation by ArchHDL is efficiently parallelized on the Intel Xeon Phi processor. We convert the ArchHDL code to a Verilog HDL code and estimated the hardware utilization on an FPGA. To implement a 48-node many-core processor, 71% of entire resources of a Virtex-7 FPGA are consumed.

  • A Comparative Study of Rule-Based Inference Engines for the Semantic Web

    Thanyalak RATTANASAWAD  Marut BURANARACH  Kanda Runapongsa SAIKAEW  Thepchai SUPNITHI  

     
    PAPER

      Pubricized:
    2017/09/15
      Vol:
    E101-D No:1
      Page(s):
    82-89

    With the Semantic Web data standards defined, more applications demand inference engines in providing support for intelligent processing of the Semantic Web data. Rule-based inference engines or rule-based reasoners are used in many domains, such as in clinical support, and e-commerce recommender system development. This article reviews and compares key features of three freely-available rule-based reasoners: Jena inference engine, Euler YAP Engine, and BaseVISor. A performance evaluation study was conducted to assess the scalability and efficiency of these systems using data and rule sets adapted from the Berlin SPARQL Benchmark. We describe our methodology in assessing rule-based reasoners based on the benchmark. The study result shows the efficiency of the systems in performing reasoning tasks over different data sizes and rules involving various rule properties. The review and comparison results can provide a basis for users in choosing appropriate rule-based inference engines to match their application requirements.

  • Fraud Analysis and Detection for Real-Time Messaging Communications on Social Networks Open Access

    Liang-Chun CHEN  Chien-Lung HSU  Nai-Wei LO  Kuo-Hui YEH  Ping-Hsien LIN  

     
    INVITED PAPER

      Pubricized:
    2017/07/21
      Vol:
    E100-D No:10
      Page(s):
    2267-2274

    With the successful development and rapid advancement of social networking technology, people tend to exchange and share information via online social networks, such as Facebook and LINE.Massive amounts of information are aggregated promptly and circulated quickly among people. However, with the enormous volume of human-interactions, various types of swindles via online social networks have been launched in recent years. Effectively detecting fraudulent activities on social networks has taken on increased importance, and is a topic of ongoing interest. In this paper, we develop a fraud analysis and detection system based on real-time messaging communications, which constitute one of the most common human-interacted services of online social networks. An integrated platform consisting of various text-mining techniques, such as natural language processing, matrix processing and content analysis via a latent semantic model, is proposed. In the system implementation, we first collect a series of fraud events, all of which happened in Taiwan, to construct analysis modules for detecting such fraud events. An Android-based application is then built for alert notification when dubious logs and fraud events happen.

  • Synthesizing Pareto Efficient Intelligible State Machines from Communication Diagram

    Toshiyuki MIYAMOTO  

     
    PAPER-Formal tools

      Pubricized:
    2017/03/07
      Vol:
    E100-D No:6
      Page(s):
    1200-1209

    For a service-oriented architecture based system, the problem of synthesizing a concrete model, i.e., behavioral model, for each service configuring the system from an abstract specification, which is referred to as choreography, is known as the choreography realization problem. In this paper, we assume that choreography is given by an acyclic relation. We have already shown that the condition for the behavioral model is given by lower and upper bounds of acyclic relations. Thus, the degree of freedom for behavioral models increases; developing algorithms of synthesizing an intelligible model for users becomes possible. In this paper, we introduce several metrics for intelligibility of state machines, and study the algorithm of synthesizing Pareto efficient state machines.

  • A Minimalist's Reversible While Language

    Robert GLÜCK  Tetsuo YOKOYAMA  

     
    PAPER-Software System

      Pubricized:
    2017/02/06
      Vol:
    E100-D No:5
      Page(s):
    1026-1034

    The paper presents a small reversible language R-CORE, a structured imperative programming language with symbolic tree-structured data (S-expressions). The language is reduced to the core of a reversible language, with a single command for reversibly updating the store, a single reversible control-flow operator, a limited number of variables, and data with a single atom and a single constructor. Despite its extreme simplicity, the language is reversibly universal, which means that it is as powerful as any reversible language can be, while it is linear-time self-interpretable, and it allows reversible programming with dynamic data structures. The four-line program inverter for R-CORE is among the shortest existing program inverters, which demonstrates the conciseness of the language. The translator to R-CORE, which is used to show the formal properties of the language, is clean and modular, and it may serve as a model for related reversible translation problems. The goal is to provide a language that is sufficiently concise for theoretical investigations. Owing to its simplicity, the language may also be used for educational purposes.

21-40hit(282hit)