IEICE global.ieice.org Site

Keyword Search Result

[Keyword] language(282hit)

41-60hit(282hit)

Phoneme Set Design Based on Integrated Acoustic and Linguistic Features for Second Language Speech Recognition
Xiaoyun WANG Tsuneo KATO Seiichi YAMAMOTO

PAPER-Speech and Hearing

Pubricized:
2016/12/29
Vol:
E100-D No:4
Page(s):
857-864
Recognition of second language (L2) speech is a challenging task even for state-of-the-art automatic speech recognition (ASR) systems, partly because pronunciation by L2 speakers is usually significantly influenced by the mother tongue of the speakers. Considering that the expressions of non-native speakers are usually simpler than those of native ones, and that second language speech usually includes mispronunciation and less fluent pronunciation, we propose a novel method that maximizes unified acoustic and linguistic objective function to derive a phoneme set for second language speech recognition. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based computer assisted language learning (CALL) system. In this paper, the authors examine the performance based on acoustic likelihood, linguistic discrimination ability and integrated objective function for second language speech. Experiments demonstrate the validity of the phoneme set derived by the proposed method.
Improving Question Retrieval in cQA Services Using a Dependency Parser
Kyoungman BAE Youngjoong KO

LETTER

Pubricized:
2017/01/17
Vol:
E100-D No:4
Page(s):
807-810
The translation based language model (TRLM) is state-of-the-art method to solve the lexical gap problem of the question retrieval in the community-based question answering (cQA). Some researchers tried to find methods for solving the lexical gap and improving the TRLM. In this paper, we propose a new dependency based model (DM) for the question retrieval. We explore how to utilize the results of a dependency parser for cQA. Dependency bigrams are extracted from the dependency parser and the language model is transformed using the dependency bigrams as bigram features. As a result, we obtain the significant improved performances when TRLM and DM approaches are effectively combined.
CLCMiner: Detecting Cross-Language Clones without Intermediates
Xiao CHENG Zhiming PENG Lingxiao JIANG Hao ZHONG Haibo YU Jianjun ZHAO

PAPER-Software Engineering

Pubricized:
2016/11/21
Vol:
E100-D No:2
Page(s):
273-284
The proliferation of diverse kinds of programming languages and platforms makes it a common need to have the same functionality implemented in different languages for different platforms, such as Java for Android applications and C# for Windows phone applications. Although versions of code written in different languages appear syntactically quite different from each other, they are intended to implement the same software and typically contain many code snippets that implement similar functionalities, which we call cross-language clones. When the version of code in one language evolves according to changing functionality requirements and/or bug fixes, its cross-language clones may also need be changed to maintain consistent implementations for the same functionality. Thus, it is needed to have automated ways to locate and track cross-language clones within the evolving software. In the literature, approaches for detecting cross-language clones are only for languages that share a common intermediate language (such as the .NET language family) because they are built on techniques for detecting single-language clones. To extend the capability of cross-language clone detection to more diverse kinds of languages, we propose a novel automated approach, CLCMiner, without the need of an intermediate language. It mines such clones from revision histories, based on our assumption that revisions to different versions of code implemented in different languages may naturally reflect how programmers change cross-language clones in practice, and that similarities among the revisions (referred to as clones in diffs or diff clones) may indicate actual similar code. We have implemented a prototype and applied it to ten open source projects implementations in both Java and C#. The reported clones that occur in revision histories are of high precisions (89% on average) and recalls (95% on average). Compared with token-based code clone detection tools that can treat code as plain texts, our tool can detect significantly more cross-language clones. All the evaluation results demonstrate the feasibility of revision-history based techniques for detecting cross-language clones without intermediates and point to promising future work.
Automatically Extracting Parallel Sentences from Wikipedia Using Sequential Matching of Language Resources
Juryong CHEON Youngjoong KO

LETTER-Natural Language Processing

Pubricized:
2016/11/11
Vol:
E100-D No:2
Page(s):
405-408
In this paper, we propose a method to find similar sentences based on language resources for building a parallel corpus between English and Korean from Wikipedia. We use a Wiki-dictionary consisted of document titles from the Wikipedia and bilingual example sentence pairs from Web dictionary instead of traditional machine readable dictionary. In this way, we perform similarity calculation between sentences using sequential matching of the language resources, and evaluate the extracted parallel sentences. In the experiments, the proposed parallel sentences extraction method finally shows 65.4% of F1-score.
Synthesis and Automatic Layout of Resistive Digital-to-Analog Converter Based on Mixed-Signal Slice Cell
Mitsutoshi SUGAWARA Kenji MORI Zule XU Masaya MIYAHARA Kenichi OKADA Akira MATSUZAWA

PAPER

Vol:
E99-A No:12
Page(s):
2435-2443
We propose a synthesis and automatic layout method for mixed-signal circuits with high regularity. As the first step of this research, a resistive digital-to-analog converter (RDAC) is presented. With a size calculation routine, the area of this RDAC is minimized while satisfying the required matching precision without any optimization loops. We propose to partition the design into slices comprising of both analog and digital cells. These cells are programmed to be synthesized as similar as custom P-Cells based on the calculation above, and automatically laid out to form one slice cell. To synthesize digital circuits, without using digital standard cell library, we propose a versatile unit digital block consisting of 8 transistors. With one or several blocks, the transistors' interconnections are programmed in the units to realize various logic gates. By using this block, the slice shapes are aligned so that the layout space in between the slices are minimized. The proposed mixed-signal slice-based partition facilitates the place-and-route of the whole RDAC. The post-layout simulation shows that the generated 9-bit RDAC achieves 1GHz sampling frequency, -0.11/0.09 and -0.30/0.75 DNL and INL, respectively, 3.57mW power consumption, and 0.0038mm2 active area.
A Morpheme-Based Weighting for Chinese-Mongolian Statistical Machine Translation
Zhenxin YANG Miao LI Lei CHEN Kai SUN

LETTER-Natural Language Processing

Pubricized:
2016/08/18
Vol:
E99-D No:11
Page(s):
2843-2846
In this paper, a morpheme-based weighting and its integration method are proposed as a smoothing method to alleviate the data sparseness in Chinese-Mongolian statistical machine translation (SMT). Besides, we present source-side reordering as the pre-processing model to verify the extensibility of our method. Experi-mental results show that the morpheme-based weighting can substantially improve the translation quality.
N-gram Approximation of Latent Words Language Models for Domain Robust Automatic Speech Recognition Open Access
Ryo MASUMURA Taichi ASAMI Takanobu OBA Hirokazu MASATAKI Sumitaka SAKAUCHI Satoshi TAKAHASHI

PAPER-Language modeling

Pubricized:
2016/07/19
Vol:
E99-D No:10
Page(s):
2462-2470
This paper aims to improve the domain robustness of language modeling for automatic speech recognition (ASR). To this end, we focus on applying the latent words language model (LWLM) to ASR. LWLMs are generative models whose structure is based on Bayesian soft class-based modeling with vast latent variable space. Their flexible attributes help us to efficiently realize the effects of smoothing and dimensionality reduction and so address the data sparseness problem; LWLMs constructed from limited domain data are expected to robustly cover unknown multiple domains in ASR. However, the attribute flexibility seriously increases computation complexity. If we rigorously compute the generative probability for an observed word sequence, we must consider the huge quantities of all possible latent word assignments. Since this is computationally impractical, some approximation is inevitable for ASR implementation. To solve the problem and apply this approach to ASR, this paper presents an n-gram approximation of LWLM. The n-gram approximation is a method that approximates LWLM as a simple back-off n-gram structure, and offers LWLM-based robust one-pass ASR decoding. Our experiments verify the effectiveness of our approach by evaluating perplexity and ASR performance in not only in-domain data sets but also out-of-domain data sets.
Investigation of Combining Various Major Language Model Technologies including Data Expansion and Adaptation Open Access
Ryo MASUMURA Taichi ASAMI Takanobu OBA Hirokazu MASATAKI Sumitaka SAKAUCHI Akinori ITO

PAPER-Language modeling

Pubricized:
2016/07/19
Vol:
E99-D No:10
Page(s):
2452-2461
This paper aims to investigate the performance improvements made possible by combining various major language model (LM) technologies together and to reveal the interactions between LM technologies in spontaneous automatic speech recognition tasks. While it is clear that recent practical LMs have several problems, isolated use of major LM technologies does not appear to offer sufficient performance. In consideration of this fact, combining various LM technologies has been also examined. However, previous works only focused on modeling technologies with limited text resources, and did not consider other important technologies in practical language modeling, i.e., use of external text resources and unsupervised adaptation. This paper, therefore, employs not only manual transcriptions of target speech recognition tasks but also external text resources. In addition, unsupervised LM adaptation based on multi-pass decoding is also added to the combination. We divide LM technologies into three categories and employ key ones including recurrent neural network LMs or discriminative LMs. Our experiments show the effectiveness of combining various LM technologies in not only in-domain tasks, the subject of our previous work, but also out-of-domain tasks. Furthermore, we also reveal the relationships between the technologies in both tasks.
Detecting Logical Inconsistencies by Clustering Technique in Natural Language Requirements
Satoshi MASUDA Tohru MATSUODANI Kazuhiko TSUDA

PAPER

Pubricized:
2016/07/06
Vol:
E99-D No:9
Page(s):
2210-2218
In the early phases of the system development process, stakeholders exchange ideas and describe requirements in natural language. Requirements described in natural language tend to be vague and include logical inconsistencies, whereas logical consistency is the key to raising the quality and lowering the cost of system development. Hence, it is important to find logical inconsistencies in the whole requirements at this early stage. In verification and validation of the requirements, there are techniques to derive logical formulas from natural language requirements and evaluate their inconsistencies automatically. Users manually chunk the requirements by paragraphs. However, paragraphs do not always represent logical chunks. There can be only one logical chunk over some paragraphs on the other hand some logical chunks in one paragraph. In this paper, we present a practical approach to detecting logical inconsistencies by clustering technique in natural language requirements. Software requirements specifications (SRSs) are the target document type. We use k-means clustering to cluster chunks of requirements and develop semantic role labeling rules to derive “conditions” and “actions” as semantic roles from the requirements by using natural language processing. We also construct an abstraction grammar to transform the conditions and actions into logical formulas. By evaluating the logical formulas with input data patterns, we can find logical inconsistencies. We implemented our approach and conducted experiments on three case studies of requirements written in natural English. The results indicate that our approach can find logical inconsistencies.
Sentence Similarity Computational Model Based on Information Content
Hao WU Heyan HUANG

PAPER-Natural Language Processing

Pubricized:
2016/03/14
Vol:
E99-D No:6
Page(s):
1645-1652
Sentence similarity computation is an increasingly important task in applications of natural language processing such as information retrieval, machine translation, text summarization and so on. From the viewpoint of information theory, the essential attribute of natural language is that the carrier of information and the capacity of information can be measured by information content which is already successfully used for word similarity computation in simple ways. Existing sentence similarity methods don't emphasize the information contained by the sentence, and the complicated models they employ often need using empirical parameters or training parameters. This paper presents a fully unsupervised computational model of sentence semantic similarity. It is also a simply and straightforward model that neither needs any empirical parameter nor rely on other NLP tools. The method can obtain state-of-the-art experimental results which show that sentence similarity evaluated by the model is closer to human judgment than multiple competing baselines. The paper also tests the proposed model on the influence of external corpus, the performance of various sizes of the semantic net, and the relationship between efficiency and accuracy.
A Method for Extraction of Future Reference Sentences Based on Semantic Role Labeling
Yoko NAKAJIMA Michal PTASZYNSKI Hirotoshi HONMA Fumito MASUI

PAPER-Natural Language Processing

Pubricized:
2015/11/18
Vol:
E99-D No:2
Page(s):
514-524
In everyday life, people use past events and their own knowledge in predicting probable unfolding of events. To obtain the necessary knowledge for such predictions, newspapers and the Internet provide a general source of information. Newspapers contain various expressions describing past events, but also current and future events, and opinions. In our research we focused on automatically obtaining sentences that make reference to the future. Such sentences can contain expressions that not only explicitly refer to future events, but could also refer to past or current events. For example, if people read a news article that states “In the near future, there will be an upward trend in the price of gasoline,” they may be likely to buy gasoline now. However, if the article says “The cost of gasoline has just risen 10 yen per liter,” people will not rush to buy gasoline, because they accept this as reality and may expect the cost to decrease in the future. In the following study we firstly investigate future reference sentences in newspapers and Web news. Next, we propose a method for automatic extraction of such sentences by using semantic role labels, without typical approaches (temporal expressions, etc.). In a series of experiments, we extract semantic role patterns from future reference sentences and examine the validity of the extracted patterns in classification of future reference sentences.
Unsupervised Learning of Continuous Density HMM for Variable-Length Spoken Unit Discovery
Meng SUN Hugo VAN HAMME Yimin WANG Xiongwei ZHANG

LETTER-Speech and Hearing

Pubricized:
2015/10/21
Vol:
E99-D No:1
Page(s):
296-299
Unsupervised spoken unit discovery or zero-source speech recognition is an emerging research topic which is important for spoken document analysis of languages or dialects with little human annotation. In this paper, we extend our earlier joint training framework for unsupervised learning of discrete density HMM to continuous density HMM (CDHMM) and apply it to spoken unit discovery. In the proposed recipe, we first cluster a group of Gaussians which then act as initializations to the joint training framework of nonnegative matrix factorization and semi-continuous density HMM (SCDHMM). In SCDHMM, all the hidden states share the same group of Gaussians but with different mixture weights. A CDHMM is subsequently constructed by tying the top-N activated Gaussians to each hidden state. Baum-Welch training is finally conducted to update the parameters of the Gaussians, mixture weights and HMM transition probabilities. Experiments were conducted on word discovery from TIDIGITS and phone discovery from TIMIT. For TIDIGITS, units were modeled by 10 states which turn out to be strongly related to words; while for TIMIT, units were modeled by 3 states which are likely to be phonemes.
Register-Based Process Virtual Machine Acceleration Using Hardware Extension with Hybrid Execution
Surachai THONGKAEW Tsuyoshi ISSHIKI Dongju LI Hiroaki KUNIEDA

PAPER-High-Level Synthesis and System-Level Design

Vol:
E98-A No:12
Page(s):
2505-2518
The Process Virtual Machine (VM) is typical software that runs applications inside operating systems. Its purpose is to provide a platform-independent programming environment that abstracts away details of the underlying hardware, operating system and allows bytecodes (portable code) to be executed in the same way on any other platforms. The Process VMs are implemented using an interpreter to interpret bytecode instead of direct execution of host machine codes. Thus, the bytecode execution is slower than those of the compiled programming language execution. Several techniques including our previous paper, the “Fetch/Decode Hardware Extension”, have been proposed to speed up the interpretation of Process VMs. In this paper, we propose an additional methodology, the “Hardware Extension with Hybrid Execution” to further enhance the performance of Process VMs interpretation and focus on Register-based model. This new technique provides an additional decoder which can classify bytecodes into either simple or complex instructions. With “Hybrid Execution”, the simple instruction will be directly executed on hardware of native processor. The complex instruction will be emulated by the “extra optimized bytecode software handler” of native processor. In order to eliminate the overheads of retrieving and storing operand on memory, we utilize the physical registers instead of (low address) virtual registers. Moreover, the combination of 3 techniques: Delay scheduling, Mode predictor HW and Branch/goto controller can eliminate all of the switching mode overheads between native mode and bytecode mode. The experimental results show the improvements of execution speed on the Arithmetic instructions, loop & conditional instructions and method invocation & return instructions can be achieved up to 16.9x, 16.1x and 3.1x respectively. The approximate size of the proposed hardware extension is 0.04mm2 (or equivalent to 14.81k gates) and consumes an additional power of only 0.24mW. The stated results are obtained from logic synthesis using the TSMC 90nm technology @ 200MHz.
Speech Recognition of English by Japanese Using Lexicon Represented by Multiple Reduced Phoneme Sets
Xiaoyun WANG Seiichi YAMAMOTO

PAPER-Speech and Hearing

Pubricized:
2015/09/10
Vol:
E98-D No:12
Page(s):
2271-2279
Recognition of second language (L2) speech is still a challenging task even for state-of-the-art automatic speech recognition (ASR) systems, partly because pronunciation by L2 speakers is usually significantly influenced by the mother tongue of the speakers. The authors previously proposed using a reduced phoneme set (RPS) instead of the canonical one of L2 when the mother tongue of speakers is known, and demonstrated that this reduced phoneme set improved the recognition performance through experiments using English utterances spoken by Japanese. However, the proficiency of L2 speakers varies widely, as does the influence of the mother tongue on their pronunciation. As a result, the effect of the reduced phoneme set is different depending on the speakers' proficiency in L2. In this paper, the authors examine the relation between proficiency of speakers and a reduced phoneme set customized for them. The experimental results are then used as the basis of a novel speech recognition method using a lexicon in which the pronunciation of each lexical item is represented by multiple reduced phoneme sets, and the implementation of a language model most suitable for that lexicon is described. Experimental results demonstrate the high validity of the proposed method.
One-Step Error Detection and Correction Approach for Voice Word Processor
Junhwi CHOI Seonghan RYU Kyusong LEE Gary Geunbae LEE

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2015/05/20
Vol:
E98-D No:8
Page(s):
1517-1525
We propose a one-step error detection and correction interface for a voice word processor. This correction interface performs analysis region detection, user intention understanding and error correction utterance recognition, all from a single user utterance input. We evaluate the performance of each component first, and then compare the effectiveness of our interface to two previous interfaces. Our evaluation demonstrates that each component is technically superior to the baselines and that our one-step error detection and correction method yields an error correction interface that is more convenient and natural than the two previous interfaces.
A Linguistics-Driven Approach to Statistical Parsing for Low-Resourced Languages
Prachya BOONKWAN Thepchai SUPNITHI

PAPER

Pubricized:
2015/01/21
Vol:
E98-D No:5
Page(s):
1045-1052
Developing a practical and accurate statistical parser for low-resourced languages is a hard problem, because it requires large-scale treebanks, which are expensive and labor-intensive to build from scratch. Unsupervised grammar induction theoretically offers a way to overcome this hurdle by learning hidden syntactic structures from raw text automatically. The accuracy of grammar induction is still impractically low because frequent collocations of non-linguistically associable units are commonly found, resulting in dependency attachment errors. We introduce a novel approach to building a statistical parser for low-resourced languages by using language parameters as a guide for grammar induction. The intuition of this paper is: most dependency attachment errors are frequently used word orders which can be captured by a small prescribed set of linguistic constraints, while the rest of the language can be learned statistically by grammar induction. We then show that covering the most frequent grammar rules via our language parameters has a strong impact on the parsing accuracy in 12 languages.
Phoneme Set Design for Speech Recognition of English by Japanese
Xiaoyun WANG Jinsong ZHANG Masafumi NISHIDA Seiichi YAMAMOTO

PAPER-Speech and Hearing

Pubricized:
2014/10/01
Vol:
E98-D No:1
Page(s):
148-156
This paper describes a novel method to improve the performance of second language speech recognition when the mother tongue of users is known. Considering that second language speech usually includes less fluent pronunciation and more frequent pronunciation mistakes, the authors propose using a reduced phoneme set generated by a phonetic decision tree (PDT)-based top-down sequential splitting method instead of the canonical one of the second language. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based English CALL system. Experiments show that a speech recognizer achieved higher recognition accuracy with the reduced phoneme set than with the canonical phoneme set.
An Approach for Synthesizing Intelligible State Machine Models from Choreography Using Petri Nets
Toshiyuki MIYAMOTO Yasuwo HASEGAWA Hiroyuki OIMURA

PAPER-Formal Construction

Vol:
E97-D No:5
Page(s):
1171-1180
A service-oriented architecture builds the entire system using a combination of independent software components. Such an architecture can be applied to a wide variety of computer systems. The problem of synthesizing service implementation models from choreography representing the overall specifications of service interaction is known as the choreography realization problem. In automatic synthesis, software models should be simple enough to be easily understood by software engineers. In this paper, we discuss a semi-formal method for synthesizing hierarchical state machine models for the choreography realization problem. The proposed method is evaluated using metrics for intelligibility.
Cross-Lingual Phone Mapping for Large Vocabulary Speech Recognition of Under-Resourced Languages
Van Hai DO Xiong XIAO Eng Siong CHNG Haizhou LI

PAPER-Speech and Hearing

Vol:
E97-D No:2
Page(s):
285-295
This paper presents a novel acoustic modeling technique of large vocabulary automatic speech recognition for under-resourced languages by leveraging well-trained acoustic models of other languages (called source languages). The idea is to use source language acoustic model to score the acoustic features of the target language, and then map these scores to the posteriors of the target phones using a classifier. The target phone posteriors are then used for decoding in the usual way of hybrid acoustic modeling. The motivation of such a strategy is that human languages usually share similar phone sets and hence it may be easier to predict the target phone posteriors from the scores generated by source language acoustic models than to train from scratch an under-resourced language acoustic model. The proposed method is evaluated using on the Aurora-4 task with less than 1 hour of training data. Two types of source language acoustic models are considered, i.e. hybrid HMM/MLP and conventional HMM/GMM models. In addition, we also use triphone tied states in the mapping. Our experimental results show that by leveraging well trained Malay and Hungarian acoustic models, we achieved 9.0% word error rate (WER) given 55 minutes of English training data. This is close to the WER of 7.9% obtained by using the full 15 hours of training data and much better than the WER of 14.4% obtained by conventional acoustic modeling techniques with the same 55 minutes of training data.
A Note on Pcodes of Partial Words
Tetsuo MORIYA Itaru KATAOKA

LETTER-Fundamentals of Information Systems

Vol:
E97-D No:1
Page(s):
139-141
In this paper, we study partial words in relation with pcodes, compatibility, and containment. First, we introduce C⊂(L), the set of all partial words contained by elements of L, and C⊃(L), the set of all partial words containing elements of L, for a set L of partial words. We discuss the relation between C(L), the set of all partial words compatible with elements of the set L, C⊂(L), and C⊃(L). Next, we consider the condition for C(L), C⊂(L), and C⊃(L) to be a pcode when L is a pcode. Furthermore, we introduce some classes of pcodes. An infix pcode and a comma-free pcode are defined, and the inclusion relation among these classes is established.

41-60hit(282hit)

Keyword Search Result

[Keyword] language(282hit)

Phoneme Set Design Based on Integrated Acoustic and Linguistic Features for Second Language Speech Recognition

Improving Question Retrieval in cQA Services Using a Dependency Parser

CLCMiner: Detecting Cross-Language Clones without Intermediates

Automatically Extracting Parallel Sentences from Wikipedia Using Sequential Matching of Language Resources

Synthesis and Automatic Layout of Resistive Digital-to-Analog Converter Based on Mixed-Signal Slice Cell

A Morpheme-Based Weighting for Chinese-Mongolian Statistical Machine Translation

N-gram Approximation of Latent Words Language Models for Domain Robust Automatic Speech Recognition Open Access

Investigation of Combining Various Major Language Model Technologies including Data Expansion and Adaptation Open Access

Detecting Logical Inconsistencies by Clustering Technique in Natural Language Requirements

Sentence Similarity Computational Model Based on Information Content

A Method for Extraction of Future Reference Sentences Based on Semantic Role Labeling

Unsupervised Learning of Continuous Density HMM for Variable-Length Spoken Unit Discovery

Register-Based Process Virtual Machine Acceleration Using Hardware Extension with Hybrid Execution

Speech Recognition of English by Japanese Using Lexicon Represented by Multiple Reduced Phoneme Sets

One-Step Error Detection and Correction Approach for Voice Word Processor

A Linguistics-Driven Approach to Statistical Parsing for Low-Resourced Languages

Phoneme Set Design for Speech Recognition of English by Japanese

An Approach for Synthesizing Intelligible State Machine Models from Choreography Using Petri Nets

Cross-Lingual Phone Mapping for Large Vocabulary Speech Recognition of Under-Resourced Languages

A Note on Pcodes of Partial Words

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles