IEICE global.ieice.org Site

Keyword Search Result

[Keyword] SPE(2504hit)

2281-2300hit(2504hit)

Speech Recognition Using Function-Word N-Grams and Content-Word N-Grams
Ryosuke ISOTANI Shoichi MATSUNAGA Shigeki SAGAYAMA

PAPER

Vol:
E78-D No:6
Page(s):
692-697
This paper proposes a new stochastic language model for speech recognition based on function-word N-grams and content-word N-grams. The conventional word N-gram models are effective for speech recognition, but they represent only local constraints within a few successive words and lack the ability to capture global syntactic or semantic relationships between words. To represent more global constraints, the proposed language model gives the N-gram probabilities of word sequences, with attention given only to function words or to content words. The sequences of function words and of content words are expected to represent syntactic and semantic constraints, respectively. Probabilities of function-word bigrams and content-word bigrams were estimated from a 10,000-sentence text database, and analysis using information theoretic measure showed that expected constraints were extracted appropriately. As an application of this model to speech recognition, a post-processor was constructed to select the optimum sentence candidate from a phrase lattice obtained by a phrase recognition system. The phrase candidate sequence with the highest total acoustic and linguistic score was sought by dynamic programming. The results of experiments carried out on the utterances of 12 speakers showed that the proposed method is more accurate than a CFG-based method, thus demonstrating its effectiveness in improving speech recognition performance.
Relationship among Recognition Rate, Rejection Rate and False Alarm Rate in a Spoken Word Recognition System
Atsuhiko KAI Seiichi NAKAGAWA

PAPER

Vol:
E78-D No:6
Page(s):
698-704
Detection of an unknown word or non-vocabulary word uttered by the user is necessary in realizing a practical spoken language user-interface. This paper describes the evaluation of an unknown word processing method for a subword unit based spoken word recognizer. We have assessed the relationship between the word recognition accuracy of a system and the detection rate of unknown words both by simulation and by experiment of the unknown word processing method. We found that the resultant detection accuracies using the unknown word processing are significantly influenced by the original word recognition accuracy while the degree of such effect depends on the vocabulary size.
Performance of Spread Spectrum Medical Telemetry System in a Sharing Frequency Band with Current Telemetry System
Masaki KYOSO Toshiaki TAKANE Akihiko UCHIYAMA

LETTER

Vol:
E78-B No:6
Page(s):
862-865
To make medical telemetry system more reliable in severe electromagnetic environment, we applied spread spectrum communication to ECG data transmission method. Spread spectrum communication system has shown superior performances to other systems, especially, in respect of anti-jamming, which allows it to share the frequency band with current telemetry systems. In this study, we show the characteristics of a spread spectrum transmitter when it is used in the same frequency band as a narrow-band transmitter. The result shows that the spread spectrum telemetry system can use the same frequency band permitted for medical telemetry system.
Characteristics of Multi-Layer Perceptron Models in Enhancing Degraded Speech
Thanh Tung LE John MASON Tadashi KITAMURA

PAPER

Vol:
E78-D No:6
Page(s):
744-750
A multi-layer perceptron (MLP) acting directly in the time-domain is applied as a speech signal enhancer, and the performance examined in the context of three common classes of degradation, namely low bit-rate CELP degradation is non-linear system degradation, additive noise, and convolution by a linear system. The investigation focuses on two topics: (i) the influence of non-linearities within the network and (ii) network topology, comparing single and multiple output structures. The objective is to examine how these characteristics influence network performance and whether this depends on the class of degradation. Experimental results show the importance of matching the enhancer to the class of degradation. In the case of the CELP coder the standard MLP with its inherently non-linear characteristics is shown to be consistently better than any equivalent linear structure (up to 3.2 dB compared with 1.6 dB SNR improvement). In contrast, when the degradation is from additive noise, a linear enhancer is always, superior.
All-Optical Timing Clock Extraction Using Multiple Wavelength Pumped Brillouin Amplifier
Hiroto KAWAKAMI Yutaka MIYAMOTO Tomoyoshi KATAOKA Kazuo HAGIMOTO

PAPER

Vol:
E78-B No:5
Page(s):
694-701
This paper discusses an all-optical tank circuit that uses the comb-shaped gain spectrum generated by a Brillouin amplifier. The theory of timing clock extraction is shown for two cases: with two gains and with three gains. In both cases, the waveform of the extracted timing clock is simulated. According to the simulation, unlike an ordinary tank circuit, the amplitude of the extracted clock is not constant even though the quality factor (Q) is infinite. The extracted clock is clearly influenced by the pattern of the original data stream if the Brillouin gain is finite. The ratio of the maximum extracted clock amplitude to the minimum extracted amplitude is calculated as a function of Brillouin gain. The detuning of the pump light frequency is also discussed. It induces not only changes in the Brillouin gain, but also phase shift in the amplified light. The relation between the frequency drift of the pump lights and the jitter of the extracted timing clock is shown, in both cases: two pump lights are used and three pump lights are used. It is numerically shown that when the all pump lights have the same frequency drift, i.e., their frequency separation is constant, the phase of the extracted clock is not influenced by the frequency drift of the pump lights. The operation principle is demonstrated at 5Gbit/s, 2.5Gbit/s, and 2Gbit/s using two pumping techniques. The parameters of quality factor and the suppression ratio in the baseband domain are measured. Q and the suppression ratio are found to be 160 and 28dB, respectively.
Identifying Strategies Using Decision Lists from Trace Information
Satoshi KOBAYASHI

PAPER-Machine Learning and Its Applications

Vol:
E78-D No:5
Page(s):
545-552
This paper concerns the issue of learning strategies for problem solvers from trace data. Many works on Explanation Based Learning have proposed methods for speeding up a given problem solver (or a Prolog program) by optimizing it on some subspace of problem instances with high probability of occurrences. However, in the current paper, we discuss the issue of identifying a target strategy exactly from trace data. Learning criterion used in this paper is the identification in the limit proposed by Gold. Further, we use the tree pattern language to represent preconditions of operators, and propose a class of strategies, called decision list strategies. One of the interesting features of our learning algorithm is the coupled use of state and operator sequence information of traces. Theoretically, we show that the proposed algorithm identifies some subclass of decision list strategies in the limit with the conjectures updated in polynomial time. Further, an experimental result on N-puzzle domain is presented.
A Formal Verification Algorithm for Pipelined Processors
Toru SHONAI Tsuguo SHIMIZU

PAPER-VLSI Design Technology and CAD

Vol:
E78-A No:5
Page(s):
618-631
We describe a formal verification algorithm for pipelined processors. This algorithm proves the equivalence between a processor's design and its specifications by using rewriting of recursive functions and a new type of mathematical induction: extended recursive induction. After the user indicates only selectors in the design, this algorithm can automatically prove processors having more than 10(1010) states. The algorithm is manuary applied to benchmark processors with pipelined control, and we discuss how data width, memory size, and the numbers of pipeline stages and instructions influence the computation cost of proving the correctness of the processors. Further, this algorithm can be used to generate a pipeline invariant.
Passive Sonar-Ranging System Based on Adaptive Filter Technique
Chang-Yu SUN Qi-Hu LI Takashi SOMA

PAPER-Digital Signal Processing

Vol:
E78-A No:5
Page(s):
594-599
A noise cancelling sonar-ranging system based on the adaptive filtering technique, which can automatically adapt itself to the changes in environmental noise-field and improve the passive sonar-ranging/goniometric precision, was introduced by this paper. In the meantime, the software and hardware design principle of the system using high speed VLSI (Very Large Scale Integrated) DSP (Digital Signal Processing) chips, and the practical test results were also presented. In comparison with the traditional ranging system, the system not only enhanced obviously the ranging precision but also possessed some more characteristics such as simple structure, rapid operation, large data-storage volume, easy programming, high reliability and so on.
Dynamic Terminations for Low-Power High-Speed Chip Interconnection in Portable Equipment
Takayuki KAWAHARA Masakazu AOKI Katsutaka KIMURA

PAPER-Digital Circuits

Vol:
E78-C No:4
Page(s):
404-413
Two types of dynamic termination, latch-type and RC-type, are useful for low-power high-speed chip interconnection where the transmission line is terminated only if the signal is changed. The gate of the termination MOS in the latch-type is driven by a feedback inverter, and that in the RC-type is driven by a differentiating signal through the resistor and capacitor. The power dissipation is 13% for the latch-type, and 11% for the RC-type in a DC termination scheme, and the overshoot is 32% for the latch-type, and 16% for the RC-type in an open scheme, both at a signal amplitude of 2 V. The RC-type is superior for signal swing as low as a 1 V. On the other hand, RC termination requires large capacitance, and thus high power. Diode termination is not effective for a small swing because of the large ON voltage of diodes.
High-Speed and Low-Power n⁺-p⁺ Double-Gate SOI CMOS
Kunihiro SUZUKI Tetsu TANAKA Yoshiharu TOSAKA Hiroshi HORIE Toshihiro SUGII

PAPER-Device Technology

Vol:
E78-C No:4
Page(s):
360-367
We propose and fabricate n+-p+ double-gate SOI MOSFETs for which threshold voltage is controlled by interaction between the two gates. Devices have excellent short channel immunity, dispite a low channel doping concentration of 1015 cm-3, and enable us to design a threshold voltage below 0.3 V while maintaining an almost ideal subthreshold swing. We demonstrated 27 ps CMOS inverter delay with a gate length of 0.19 µm, which is, to our knowledge, the lowest delay for this gate length despite rather a thick 9 nm gate oxide. This high performance is a result of the low threshold voltage and negligible drain capacitance. We also showed theoretically that we can design a 0.1 µm gate length device with an ideal subthreshold swing, and that we can expect less than 10 ps inverter delay at a supply voltage of 1 V.
High-Speed High-Density Self-Aligned PNP Technology for Low-Power Complementary Bipolar ULSIs
Katsuyoshi WASHIO Hiromi SHIMAMOTO Tohru NAKAMURA

PAPER-Device Technology

Vol:
E78-C No:4
Page(s):
353-359
A high-speed high-density self-aligned pnp technology for complementary bipolar ULSIs has been developed to achieve high-speed and low-power performance simultaneously. It is fully compatible with the npn process. A low sheet-resistance p+ buried layer and a low sheet-resistance extrinsic n+ polysilicon layer with U-grooved isolation enable the transistor size to be scaled down to about 20 µm2. Current gain of 85 with 4-V collector-emitter breakdown voltage was obtained without any leakage current arising from emitter-base forward tunneling or recombination, which indicates no extrinsic base encroachment problem. A shallow emitter junction depth of 45 nm and narrow base width of 30 nm, obtained by utilizing an optimized retrograded p-well, an arsenic-implanted intrinsic base, and emitter diffusion from BF2-implanted polysilicon, improve the maximum cutoff frequency to 35 GHz. The power dissipation of the pnp pull-down complementary emitter-follower ECL circuit with load capacitances is calculated to be reduced to 20-40% of a conventional ECL circuit.
A Fair and Wasteless Channel Assignment Protocol for Optical Dual Bus Networks
Shu LI Yasumitsu MIYAZAKI

PAPER

Vol:
E78-B No:4
Page(s):
539-545
The Distributed Queue Dual Bus protocol (DQDB) has been adopted as the metropolitan area network (MAN) standard by IEEE802.6 committee. Recently, the unfairness problem in the DQDB protocol, by which head stations benefit, has been pointed out. Although a fair bandwidth distribution among the stations is obtained by adding the so-called bandwidth balancing mechanism into the DQDB protocol (DQDB/BB), the DQDB/BB protocol leaves a portion of the available bandwidth unused, and it takes a considerable amount of time to converge to fair channel assignment. In this study, to overcome the drawbacks in DQDB and DQDB/BB, we introduce a new media access control protocol which is based on assigning each station a level according to some traffic information such as the queueing length, delay time etc. Only the station with the highest level is allowed to transmit. Through the operation of level assignment or the choice of level function, the transmission can be easily controlled in a distributed manner. This protocol is simple compared with DQDB/BB and can be implemented in the DQDB architecture. The simulation results show that the new protocol obtains not only fair throughput regardless of the distance between the stations, but also fair delay performance. In addition, the new protocol can easily provide preempty priority service through level assignment. The new protocol converges to fair distribution of the channel in the time required for only one or two round-trips. This is very fast compared with the DQDB/BB protocol.
A New Approach of Parsing and Search Based on the Divide and Conquer Strategy for Continuous Speech Recognition
Ming-Sheng WANG Satoshi IMAI

PAPER-Speech Processing and Acoustics

Vol:
E78-D No:4
Page(s):
455-465
In this paper, we report a new approach about parsing and searching problem for a given phonetic lattice. The approach is based on the Divide and Conquer (DC) strategy. By dividing the phonetic lattice, we first construct a PD-tree to represent this lattice, then, we parse through this PD-tree to identify the possible sentence which is supposed to be the speech utterance. Next, we propose a new search scheme called Downward Request (DR) search model to decrease the computation costs, and this search model gives us the optimal or N-best solutions. Experiments performed on Chinese speech recognition show us the good results.
Design and Construction of an Advisory Dialogue Database
Tadahiko KUMAMOTO Akira ITO Tsuyoshi EBINA

PAPER-Databases

Vol:
E78-D No:4
Page(s):
420-427
We are aming to develop a computer-based consultant system which helps novice computer users to achieve their task goals on computers through natural language dialogues. Our target is spoken Japanese. To develop effective methods for processing spoken Japanese, it is essential to analyze real dialogues and to find the characteristics of spoken Japanese. In this paper, we discuss the design problems associated with constructing a spoken dialogue database from the viewpoint of advisory dialogue collection, describe XMH (X-window-based electronic mail handling program) usage experiments made to collect advisory dialogues between novice XMH users and an expert consultant, and show the dialogue database we constructed from these dialogues. The main features of our database are as follows: (1) Our target dialogues were advisory ones. (2) The advisory dialogues were all related to the use of XMH that has a visual interface operated by a keyboard and a mouse. (3) The primary objective of the users was not to engage in dialogues but to achieve specific task goals using XMH. (4) Not only what the users said but also XMH operations performed by the users are included as dialogue elements. This kind of dialogue database is a very effective source for developing new methods for processing spoken language in multimodal consultant systems, and we have therefore made it available to the public. Based on our analysis of the database, we have already developed several effective methods such as a method for recognizing user's communicative intention from a transcript of spoken Japanese, and a method for controlling dialogues between a novice XMH user and the computer-based consultant system which we are developing. Also, we have proposed several response generation rules as the response strategy for the consultant system. We have developed an experimental consultant system by implementing the above methods and strategy.
A New Emitter-Follower Circuit for High-Speed and Low-Power ECL
Nagisa SASAKI Hisayasu SATO Kimio UEDA Koichiro MASHIKO Hiroshi SHIBATA

PAPER-Digital Circuits

Vol:
E78-C No:4
Page(s):
374-380
We propose a directly controlled emitter-follower circuit with a feedback type level stabilizer for low-voltage, low-power and high-speed bipolar ECL circuits. The emitter-follower circuit employs a current source structure that compensates speed and power for various supply voltage and temperature. The feedback controlled circuit with a small current source stabilizes 'High' level. At a power consumption of 1 mW/gate, the new circuit is 45% faster under the loaded condition (FO1, CL0.5 pF) and has 47% better load driving capability than conventional ECL gates.
Decomposable Termination of Composable Term Rewriting Systems
Masahito KURIHARA Azuma OHUCHI

PAPER-Algorithm and Computational Complexity

Vol:
E78-D No:4
Page(s):
314-320
We extend the theorem of Gramlich on modular termination of term rewriting systems, by relaxing the disjointness condition and introducing the composability instead. More precisely, we prove that if R1, R-1 are composable, terminating term rewriting systems such that their union is nonterminating then for some a {1, -1}, Ra OR is nonterminating and R-aRa is Fa-lifting. Here, OR is defined to be the special system {or(x, y) x, or(x, y) y}, Fa is the set of function symbols associated with Ra, and an Fa-lifting system contains a rule which has either a variable or a symbol from Fa at the leftmost position of its right-hand side. The extended theorem is stronger than the original one in that it relaxed the disjointness and constructor-sharing conditions and allowed the two systems to share defined symbols in common under the restriction of composability. The corollaries of the theorem show several sufficient conditions for decomposability of termination, which are useful for proving termination of term rewriting systems defined by combination of several composable modules.
Group Communications Algorithm for Dynamically Updating in Distributed Systems
Hiroaki HIGAKI

PAPER-Computer Networks

Vol:
E78-D No:4
Page(s):
444-454
This paper proposes a novel updating technique, dynamically updating, for achieving extension or modification of functions in a distributed system. Usual updating technique requires synchronous suspension for multiple processes for avoiding unspecified reception caused by the conflict of different versions of processes. Thus, this technique needs very high overhead and it must restrict the types of distributed systems, to which it can be applied, to RPC (remote procedure call) type or client-server type. Using the proposed dynamically updating technique, updating management can be invoked asynchronously by each process with assurance of correct execution of the system, i.e., the system can cope with the effect of unspecified reception caused by mixture of different version processes. Therefore, low overhead updating can be achieved in partner type distributed systems, that is more general type including communications systems or computer networks. Dynamically updating technique is implemented by using a novel distributed algorithm that consists of group communication, checkpoint setting, and rollback recovery. By using the algorithm proposed in this paper, rollback recovery can be achieved with the lowest overhead, i.e., a set of checkpoint determines the last global state for consistent rollback recovery and a set of processes that need to rollback simultaneously is the smallest one. This paper also proves the correctness of the proposed algorithm.
On the Edge Importance Using Its Traffic Based on a Distribution Function along Shortest Paths in a Network
Peng CHENG Shigeru MASUYAMA

LETTER-Graphs, Networks and Matroids

Vol:
E78-A No:3
Page(s):
440-443
We model a road network as a directed graph G(V,E) with a source s and a sink t, where each edge e has a positive length l(e) and each vertex v has a distribution function αv with respect to the traffic entering and leaving v. This paper proposes a polynomial time algorithm for evaluating the importance of each edge e E whicn is defined to be the traffic f(e) passing through e in order to assign the required traffic Fst(0) from s to t along only shortest s-t paths in accordance with the distribution function αv at each vertex v.
Temporal Characteristics of Utterance Units and Topic Structure of Spoken Dialogs
Kazuyuki TAKAGI Shuichi ITAHASHI

PAPER-Speech Processing

Vol:
E78-D No:3
Page(s):
269-276
There are various difficulties in processing spoken dialogs because of acoustic, phonetic, and grammatical ill-formedness, and because of interactions among participants. This paper describes temporal characteristics of utterances in human-human task-oriented dialogs and interactions between the participants, analyzed in relation to the topic structure of the dialog. We analyzed 12 task-oriented simulated dialogs of ASJ continuous speech corpus conducted by 13 different participants whose total length being 66 minutes. Speech data was segmented into utterance units each of which is a speech interval segmented by pauses. There were 3876 utterance units, and 38.9% of them were interjections, fillers, false starts and chiming utterances. Each dialog consisted of 6 to 15 topic segments in each of which participants exchange specific information of the task. Eighty-six out of 119 new topic segments started with interjectory utterances and filled pauses. It was found that the durations of turn-taking interjections and fillers including the preceding silent pause were significantly longer in topic boundaries than the other positions. The results indicate that the duration of interjection words and filled pauses is a sign of a topic shift in spoken dialogs. In natural conversations, participants' speaking modes change dynamically as the conversation develops. Response time of both client and agent role speakers became shorter as the dialog proceeded. This indicates that interactions between the participants become active as the dialog proceeds. Speech rate was also affected by the dialog structure. It was generally fast in the initiating and terminating parts where most utterances are of fixed expressions, and slow in topic segments of the body part of the dialog where both client and agent participants stalled to speak in order to retrieve task knowledge. The results can be utilized in man-machine dialog systems, e.g., in order to detect topic shifts of a dialog, and to make the speech interface of dialog systems more natural to a human participant.
Signature Pairs for Direct-Sequence Spread-Spectrum Multiple Access Communication Systems
Guu-Chang YANG

LETTER-Radio Communication

Vol:
E78-B No:3
Page(s):
420-423
A key element in the CDMA transmission is DS spreading. Spreading in a DS/SSMA system are provided in two categories-synchronization and data. For synchronization sequences, good auto-correlation and cross-correlation properties are required in order to guarantee fast acquistion with a minimum false alarm probability. On the other hand, the auto-correlation property may not be so important in data spreading since synchronization is obtained by synchronization spreading. In this paper we provide a set of synchronization sequences and a set of data sequences--each a set of binary N-tuples--that have the necessary correlation constraints.

2281-2300hit(2504hit)

Keyword Search Result

[Keyword] SPE(2504hit)

Speech Recognition Using Function-Word N-Grams and Content-Word N-Grams

Relationship among Recognition Rate, Rejection Rate and False Alarm Rate in a Spoken Word Recognition System

Performance of Spread Spectrum Medical Telemetry System in a Sharing Frequency Band with Current Telemetry System

Characteristics of Multi-Layer Perceptron Models in Enhancing Degraded Speech

All-Optical Timing Clock Extraction Using Multiple Wavelength Pumped Brillouin Amplifier

Identifying Strategies Using Decision Lists from Trace Information

A Formal Verification Algorithm for Pipelined Processors

Passive Sonar-Ranging System Based on Adaptive Filter Technique

Dynamic Terminations for Low-Power High-Speed Chip Interconnection in Portable Equipment

High-Speed and Low-Power n⁺-p⁺ Double-Gate SOI CMOS

High-Speed High-Density Self-Aligned PNP Technology for Low-Power Complementary Bipolar ULSIs

A Fair and Wasteless Channel Assignment Protocol for Optical Dual Bus Networks

A New Approach of Parsing and Search Based on the Divide and Conquer Strategy for Continuous Speech Recognition

Design and Construction of an Advisory Dialogue Database

A New Emitter-Follower Circuit for High-Speed and Low-Power ECL

Decomposable Termination of Composable Term Rewriting Systems

Group Communications Algorithm for Dynamically Updating in Distributed Systems

On the Edge Importance Using Its Traffic Based on a Distribution Function along Shortest Paths in a Network

Temporal Characteristics of Utterance Units and Topic Structure of Spoken Dialogs

Signature Pairs for Direct-Sequence Spread-Spectrum Multiple Access Communication Systems

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles