The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] voice(140hit)

101-120hit(140hit)

  • A Context Clustering Technique for Average Voice Models

    Junichi YAMAGISHI  Masatsune TAMURA  Takashi MASUKO  Keiichi TOKUDA  Takao KOBAYASHI  

     
    PAPER-Speech Synthesis and Prosody

      Vol:
    E86-D No:3
      Page(s):
    534-542

    This paper describes a new context clustering technique for average voice model, which is a set of speaker independent speech synthesis units. In the technique, we first train speaker dependent models using multi-speaker speech database, and then construct a decision tree common to these speaker dependent models for context clustering. When a node of the decision tree is split, only the context related questions which are applicable to all speaker dependent models are adopted. As a result, every node of the decision tree always has training data of all speakers. After construction of the decision tree, all speaker dependent models are clustered using the common decision tree and a speaker independent model, i.e., an average voice model is obtained by combining speaker dependent models. From the results of subjective tests, we show that the average voice models trained using the proposed technique can generate more natural sounding speech than the conventional average voice models.

  • Comparative Assessment of Test Signals Used for Measuring Residual Echo Characteristics

    Nobuhiko KITAWAKI  Takeshi YAMADA  Futoshi ASANO  

     
    PAPER-Network

      Vol:
    E86-B No:3
      Page(s):
    1102-1108

    Appropriate test signals defined by formula or generated by algorithm are used for measuring objective QoS (Quality of Services) for voice operated telecommunication devices such as telephone and speech codec (coder-decoder). However, that for measuring residual echo characteristics in hands-free telecommunications equipped with acoustic echo canceller is under study in ITU-T Recommendation G.167. This paper describes comparative assessment of test signals for measurement of residual echo characteristics. In hands-free telecommunications, acoustical echo canceller has been developed to remove a room echo signal through the loudspeaker to the microphone in the receiving end. Performance of the echo canceller system is evaluated by residual echo characteristics expressed in echo return loss enhancement (ERLE). The ERLE can be conventionally measured by putting white noise into the echo canceller system. However, white noise is not adequate as the test signal for measuring the performance of the echo canceller, since the performance may depend on the characteristics of input test signal, and the characteristics of the white noise differ from those of real voice. Therefore, this paper discusses appropriate characteristics of real voice required for objective quality evaluation of echo canceller system. The test signals used for this verification tests were real voice (RV), white noise (WN), frequency weighted noise (FWN), artificial voice (AV), and composite source signal (CSS) depending on the approximation of real voice characteristics. As the comparative assessment results, the ERLE characteristics measured by artificial voice conforming to ITU-T Recommendation P.50 having average characteristics of real voices in time and frequency domains are almost equivalent to those of real voice and best among those test signals. It is concluded that artificial voice P.50 is satisfied with measurement of residual echo characteristics.

  • Decreasing Suspension Time for Fast Moving Data Calls in an Integrated Micro-Cellular Network with Preemption

    Gaute LAMBERTSEN  Takahiko YAMADA  

     
    PAPER

      Vol:
    E85-B No:10
      Page(s):
    2012-2020

    In this paper, we propose and evaluate a new channel assignment scheme for a micro-cellular network integrating data and conversational services. The channel assignment scheme combines handover processing depending on terminal speed with a preemptive scheme. High-speed terminals take over the channels of data terminals upon entering a full cell, while the data terminals are put in a queue until new resources are available. Simulating several variations of the scheme, allowing both fast moving data and voice terminals to preempt data terminals yielded the best result. Suspension time for fast moving data terminals was reduced dramatically, reducing the disadvantage caused by a high number of handovers. The cost was a small increase in blocking probability for new terminals.

  • QoS Evaluation of VoIP Communication Employing Self-Organizing Neural Network

    Masao MASUGI  

     
    LETTER-Internet

      Vol:
    E85-B No:9
      Page(s):
    1867-1871

    This paper describes a QoS evaluation method for VoIP communications using a self-organizing neural network. Based on measurements in real environments, evaluation results confirmed that our method can effectively display total QoS level composed of several QoS-related factors such as PSQM+ and end-to-end delay.

  • Performance Modeling and Analysis of SIP-T Signaling System in Carrier Class Packet Telephony Network for Next Generation Networks

    Peir-Yuan WANG  Jung-Shyr WU  

     
    PAPER-Network

      Vol:
    E85-B No:8
      Page(s):
    1572-1584

    This paper presents the performance modeling, analysis, and simulation of SIP-T (Session Initiation Protocol for Telephones) signaling system in carrier class packet telephony network for NGN (Next Generation Networks). Until recently, fone of the greatest challenges in the migration from existing PSTN (Public Switched Telephone Network) toward NGN is to build a carrier class packet telephony network that preserves the ubiquity, quality, and reliability of PSTN services while allowing the greatest flexibility for use of new packet telephony technology. The SIP-T signaling system defined in IETF (Internet Engineering Task Force) draft is a mechanism that uses SIP (Session Initiation Protocol) to facilitate the interconnection of PSTN with carrier class packet telephony network. Based on IETF, the SIP-T signaling system not only promises scalability, flexibility, and interoperability with PSTN but also provides call control function of MGC (Media Gateway Controller) to set up, tear down, and manage VoIP (Voice over IP) calls in carrier class packet telephony network. In this paper, we derive the buffer size, the mean of queueing delay, and the variance of queueing delay of SIP-T signaling system that are the major performance evaluation parameters for improving QoS (Quality of Service) and system performance of MGC in carrier class packet telephony network focused on toll by-pass or tandem by-pass of PSTN. First, we assume a mathematical model of the M/G/1 queue with non-preemptive priority assignment to represent SIP-T signaling system. Second, we derive the formulas of buffer size, queueing delay, and delay variation for the non-preemptive priority queue by queueing theory respectively. Besides, some numerical examples of buffer size, queueing delay, and delay variation are presented as well. Finally, the theoretical estimates are shown to be in excellent consistence with simulation results.

  • Voice Conversion Using Low Dimensional Vector Mapping

    Ki-Seung LEE  Won DOH  Dae-Hee YOUN  

     
    PAPER-Speech and Hearing

      Vol:
    E85-D No:8
      Page(s):
    1297-1305

    In this paper, a new voice personality transformation algorithm which uses the vocal tract characteristics and pitch period as feature parameters is proposed. The vocal tract transfer function is divided into time-invariant and time-varying parts. Conversion rules for the time-varying part are constructed by the classified-linear transformation matrix based on soft-clustering techniques for LPC cepstrum expressed in KL (Karhunen-Loève) coefficients. An excitation signal containing prosodic information is transformed by average pitch ratio. In order to improve the naturalness, transformation on the excitation signal is separately applied to voiced and unvoiced bands to preserve the overall spectral structure. Objective tests show that the distance between the LPC cepstrum of a target speaker and that of the speech synthesized using the proposed method is reduced by about 70% compared with the distance between the target speaker's LPC cepstrum and the source speaker's. Also, subjective listening tests show that 60-70% of listeners identify the transformed speech as the target speaker's.

  • Performance Analysis of Bulk Handoff in Integrated Voice/Data Wireless Networks

    Youl-Kyu SUH  Sung-Hong WIE  Hyun-Ho CHOI  Dong-Ho CHO  

     
    LETTER-Wireless Communication Technology

      Vol:
    E85-B No:7
      Page(s):
    1396-1401

    In this paper, we analyze the performance of a bulk handoff scheme for mixed traffic in integrated voice/data wireless mobile networks in which new and handoff voice/data calls are accepted based on prioritization of handoff requests. If fewer channels than handoff calls are available in the target cell, some handoff calls are terminated without queuing. A higher priority is given to voice handoff calls than to data handoff calls. A multidimensional birth-death process model is presented to analyze the bulk handoff performance of mixed traffic. A numerical analysis of system performance is presented to evaluate the blocking probabilities of new voice and data calls, handoff failure probabilities, and the forced termination probabilities of voice/data handoff calls.

  • Channel Assignment Scheme for Integrated Voice and Data Traffic in Reservation-Type Packet Radio Networks

    Hideyuki UEHARA  Masato FUJIHARA  Mitsuo YOKOYAMA  Hiro ITO  

     
    PAPER

      Vol:
    E85-B No:1
      Page(s):
    191-198

    In this paper, we propose a channel assignment scheme for integrated voice and data traffic in reservation multiple access protocol. In the proposed scheme, a voice packet never contends with a data packet and takes over the slot which is previously assigned to a data packet. Thus, a larger number of voice terminals can be accommodated without degradation of quality and throughput even in the situation that data were integrated. We evaluate the voice packet dropping probability, throughput and packet delay through computer simulation. The results show that the proposed scheme has better performance than the conventional PRMA and DQRUMA systems.

  • Improvement of PSRR Characteristics of a SCF Using a Leapfrog Filter and an Equal Level Diagram Design

    Katsuhiro FURUKAWA  

     
    LETTER-Analog Signal Processing

      Vol:
    E84-A No:10
      Page(s):
    2600-2605

    Power supply rejection ratio (PSRR) characteristics of a switched capacitor filter (SCF) is improved when using an equal level diagram design of a leapfrog type filter. By using this design method, it is shown that PSRR of a SCF measured is improved about 20 dB.

  • A New Effective Analysis for Wireless CSMA/CA LANs Supporting Real-Time Voice and Data Services

    Wuyi YUE  Yutaka MATSUMOTO  

     
    PAPER

      Vol:
    E84-A No:7
      Page(s):
    1660-1669

    Wireless LANs have been used for realizing fully-distributed users in a multimedia environment that has the ability to provide real-time bursty traffic (such as voice or video) and data traffic. In this paper, we present a new realistic and detailed system model and a new effective analysis for the performance of wireless LANs which support multimedia communication with non-persistent carrier sense multiple access with collision avoidance (CSMA/CA) protocol. In this CSMA/CA model, a user with a packet ready to transmit initially sends some pulse signals with random intervals within a collision avoidance period before transmitting the packet to verify a clear channel. The system model consists of a finite number of users to efficiently share a common channel. Each user can be a source of both voice traffic and data traffic. The time axis is slotted, and a frame has a large number of slots and includes two parts: the collision avoidance period and the packet transmission period. A discrete-time Markov process is used to model the system operation. The number of slots in a frame can be arbitrary, dependent on the chosen lengths of the collision avoidance period and packet transmission period. Numerical results are shown in terms of channel utilization and average packet delay for different packet generation rates. They indicate that the network performance can be improved by adequate choice of ratios between the collision avoidance period and transmission period, and the pulse transmission probability.

  • An Improved Voice Activity Detection Algorithm Employing Speech Enhancement Preprocessing

    Yoon-Chang LEE  Sang-Sik AHN  

     
    PAPER

      Vol:
    E84-A No:6
      Page(s):
    1401-1405

    In this paper, we first propose a new speech enhancement preprocessing algorithm by combining power subtraction method and maximal ratio combining technique, then apply it to both energy-based and statistical model-based VAD algorithm to improve the performance even in low SNR conditions. We also perform extensive computer simulations to demonstrate the performance improvement of the proposed VAD algorithm employing the proposed speech enhancement preprocessing algorithm under various background noise environments.

  • Intuitive Sound Design Using Vocal Mimicking

    Sanae H. WAKE  Toshiyuki ASAHI  

     
    LETTER-Man-Machine Systems, Multimedia Processing

      Vol:
    E84-D No:6
      Page(s):
    749-750

    Our aim is to develop an intuitive sound designing interface for non-expert users. We propose editing sound by sound, which means using vocal mimicking as a "master" to transform the pitch and amplitude envelope. Our technique allows any user to easily and intuitively design sound because it requires no knowledge of acoustic features.

  • QoS and Capacity Comparison of CDMA ALOHA Protocols in Multimedia Networks

    Abbas SANDOUK  Takaya YAMAZATO  Masaaki KATAYAMA  Akira OGAWA  

     
    LETTER

      Vol:
    E84-B No:6
      Page(s):
    1588-1595

    In this letter, performance evaluation of a system that combines between Code-Division Multiple Access (CDMA) and ALOHA protocol in multimedia networks is presented. In our analysis, we compare the performance between the two basic techniques of ALOHA protocol, i.e., Slotted-ALOHA (S-ALOHA) and Unslotted-ALOHA (U-ALOHA), when combined with CDMA scheme to support voice and data users operating in same CDMA channel. The quality of service (QoS) required for voice and data media is completely taken care of. We obtain the throughput of data media, and the outage probability for voice considering both voice and data offered loads. Throughput performance of S-ALOHA technique is almost twice of that of U-ALOHA. However, we show in this letter that when we combine CDMA with the two basic techniques of ALOHA to accomplish multimedia transmission, both techniques have almost same performance. And, thus, CDMA U-ALOHA can be a good candidate for multimedia networks.

  • Voice over IP Enabling Telephony and IP Network Convergence

    Tohru HOSHI  Koji TSUKADA  Kazuma YUMOTO  Keiko TANIGAWA  Yoshiyuki NAKAYAMA  

     
    PAPER

      Vol:
    E84-D No:5
      Page(s):
    548-559

    Voice over IP (VoIP) is a generic name for services, systems and technology for telephony over an IP network. It is also referred to as Internet telephony and IP (Internet Protocol) telephony. Internet telephone client software attracted attention when it first appeared in 1995. Since that, VoIP has rapidly matured into a practical technology, propelled by the popularization and rapid development of the Internet. IP network traffic already exceeds telephone network traffic and is expected to further increase several-fold in the next few years. In future, the telephone network will be integrated into the IP network and telephony will become entirely VoIP. There are three expectations for VoIP. The first is inexpensive telephone service. The second expectation is for integrated telephony and IP network services such as a CTI (Computer Telephony Integration) system in which there is interworking with various Internet applications, such as e-mail and Web call-back for communication services of greater convenience rather than simple replacement of the telephone. The third expectation is for a platform for providing high-quality voice communication, multicast communication, and other such enhanced voice services that have a high degree of freedom. However, many problems remain to be overcome before the VoIP System is realized. The main problems are real-time transmission of voice that allows a smooth conversation, session control for providing a variety of services, and the proposal of new services. In this paper, we give an overview of VoIP and the problems that must be solved in order to realize it and propose some solutions regarding stream control and applications. We also describe session control and other topics that are being discussed in standardization forums.

  • A Joint Packet Reservation and Status Sensing Multiple Access for Voice/Data Integrated CDMA Networks

    In-Taek LIM  

     
    PAPER-Terrestrial Radio Communications

      Vol:
    E84-B No:4
      Page(s):
    975-983

    In this paper, a medium access control protocol is proposed for the integrated voice and data services in the local CDMA communication systems. Based on WB-TD/CDMA, uplink channels for the proposed protocol are composed of time slots with multiple spreading codes for each slot. During a talkspurt, a voice terminal transmits its entire packets over a reserved code. On the other hand, a data terminal transmits its packet after sensing the spreading code status. The base station broadcasts the status of spreading codes for each slot. In this protocol, voice packets never contend with data packets. The numerical results show that this protocol increases the system capacity for voice service by applying the reservation scheme. Despite the low access priority of data terminal, the data traffic performance can be increased in proportion to the number of spreading codes.

  • Erlang Capacity of Voice/Data DS-CDMA Systems with Prioritized Services

    Insoo KOO  Eunchan KIM  Kiseon KIM  

     
    PAPER

      Vol:
    E84-B No:4
      Page(s):
    716-726

    In this paper, we propose a Call Admission Control (CAC) scheme for the Direct Sequence-Code Division Multiple Access (DS-CDMA) systems supporting voice and data services and analyze the Erlang capacity under the proposed CAC scheme. Service groups are classified by Quality of Service (QoS) requirements such as the required Bit Error Rate (BER) and information bit rate, and Grade of Service (GoS) requirement such as required call blocking probability. Different traffics require different system resources based on their QoS requirements. In the proposed CAC scheme, some system resources are reserved exclusively for handoff calls to have high priority over new calls. Additionally, the queueing is allowed for both new and handoff data traffics that are not sensitive to delay. As a performance measure of the suggested CAC scheme, Erlang capacity is introduced. For the performance analysis, a four-dimensional Markov chain model is developed. As a numerical example, Erlang capacity of an IS-95B type system is depicted, and optimum values of system parameters such as the number of reservation channels and queue lengths are found. Finally, it is observed that Erlang capacity is improved more than 2 times by properly selecting the system parameters with the proposed CAC scheme. Also, the effect of handoff parameters on the Erlang capacity is observed.

  • Analysis of Erlang Capacity for Voice/Data DS-CDMA Systems with the Limited Number of Channel Elements

    Insoo KOO  Jeongrok YANG  Kiseon KIM  

     
    PAPER

      Vol:
    E84-B No:3
      Page(s):
    527-538

    In this paper, we propose an analytical procedure for the Erlang capacity in the reverse link of the DS-CDMA systems supporting voice and data services with the limited number of channel elements. Especially, we focus on IS-95B type systems with sectorized directional antenna that support the medium data rates for data traffic by aggregating multiple codes in both directions, to and from the mobile device. For the performance analysis, a 6-dimensional Markov chain model is developed, and the Erlang capacity is depicted as a function of the offered traffic loads of voice and data. The call blocking is caused not only by the scarcity of channel elements that perform the baseband spread spectrum signal processing for the given channel in the base station, but also by insufficient available channels per sector. The effect of the different Grade of Service (GoS) requirements on the Erlang capacity is observed, and the optimum values of the system parameters such as channel elements are selected with respect to the Erlang capacity. Furthermore, we expand our approach to the multi-FA systems that support multiple CDMA carriers more than one. It is observed that Erlang capacity for a high FA can be well estimated through the linear regression with the Erlang capacity results of low FAs.

  • An Access Control Protocol for a Heterogeneous Traffic with a Multi-Code CDMA Scheme

    Abbas SANDOUK  Takaya YAMAZATO  Masaaki KATAYAMA  Akira OGAWA  

     
    PAPER

      Vol:
    E83-A No:11
      Page(s):
    2085-2092

    In this paper, we discuss the access control in multimedia CDMA ALOHA protocol. We introduce a new algorithm for the access control based on Modified Channel Load Sensing Protocol (MCLSP) in an integrated voice and two different classes of data users, high bit rate and low bit rate, exist in a multi-code CDMA Slotted ALOHA system. With our new algorithm, we show that the throughput of high bit rate data users, as well as, the total throughput of the data medium can be optimized and take a maximum value even at high values of offered loads. We also investigate the performance when voice activity detection (VAD) is considered in voice transmission.

  • Non-Collision Packet Reservation Multiple Access with Random Transmission to Idle Slots

    Mioko TADENUMA  Iwao SASASE  

     
    PAPER-Information Network

      Vol:
    E83-A No:10
      Page(s):
    1945-1954

    The non-collision packet reservation multiple access (NC-PRMA) protocol has been proposed for wireless voice communications. In that protocol, although it can avoid any collision by using control minislot, the terminal which generates its talkspurt in a current frame has to wait till a next frame to transmit an asking packet to obtain reservation. Furthermore, under integrated voice and data traffic, in the conventional NC-PRMA the voice packet dropping probability becomes worse, because of the number of slots that voice terminals can access are limited. In this paper, we propose the NC-PRMA with random transmission to idle slots. First, we evaluate the mean access delay and the voice packet dropping probability under only voice traffic by the theoretical analysis and the computer simulation. It is shown that the proposed scheme attains lower mean access delay than the conventional NC-PRMA. Next, we evaluate the data packet delay and the voice packet dropping probability under integrated voice and data traffic by the computer simulation. It is shown that the proposed scheme attains lower packet dropping probability than the PRMA and the conventional NC-PRMA.

  • Modeling CDPD Channel Holding Times

    Yi-Bing LIN  Phone LIN  Yu-Min CHUANG  

     
    PAPER-Wireless Communication Technology

      Vol:
    E83-B No:9
      Page(s):
    2051-2055

    Cellular Digital Packet Data (CDPD) provides wireless data communication services to mobile users by sharing unused RF channels with AMPS on a non-interfering basis. To prevent interference on the voice activities, CDPD makes forced hop to a channel stream when a voice request is about to use the RF channel occupied by the channel stream. The number of forced hops is affected by the voice channel selection policy. We propose analytic models to investigate the CDPD channel holding time for the the least-idle and random voice channel selection policies. Under various system parameters and voice channel selection policies, we provide guidelines to reduce the number of forced hops.

101-120hit(140hit)