The search functionality is under construction.

Author Search Result

[Author] Pao-Chi CHANG(9hit)

1-9hit
  • A Low-Complexity Down-Mixing Structure on Quadraphonic Headsets for Surround Audio

    Tai-Ming CHANG  Yi-Ming SHIU  Pao-Chi CHANG  

     
    PAPER-Digital Signal Processing

      Vol:
    E96-A No:7
      Page(s):
    1526-1533

    This work presents a four-channel headset achieving a 5.1-channel-like hearing experience using a low-complexity head-related transfer function (HRTF) model and a simplified reverberator. The proposed down-mixing architecture enhances the sound localization capability of a headset using the HRTF and by simulating multiple sound reflections in a room using Moorer's reverberator. Since the HRTF has large memory and computation requirements, the common-acoustical-pole and zero (CAPZ) model can be used to reshape the lower-order HRTF model. From a power consumption viewpoint, the CAPZ model reduces computation complexity by approximately 40%. The subjective listening tests in this study shows that the proposed four-channel headset performs much better than stereo headphones. On the other hand, the four-channel headset that can be implemented by off-the-shelf components preserves the privacy with low cost.

  • Fast Gated Recurrent Network for Speech Synthesis

    Bima PRIHASTO  Tzu-Chiang TAI  Pao-Chi CHANG  Jia-Ching WANG  

     
    LETTER-Speech and Hearing

      Pubricized:
    2022/06/10
      Vol:
    E105-D No:9
      Page(s):
    1634-1638

    The recurrent neural network (RNN) has been used in audio and speech processing, such as language translation and speech recognition. Although RNN-based architecture can be applied to speech synthesis, the long computing time is still the primary concern. This research proposes a fast gated recurrent neural network, a fast RNN-based architecture, for speech synthesis based on the minimal gated unit (MGU). Our architecture removes the unit state history from some equations in MGU. Our MGU-based architecture is about twice faster, with equally good sound quality than the other MGU-based architectures.

  • Analyzing and Absorbing Cross-Layer Header Overhead of Video Data from End-to-End Viewpoint

    Chu-Chuan LEE  Pao-Chi CHANG  

     
    PAPER-Multimedia Systems for Communications" Multimedia Systems for Communications

      Vol:
    E88-B No:11
      Page(s):
    4360-4367

    Regarding IP-based video applications over wireless networks, the multi-layer header overhead may significantly affect the estimation of target video encoding bit rate and the effective throughput of wireless network. Based on the existing header structure of video packets, this study intends to deal with the header overhead problem from the end-to-end viewpoint. This paper first proposes a simple yet robust closed-form that can determine accurately and timely the optimal video payload length at the video sender based on the current wireless channel condition. The contribution can effectively improve the WLAN throughput and enhance the error resilience effect of scalable video data simultaneously. This study further explores the impact of multi-layer header overhead to the video coding work and proposes a Dynamic Header Overhead Accommodation (DHOA) scheme, which is executed in the video compression layer, to adjust dynamically the available video encoding bits for accommodating the header overhead in advance. The contributions of this paper are robust for various IP implementations such as IPSec (IP Security) over different 802.11 standards. Analytical and simulation results verify the accuracy and effectiveness of the proposed closed-form and header accommodation method. Using DHOA, the bandwidth mismatch between the actual bandwidth demand of packetized video data and the available network bandwidth is no more than 1.1% regardless of the packet sizes used in this paper.

  • Real-Time Complexity Control for H.264 Video Encoding by Coding Gain Maximization

    Ming-Chen CHIEN  Pao-Chi CHANG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E94-B No:7
      Page(s):
    2181-2184

    This research proposes a Coding-Gain-Based (CGB) complexity control method for real-time H.264 video encoding in complexity-constrained systems such as wireless handsets. By allocating more complexity to the encoding tools which have higher coding efficiency, the CGB method is able to maximize the overall coding efficiency of the encoder.

  • Adaptive Rate Control Mechanism in H.264/AVC for Scene Changes

    Jiunn-Tsair FANG  Zong-Yi CHEN  Chen-Cheng CHAN  Pao-Chi CHANG  

     
    PAPER-Image

      Vol:
    E97-A No:12
      Page(s):
    2625-2632

    Rate control that is required to regulate the bitrate of video coding is critical to time-sensitive video applications used over networks. However, the H.264/AVC standard does not respond to scene changes, and this causes the transmission quality to deteriorate as a scene change occurs. In this work, a scene change is detected by comparing the ratio of the sum of absolute difference (SAD) between two consecutive frames. As the scene change is detected, the proposed method, which is modified from the reference software of H.264/AVC, re-assigns a quantization parameter (QP) value to regulate the bitrate. Because the inter-prediction works poorly for the scene-changed frame, the proposed method estimates its frame complexity based on the content, and further creates another Q-R model to assign QP. The adaptive rate control mechanism presented in this study can quickly respond to the heavy bitrate increment caused by a change of scene. Simulation results show that the proposed method improves the average peak signal noise ratio (PSNR) to approximately 1.1dB, with a smaller buffer size compared with the performance of the reference software JM version 17.2.

  • Adaptive Video Quality Control Based on Connection Status over ATM Networks

    Pao-Chi CHANG  Jong-Tzy WANG  Yu-Cheng LIN  

     
    PAPER-Communication Networks and Services

      Vol:
    E82-B No:9
      Page(s):
    1388-1396

    The MPEG video coding is the most widely used video coding standard which usually generates variable bitrate (VBR) data streams. Although ATM can deliver VBR traffic, the burst traffic still has the possibility to be dropped due to network congestion. The cell loss can be minimized by using an enforced rate control method. However, the quality of the reproduced video may be sacrificed due to insufficient peak rate available. In this work, we propose an end-to-end quality adaptation mechanism for MPEG traffic over ATM. The adaptive quality control (AQC) scheme allocates a certain number of coding bits to each video frame based on the network condition and the type of next frame. More bits may be allocated if the network condition, represented by the connection-level, is good or the next frame is B-frame that usually consumes fewer bits. A high connection-level allows a relatively large number of tagged cells, which are non-guaranteed in delivery, for video frames with high peak rates. The connection-level adjustment unit at the encoder end adjusts the connection-level based on the message of the network condition from the quality monitoring unit at decoder. The simulation results show that the AQC system can effectively utilize the channel bandwidth as well as maintain satisfactory video quality in various network conditions.

  • Selective Block-Wise Reordering Technique for Very Low Bit-Rate Wavelet Video Coding

    Ta-Te LU  Pao-Chi CHANG  

     
    PAPER-Image

      Vol:
    E87-A No:4
      Page(s):
    920-928

    In this paper, we present a novel energy compaction method, called the selective block-wise reordering, which is used with SPIHT (SBR-SPIHT) coding for low rate video coding to enhance the coding efficiency for motion-compensated residuals. In the proposed coding system, the motion estimation and motion compensation schemes of H.263 are used to reduce the temporal redundancy. The residuals are then wavelet transformed. The block-mapping reorganization utilizes the wavelet zerotree relationship that jointly presents the wavelet coefficients from the lowest subband to high frequency subbands at the same spatial location, and allocates each wavelet tree with all descendents to form a wavelet block. The selective multi-layer block-wise reordering technique is then applied to those wavelet blocks that have energy higher than a threshold to enhance the energy compaction by rearranging the significant pixels in a block to the upper left corner based on the magnitude of energy. An improved SPIHT coding is then applied to each wavelet block, either re-ordered or not. The high energy compaction resulting from the block reordering can reduce the number of redundant bits in the sorting pass and improve the quantization efficiency in the refinement pass of SPIHT coding. Simulation results demonstrate that SBR-SPIHT outperforms H.263 by 1.28-0.69 dB on average for various video sequences at very low bit-rates, ranging from 48 to 10 kbps.

  • Adaptive Voice Smoothing with Optimal E-Model Method for VoIP Services

    Shyh-Fang HUANG  Pao-Chi CHANG  Eric Hsiao-kuang WU  

     
    PAPER-Multimedia Systems for Communications

      Vol:
    E89-B No:6
      Page(s):
    1862-1868

    VoIP, one of emerging technologies, offers high quality of real time voice services over IP-based broadband networks; however, the quality of voice would easily be degraded by IP network impairments such as delay, jitter and packet loss, hereon initiate the presence of new technologies to help solve out the problems. Among those, playout buffer at the receiving end can compensate for the jitter effects by its function of tradeoff between delay and loss. Adaptive smoothing algorithms are capable of the dynamical adjustment of smoothing size by introducing a variable delay based on the use of the network parameters so as to avoid the quality decay problem. This paper introduces an efficient and feasible perceived quality method for buffer optimization to achieve the best voice quality. This work formulates an online loss model which incorporates buffer sizes and applies the ITU-T E-model approach to optimize the delay-loss problem. Distinct from other optimal smoothers, the proposed optimal smoother can be applied for most codecs and carries the lowest complexity. Since the adaptive smoothing scheme introduces variable playback delays, the buffer re-synchronization between the capture and the playback becomes essential. This work also presents a buffer re-synchronization algorithm based on silence skipping to prevent unacceptable increase in the buffer preloading delay and even buffer overflow. Simulation experiments validate that the proposed adaptive smoother achieves significant improvement in the voice quality.

  • Error Robust H.263 Video Coding with Video Segment Regulation and Precise Error Tracking

    Tien-Hsu LEE  Pao-Chi CHANG  

     
    PAPER-Multimedia Systems

      Vol:
    E84-B No:2
      Page(s):
    317-324

    This paper presents an error resilient H.263 video compression scheme over noisy channels. The start codes in the H.263 bit stream syntax, which inherently provide the resynchronization functionality for error handling, may cause significant error damage if they are incorrectly decoded. Therefore, we develop a video segment regulation algorithm at the decoder to efficiently identify and correct erroneous start codes and block addresses. In addition, the precise error tracking technique is used to further reduce the error propagation effects. After performing the video segment regulation, the decoder can report the exact addresses of detected corrupt blocks back to the encoder via a feedback channel. With these negative acknowledgments, the encoder can precisely calculate and trace the propagated errors by examining the backward motion dependency for each pixel in the current encoding frame. With this precise tracking strategy, the error propagation effects can be terminated completely by INTRA refreshing the affected blocks. Simulation results show that the proposed scheme yields significant video quality improvements over the motion compensated concealment by gains of 4.1 to 6.2 dB PSNRs at bit rate around 35 kbps in error-prone DECT environments. In particular, this scheme complies with the H.263 standard and has the advantages of low memory requirement and low computation complexity that are suitable for practical real-time implementation.