1-12hit |
Ruxue GUO Pengxu JIANG Ruiyu LIANG Yue XIE Cairong ZOU
For a long time, the compensation effect of hearing aid is mainly evaluated subjectively, and there are fewer studies of objective evaluation. Furthermore, a pure speech signal is generally required as a reference in the existing objective evaluation methods, which restricts the practicality in a real-world environment. Therefore, this paper presents a non-intrusive speech quality evaluation method for hearing aid, which combines the audiogram and weighted frequency information. The proposed model mainly includes an audiogram information extraction network, a frequency information extraction network, and a quality score mapping network. The audiogram is the input of the audiogram information extraction network, which helps the system capture the information related to hearing loss. In addition, the low-frequency bands of speech contain loudness information and the medium and high-frequency components contribute to semantic comprehension. The information of two frequency bands is input to the frequency information extraction network to obtain time-frequency information. When obtaining the high-level features of different frequency bands and audiograms, they are fused into two groups of tensors that distinguish the information of different frequency bands and used as the input of the attention layer to calculate the corresponding weight distribution. Finally, a dense layer is employed to predict the score of speech quality. The experimental results show that it is reasonable to combine the audiogram and the weight of the information from two frequency bands, which can effectively realize the evaluation of the speech quality of the hearing aid.
Kenji KITA Hiroshi GOTOH Hiroyasu ISHIKAWA Hideyuki SHINONAGA
Power line communications (PLC) is a communication technology that uses a power-line as a transmission medium. Previous studies have shown that connecting an AC adapter such as a mobile phone charger to the power-line affects signal quality. Therefore, in this paper, the authors analyze the influence of chargers on inter-computer communications using packet capture to evaluate communications quality. The analysis results indicate the occurrence of a short duration in which packets are not detected once in a half period of the power-line supply: named communication forbidden time. For visualizing the communication forbidden time and for evaluating the communications quality of the inter-computer communications using PLC, the authors propose an instantaneous power-line frequency synchronized superimposed chart and its plotting algorithm. Further, in order to analyze accurately, the position of the communication forbidden time can be changed by altering the initial burst signal plotting position. The difference in the chart, which occurs when the plotting start position changes, is also discussed. We show analysis examples using the chart for a test bed data assumed an ideal environment, and show the effectiveness of the chart for analyzing PLC inter-computer communications.
Song LIANG Leida LI Bo HU Jianying ZHANG
This letter presents an objective quality index for benchmarking image inpainting algorithms. Under the guidance of the masks of damaged areas, the boundary region and the inpainting region are first located. Then, the statistical features are extracted from the boundary and inpainting regions respectively. For the boundary region, we utilize Weibull distribution to fit the gradient magnitude histograms of the exterior and interior regions around the boundary, and the Kullback-Leibler Divergence (KLD) is calculated to measure the boundary distortions caused by imperfect inpainting. Meanwhile, the quality of the inpainting region is measured by comparing the naturalness factors between the inpainted image and the reference image. Experimental results demonstrate that the proposed metric outperforms the relevant state-of-the-art quality metrics.
Nii L. SOWAH Qingbo WU Fanman MENG Liangzhi TANG Yinan LIU Linfeng XU
In this paper, we improve upon the accuracy of existing tracklet generation methods by repairing tracklets based on their quality evaluation and detection propagation. Starting from object detections, we generate tracklets using three existing methods. Then we perform co-tracklet quality evaluation to score each tracklet and filtered out good tracklet based on their scores. A detection propagation method is designed to transfer the detections in the good tracklets to the bad ones so as to repair bad tracklets. The tracklet quality evaluation in our method is implemented by intra-tracklet detection consistency and inter-tracklet detection completeness. Two propagation methods; global propagation and local propagation are defined to achieve more accurate tracklet propagation. We demonstrate the effectiveness of the proposed method on the MOT 15 dataset
Yu ZHOU Leida LI Ke GU Zhaolin LU Beijing CHEN Lu TANG
Depth-image-based-rendering (DIBR) is a popular technique for view synthesis. The rendering process mainly introduces artifacts around edges, which leads to degraded quality. This letter proposes a DIBR-synthesized image quality metric by measuring the Statistics of both Edge Intensity and Orientation (SEIO). The Canny operator is first used to detect edges. Then the gradient maps are calculated, based on which the intensity and orientation of the edge pixels are computed for both the reference and synthesized images. The distance between the two intensity histograms and that between the two orientation histograms are computed. Finally, the two distances are pooled to obtain the overall quality score. Experimental results demonstrate the advantages of the presented method.
Takeshi YAMADA Yuki KASUYA Yuki SHINOHARA Nobuhiko KITAWAKI
This paper describes non-reference objective quality evaluation for noise-reduced speech. First, a subjective test is conducted in accordance with ITU-T Rec. P.835 to obtain the speech quality, the noise quality, and the overall quality of noise-reduced speech. Based on the results, we then propose an overall quality estimation model. The unique point of the proposed model is that the estimation of the overall quality is done only using the previously estimated speech quality and noise quality, in contrast to conventional models, which utilize the acoustical features extracted. Finally, we propose a non-reference objective quality evaluation method using the proposed model. The results of an experiment with different noise reduction algorithms and noise types confirmed that the proposed method gives more accurate estimates of the overall quality compared with the method described in ITU-T Rec. P.563.
Noritsugu EGI Takanori HAYASHI Akira TAKAHASHI
We propose a parametric packet-layer model for monitoring audio quality in multimedia streaming services such as Internet protocol television (IPTV). This model estimates audio quality of experience (QoE) on the basis of quality degradation due to coding and packet loss of an audio sequence. The input parameters of this model are audio bit rate, sampling rate, frame length, packet-loss frequency, and average burst length. Audio bit rate, packet-loss frequency, and average burst length are calculated from header information in received IP packets. For sampling rate, frame length, and audio codec type, the values or the names used in monitored services are input into this model directly. We performed a subjective listening test to examine the relationships between these input parameters and perceived audio quality. The codec used in this test was the Advanced Audio Codec-Low Complexity (AAC-LC), which is one of the international standards for audio coding. On the basis of the test results, we developed an audio quality evaluation model. The verification results indicate that audio quality estimated by the proposed model has a high correlation with perceived audio quality.
Kritsada SRIPHAEW Thanaruk THEERAMUNKONG
Assessment of discovered patterns is an important issue in the field of knowledge discovery. This paper presents an evaluation method that utilizes citation (reference) information to assess the quality of discovered document relations. With the concept of transitivity as direct/indirect citations, a series of evaluation criteria is introduced to define the validity of discovered relations. Two kinds of validity, called soft validity and hard validity, are proposed to express the quality of the discovered relations. For the purpose of impartial comparison, the expected validity is statistically estimated based on the generative probability of each relation pattern. The proposed evaluation is investigated using more than 10,000 documents obtained from a research publication database. With frequent itemset mining as a process to discover document relations, the proposed method was shown to be a powerful way to evaluate the relations in four aspects: soft/hard scoring, direct/indirect citation, relative quality over the expected value, and comparison to human judgment.
A new fast and reliable image objective quality evaluation technique is presented in this paper. The proposed method takes image structure into account and uses a low complexity homogeneity measure to evaluate the intensity uniformity of a local region based on high-pass operators. We experimented with monochrome images under different types of distortions. Experimental results indicate that the proposed method provides better consistency with the perceived image quality. It is suitable for real applications to control the processed image quality.
Kaoru NAKAZONO Yuji NAGASHIMA Akira ICHIKAWA
We report a specially designed encoding technique for sign language video sequences supposing that the technique is for sign telecommunication such as that using mobile videophones with a low bitrate. The technique is composed of three methods: gradient coding, precedence macroblock coding, and not-coded coding. These methods are based on the idea to distribute a certain number of bits for each macroblock according to the evaluation of importance of parts of the picture. They were implemented on a computer and encoded data of a short clip of sign language dialogue was evaluated by deaf subjects. As a result, the efficiency of the technique was confirmed.
Kazuhiro KONDO Kiyoshi NAKAGAWA
We proposed and evaluated a speech packet loss concealment method which predicts lost segments from speech included in packets either before, or both before and after the lost packet. The lost segments are predicted recursively by using linear prediction both in the forward direction from the packet preceding the loss, and in the backward direction from the packet succeeding the lost segment. Predicted samples in each direction are smoothed by averaging using linear weights to obtain the final interpolated signal. The adjacent segments are also smoothed extensively to significantly reduce the speech quality discontinuity between the interpolated signal and the received speech signal. Subjective quality comparisons between the proposed method and the the packet loss concealment algorithm described in the ITU standard G.711 Appendix I showed similar scores up to about 10% packet loss. However, the proposed method showed higher scores above this loss rate, with Mean Opinion Score rating exceeding 2.4, even at an extremely high packet loss rate of 30%. Packet loss concealment of speech degraded with G.729 coding, and babble noise mixed speech showed similar trends, with the proposed method showing higher qualities at high loss rates. We plan to further improve the performance by using adaptive LPC prediction order depending on the estimated pitch, and adaptive LPC bandwidth expansion depending on the consecutive number of repetitive prediction, among many other improvements. We also plan to investigate complexity reduction using gradient LPC coefficient updates, and processing delay reduction using adaptive forward/bidirectional prediction modes depending on the measured packet loss ratio.
Hiroki MORIMURA Satoshi SHIGEMATSU Toshishige SHIMAMURA Koji FUJII Chikara YAMAGUCHI Hiroki SUTO Yukio OKAZAKI Katsuyuki MACHIDA Hakaru KYURAGI
This paper describes an adaptive fingerprint-sensing scheme for a user authentication system with a fingerprint sensor LSI to obtain high-quality fingerprint images suitable for identification. The scheme is based on novel evaluation indexes of fingerprint-image quality and adjustable analog-to-digital (A/D) conversion. The scheme adjusts dynamically an A/D conversion range of the fingerprint sensor LSI while evaluating the image quality during real-time fingerprint-sensing operation. The evaluation indexes pertain to the contrast and the ridgelines of a fingerprint image. The A/D conversion range is adjusted by changing quantization resolution and offset. We developed a fingerprint sensor LSI and a user authentication system to evaluate the adaptive fingerprint-sensing scheme. The scheme obtained a fingerprint image suitable for identification and the system achieved an accurate identification rate with 0.36% of the false rejection rate (FRR) at 0.075% of the false acceptance rate (FAR). This confirms that the scheme is very effective in achieving accurate identification.