1-2hit |
Freddy PERRAUD Christian VIARD-GAUDIN Emmanuel MORIN Pierre-Michel LALLICAN
This paper incorporates statistical language models into an on-line handwriting recognition system for devices with limited memory and computational resources. The objective is to minimize the error recognition rate by taking into account the sentence context to disambiguate poorly written texts. Probabilistic word n-grams have been first investigated, then to fight the curse of dimensionality problem induced by such an approach and to decrease significantly the size of the language model an extension to class-based n-grams has been achieved. In the latter case, the classes result either from a syntactic criterion or a contextual criteria. Finally, a composite model is proposed; it combines both previous kinds of classes and exhibits superior performances compared with the word n-grams model. We report on many experiments involving different European languages (English, French, and Italian), they are related either to language model evaluation based on the classical perplexity measurement on test text corpora but also on the evolution of the word error rate on test handwritten databases. These experiments show that the proposed approach significantly improves on state-of-the-art n-gram models, and that its integration into an on-line handwriting recognition system demonstrates a substantial performance improvement.
Patrick LE CALLET Christian VIARD-GAUDIN Stephane PECHARD Emilie CAILLAULT
This paper describes an objective measurement method designed to assess the perceived quality for digital videos. The proposed approach can be used either in the context of a reduced reference quality assessment or in the more challenging situation where no reference is available. In that way, it can be deployed in a QoS monitoring strategy in order to control the end-user perceived quality. The originality of the approach relies on the very limited computation resources which are involved, such a system could be integrated quite easily in a real time application. It uses a convolutional neural network (CNN) that allows a continuous time scoring of the video. Experiments conducted on different MPEG-2 videos, with bit rates ranging from 2 to 6 Mbits/s, show the effectiveness of the proposed approach. More specifically, a linear correlation criterion, between objective and subjective scoring, ranging from 0.90 up to 0.95 has been obtained on a set of typical TV videos in the case of a reduced reference assessment. Without any reference to the original video, the correlation criteria remains quite satisfying since it still lies between 0.85 and 0.90, which is quite high with respect to the difficulty of the task, and equivalent and more in some cases than the traditional PSNR, which is a full reference measurement.