Feng-Cheng CHANG Hsueh-Ming HANG
Content-based image search has long been considered a difficult task. Making correct conjectures on the user intention (perception) based on the query images is a critical step in the content-based search. One key concept in this paper is how we find the user preferred low-level image characteristics from the multiple positive samples provided by the user. The second key concept is how we generate a set of consistent "pseudo images" when the user does not provide a sufficient number of samples. The notion of image feature stability is thus introduced. The third key concept is how we use negative images as pruning criterion. In realizing the preceding concepts, an image search scheme is developed using the weighted low-level image features. At the end, quantitative simulation results are used to show the effectiveness of these concepts.
In this paper, an entropy based associative memory model will be proposed and applied to memory retrievals with an orthogonal learning model to compare with the conventional model based on the quadratic Lyapunov functional to be minimized. In the present approach, the updating dynamics will be constructed on the basis of the entropy minimization strategy which may be reduced asymptotically to the above-mentioned autocorrelation dynamics as a special case. From numerical results, it will be found that the presently proposed novel approach realizes twice of the memory capacity in comparison with the autocorrelation based dynamics such as associatron.
Baoliu YE Minyi GUO Jingyang ZHOU Daoxu CHEN
A fundamental problem in a pure Peer-to-Peer (P2P) file sharing system is how to protect the anonymity of peer nodes when providing efficient data access services. Most of existing work mainly focus on how to provide the initiator anonymity, but neglect the anonymity of the responder. In this paper, we propose a multicast-based protocol, called Mapper, for efficient file sharing with mutual anonymity. By seamlessly combining the technologies of multi-proxy and IP multicast together, the proposed protocol guarantees mutual anonymity during the entire session of file retrieval. Furthermore, Mapper replicates requested files inside the multicast group, so file distribution can be adjusted adaptively and the cost for multicast can be further reduced. Results of both simulations and theoretical analyses demonstrate that Mapper possesses the merits of scalability, reliability, and high adaptability.
Phonetic string search of written text is an important topic in Information Retrieval. While most of the previous methods convert a string into intermediate codes with phonetic transformation rules, this paper proposes a novel algorithm to segment two phonetic strings into syllables and find the optimal pairing of the corresponding syllables to calculate their similarity score. The experiment shows that this method is very effective and flexible. It can be easily adapted to different datasets and achieves optimal performance on average.
We propose a system that enables us to gather hundreds of images related to one set of keywords provided by a user from the World Wide Web. The system is called Image Collector II. The Image Collector, which we proposed previously, can gather only one or two hundreds of images. We propose the two following improvements on our previous system in terms of the number of gathered images and their precision: (1) We extract some words appearing with high frequency from all HTML files in which output images are embedded in an initial image gathering, and using them as keywords, we carry out a second image gathering. Through this process, we can obtain hundreds of images for one set of keywords. (2) The more images we gather, the more the precision of gathered images decreases. To improve the precision, we introduce word vectors of HTML files embedding images into the image selecting process in addition to image feature vectors.
Koichi KISE Shota FUKUSHIMA Keinosuke MATSUMOTO
Question answering (QA) is the task of retrieving an answer in response to a question by analyzing documents. Although most of the efforts in developing QA systems are devoted to dealing with electronic text, we consider it is also necessary to develop systems for document images. In this paper, we propose a method of document image retrieval for such QA systems. Since the task is not to retrieve all relevant documents but to find the answer somewhere in documents, retrieval should be precision oriented. The main contribution of this paper is to propose a method of improving precision of document image retrieval by taking into account the co-occurrence of successive terms in a question. The indexing scheme is based on two-dimensional distributions of terms and the weight of co-occurrence is measured by calculating the density distributions of terms. The proposed method was tested by using 1253 pages of documents about the major league baseball with 20 questions and found that it is superior to the baseline method proposed by the authors.
Hongge LI Yoshihiro HAYAKAWA Koji NAKAJIMA
Self-connection can enlarge the memory capacity of an associative memory based on the neural network. However, the basin size of the embedded memory state shrinks. The problem of basin size is related to undesirable stable states which are spurious. If we can destabilize these spurious states, we expect to improve the basin size. The inverse function delayed (ID) model, which includes the Bonhoeffer-van der Pol (BVP) model, has negative resistance in its dynamics. The negative resistance of the ID model can destabilize the equilibrium states on certain regions of the conventional neural network. Therefore, the associative memory based on the ID model, which has self-connection in order to enlarge the memory capacity, has the possibility to improve the basin size of the network. In this paper, we examine the fundamental characteristics of an associative memory based on the ID model by numerical simulation and show the improvement of performance compared with the conventional neural network.
Hsi-Cheng CHANG Chiun-Chieh HSU
Data clustering is a technique for grouping similar data items together for convenient understanding. Conventional data clustering methods, including agglomerative hierarchical clustering and partitional clustering algorithms, frequently perform unsatisfactorily for large text collections, since the computation complexities of the conventional data clustering methods increase very quickly with the number of data items. Poor clustering results degrade intelligent applications such as event tracking and information extraction. This paper presents an unsupervised document clustering method which identifies topic keyword clusters of the text corpus. The proposed method adopts a multi-stage process. First, an aggressive data cleaning approach is employed to reduce the noise in the free text and further identify the topic keywords in the documents. All extracted keywords are then grouped into topic keyword clusters using the k-nearest neighbor approach and the keyword clustering technique. Finally, all documents in the corpus are clustered based on the topic keyword clusters. The proposed method is assessed against conventional data clustering methods on a web news corpus. The experimental results show that the proposed method is an efficient and effective clustering approach.
Machine transliteration is an automatic method to generate characters or words in one alphabetical system for the corresponding characters in another alphabetical system. Machine transliteration can play an important role in natural language application such as information retrieval and machine translation, especially for handling proper nouns and technical terms. The previous works focus on either a grapheme-based or phoneme-based method. However, transliteration is an orthographical and phonetic converting process. Therefore, both grapheme and phoneme information should be considered in machine transliteration. In this paper, we propose a grapheme and phoneme-based transliteration model and compare it with previous grapheme-based and phoneme-based models using several machine learning techniques. Our method shows about 1378% performance improvement.
Hamid LAGA Hiroki TAKAHASHI Masayuki NAKAJIMA
In this paper, we present a novel framework for analyzing and segmenting point-sampled 3D objects. Our algorithm computes a decomposition of a given point set surface into meaningful components, which are delimited by line features and deep concavities. Central to our method is the extension of the scale-space theory to the three-dimensional space to allow feature analysis and classification at different scales. Then, a new surface classifier is computed and used in an anisotropic diffusion process via partial differential equations (PDEs). The algorithm avoids the misclassifications due to fuzzy and incomplete line features. Our algorithm operates directly on points requiring no vertex connectivity information. We demonstrate and discuss its performance on a collection of point sampled 3D objects including CAD and natural models. Applications include 3D shape matching and retrieval, surface reconstruction and feature preserving simplification.
Semantic image segmentation and appropriate region content description are crucial issues for region-based image retrieval (RBIR). In this paper, a novel region-based image retrieval method is proposed, which performs fast coarse image segmentation and fine region feature extraction using the decomposition property of image wavelet transform. First, coarse image segmentation is conducted efficiently in the Low-Low(LL) frequency subband of image wavelet transform. Second, the feature vector of each segmented region is hierarchically extracted from all different wavelet frequency subbands, which captures the distinctive feature (e.g., semantic texture) inside one region finely. Experiment results show the efficiency and the effectiveness of the proposed method for region-based image retrieval.
Harksoo KIM Choong-Nyoung SEON Jungyun SEO
Most of commercial websites provide customers with menu-driven navigation and keyword search. However, these inconvenient interfaces increase the number of mouse clicks and decrease customers' interest in surfing the websites. To resolve the problem, we propose an information retrieval assistant using a natural language interface in online sales domains. The information retrieval assistant has a client-server structure; a system connector and a NLP (natural language processing) server. The NLP server performs a linguistic analysis of users' queries with the help of coordinated NLP agents that are based on shallow NLP techniques. After receiving the results of the linguistic analysis from the NLP server, the system connector interacts with outer information provision systems such as conventional information retrieval systems and relational database management systems according to the analysis results. Owing to the client-server structure, we can easily add other information provision systems to the information retrieval assistant with trivial modifications of the NLP server. In addition, the information retrieval assistant guarantees fast responses because it uses shallow NLP techniques. In the preliminary experiment, as compared to the menu-driven system, we found that the information retrieval assistant could reduce the bothersome tasks such as menu selecting and mouse clicking because it provides a convenient natural language interface.
Masahiko MATSUSHITA Hiromitsu NISHIZAKI Takehito UTSURO Seiichi NAKAGAWA
This paper presents speech-driven Web retrieval models which accept spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken queries and then improving retrieval accuracy in speech-driven Web retrieval. We experimentally evaluated the techniques of combining outputs of multiple LVCSR models in recognition of spoken queries. As model combination techniques, we compared the SVM learning technique with conventional voting schemes such as ROVER. In addition, for investigating the effects on the retrieval performance in vocabulary size of the language model, we prepared two kinds of language models: the one's vocabulary size was 20,000, the other's one was 60,000. Then, we evaluated the differences in the recognition rates of the spoken queries and the retrieval performance. We showed that the techniques of multiple LVCSR model combination could achieve improvement both in speech recognition and retrieval accuracies in speech-driven text retrieval. Comparing with the retrieval accuracies when an LM with a 20,000/60,000 vocabulary size is used in an LVCSR system, we found that the larger the vocabulary size is, the better the retrieval accuracy is.
The growth of the Internet has resulted in an increasing need for personalized information systems. The paper describes an autonomous agent, the Web Robot Agent or WebBot, which integrates with the web and acts as a personal recommendation system that cooperates with the user in order to identify interesting pages. The Apriori algorithm extracts the characteristics of the web pages in the form of association words that are semantically related and mines a bag of association words. Using hybrid components from collaborative filtering and content-based filtering, this hybrid recommendation system can overcome the shortcomings associated with traditional recommendation systems. In this paper, we present an improved recommendation system, which uses the user preference mining through hybrid 2-way filtering. The proposed method was tested on a database, and its effectiveness compared with existent methods was proven in on-line experiments.
Beom-Joon CHO Bong-Kee SIN Jin H. KIM
The traditional methods of HMM, although highly successful in 1-D time series analysis, have not yet been successfully extended to 2-D image analysis while fully exploiting the hierarchical design and extension of HMM networks for complex structured signals. Apart from the traditional method off-line training of the Baum-Welch algorithm, we propose a new method of real time creation of word or composite character HMMs for 2-D word/character patterns. Unlike the Latin words in which letters run left-to-right, the composition of word/character components need not be linear, as in Korean Hangul and Chinese characters. The key idea lies in the character composition at the image level and the image-to-model conversion followed by redundancy reduction. Although the resulting model is not optimal, the proposed method has much greater advantage in regard to memory usage and training difficulty. In a series of experiments in character/word spotting in document images, the system recorded the hit ratios of 80% and 67% in Hangul character and word spotting respectively without language models.
Yasuhito ASANO Hiroshi IMAI Masashi TOYODA Masaru KITSUREGAWA
In this paper, we present Neighbor Community Finder (NCF, for short), a tool for finding Web communities related to given URLs. While existing link-based methods of finding communities, such as HITS, trawling, and Companion, use algorithms running on a Web graph whose vertices are pages and edges are links on the Web, NCF uses an algorithm running on an inter-site graph whose vertices are sites and edges are global-links (links between sites). Since the phrase "Web site" is used ambiguously in our daily life and has no unique definition, NCF uses directory-based sites proposed by the authors as a model of Web sites. NCF receives URLs interested in by a user and constructs an inter-site graph containing neighbor sites of the given URLs by using a method of identifying directory-based sites from URL and link data obtained from the actual Web on demand. By computational experiments, we show that NCF achieves higher quality than Google's "Similar Pages" service for finding pages related to given URLs corresponding to various topics selected from among the directories of Yahoo! Japan.
An information retrieval (IR) system with query expansion on a low-cost high-performance PC cluster environment is implemented. We study how query performance is affected by query expansion and two declustering methods using two standard Korean test collections. According to the experiments, the greedy method shows about 20% enhancement overall when compared with the lexical method.
Cheon Won CHOI Woo Cheol SHIN Jin Kyung PARK Jun HA Ho-Kyoung LEE
In provisioning packet data service on wireless cellular networks, a scheme of altering connection status between mobile and base stations appeared intending to efficiently utilize resource during idle periods. In such a scheme, connection components are sequentially released as an idle period persists, while the transmitting station converts to an transmission activity mode as the station is loaded with packets. However, actual resume of transmission activity is postponed by connection retrieval time to restore lost connection components. In general, an idle period affects the following connection retrieval time, which in turn produces an impact on the forthcoming idle period. Such chain reaction also makes a significant influence on overall packet delay performance. In this paper, as a way of improving packet delay performance, we propose two schemes identified as conservative extension and load threshold schemes. In the conservative extension scheme, we intentionally extend connection retrieval times so that each connection retrieval time is guaranteed not to be lower than a certain value. On the other hand, according to the load threshold scheme, a retrieval of lost connection components is postponed until packets are accumulated at the transmitting station up to a prescribed threshold. An increase in the value and threshold incurs an additional stand-by before resuming transmission activity in both proposed schemes. In turn, such intentional stand-by may contribute to regulating the length of idle period and connection retrieval time, and subsequently improving packet delay performance. To inspect the impact of conservative extension and load threshold schemes on packet delay performance, we first investigate the properties of idle periods. Secondly, for Poisson packet arrivals, we present an analytical method to exactly calculate the moments of packet delay time (at steady state) in each scheme. From numerical examples, we confirm the existence of non-trivial optimal value and threshold minimizing average packet delay or packet delay variation and conclude that conservative extension and load threshold schemes are able to enhance packet delay performance in various environments.
Xiang-Yan ZENG Yen-Wei CHEN Zensho NAKAO Jian CHENG Hanqing LU
Color histograms are effective for representing color visual features. However, the high dimensionality of feature vectors results in high computational cost. Several transformations, including singular value decomposition (SVD) and principal component analysis (PCA), have been proposed to reduce the dimensionality. In PCA, the dimensionality reduction is achieved by projecting the data to a subspace which contains most of the variance. As a common observation, the PCA basis function with the lowest frquency accounts for the highest variance. Therefore, the PCA subspace may not be the optimal one to represent the intrinsic features of data. In this paper, we apply independent component analysis (ICA) to extract the features in color histograms. PCA is applied to reduce the dimensionality and then ICA is performed on the low-dimensional PCA subspace. The experimental results show that the proposed method (1) significantly reduces the feature dimensions compared with the original color histograms and (2) outperforms other dimension reduction techniques, namely the method based on SVD of quadratic matrix and PCA, in terms of retrieval accuracy.
Haruo YOKOTA Takashi KOBAYASHI Taichi MURAKI Satoshi NAOI
A combination of slides used in a presentation and a video recording of the circumstances of the presentation are quite useful for many applications, such as e-learning. However, to create new content from these with current authoring tools requires considerable effort for the author and the products have reduced flexibility. In this paper, we propose the preparation of a unifying function without creating new content manually. We also propose a new approach to search unified presentation manuscripts for slides matched with given keywords by considering the features peculiar to the presentation slides. We propose impression indicators to express how well a slide matches the given keywords. We also propose a system for retrieving a sequence of desired presentation slides from archives of the combined slides and video. We named the system Unified Presentation Slide Retrieval by Impression Search Engine or UPRISE. We describe the system configuration of UPRISE and the experimentation undertaken to evaluate the effect of the proposed indicators and to compare the results with those of the traditional tf.idf retrieval method.