Dae-Kyu SHIN Hyun-Sool KIM Tae-Yun CHUNG Sang-Hui PARK
This paper proposes a new method of retrieving images from large image databases. The method is based on VQ (Vector Quantization) of local texture features at interest points automatically detected in an image. The texture features are extracted by Gabor wavelet filter bank, and rearranged for rotation. These features are classified by VQ and then construct a pattern histogram. Retrievals are performed by just comparing pattern histograms between images.
This paper discusses a new type of semi-supervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes documents into groups based only on similarity measures. In this paper, we attempt to isolate more semantically coherent clusters by employing the domain-specific knowledge provided by a document analyst. By using external human knowledge to guide the clustering mechanism with some flexibility when creating the clusters, clustering efficiency can be considerably enhanced. Experimental results show that the use of only a little external knowledge can considerably enhance the quality of clustering results that satisfy users' constraint.
Hee-Won CHANG Eung-Kwan KANG Jong-Soo CHOI
In this paper, we propose a new content-based image retrieval method by using color and color spatial information of an image. To index images, we use the average coordinates of color distribution to obtain the spatial information of each segmented fuzzy region that is less sensitive to image rotation and object translations in an image. Furthermore, we also propose the alternative to solve the ripple phenomenon, which is occurred in the conventional fuzzy region segmentation algorithm and distorts the information of other regions.
In content-based image retrieval (CBIR), the content of an image can be expressed in terms of different features such as color, texture, shape, or text annotations. Retrieval methods based on these features can be varied depending on how the feature values are combined. Many of the existing approaches assume linear relationships between different features, and also require users to assign weights to features for themselves. Other nonlinear approaches have mostly concentrated on indexing technique. While the linearly combining approach establishes the basis of CBIR, the usefulness of such systems is limited due to the lack of the capability to represent high-level concepts using low-level features and human perception subjectivity. In this paper, we introduce a Neural Network-based Image Retrieval (NNIR) system, a human-computer interaction approach to CBIR using the Radial Basis Function (RBF) network. The proposed approach allows the user to select an initial query image and incrementally search target images via relevance feedback. The experimental results show that the proposed approach has the superior retrieval performance over the existing linearly combining approach, the rank-based method, and the BackPropagation-based method.
Our aim is to develop an intuitive and effective sound retrieval method for non-expert users. Such a retrieval method should be developed to accommodate a human's perceptual features. We therefore first conducted an experiment to clarify how people represent sound. A participant listens to one sound stimulus and then conveys the sound to a partner. The results indicated that people used mostly verbal description categorized in three groups: the sound itself, the sound's situation, and the sound's impression. Based on these results, we propose three types of keywords: onomatopoeia, sound source, and adjective, which are typical keywords of the above three groups of sound description, for sound retrieval. This retrieval method was implemented for a sound database. Our method can increase the varieties of sounds able to be retrieved and allow users to intuitively search sounds because users can retrieve sounds by using keywords that are most natural to them.
Kunihiko SADAKANE Hiroshi IMAI
When we search from a huge amount of documents, we often specify several keywords and use conjunctive queries to narrow the result of the search. Though the searched documents contain all keywords, positions of the keywords are usually not considered. As a result, the search result contains some meaningless documents. It is therefore effective to rank documents according to proximity of keywords in the documents. This ranking is regarded as a kind of text data mining. In this paper, we propose two algorithms for finding documents in which all given keywords appear in neighboring places. One is based on plane-sweep algorithm and the other is based on divide-and-conquer approach. Both algorithms run in O(n log n) time where n is the number of occurrences of given keywords. We run the algorithms on a large collection of html files and verify its effectiveness.
Liang CHEN Naoyuki TOKUDA Akira NAGAI
To improve the unstable performance of the traditional keyword-based search engine due to ambiguities of a natural language such as synonymy and /or polysemy, we have developed a new advanced DLSI (differential latent semantic index) space based probabilistic information retrieval system. The new method exploits a most likelihood posteriori function providing a measure of reliability in retrieving a document in the database having a closest match with another document of a query. Our simple experiment gives a supporting evidence for the validity of the theory, which is capable of capturing the intricate variability in word usage contributing to a more robust context contingent search engine.
Private information retrieval for k 1 databases (denoted by (k,l)-PIR for short) is a protocol that (1) a user sends an l tuple query to each of k noncommunicating replicated databases; (2) each database responds the user with an answer corresponding to the l tuple query; (3) the user privately retrieve any single bit out of the n bits of data stored in k databases. In this model, "privacy" implies that the user retrieves the bit he is interested in but releases to each database nothing about which bit he wishes to get. In general, the efficiency of (k,l)-PIR is measured by the total amount of bits exchanged between the user and the k databases, but few about its lower bounds are known except for restricted cases. In this paper, we classify (k,l)-PIR into a linear type, a multilinear type, and an affine type with respect to the relationship between queries to each database (made by the user) and answers to the user (made by each database), and show that (1) the lower bound for the communication complexity of any multilinear type (k,l)-PIR is Ω(n1/(l+1)) (Theorem 3.1); (2) the lower bound for the communication complexity of any linear type (k,l)-PIR is Ω(n) (Corollary 3.2); (3) the lower bound for the communication complexity of any affine type (k,l)-PIR is Ω(n1/(l+1)) (Theorem 4.2).
Zaher AGHBARI Kunihiko KANEKO Akifumi MAKINOUCHI
Recently, two approaches investigated indexing and retrieving videos. One approach utilized the visual features of individual objects, and the other approach exploited the spatio-temporal relationships between multiple objects. In this paper, we integrate both approaches into a new video model, called the Visual-Spatio-Temporal (VST) model to represent videos. The visual features are modeled in a topological approach and integrated with the spatio-temporal relationships. As a result, we defined rich sets of VST relationships which support and simplify the formulation of more semantical queries. An intuitive query interface which allows users to describe VST features of video objects by sketch and feature specification is presented. The conducted experiments prove the effectiveness of modeling and querying videos by the visual features of individual objects and the VST relationships between multiple objects.
Teruhito KANAZAWA Atsuhiro TAKASU Jun ADACHI
Semantic ambiguity is a serious problem in information retrieval. Query expansion has been proposed as one method of solving this problem. However, queries tend not to have much information for fitting query vectors to the latent semantics, which are difficult to express in a few query terms given by users. We propose a document vector modification method that modifies document vectors based on the relevance of documents. This method is expected to show better retrieval effectiveness than conventional methods. In this paper, we evaluate our method through retrieval experiments in which the relevance of documents extracted from scientific papers is assessed, and a comparison with tfidf is described.
Jiying ZHAO Rina HAYASAKA Ryoji MURANOI Masahito ITO Yutaka MATSUSHITA
In this paper, we define content-identifier (ContentID) to represent the characteristics of shot. The ContentID carries both positional and temporal color information. Based on the concept of ContentID, we propose a video retrieval method. The method is robust to compression, format conversion, frame dropping and noise such as watermark and so on. Furthermore, based on our retrieval method, we implemented a copyright protection system for digital video using spread-spectrum based watermarking technique.
Kazuhiro OTSUKA Tsutomu HORIKOSHI Haruhiko KOJIMA Satoshi SUZUKI
A novel method is proposed to retrieve image sequences with the goal of forecasting complex and time-varying natural patterns. To that end, we introduce a framework called Memory-Based Forecasting; it provides forecast information based on the temporal development of past retrieved sequences. This paper targets the radar echo patterns in weather radar images, and aims to realize an image retrieval method that supports weather forecasters in predicting local precipitation. To characterize the radar echo patterns, an appearance-based representation of the echo pattern, and its velocity field are employed. Temporal texture features are introduced to represent local pattern features including non-rigid complex motion. Furthermore, the temporal development of a sequence is represented as paths in eigenspaces of the image features, and a normalized distance between two sequences in the eigenspace is proposed as a dissimilarity measure that is used in retrieving similar sequences. Several experiments confirm the good performance of the proposed retrieval scheme, and indicate the predictability of the image sequence.
Efficient content-based retrieval of complex images is a challenging task since the detected object may appear in various scale, rotation and orientation with a wide variety of background colors and forms. In this paper, we propose a novel representation of objects with multiple colors, the spatial neighborhood-adjacency graph(SNAG), which can serve as a basis for detecting object by color contents from the candidate image. The SNAG consists of a set of main-vertices and two sets of edges. Each main-vertex represents a single color region of multi-colored object, and edges are divided into two classes: Neighborhood edges representing neighborhood relationship between two main-vertices with similar color, and adjacency edges representing adjacency relationship between a main-vertex and another vertex with different color. By investigating whether SNAG of object image is an isomorphic subgraph of SNAG of a candidate image, we can determine whether the similar object exists in the candidate image. In addition, we have also applied the proposed approach to a range of different object detection problems involving complex background, and effectiveness has been proved.
Fumihiro KUMENO Akihiko OHSUGA Shinichi HONIDEN
This paper describes the architecture to implement an application in network environments, which adapts to unexpected change in the development phase. In this architecture, an application is expressed as an agent which consists of two layers: base level and meta level. The base level program is an application program and the meta level program is the program that controls the execution of the base level and changes the base level program. Virtual places are also provided in the network. They are used for the release of programs and information which agents retrieve to change their own base level program. An application (or an agent), when a change is required, moves from places to places for the retrieval of programs to adapt to the change. A program search strategy is introduced to adapt to changes by using distributed thesauri of released programs, which realizes an agent's program retrieval method in network environments.
A content-based image retrieval scheme based on scale-space theory is proposed. Instead of considering all scales for image retrieval, the proposed algorithm utilizes a modified histogram intersection method to compute the relative scale between a query image and a candidate image. The proposed method has been applied to various images and the performance improvement has been verified.
To provide users with database-like query interfaces on HTML data, several systems have been developed to extract structures from HTML pages. Among them, tree-like structures and path expressions are the most popular modeling and navigating tools, respectively. Although path expressions are straightforward in representing top-down search patterns, they provide very limited help in representing bottom-up and in-breadth search patterns. In this paper, a lattice model is proposed to store Web data. The model provides an integrated mechanism to store text, linking information, HTML hierarchy, and sequence order of HTML data. By incorporating lattice operators with comprehension syntax, we show that the query language can represent top-down, bottom-up, and in-breadth searching patterns with uniform operators. It will be also shown that lattice comprehensions can represent all operators of path expressions, except Kleen closure.
Informally, private information retrieval for k 1 databases (k-PIR) is an interactive scheme that enables a user to make access to (separated) k replicated copies of a database and privately retrieve any single bit out of the n bits of data stored in the database. In this model, "privacy" implies that the user retrieves the bit he is interested in but releases to each database nothing about which bit he really tries to get. Chor et. al. proposed 2-PIR with communication complexity 12 n1/32 that is based on the covering codes. Then Ambainis recursively extended the scheme by Chor et. al. and showed that for each k 2, there exists k-PIR with communication complexity at most ckn1/(2k-1) some constant ck > 0. In this paper, we relax the condition for the covering codes and present time-efficient 2-PIR with communication complexity 12 n1/3. In addition, we generally formulate the recursive scheme by Ambainis and show that for each k 4, there exists k-PIR with communication complexity at most ck' n1/(2k-1) for some constant ck' << ck.
Masami SHISHIBORI Makoto OKADA Tooru SUMITOMO Jun-ichi AOE
In many applications, information retrieval is a very important research field. In several key strategies, the binary trie is famous as a fast access method able to retrieve keys in order. Especially, a Patricia trie gives the shallowest trie by eliminating all nodes which have only one arc, and it requires the smallest storage among the other trie structures. If trie structures are implemented, however, the greater the number of the registered keys, the larger storage is required. In order to solve this problem, Jonge et al. proposed a method to change the normal binary trie into a compact bit stream. This paper proposes the improved trie representation for the Patricia trie, as well as the methods for searching and inserting the key on it. The theoretical and experimental results, using 50,000 Japanese nouns and 50,000 English words, show that this method generates 25-39 percent shorter bit streams than the traditional method. This method, thus, enables us to provide more compact storage and faster access than the traditional method.
Takashi YUKAWA Kaname KASAHARA Kazumitsu MATSUZAWA
This paper proposes high-speed similitude retrieval schemes for a viewpoint-based similarity discrimination system (VB-SDS) and presents analytical and experimental performance evaluations. The VB-SDS, which contains a huge set of semantic definitions of commonly used words and computes semantic similarity between any two words under a certain viewpoint, promises to be a very important module in analogical and case-based reasoning systems that provide solutions under uncertainty. By computing and comparing similarities for all words contained in the system, the most similar word for a given word can be retrieved under a given viewpoint. However, the time this consumes makes the VB-SDS unsuitable for inference systems. The proposed schemes reduce search space based on the upper bound of a similarity calculation function to increase retrieval speed. An analytical evaluation shows the schemes can achieve a thousand-fold speedup and confirmed through experimental results for a VB-SDS containing about 40,000 words.
Teruyuki HASEGAWA Toru HASEGAWA Toshihiko KATO Kenji SUZUKI
Most of current real time video retrieval systems use video transfer protocols such that servers simply transmit video packets in the same rate as clients play them. If any packets are corrupted during transmission, they will be lost and cannot be recovered by retransmission. In video retrieval systems, however, teh video data are stored in servers and clients can prefetch them prior to playing. So, it might be possible for the video retrieval systems to make corrupted video packets retransmitted before the play-out dead line. But the application of existing reliable protocols causes problems such that, if a packet does not arrive before the dead line due to retransmission, the packets following it will not be delivered to the upper layer even if they have already arrived. In this paper, we discuss how to apply reliable protocols to real time video retrieval systems and propose an new real time video transfer protocol over ATM network, which provides the video data prefetch, the flow control for video buffer, the selective retransmission with skipping function for video packets late for the play-out dead line, and the resynchronization function for video buffer. We have implemented an experimental system using our protocol and evaluated the performance. The results of performance evaluation shows that the proposed protocol decreases the number of unplayed video data largely when transmission errors are inserted in an ATM network.