The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] database(210hit)

41-60hit(210hit)

  • Probabilistic Frequent Itemset Mining on a GPU Cluster Open Access

    Yusuke KOZAWA  Toshiyuki AMAGASA  Hiroyuki KITAGAWA  

     
    PAPER

      Vol:
    E97-D No:4
      Page(s):
    779-789

    Probabilistic frequent itemset mining, which discovers frequent itemsets from uncertain data, has attracted much attention due to inherent uncertainty in the real world. Many algorithms have been proposed to tackle this problem, but their performance is not satisfactory because handling uncertainty incurs high processing cost. To accelerate such computation, we utilize GPUs (Graphics Processing Units). Our previous work accelerated an existing algorithm with a single GPU. In this paper, we extend the work to employ multiple GPUs. Proposed methods minimize the amount of data that need to be communicated among GPUs, and achieve load balancing as well. Based on the methods, we also present algorithms on a GPU cluster. Experiments show that the single-node methods realize near-linear speedups, and the methods on a GPU cluster of eight nodes achieve up to a 7.1 times speedup.

  • Probabilistic Range Querying over Gaussian Objects Open Access

    Tingting DONG  Chuan XIAO  Yoshiharu ISHIKAWA  

     
    PAPER

      Vol:
    E97-D No:4
      Page(s):
    694-704

    Probabilistic range query is an important type of query in the area of uncertain data management. A probabilistic range query returns all the data objects within a specific range from the query object with a probability no less than a given threshold. In this paper, we assume that each uncertain object stored in the database is associated with a multi-dimensional Gaussian distribution, which describes the probability distribution that the object appears in the multi-dimensional space. A query object is either a certain object or an uncertain object modeled by a Gaussian distribution. We propose several filtering techniques and an R-tree-based index to efficiently support probabilistic range queries over Gaussian objects. Extensive experiments on real data demonstrate the efficiency of our proposed approach.

  • Performance Improvement of Database Compression for OLTP Workloads

    Ki-Hoon LEE  

     
    LETTER-Data Engineering, Web Information Systems

      Vol:
    E97-D No:4
      Page(s):
    976-980

    As data volumes explode, data storage costs become a large fraction of total IT costs. We can reduce the costs substantially by using compression. However, it is generally known that database compression is not suitable for write-intensive workloads. In this paper, we provide a comprehensive solution to improve the performance of compressed databases for write-intensive OLTP workloads. We find that storing data too densely in compressed pages incurs many future page splits, which require exclusive locks. In order to avoid lock contention, we reduce page splits by sacrificing a couple of percent of space savings. We reserve enough space in each compressed page for future updates of records and prevent page merges that are prone to incur page splits in the near future. The experimental results using TPC-C benchmark and MySQL/InnoDB show that our method gives 1.5 times higher throughput with 33% space savings compared with the uncompressed counterpart and 1.8 times higher throughput with only 1% more space compared with the state-of-the-art compression method developed by Facebook.

  • White Space Communication Systems: An Overview of Regulation, Standardization and Trial Open Access

    Hiroshi HARADA  

     
    INVITED PAPER

      Vol:
    E97-B No:2
      Page(s):
    261-274

    This paper summarizes the current status of regulations, standardization efforts and trials around the world regarding white space (WS) communications, especially television band WS (TVWS). After defining WS communication systems configurations and function and the categories of white space database, the TVWS regulations in United States, United Kingdom, and Japan are summarized. Then regarding status of standardization for TVWS devices, IEEE 802 and IEEE 1900 standards are summarized. Finally ongoing pilot projects and trials of WS communications in the world are summarized, and trends and future direction of research on WS communication systems are summarized.

  • Accelerating Range Query Processing on R-Tree Using Graphics Processing Units

    Boseon YU  Hyunduk KIM  Wonik CHOI  Dongseop KWON  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E96-D No:12
      Page(s):
    2776-2785

    Recently, various research efforts have been conducted to develop strategies for accelerating multi-dimensional query processing using the graphics processing units (GPUs). However, well-known multi-dimensional access methods such as the R-tree, B-tree, and their variants are hardly applicable to GPUs in practice, mainly due to the characteristics of a hierarchical index structure. More specifically, the hierarchical structure not only causes frequent transfers of small volumes of data but also provides limited opportunity to exploit the advanced data parallelism of GPUs. To address these problems, we propose an approach that uses GPUs as a buffer. The main idea is that object entries in recently visited leaf nodes are buffered in the global memory of GPUs and processed by massive parallel threads of the GPUs. Through extensive performance studies, we observed that the proposed approach achieved query performance up to five times higher than that of the original R-tree.

  • Transformation of a Relational Database to RDF/RDFS with ER2iDM

    Mi-Young CHOI  Chang-Joo MOON  Doo-Kwon BAIK  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E96-D No:7
      Page(s):
    1478-1488

    The Semantic Web uses RDF/RDFS, which can enable a machine to understand web data without human interference. But most web data is not available in RDF/RDFS documents because most web data is still stored in databases. It is much more favorable to use stored data in a database to build the Semantic Web. This paper proposes an enhanced relational RDF/RDFS interoperable data model (ER2iDM) and a transformation procedure from relational data model (RDM) to RDF/RDFS based on ER2iDM. The ER2iDM is a data model that plays the role of an inter-mediator between RDM and RDF/RDFS during a transformation procedure. The data and schema information in the database are migrated to the ER2iDM according to the proposed translation procedures without incurring loss of meaning of the entities, relationships, and data. The RDF/RDFS generation tool makes a RDF/RDFS XML document automatically from the ER2iDM. The proposed ER2iDM and transformation procedure provides detailed guidelines for transformation from RDM to RDF/RDFS unlike existing studies; therefore, we can more efficiently build up the Semantic Web using database stored data.

  • Deciding Schema k-Secrecy for XML Databases

    Chittaphone PHONHARATH  Kenji HASHIMOTO  Hiroyuki SEKI  

     
    PAPER-Static Analysis

      Vol:
    E96-D No:6
      Page(s):
    1268-1277

    We study a static analysis problem on k-secrecy, which is a metric for the security against inference attacks on XML databases. Intuitively, k-secrecy means that the number of candidates of sensitive data of a given database instance or the result of unauthorized query cannot be narrowed down to k-1 by using available information such as authorized queries and their results. In this paper, we investigate the decidability of the schema k-secrecy problem defined as follows: for a given XML database schema, an authorized query and an unauthorized query, decide whether every database instance conforming to the given schema is k-secret. We first show that the schema k-secrecy problem is undecidable for any finite k>1 even when queries are represented by a simple subclass of linear deterministic top-down tree transducers (LDTT). We next show that the schema ∞-secrecy problem is decidable for queries represented by LDTT. We give an algorithm for deciding the schema ∞-secrecy problem and analyze its time complexity. We show the schema ∞-secrecy problem is EXPTIME-complete for LDTT. Moreover, we show similar results LDTT with regular look-ahead.

  • Flash-Aware Page Management Policy of a Navigation-Specialized Mobile DBMS for an Incremental Map Update

    KyoungWook MIN  JeongDan CHOI  

     
    LETTER-Data Engineering, Web Information Systems

      Vol:
    E96-D No:5
      Page(s):
    1211-1214

    The performance of a mobile database management system (DBMS) in which most queries are made up of random data accesses if the NAND flash memory is used as storage media of the DBMS is degraded. The reason for this is that the performance of NAND flash memory is good for writing sequentially but poor when writing randomly. Thus, a new storage structure and querying policies are needed in mobile DBMS when flash memory is used as the storage media. In this letter, we propose a new policy of database page management to enhance the frequent random update performance, and then evaluate the performance experimentally.

  • ATTI: Workload-Aware Query Adaptive OcTree Based Trajectory Index

    Xiangxu MENG  Xiaodong WANG  Xinye LIN  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E96-D No:3
      Page(s):
    643-654

    The GPS trajectory databases serve as bases for many intelligent applications that need to extract some trajectories for future processing or mining. When doing such tasks, spatio-temporal range queries based methods, which find all sub-trajectories within the given spatial extent and time interval, are commonly used. However, the history trajectory indexes of such methods suffer from two problems. First, temporal and spatial factors are not considered simutaneously, resulting in low performance when processing spatio-temporal queries. Second, the efficiency of indexes is sensitive to query size. The query performance changes dramatically as the query size changed. This paper proposes workload-aware Adaptive OcTree based Trajectory clustering Index (ATTI) aiming at optimizing trajectory storage and index performance. The contributions are three-folds. First, the distribution and time delay of the trajectory storage are introduced into the cost model of spatio-temporal range query; Second, the distribution of spatial division is dynamically adjusted based on GPS update workload; Third, the query workload adaptive mechanism is proposed based on virtual OcTree forest. A wide range of experiments are carried out over Microsoft GeoLife project dataset, and the results show that query delay of ATTI could be about 50% shorter than that of the nested index.

  • SASUM: A Sharing-Based Approach to Fast Approximate Subgraph Matching for Large Graphs

    Song-Hyon KIM  Inchul SONG  Kyong-Ha LEE  Yoon-Joon LEE  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E96-D No:3
      Page(s):
    624-633

    Subgraph matching is a fundamental operation for querying graph-structured data. Due to potential errors and noises in real-world graph data, exact subgraph matching is sometimes inappropriate in practice. In this paper we consider an approximate subgraph matching model that allows missing edges. Based on this model, approximate subgraph matching finds all occurrences of a given query graph in a database graph, allowing missing edges. A straightforward approach is to first generate query subgraphs of a given query graph by deleting edges and then perform exact subgraph matching for each query subgraph. In this paper we propose a sharing-based approach to approximate subgraph matching, called SASUM. Our method is based on the fact that query subgraphs are highly overlapped. Due to this overlapping nature of query subgraphs, the matches of a query subgraph can be computed from the matches of a smaller query subgraph, which results in reducing the number of query subgraphs that require expensive exact subgraph matching. Our method uses a lattice framework to identify sharing opportunities between query subgraphs. To further reduce the number of graphs that need exact subgraph matching, SASUM generates small base graphs that are shared by query subgraphs and chooses the minimum number of base graphs whose matches are used to derive the matching results of all query subgraphs. A comprehensive set of experiments shows that our approach outperforms the state-of-the-art approach by orders of magnitude in terms of query execution time.

  • Interoperable Spatial Information Model and Design Environment Based on ucR Technology

    Yukihiko SHIGESADA  Shinsuke KOBAYASHI  Noboru KOSHIZUKA  Ken SAKAMURA  

     
    PAPER-Information Network

      Vol:
    E96-D No:1
      Page(s):
    51-63

    Context awareness is one of the ultimate goals of ubiquitous computing, and spatial information plays an important role in building context awareness. In this paper, we propose a new interoperable spatial information model, which is based on ucode relation (ucR) and Place Identifier (PI), for realizing ubiquitous spatial infrastructure. In addition, we propose a design environment for spatial information database using our model. Our model is based on ucode and its relation. ucode is 128 bits number and the number itself has no meaning. Hence, it is difficult to manage the relation between ucodes without using a tool. Our design environment provides to describe connection between each ucode visually and is able to manipulate data using the target space map interactively. To evaluate the proposed model and environment, we designed three spaces using our tool. In addition, we developed a web application using our spatial model. From evaluation, we have been showed that our model is effective and our design environment is useful to develop our spatial information model.

  • Batch Logical Protocols for Efficient Multi-Party Computation

    Naoto KIRIBUCHI  Ryo KATO  Tsukasa ENDO  Takashi NISHIDE  Hiroshi YOSHIURA  

     
    PAPER-Cryptography and Information Security

      Vol:
    E95-A No:10
      Page(s):
    1718-1728

    It is becoming more and more important to make use of personal or classified information while keeping it confidential. A promising tool for meeting this challenge is secure multi-party computation (MPC). It enables multiple parties, each given a snippet of a secret s, to compute a function f(s) by communicating with each other without revealing s. However, one of the biggest problems with MPC is that it requires a vast amount of communication. Much research has gone into making each protocol (equality testing, interval testing, etc.) more efficient. In this work, we make a set of multiple protocols more efficient by transforming them into their equivalent batch processing form and propose two protocols: “Batch Logical OR” and “Batch Logical AND.” Using proposed protocols recursively, we also propose “Batch Logical OR-AND” and “Batch Logical AND-OR,” and show arbitrary formula consisting of Boolean protocols, OR gates, and AND gates can be batched. Existing logical OR and logical AND protocols consisting of t equality testing invocations have a communication complexity of O(t), where is the bit length of the secrets. Our batched versions of these protocols reduce it to O( + t). For t interval testing invocations, they reduce both communication and round complexity. Thus they can make the queries on a secret shared database more efficient. For example, the use of the proposed protocols reduces the communication complexity for a query consisting of equality testing and interval testing by approximately 70% compared to the use of the corresponding existing protocols. The concept of the proposed protocols is versatile and can be applied to logical formulae consisting of protocols other than equality testing and interval testing, thereby making them more efficient as well.

  • Outlier Detection and Removal for HMM-Based Speech Synthesis with an Insufficient Speech Database

    Doo Hwa HONG  June Sig SUNG  Kyung Hwan OH  Nam Soo KIM  

     
    LETTER-Speech and Hearing

      Vol:
    E95-D No:9
      Page(s):
    2351-2354

    Decision tree-based clustering and parameter estimation are essential steps in the training part of an HMM-based speech synthesis system. These two steps are usually performed based on the maximum likelihood (ML) criterion. However, one of the drawbacks of the ML criterion is that it is sensitive to outliers which usually result in quality degradation of the synthesized speech. In this letter, we propose an approach to detect and remove outliers for HMM-based speech synthesis. Experimental results show that the proposed approach can improve the synthetic speech, particularly when the available training speech database is insufficient.

  • Securing Provenance of Distributed Processes in an Untrusted Environment

    Amril SYALIM  Takashi NISHIDE  Kouichi SAKURAI  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E95-D No:7
      Page(s):
    1894-1907

    Recently, there is much concern about the provenance of distributed processes, that is about the documentation of the origin and the processes to produce an object in a distributed system. The provenance has many applications in the forms of medical records, documentation of processes in the computer systems, recording the origin of data in the cloud, and also documentation of human-executed processes. The provenance of distributed processes can be modeled by a directed acyclic graph (DAG) where each node represents an entity, and an edge represents the origin and causal relationship between entities. Without sufficient security mechanisms, the provenance graph suffers from integrity and confidentiality problems, for example changes or deletions of the correct nodes, additions of fake nodes and edges, and unauthorized accesses to the sensitive nodes and edges. In this paper, we propose an integrity mechanism for provenance graph using the digital signature involving three parties: the process executors who are responsible in the nodes' creation, a provenance owner that records the nodes to the provenance store, and a trusted party that we call the Trusted Counter Server (TCS) that records the number of nodes stored by the provenance owner. We show that the mechanism can detect the integrity problem in the provenance graph, namely unauthorized and malicious “authorized” updates even if all the parties, except the TCS, collude to update the provenance. In this scheme, the TCS only needs a very minimal storage (linear with the number of the provenance owners). To protect the confidentiality and for an efficient access control administration, we propose a method to encrypt the provenance graph that allows access by paths and compartments in the provenance graph. We argue that encryption is important as a mechanism to protect the provenance data stored in an untrusted environment. We analyze the security of the integrity mechanism, and perform experiments to measure the performance of both mechanisms.

  • Decidability of the Security against Inference Attacks Using a Functional Dependency on XML Databases

    Kenji HASHIMOTO  Hiroto KAWAI  Yasunori ISHIHARA  Toru FUJIWARA  

     
    PAPER-Database Security

      Vol:
    E95-D No:5
      Page(s):
    1365-1374

    This paper discusses verification of the security against inference attacks on XML databases in the presence of a functional dependency. So far, we have provided the verification method for k-secrecy, which is a metric for the security against inference attacks on databases. Intuitively, k-secrecy means that the number of candidates of sensitive data (i.e., the result of unauthorized query) of a given database instance cannot be narrowed down to k-1 by using available information such as authorized queries and their results. In this paper, we consider a functional dependency on database instances as one of the available information. Functional dependencies help attackers to reduce the number of the candidates for the sensitive information. The verification method we have provided cannot be naively extended to the k-secrecy problem with a functional dependency. The method requires that the candidate set can be captured by a tree automaton, but the candidate set when a functional dependency is considered cannot be always captured by any tree automaton. We show that the ∞-secrecy problem in the presence of a functional dependency is decidable when a given unauthorized query is represented by a deterministic topdown tree transducer, without explicitly computing the candidate set.

  • AQBE – QBE Style Queries for Archetyped Data

    Shelly SACHDEVA  Daigo YAGINUMA  Wanming CHU  Subhash BHALLA  

     
    PAPER-Biological Engineering

      Vol:
    E95-D No:3
      Page(s):
    861-871

    Large-scale adoption of electronic healthcare applications requires semantic interoperability. The new proposals propose an advanced (multi-level) DBMS architecture for repository services for health records of patients. These also require query interfaces at multiple levels and at the level of semi-skilled users. In this regard, a high-level user interface for querying the new form of standardized Electronic Health Records system has been examined in this study. It proposes a step-by-step graphical query interface to allow semi-skilled users to write queries. Its aim is to decrease user effort and communication ambiguities, and increase user friendliness.

  • A Distributed Sensing and Caching Database for Cognitive Radio Systems

    Ha-Nguyen TRAN  Chen SUN  Yohannes D. ALEMSEGED  Hiroshi HARADA  

     
    PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

      Vol:
    E95-B No:1
      Page(s):
    217-225

    This paper presents the efficiency of a sensing database and caching (SDB) for cognitive radio systems. The proposed SDB stores regulatory information from regulatory databases, and contains sensing information by distributed sensing schemes. Preliminary information processing for instance indexing, sorting, or applying some models or algorithms, etc. can be performed for the stored data. Available information and the results of the information processing are provided to cognitive radios in order to determine available spectrum and to facilitate dynamic spectrum access at lower sensing cost but higher sensing quality. The SDB is implemented in local networks, therefore information exchange between SDB and the cognitive radios can be realized at low latency and the amount of signaling traffic to global network can be reduced. This paper analyzes the effect of SDB and the performance evaluation was done in a certain condition. As a result, by deploying SDB a system can achieve up to 20% of reduction of sensing activities and maximum 1.3 times higher sensing quality.

  • Modeling Uncertainty in Moving Objects Databases

    Shayma ALKOBAISI  Wan D. BAE  Sada NARAYANAPPA  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E94-D No:12
      Page(s):
    2440-2459

    The increase in the advanced location based services such as traffic coordination and management necessitates the need for advanced models tracking the positions of Moving Objects (MOs) like vehicles. Due to computer processing limitations, it is impossible for MOs to continuously update their locations. This results in the uncertainty nature of a MO's location between any two reported positions. Efficiently managing and quantifying the uncertainty regions of MOs are needed in order to support different types of queries and to improve query response time. This challenging problem of modeling uncertainty regions associated with MO was recently addressed by researchers and resulted in models that ranged from linear which require few properties of MOs as input to the models, to non-linear that are able to more accurately represent uncertainty regions by considering higher degree input. This paper summarizes and discusses approaches in modeling uncertainty regions associated with MOs. It further illustrates the need for appropriate approximations especially in the case of non-linear models as the uncertainty regions become rather irregularly shaped and difficult to manage. Finally, we demonstrate through several experimental sets the advantage of non-linear models over linear models when the uncertainty regions of MOs are approximated by two different approximations; the Minimum Bounding Box (MBB) and the Tilted Minimum Bounding Box (TMBB).

  • ROCKET: A Robust Parallel Algorithm for Clustering Large-Scale Transaction Databases

    Woong-Kee LOH  Yang-Sae MOON  Heejune AHN  

     
    LETTER-Artificial Intelligence, Data Mining

      Vol:
    E94-D No:10
      Page(s):
    2048-2051

    We propose a robust and efficient algorithm called ROCKET for clustering large-scale transaction databases. ROCKET is a divisive hierarchical algorithm that makes the most of recent hardware architecture. ROCKET handles the cases with the small and the large number of similar transaction pairs separately and efficiently. Through experiments, we show that ROCKET achieves high-quality clustering with a dramatic performance improvement.

  • Phonetically Balanced Text Corpus Design Using a Similarity Measure for a Stereo Super-Wideband Speech Database

    Yoo Rhee OH  Yong Guk KIM  Mina KIM  Hong Kook KIM  Mi Suk LEE  Hyun Joo BAE  

     
    PAPER-Speech and Hearing

      Vol:
    E94-D No:7
      Page(s):
    1459-1466

    In this paper, we propose a text corpus design method for a Korean stereo super-wideband speech database. Since a small-sized text corpus for speech coding is generally required for speech coding, the corpus should be designed to comply with the pronunciation behavior of natural conversation in order to ensure efficient speech quality tests. To this end, the proposed design method utilizes a similarity measure between the phoneme distribution occurring from natural conversation and that from the designed text corpus. In order to achieve this goal, we first collect and refine text data from textbooks and websites. Next, a corpus is designed from the refined text data based on the similarity measure to compare phoneme distributions. We then construct a Korean stereo super-wideband speech (K-SW) database using the designed text corpus, where the recording environment is set to meet the conditions defined by ITU-T. Finally, the subjective quality of the K-SW database is evaluated using an ITU-T super-wideband codec in order to demonstrate that the K-SW database is useful for developing and evaluating super-wideband codecs.

41-60hit(210hit)