The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] learning theory(17hit)

1-17hit
  • An Efficient Learning Algorithm for Regular Pattern Languages Using One Positive Example and a Linear Number of Membership Queries

    Satoshi MATSUMOTO  Tomoyuki UCHIDA  Takayoshi SHOUDAI  Yusuke SUZUKI  Tetsuhiro MIYAHARA  

     
    PAPER

      Pubricized:
    2019/12/23
      Vol:
    E103-D No:3
      Page(s):
    526-539

    A regular pattern is a string consisting of constant symbols and distinct variable symbols. The language of a regular pattern is the set of all constant strings obtained by replacing all variable symbols in the regular pattern with non-empty strings. The present paper deals with the learning problem of languages of regular patterns within Angluin's query learning model, which is an established mathematical model of learning via queries in computational learning theory. The class of languages of regular patterns was known to be identifiable from one positive example using a polynomial number of membership queries, in the query learning model. In present paper, we show that the class of languages of regular patterns is identifiable from one positive example using a linear number of membership queries, with respect to the length of the positive example.

  • Polynomial Time Learnability of Graph Pattern Languages Defined by Cographs

    Takayoshi SHOUDAI  Yuta YOSHIMURA  Yusuke SUZUKI  Tomoyuki UCHIDA  Tetsuhiro MIYAHARA  

     
    PAPER

      Pubricized:
    2017/12/19
      Vol:
    E101-D No:3
      Page(s):
    582-592

    A cograph (complement reducible graph) is a graph which can be generated by disjoint union and complement operations on graphs, starting with a single vertex graph. Cographs arise in many areas of computer science and are studied extensively. With the goal of developing an effective data mining method for graph structured data, in this paper we introduce a graph pattern expression, called a cograph pattern, which is a special type of cograph having structured variables. Firstly, we show that a problem whether or not a given cograph pattern g matches a given cograph G is NP-complete. From this result, we consider the polynomial time learnability of cograph pattern languages defined by cograph patterns having variables labeled with mutually different labels, called linear cograph patterns. Secondly, we present a polynomial time matching algorithm for linear cograph patterns. Next, we give a polynomial time algorithm for obtaining a minimally generalized linear cograph pattern which explains given positive data. Finally, we show that the class of linear cograph pattern languages is polynomial time inductively inferable from positive data.

  • Polynomial Time Inductive Inference of Languages of Ordered Term Tree Patterns with Height-Constrained Variables from Positive Data

    Takayoshi SHOUDAI  Kazuhide AIKOH  Yusuke SUZUKI  Satoshi MATSUMOTO  Tetsuhiro MIYAHARA  Tomoyuki UCHIDA  

     
    PAPER-Algorithms and Data Structures

      Vol:
    E100-A No:3
      Page(s):
    785-802

    An efficient means of learning tree-structural features from tree-structured data would enable us to construct effective mining methods for tree-structured data. Here, a pattern representing rich tree-structural features common to tree-structured data and a polynomial time algorithm for learning important tree patterns are necessary for mining knowledge from tree-structured data. As such a tree pattern, we introduce a term tree pattern t such that any edge label of t belongs to a finite alphabet Λ, any internal vertex of t has ordered children and t has a new kind of structured variable, called a height-constrained variable. A height-constrained variable has a pair of integers (i, j) as constraints, and it can be replaced with a tree whose trunk length is at least i and whose height is at most j. This replacement is called height-constrained replacement. A sequence of consecutive height-constrained variables is called a variable-chain. In this paper, we present polynomial time algorithms for solving the membership problem and the minimal language (MINL) problem for term tree patternshaving no variable-chain. The membership problem for term tree patternsis to decide whether or not a given tree can be obtained from a given term tree pattern by applying height-constrained replacements to all height-constrained variables in the term tree pattern. The MINL problem for term tree patternsis to find a term tree pattern t such that the language generated by t is minimal among languages, generated by term tree patterns, which contain all given tree-structured data. Finally, we show that the class, i.e., the set of all term tree patternshaving no variable-chain, is polynomial time inductively inferable from positive data if |Λ| ≥ 2.

  • Statistical Learning Theory of Quasi-Regular Cases

    Koshi YAMADA  Sumio WATANABE  

     
    PAPER-General Fundamentals and Boundaries

      Vol:
    E95-A No:12
      Page(s):
    2479-2487

    Many learning machines such as normal mixtures and layered neural networks are not regular but singular statistical models, because the map from a parameter to a probability distribution is not one-to-one. The conventional statistical asymptotic theory can not be applied to such learning machines because the likelihood function can not be approximated by any normal distribution. Recently, new statistical theory has been established based on algebraic geometry and it was clarified that the generalization and training errors are determined by two birational invariants, the real log canonical threshold and the singular fluctuation. However, their concrete values are left unknown. In the present paper, we propose a new concept, a quasi-regular case in statistical learning theory. A quasi-regular case is not a regular case but a singular case, however, it has the same property as a regular case. In fact, we prove that, in a quasi-regular case, two birational invariants are equal to each other, resulting that the symmetry of the generalization and training errors holds. Moreover, the concrete values of two birational invariants are explicitly obtained, hence the quasi-regular case is useful to study statistical learning theory.

  • Learning of Elementary Formal Systems with Two Clauses Using Queries

    Hirotaka KATO  Satoshi MATSUMOTO  Tetsuhiro MIYAHARA  

     
    PAPER

      Vol:
    E92-D No:2
      Page(s):
    172-180

    An elementary formal system, EFS for short, is a kind of logic program over strings, and regarded as a set of rules to generate a language. For an EFS Γ, the language L(Γ) denotes the set of all strings generated by Γ. We consider a new form of EFS, called a restricted two-clause EFS, and denote by rEFS the set of all restricted two-clause EFSs. Then we study the learnability of rEFS in the exact learning model. The class rEFS contains the class of regular patterns, which is extensively studied in Learning Theory. Let Γ* be a target EFS in rEFS of learning. In the exact learning model, an oracle for superset queries answers "yes" for an input EFS Γ in rEFS if L(Γ) is a superset of L(Γ*), and outputs a string in L(Γ*)-L(Γ), otherwise. An oracle for membership queries answers "yes" for an input string w if w is included in L(Γ*), and answers "no", otherwise. We show that any EFS in rEFS is exactly identifiable in polynomial time using membership and superset queries. Moreover, for other types of queries, we show that there exists no polynomial time learning algorithm for rEFS by using the queries. This result indicates the hardness of learning the class rEFS in the exact learning model, in general.

  • A Polynomial Time Algorithm for Finding a Minimally Generalized Linear Interval Graph Pattern

    Hitoshi YAMASAKI  Takayoshi SHOUDAI  

     
    PAPER

      Vol:
    E92-D No:2
      Page(s):
    120-129

    A graph is an interval graph if and only if each vertex in the graph can be associated with an interval on the real line such that any two vertices are adjacent in the graph exactly when the corresponding intervals have a nonempty intersection. A number of interesting applications for interval graphs have been found in the literature. In order to find structural features common to structural data which can be represented by intervals, this paper proposes new interval graph structured patterns, called linear interval graph patterns, and a polynomial time algorithm for finding a minimally generalized linear interval graph pattern explaining a given finite set of interval graphs.

  • Polynomial Time Inductive Inference of TTSP Graph Languages from Positive Data

    Ryoji TAKAMI  Yusuke SUZUKI  Tomoyuki UCHIDA  Takayoshi SHOUDAI  

     
    PAPER

      Vol:
    E92-D No:2
      Page(s):
    181-190

    Two-Terminal Series Parallel (TTSP, for short) graphs are used as data models in applications for electric networks and scheduling problems. We propose a TTSP term graph which is a TTSP graph having structured variables, that is, a graph pattern over a TTSP graph. Let TGTTSP be the set of all TTSP term graphs whose variable labels are mutually distinct. For a TTSP term graph g in TGTTSP, the TTSP graph language of g, denoted by L(g), is the set of all TTSP graphs obtained from g by substituting arbitrary TTSP graphs for all variables in g. Firstly, when a TTSP graph G and a TTSP term graph g are given as inputs, we present a polynomial time matching algorithm which decides whether or not L(g) contains G. The minimal language problem for the class LTTSP={L(g) | g ∈ TGTTSP} is, given a set S of TTSP graphs, to find a TTSP term graph g in TGTTSP such that L(g) is minimal among all TTSP graph languages which contain all TTSP graphs in S. Secondly, we give a polynomial time algorithm for solving the minimal language problem for LTTSP. Finally, we show that LTTSP is polynomial time inductively inferable from positive data.

  • Learning of Finite Unions of Tree Patterns with Internal Structured Variables from Queries

    Satoshi MATSUMOTO  Takayoshi SHOUDAI  Tomoyuki UCHIDA  Tetsuhiro MIYAHARA  Yusuke SUZUKI  

     
    PAPER-Algorithmic Learning Theory

      Vol:
    E91-D No:2
      Page(s):
    222-230

    A linear term tree is defined as an edge-labeled rooted tree pattern with ordered children and internal structured variables whose labels are mutually distinct. A variable can be replaced with arbitrary edge-labeled rooted ordered trees. We consider the polynomial time learnability of finite unions of linear term trees in the exact learning model formalized by Angluin. The language L(t) of a linear term tree t is the set of all trees obtained from t by substituting arbitrary edge-labeled rooted ordered trees for all variables in t. Moreover, for a finite set S of linear term trees, we define L(S)=∪t∈S L(t). A target of learning, denoted by T*, is a finite set of linear term trees, where the number of edge labels is infinite. In this paper, for any set T* of m linear term trees (m ≥ 0), we present a query learning algorithm which exactly identifies T* in polynomial time using at most 2mn2 Restricted Subset queries and at most m+1 Equivalence queries, where n is the maximum size of counterexamples. Finally, we note that finite sets of linear term trees are not learnable in polynomial time using Restricted Equivalence, Membership and Subset queries.

  • Boundedness of Input Space and Effective Dimension of Feature Space in Kernel Methods

    Kazushi IKEDA  

     
    LETTER-Biocybernetics, Neurocomputing

      Vol:
    E87-D No:1
      Page(s):
    258-260

    Kernel methods such as the support vector machines map input vectors into a high-dimensional feature space and linearly separate them there. The dimensionality of the feature space depends on a kernel function and is sometimes of an infinite dimension. The Gauss kernel is such an example. We discuss the effective dimension of the feature space with the Gauss kernel and show that it can be approximated to a sum of polynomial kernels and that its dimensionality is determined by the boundedness of the input space by considering the Taylor expansion of the kernel Gram matrix.

  • Criteria for Inductive Inference with Mind Changes and Anomalies of Recursive Real-Valued Functions

    Eiju HIROWATARI  Kouichi HIRATA  Tetsuhiro MIYAHARA  Setsuo ARIKAWA  

     
    PAPER-Computational Learning Theory

      Vol:
    E86-D No:2
      Page(s):
    219-227

    This paper investigates the interaction of mind changes and anomalies for inductive inference of recursive real-valued functions. We show that the criteria for inductive inference of recursive real-valued functions by bounding the number of mind changes and anomalies preserve the same hierarchy as that of recursive functions, if the length of each anomaly as an interval is bounded. However, we also show that, without bounding it, the hierarchy of some criteria collapses. More precisely, while the class of recursive real-valued functions inferable in the limit allowing no more than one anomaly is properly contained in the class allowing just two anomalies, the latter class coincides with the class allowing arbitrary and bounded number of anomalies.

  • On the Sample Complexity of Consistent Learning with One-Sided Error

    Eiji TAKIMOTO  Akira MARUOKA  

     
    PAPER-Computational Learning Theory

      Vol:
    E78-D No:5
      Page(s):
    518-525

    Although consistent learning is sufficient for PAC-learning, it has not been found what strategy makes learning more efficient, especially on the sample complexity, i.e., the number of examples required. For the first step towards this problem, classes that have consistent learning algorithms with one-sided error are considered. A combinatorial quantity called maximal particle sets is introduced, and an upper bound of the sample complexity of consistent learning with one-sided error is obtained in terms of maximal particle sets. For the class of n-dimensional axis-parallel rectangles, one of those classes that are consistently learnable with one-sided error, the cardinality of the maximal particle set is estimated and O(d/ε1/ε log 1/δ) upper bound of the learning algorithm for the class is obtained. This bound improves the bounds due to Blumer et al. and meets the lower bound within a constant factor.

  • Learning Logic Programs Using Definite Equality Theories as Background Knowledge

    Akihiro YAMAMOTO  

     
    PAPER-Computational Learning Theory

      Vol:
    E78-D No:5
      Page(s):
    539-544

    In this paper we investigate the learnability of relations in Inductive Logic Programming, by using equality theories as background knowledge. We assume that a hypothesis and an observation are respectively a definite program and a set of ground literals. The targets of our learning algorithm are relations. By using equality theories as background knowledge we introduce tree structure into definite programs. The structure enable us to narrow the search space of hypothesis. We give pairs of a hypothesis language and a knowledge language in order to discuss the learnability of relations from the view point of inductive inference and PAC learning.

  • A Polynomial Time Learning Algorithm for Recognizable Series

    Hiroyuki OHNISHI  Hiroyuki SEKI  Tadao KASAMI  

     
    PAPER-Automata, Languages and Theory of Computing

      Vol:
    E77-D No:10
      Page(s):
    1077-1085

    Recognizable series is a model of a sequential machine. A recognizable series S is represented by a triple (λ,µ,γ), called a linear representation of S, where λ is a row vector of dimension n specifying the initial state, γ is a column vector of dimension n specifying the output at a state, and µ is a morphism from input words to nn matrices specifying the state transition. The output for an input word w is defined as λ(µw) γ, called the coefficient of w in S, and written as (S,w). We present an algorithm which constructs a reduced linear representation of an unknown recognizable series S, with coefficients in a commutative field, using coefficient queries and equivalence queries. The answer to a coefficient query, with a word w, is the coefficient (S, w) of w in S. When one asks an equivalence query with a linear representation (λ,µ,γ), if (λ,µ,γ) is a linear representation of S, yes is returned, and otherwise a word c such that λ (µc) γ(S, c) and the coefficient (S, c) are returned: Such a word c is called a counterexample for the query. For each execution step of the algorithm, the execution time consumed from the initial step to the current step is O(mN 4M), where N is the dimension of a reduced linear representation of S, M is the maximum time consumed by a single fundamental operation (addition, subtraction, multiplication or division), and m is the maximum length of counterexamples as answers to equivalence queries returned until that step.

  • A Note on Inadequacy of the Model for Learning from Queries

    Ryuichi NAKANISHI  Hiroyuki SEKI  Tadao KASAMI  

     
    PAPER-Automata, Languages and Theory of Computing

      Vol:
    E77-D No:8
      Page(s):
    861-868

    Learning correctly from queries" is a formal learning model proposed by Angluin. In this model, for a class Γ of language representations, a learner asks queries to a teacher of an unknown language Lq which can be represented by some GqΓ, and eventually outputs a language representation GΓ which represents Lq and halts. An algorithm (leaner) A is said to learn a class of languages represented by Γ in the weak definition if the time complexity of A is some polynomial of n and m, where n is the minimum size of the lagunage representations in Γ which represent Lq, and m is the maximum length of the counterexamples returned in an execution. On the other band, A is said to learn represented by Γ in the strong definition if at any point τ of the execution, the time consumed up to τ is some polynomial of n and m, where n is the same as above, and m is the maximum length of the counterexamples returned up to τ. In this paper, adequacy of the model is examined, and it is shown that both in the weak and strong definitions, there exist learners which extract a long counterexample, and identify Lq by using equivalence queries exhaustively. For example, there exists a learner which learns the class CFL of context-free languages represented by the class CFG of context-free grammars in the weak definition using only equivalence queries. Next, two restrictions concerning with learnability criteria are introduced. Proper termination condition is that when a teacher replies with yes" to an equivalence query, then the learner must halt immediately. The other condition, called LBC-condition, is that in the weak/strong definition, the time complexity must be some polynomial of n and log m. In this paper, it is shown that under these conditions, there still exist learners which execute exhaustive search. For instance, there exists a learner which learns CFL represented by CFG in the weak definition using membership queries and equivalence queries under the proper termination condition, and there also exists a learner that learns CFL represented by CFG in the strong definition using subset queries and superset queries under LBC-condition. These results suggest that the weak definition is not an adequate learning model even if the proper termination condition is assumed. Also, the model becomes inadequate in the strong definition if some combination of queries, such as subset queries and superset queries, is used instead of equivalence queries. Many classes of languages become learnable by our extracting long counterexample" technique. However, it is still open whether or not CFL represented by CFG is learnable in the strong definition from membership queries and equivalence queries, although the answer is known to be negative if at least one of (1) quadratic residues modulo a composite, (2) inverting RSA encryption, or (3) factoring Blum integers, is intractable.

  • A Regularization Method for Neural Network Learning that Minimizes Estimation Error

    Miki YAMADA  

     
    PAPER-Regularization

      Vol:
    E77-D No:4
      Page(s):
    418-424

    A new regularization cost function for generalization in real-valued function learning is proposed. This cost function is derived from the maximum likelihood method using a modified sample distribution, and consists of a sum of square errors and a stabilizer which is a function of integrated square derivatives. Each of the regularization parameters which gives the minimum estimation error can be obtained uniquely and non-empirically. The parameters are not constants and change in value during learning. Numerical simulation shows that this cost function predicts the true error accurately and is effective in neural network learning.

  • Polynomially Sparse Variations and Reducibility among Prediction Problems

    Naoki ABE  Osamu WATANABE  

     
    PAPER

      Vol:
    E75-D No:4
      Page(s):
    449-458

    We investigate the relationship between two different notions of reducibility among prediction (learning) problems within the distribution-free learning model of Valiant (PAC learning model). The notions of reducibility we consider are the analogues for prediction problems of the many-one reducibility and of the Turing reducibility. The former is the notion of prediction preserving reducibility developed by Pitt and Warmuth, and its generalization. Concerning these two notions of reducibility, we show that there exist a pair of prediction problems A and B, whose membership problems are polynomial time solvable, such that A is reducible to B with respect to the Turing reducibility, but not with respect to the prediction preserving reducibility. We show this result by making use of the notion of a class of polynomially sparse variants of a concept representation class. We first show that any class A of polynomially sparse variants of another class B is reducible to B with respect to the Turing reducibility'. We then prove the existence of a prediction problem R and a class R of polynomially sparse variants of R, such that R does not reduce to R with respect to the prediction preserving reducibility.

  • Algorithmic Learning Theory with Elementary Formal Systems

    Setsuo ARIKAWA  Satoru MIYANO  Ayumi SHINOHARA  Takeshi SHINOHARA  Akihiro YAMAMOTO  

     
    INVITED PAPER

      Vol:
    E75-D No:4
      Page(s):
    405-414

    The elementary formal system (EFS, for short) is a kind of logic program which directly manipulates character strings. This paper outlines in brief the authors' studies on algorithmic learning theory developed in the framework of EFS's. We define two important classes of EFS's and a new hierarchy of various language classes. Then we discuss EFS's as logic programs. We show that EFS's form a good framework for inductive inference of languages by presenting model inference system for EFS's in Shapiro's sense. Using the framework we also show that inductive inference from positive data and PAC-learning are both much more powerful than they have been believed. We illustrate an application of our theoretical results to Molecular Biology.