Information about Algorithmic Learning Theory

Algorithmic learning theory (or inductive inference) is a framework for machine learning.

The framework was introduced in E. Mark Gold's seminal paper "Language identification in the limit". The objective of language identification is for a machine running one program to be capable of developing another program by which any given sentence can be tested to determine whether it is "grammatical" or "ungrammatical". The language being learned need not be English or any other natural language - in fact the definition of "grammatical" can be absolutely anything known to the tester.

In the framework of algorithmic learning theory, the tester gives the learner an example sentence at each step, and the learner responds with a hypothesis, which is a suggested program to determine grammatical correctness. It is required of the tester that every possible sentence (grammatical or not) appears in the list eventually, but no particular order is required. It is required of the learner that at each step the hypothesis must be correct for all the sentences so far.

A particular learner is said to be able to "learn a language in the limit" if there is a certain number of steps beyond which its hypothesis no longer changes. At this point it has indeed learned the language, because every possible sentence appears somewhere in the sequence of inputs (past or future), and the hypothesis is correct for all inputs (past or future), so the hypothesis is correct for every sentence. The learner is not required to be able to tell when it has reached a correct hypothesis, all that is required is that it be true.

Gold showed that any language which is defined by a Turing machine program can be learned in the limit by another Turing-complete machine using enumeration. This is done by the learner testing all possible Turing machine programs in turn until one is found which is correct so far - this forms the hypothesis for the current step. Eventually, the correct program will be reached, after which the hypothesis will never change again (but note that the learner does not know that it won't need to change).

Gold also showed that if the learner is given only positive examples (that is, only grammatical sentences appear in the input, not ungrammatical sentences), then the language can only be guaranteed to be learned in the limit if there are only a finite number of possible sentences in the language (this is possible if, for example, sentences are known to be of limited length).

Language identification in the limit is a very theoretical model. It does not allow for limits of runtime or computer memory which can occur in practice, and the enumeration method may fail if there are errors in the input. However the framework is very powerful, because if these strict conditions are maintained, it allows the learning of any program known to be computable. This is because a Turing machine program can be written to mimic any program in any conventional programming language. See Church-Turing thesis.

Other frameworks of learning consider a much more restricted class of function than Turing machines, but complete the learning more quickly (in polynomial time). An example of such a framework is Probably approximately correct learning.

External links

machine learning is concerned with the design and development of algorithms and techniques that allow computers to "learn". At a general level, there are two types of learning: inductive, and deductive.
..... Click the link for more information.
Language identification in the limit is a formal model for inductive inference. It was introduced by E. Mark Gold in his paper with the same title [1] . In this model, a learner is provided with presentation of some language. The learning is seen as an infinite process.
..... Click the link for more information.


Language identification is the process of determining which natural language given content is in. Traditionally, identification of written language - as practiced, for instance, in library science - has relied on
..... Click the link for more information.
English}}} 
Writing system: Latin (English variant) 
Official status
Official language of: 53 countries
Regulated by: no official regulation
Language codes
ISO 639-1: en
ISO 639-2: eng
ISO 639-3: eng  
..... Click the link for more information.
In the philosophy of language, a natural language (or ordinary language) is a language that is spoken, written, or signed (visually or tactilely) by humans for general-purpose communication, as distinguished from formal languages (such as computer-programming
..... Click the link for more information.
A hypothesis (from Greek ὑπόθεσις) consists either of a suggested explanation for a phenomenon or of a reasoned proposal suggesting a possible correlation between multiple phenomena.
..... Click the link for more information.
A computer program is one or more instructions that are intended for execution by a computer. Specifically, it is a symbol or combination of symbols forming an algorithm that may or may not terminate, and that algorithm is written in a programming language.
..... Click the link for more information.
Turing machines are extremely basic abstract symbol-manipulating devices which, despite their simplicity, can be adapted to simulate the logic of any computer that could possibly be constructed. They were described in 1936 by Alan Turing.
..... Click the link for more information.
enumeration of a set is either a procedure for listing all members of the set in some definite sequence, or a count of objects of a specified kind. The two kinds of enumeration often, but not always, overlap.
..... Click the link for more information.
In mathematics, a set is called finite if there is a bijection between the set and some set of the form where n is a natural number. (The value n = 0 is allowed; that is, the empty set is finite.) An infinite set is a set which is not finite.
..... Click the link for more information.
In computer science, runtime or run time describes the operation of a computer program, the duration of its execution, from beginning to termination (compare compile time).
..... Click the link for more information.
Computer data storage, computer memory, and often casually storage or memory refer to computer components, devices and recording media that retain digital data used for computing for some interval of time.
..... Click the link for more information.
A programming language is an artificial language that can be used to control the behavior of a machine, particularly a computer. Programming languages, like natural languagess, are defined by syntactic and semantic rules which describe their structure and meaning respectively.
..... Click the link for more information.
polynomial time refers to the computation time of a problem where the run time, m(n), is no greater than a polynomial function of the problem size, n.
..... Click the link for more information.
In computational learning theory Probably approximately correct learning (PAC learning) is a framework of machine learning proposed by Leslie Valiant in his paper A theory of the learnable that allows an accurate mathematical analysis of learning.
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter