Wednesday 15 February, 2006

Noam Chomsky

Noam Chomsky was recently voted as the world’s greatest living intellectual defeating great scientists like Stephen Hawking. He is basically a linguist but has done yeoman service to the information revolution because it was his pathbreaking study of grammars of natural languages of the world that led to the development of grammars for computer languages. He classified grammars of languages into 4 types and this classification is known as Chomsky hierarchy.
The following passages are reproduced from wikipedia. Remember Turing test? It is a test designed by A.Turing to identify whether the answers to the questions are given by a machine with Artificial Intelligence or a human being. It is a litmus test for machines with AI.
- 3vkrm
Avram Noam Chomsky (born December 7, 1928) is the Institute Professor Emeritus of linguistics at the Massachusetts Institute of Technology. Chomsky is credited with the creation of the theory of generative grammar, often considered the most significant contribution to the field of theoretical linguistics of the 20th century. He also helped spark the cognitive revolution in psychology through his review of B. F. Skinner's Verbal Behavior, which challenged the behaviorist approach to the study of mind and language dominant in the 1950s. His naturalistic approach to the study of language has also affected the philosophy of language and mind (see Harman, Fodor). He is also credited with the establishment of the so-called Chomsky hierarchy, a classification of formal languages in terms of their generative power.Along with his linguistics work, Chomsky is also widely known for his political activism, and for his criticism of the foreign policy of the United States and other governments. Chomsky describes himself as a libertarian socialist, a sympathizer of anarcho-syndicalism, and is often considered to be a key intellectual figure within the left wing of American politics.According to the Arts and Humanities Citation Index, between 1980 and 1992 Chomsky was cited as a source more often than any living scholar, and the eighth most cited source overall.
Chomsky hierarchy
The Chomsky hierarchy is a containment hierarchy of classes of formal grammars that generate formal languages. This hierarchy of these grammars which are also called phrase structure grammars was described by Noam Chomsky in 1956 .
Formal grammars
A formal grammar consists of a finite set of terminal symbols (the letters of the words in the formal language), a finite set of nonterminal symbols, a finite set of production rules with a left- and a right-hand side consisting of a word of these symbols, and a start symbol. A rule may be applied to a word by replacing the left-hand side by the right-hand side. A derivation is a sequence of rule applications. Such a grammar defines the formal language of all words consisting solely of terminal symbols that can be reached by a derivation from the start symbol.
Nonterminals are usually represented by uppercase letters, terminals by lowercase letters, and the start symbol by S. For example, the grammar with terminals {a,b}, nonterminals {S,A,B}, production rules
S → ABS
S → ε (where ε is the empty string)
BA → AB
BS → b
Bb → bb
Ab → ab
Aa → aa
and start symbol S, defines the language of all words of the form anbn (i.e. n copies of a followed by n copies of b). The following is a simpler grammar that defines a similar language: Terminals {p,q}, Nonterminals {S}, Start symbol S, Production rules
S → pSq
S → ε
See formal grammar for a more elaborate explanation.
The hierarchy
The Chomsky hierarchy consists of the following levels:
· Type-0 grammars (unrestricted grammars) include all formal grammars. They generate exactly all languages that can be recognized by a Turing machine. These languages are also known as the recursively enumerable languages. Note that this is different from the recursive languages which can be decided by an always halting Turing machine.
· Type-1 grammars (context-sensitive grammars) generate the context-sensitive languages. These grammars have rules of the form with A a nonterminal and α, β and γ strings of terminals and nonterminals. The strings α and β may be empty, but γ must be nonempty. The rule is allowed if S does not appear on the right side of any rule. The languages described by these grammars are exactly all languages that can be recognized by a non-deterministic Turing machine whose tape is bounded by a constant times the length of the input.
· Type-2 grammars (context-free grammars) generate the context-free languages. These are defined by rules of the form with A a nonterminal and γ a string of terminals and nonterminals. These languages are exactly all languages that can be recognized by a non-deterministic pushdown automaton. Context free languages are the theoretical basis for the syntax of most programming languages.
· Type-3 grammars (regular grammars) generate the regular languages. Such a grammar restricts its rules to a single nonterminal on the left-hand side and a right-hand side consisting of a single terminal, possibly followed (or preceded, but not both in the same grammar) by a single nonterminal. The rule is also here allowed if S does not appear on the right side of any rule. These languages are exactly all languages that can be decided by a finite state automaton. Additionally, this family of formal languages can be obtained by regular expressions. Regular languages are commonly used to define search patterns and the lexical structure of programming languages.
Note that the set of grammars corresponding to recursive languages is not a member of this hierarchy.
Every regular language is context-free, every context-free language is context-sensitive and every context-sensitive language is recursive and every recursive language is recursively enumerable. These are all proper inclusions, meaning that there exist recursively enumerable languages which are not recursive, recursive languages that are not context-sensitive, context-sensitive languages which are not context-free and context-free languages which are not regular.
The following table summarizes each of Chomsky's four types of grammars, the class of languages it generates, the type of automaton that recognizes it, and the form its rules must have.


GrammarLanguagesAutomatonProduction rules
Type-0Recursively enumerableTuring machineNo restrictions
Type-1Context-sensitiveLinear-bounded non-deterministic Turing machine\alpha A\beta \rightarrow
"\alpha\gamma\beta"
Type-2Context-freeNon-deterministic pushdown automatonA \rightarrow \gamma
Type-3RegularFinite state automatonA \rightarrow a and

either A \rightarrow aB
or A \rightarrow Ba


Automata theory: formal languages and formal grammars
Chomsky
hierarchy
GrammarsLanguagesMinimal
automaton
Type-0(unrestricted)Recursively enumerableTuring machine
(unrestricted)RecursiveDecider
Type-1Context-sensitiveContext-sensitiveLinear-bounded
Type-2Context-freeContext-freePushdown
Type-3RegularRegularFinite
Each category of languages or grammars is a proper superset of the category directly beneath it.

No comments: