Regular Language

https://en.wikipedia.org/wiki/Regular_language

A regular language is a formal language that can be expressed using a regular expression (in the strict sense of the regex notion as used in TCS and as opposed to many regex engines provided by modern PL which are augmented with features which cannot be expressed by a classic regex). Alternatively, a regular language can be defined as a language recognized by a finite automaton.

The equivalence of regular expressions (regex) and finite automata is known as Kleene's theorem.

In the Chomsky hierarchy, regular languages are defined to be the languages that are generated by Type-3 grammars i.e. regular grammars.

Formal definition

The collection of regular languages over an alphabet Σ is defined recursively:

  • The empty language Ø is a regular language

  • The empty string language {ε} = {""} is a regular language

  • For each a ∈ Σ, the singleton language {a} is a regular language

  • If A and B are regular languages, then so are:

    • A ∪ B (union)

    • AB or A • B (concatenation)

    • A* (iteration, Kleene star)

  • No other languages over Σ are regular

Lexical structure of a PL is a set of token classes The basic regular expressions are the empty string, ε, and a single character/symbol of the language, e.g. 'c' (character 'c').

Last updated