Regular Language
Last updated
Was this helpful?
Last updated
Was this helpful?
Let X be a set. The set of sequences over X, called Xโ, is defined as follows:
ฯต is a sequence, called the empty sequence
if xs is a sequence and x โ X, then x:xs is also a sequence
nothing else is a sequence
Letters from the beginning of the alphabet represent single symbols (a,b,c), and letters from the end of the alphabet represent sequences of symbols (x,y,z).
๐ฒ ๐ ๐ ๐บโ ๐ ๐ ๐ ๐
sequence
symbols ๐ช๐ซ๐ฌ๐ญ๐ฎ๐๐๐
single symbols, ๐ช ๐ซ ๐ฌ ๐ญ ๐ฎ
sequence of symbols, ๐ ๐ ๐
๐ the empty sequence
๐บ alphabet (finite set)
๐บโ infinite set of all possible sequences over an alphabet ๐บ
๐ฒ language, ๐ฒ โ ๐บโ for an alphabet ๐บ
๐ฒสณ complement of ๐ฒ language, ๐ฒสณ = {๐|๐ โ ๐บโ, ๐ โ ๐ฒ}
๐ sentence, ๐ โ ๐ฒ
predicate/property ๐(๐ฒ) = โ๐.๐ โ ๐ฒ -> ๐(๐) such that ๐ฒ โ ๐
alphabet T
is a finite set of symbols
Tโ
is infinite set of all possible sequences over an alphabet T
language L
is a subset of Tโ
, L โ Tโ
, for the alphabet T
sentence s
is an element of a language, s โ L
Eexamples:
alphabet, T
binary = {0,1}
digits = {0..9}
English = {a..z} = l
languages L over the alphabet T
T*
โ
{ฯต}
T
S
= {course, practice, exam} is a language over the English alphabet l
sentence
"exam" is a sentence in the language S
over the English alphabet l
The question is now how to specify a language. Since a language is a set, we immediately recognize 3 options:
enumerate all the elements of the set explicitly
characterise the elements of the set by means of a predicate
define which elements belong to the set by means of induction
Examples of the second approach:
the even natural numbers {n | n โ {0..9}โ, n mod 2 = 0}
PAL
, the palindromes over the alphabet ฮฃ={a,b,c}: {s|s โ ฮฃโ, s=sสณ}
where sสณ denotes the reverse of sequence s.
One of the fundamental differences between the predicative and the inductive approach to defining a language is that the inductive approach is constructive, i.e. it provides us with a way to enumerate all elements of a language.
If we define a language by means of a predicate, we only have a means to determine whether or not an element belongs to a language. A famous example of a language which is easily defined in a predicative way, but for which the membership test is very hard, is the set of prime numbers.
Quite often we want to prove that a language L
, which is defined by means of an inductive definition, has some property P
. If P
is of the form
P(L) = โx.x โ L -> P(x)
, then we want to prove that L โ P
.
Since languages are sets the usual set operators may be used to construct new languages from existing ones. The complement of a language L
over an alphabet T
is defined by Lสณ = {x|x โ Tโ, x โ L}
.
Let L and M be languages over the same alphabet T, then
Lแถ = Tโ โ L complement of L
Lสถ = { sสถ | s โ L} reverse of L
LM = { st | s โ L, t โ M} concatenation of L and M
Lโฐ = {ฯต} 0แตสฐ power of L
Lโฟ = L ร L ร ... ร L nแตสฐ power of L
Lโ = Lโฐ โ Lยน โ Lยฒ โ ... star โ closure of L
Lแฉ = Lยน โ Lยฒ โ ... positive closure of L