Regular Language

https://en.wikipedia.org/wiki/Regular_grammar

Sequence

Let X be a set. The set of sequences over X, called Xโˆ—, is defined as follows:

  • ฯต is a sequence, called the empty sequence

  • if xs is a sequence and x โˆˆ X, then x:xs is also a sequence

  • nothing else is a sequence

Letters from the beginning of the alphabet represent single symbols (a,b,c), and letters from the end of the alphabet represent sequences of symbols (x,y,z).

CFG

  • ๐šฒ ๐›Œ ๐œ€ ๐šบโˆ— ๐›™ ๐›— ๐›” ๐“

  • sequence

  • symbols ๐“ช๐“ซ๐“ฌ๐“ญ๐“ฎ๐“๐”‚๐“

    • single symbols, ๐“ช ๐“ซ ๐“ฌ ๐“ญ ๐“ฎ

    • sequence of symbols, ๐“ ๐”‚ ๐“

  • ๐œ€ the empty sequence

  • ๐šบ alphabet (finite set)

  • ๐šบโˆ— infinite set of all possible sequences over an alphabet ๐šบ

  • ๐šฒ language, ๐šฒ โІ ๐šบโˆ— for an alphabet ๐šบ

  • ๐šฒสณ complement of ๐šฒ language, ๐šฒสณ = {๐“|๐“ โˆˆ ๐šบโˆ—, ๐“ โˆ‰ ๐šฒ}

  • ๐›” sentence, ๐›” โˆˆ ๐šฒ

  • predicate/property ๐›™(๐šฒ) = โˆ€๐“.๐“ โˆˆ ๐šฒ -> ๐›™(๐“) such that ๐šฒ โІ ๐›™

  • alphabet T is a finite set of symbols

  • Tโˆ— is infinite set of all possible sequences over an alphabet T

  • language L is a subset of Tโˆ—, L โІ Tโˆ—, for the alphabet T

  • sentence s is an element of a language, s โˆˆ L

Eexamples:

  • alphabet, T

    • binary = {0,1}

    • digits = {0..9}

    • English = {a..z} = l

  • languages L over the alphabet T

    • T*

    • โˆ…

    • {ฯต}

    • T

    • S = {course, practice, exam} is a language over the English alphabet l

  • sentence

    • "exam" is a sentence in the language S over the English alphabet l

The question is now how to specify a language. Since a language is a set, we immediately recognize 3 options:

  • enumerate all the elements of the set explicitly

  • characterise the elements of the set by means of a predicate

  • define which elements belong to the set by means of induction

Examples of the second approach:

  • the even natural numbers {n | n โˆˆ {0..9}โˆ—, n mod 2 = 0}

  • PAL, the palindromes over the alphabet ฮฃ={a,b,c}: {s|s โˆˆ ฮฃโˆ—, s=sสณ}

    where sสณ denotes the reverse of sequence s.

One of the fundamental differences between the predicative and the inductive approach to defining a language is that the inductive approach is constructive, i.e. it provides us with a way to enumerate all elements of a language.

If we define a language by means of a predicate, we only have a means to determine whether or not an element belongs to a language. A famous example of a language which is easily defined in a predicative way, but for which the membership test is very hard, is the set of prime numbers.

Quite often we want to prove that a language L, which is defined by means of an inductive definition, has some property P. If P is of the form P(L) = โˆ€x.x โˆˆ L -> P(x), then we want to prove that L โІ P.

Since languages are sets the usual set operators may be used to construct new languages from existing ones. The complement of a language L over an alphabet T is defined by Lสณ = {x|x โˆˆ Tโˆ—, x โˆ‰ L}.

Language operations

Let L and M be languages over the same alphabet T, then

  • Lแถœ = Tโˆ— โˆ’ L complement of L

  • Lสถ = { sสถ | s โˆˆ L} reverse of L

  • LM = { st | s โˆˆ L, t โˆˆ M} concatenation of L and M

  • Lโฐ = {ฯต} 0แต—สฐ power of L

  • Lโฟ = L ร— L ร— ... ร— L nแต—สฐ power of L

  • Lโˆ— = Lโฐ โ‹ƒ Lยน โ‹ƒ Lยฒ โ‹ƒ ... star โˆ’ closure of L

  • Lแฉ = Lยน โ‹ƒ Lยฒ โ‹ƒ ... positive closure of L

Last updated

Was this helpful?