Warning: Undefined array key "amp-addthis" in /home/tgagmvup/onlinestudy.guru/wp-content/plugins/addthis/backend/AddThisSharingButtonsFeature.php on line 101
lang="en-US"> Regular Expression - onlinestudy.guru
Site icon onlinestudy.guru

Regular Expression

The lexical analyzer needs to scan and identify only a finite set of valid string/token/lexeme that belong to the language in hand. It searches for the pattern defined by the language rules.

Regular expressions have the capability to express finite languages by defining a pattern for finite strings of symbols. The grammar defined by regular expressions is known as regular grammar. The language defined by regular grammar is known as regular language.

Regular expression is an important notation for specifying patterns. Each pattern matches a set of strings, so regular expressions serve as names for a set of strings. Programming language tokens can be described by regular languages. The specification of regular expressions is an example of a recursive definition. Regular languages are easy to understand and have efficient implementation.

There are a number of algebraic laws that are obeyed by regular expressions, which can be used to manipulate regular expressions into equivalent forms.

Operations

The various operations on languages are:

Notations

If r and s are regular expressions denoting the languages L(r) and L(s), then

Precedence and Associativity

Representing valid tokens of a language in regular expression

If x is a regular expression, then:

[a-z] is all lower-case alphabets of English language. [A-Z] is all upper-case alphabets of English language. [0-9] is all natural digits used in mathematics.

Representing occurrence of symbols using regular expressions

letter = [a – z] or [A – Z]

digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 or [0-9]

sign = [ + | – ]

Representing language tokens using regular expressions

Decimal = (sign)?(digit)+

Identifier = (letter)(letter | digit)*

The only problem left with the lexical analyzer is how to verify the validity of a regular expression used in specifying the patterns of keywords of a language. A well-accepted solution is to use finite automata for verification.

 

NEXT – From a Regular expression to an NFA

Exit mobile version