Writing a compiler is not a simple project and anything that make the task simpler is worth
exploring. At a very early stage in the history of compiler development it has recognised that
some aspects of compiler design could be automated. Consequently a great deal of effort has
been directed towards the development of software tools to aid the production of a compiler.
Two best known software tools for compiler constructions are Lex (a lexical analyzer generator)
and Yacc (a parser generator). both of which are available under the UNIX operating
system. Their cotltinuing popularity is partly due to their widespread availability but also because
they are powerful and easy to use with a wide range of applicability. This section I describes these two software tools.
YACC is a parser, Lex is a lexical analyzer. They are typically used together: you Lex the string input, and YACC the tokenized input provided by Lex.
Now, a regular expression can only represent regular languages. One of the constraints of a regular language is the lack of “memory”. You cannot define the rules for acceptance further down the string based on what has come before.
Input Stream (characters) -> Lex (tokens) -> Yacc (Abstract Syntax Tree) -> Your Application
This is mostly clearly seen in the case of parenthesis. A regular language can’t match nested parenthesis to the correct level. Or any other such structure. The grammars of (most) computer languages can and do, and, because of that, they can’t be be parsed with a Lexer or a regular expression. That’s where YACC comes in.
One can reverse the question as well. If YACC can do more, why not use it for the lexical analysis? Well, it so happens that you can verify the validity of a regular expression very efficiently, which is not the case of general grammars — not to the same level. Still, YACC can do basic lexical analysis, if the lexical rules of the language are simple enough.