Compiler Design (May-2017) Unit-IV

Question: -8(a) Explain the main issues of code generation  in detail.

Answer: – Issues in the design of code generator are:

  1. Input to the Code Generator
  • An input to the code generator consists of the intermediate representation of the source program.
  • Moreover, There are several types of the intermediate language, such as postfix notation, quadruples, and syntax trees or DAGs.
  • After The detection of semantic error should be done before submitting the input to the code submitting generator.
  • The code generation phase require complete error-free intermediate code as an input.requires
  1. Target program
  • The output of the code generator is the target program. The output may take on a variety of forms; absolute machine language, relocatable machine language, or assembly language, language.
  • Producing an absolute machine language program as output has the advantage that it can be placed in a location in memory and immediately executed.
  • Producing a relocatable machine language program as output is that the subroutine can be compiled separately. A set of relocatable object modules can link together and loaded for execution by a linking loader.
  • Similarly, Producing an assembly language program as output makes the process of code generation somewhat easier.We can generate symbolic instructions and use the macro what facilities of the assembler to help generate core.
  1. Memory management
  • Mapping names in the source program to addresses of data objects in run-time memory are done cooperatively by the front end and the code generator.
  • So that We assume that a name in a three-address statement refers to a symbol table entry for three-address the name.
  • From the symbol table information, a relative address can determine for the name bol
    in a data area.

4.Instruction selection

  • If we do not care about the efficiency of the target program, instruction selection is straightforward. It requires special handling. So that, For example, the sequence of statement example,
    a := b + c
    d := a + e
    would be translated into
    MOV b, R0
    ADD c, R0
    MOV R0, a
    MOV a, R0
    ADD e, R0
    MOV R0, d
  • So, Here the fourth statement is redundant, so we can eliminate that statement.

5.Register allocation Issues: Code Generation

  • If the instruction contains register operands then such a use becomes shorter and faster than that of using in memory.
  • Moreover, The use of registers often subdivided into two subproblems:
  • During register allocation, we select the set of variables that will reside in registers at a point in the program.
  • During a subsequent register assignment phase, we pick the specific register that a variable will reside in.
  • Finding an optimal assignment of registers to variables is difficult, even with single register value.
  • Mathematically the problem is NPNP-complete.

6.Choice of evaluation Issues: Code Generation

  • The order in which computations performed can affect the efficiency of the target code. Some computation orders require fewer registers to hold intermediate results than others. Picking the best order is another difficult, NP-complete problem.NP-complete

7.Approaches to code generation Issues: Code Generation

  • Similarly, The most important criterion for a code generator is that it produces correct code.
  • Correctness takes on special significance because of the number of special cases that code generator must face.
  • Given the premium on correctness, designing a code generator so it can easily implement, tested, and maintained is an important design goal.

Question: -8(b) Define Peephole optimization. List the characteristics of peephole optimization.

Answer: – Definition: Peephole optimization is a simple and effective technique for locally improving target code. This technique is applied to improve the performance of the target program by examining the short sequence of target instructions (called the peephole) and replace these instructions replacing by shorter or faster sequence whenever possible. Peephole is a small, moving window on the target program.

Characteristics of Peephole Optimization

So The peephole optimization can be applied to the target code using the following characteristic.

1. Redundant instruction elimination

  • Especially the redundant loads and stores can be eliminated in this type of transformations.
    MOV R0,x
    MOV x,R0
  • We can eliminate the second instruction since x is in already R0. But if MOV x, R0 is a label statement then we can not remove it.

2. Unreachable code

  • Especially the redundant loads and stores can be eliminated in this type of transformations.
  • An unlabeled instruction immediately following an unconditional jump may be removed.
  • This operation can be repeated to eliminate the sequence of instructions.
    #define debug 0
    If(debug) {
    Print debugging information
    In the intermediate representation the if statement may be translated as if-
    If debug=1 goto L1
    goto L2
    L1: print debugging information
  • Additionally, One obvious peephole optimization is to eliminate jumps over jumps. Thus no matter
    what the value of debugging, can replaced by:
    If debug goto L2debug≠1
    Print debugging information
  • Now, since debug set to 0 at the beginning of the program, constant propagation program,
    should replace by
    If 0≠1 goto L2≠1
    Print debugging information
  • As the argument of the first statement of evaluates to a constant true, it can replace by goto L2.
  • Then all the statement that prints debugging aids are manifestly unreachable and can manifestly eliminate one at a time.

3. The flow of control optimization

  • The unnecessary jumps can eliminate in either the intermediate code or the target code by the following types of peephole optimizations.
  • We can replace the jump sequence.
    Goto L1
    L1: goto L2
    By the sequence
    Goto L2
    L1: goto L2
  • If there are no jumps to L1 then it may be possible to eliminate the statement L1: goto L2 provided it preceded by an unconditional jump. Similarly, the sequence
    If a<b goto L1
    L1: goto L2
    Can replaced by
    If a<b goto L2
    L1: goto L2

4. Algebraic simplification

  • So Peephole optimization is an effective technique for algebraic simplification.
  • The statements such as x = x + 0 or x := x* 1 can eliminated by peephole optimization.

5. Reduction in strength

  • Certain machine instructions are cheaper than the other.
  • In order to improve the performance of the intermediate code, we can replace these instructions by equivalent cheaper instruction.
  • So For example, x2 is cheaper than  x * x. Similarly, addition and subtraction are cheaper than multiplication and division. So we can add an effectively equivalent addition and subtraction for multiplication and division.

6. Machine idioms

  • So The target instructions have equivalent machine instructions for performing some have operations.
  • Hence we can replace these target instructions by equivalent machine instructions in order to improve the efficiency.
  • Example: Some machines have auto-increment or auto-decrement addressing modes.decrement These modes can use in a code for a statement like i=i+1.

Question: -9(a) Explain DAG representation of basic blocks in detail.

Answer: –DAG Representation

  • DAG stands for Directed Acyclic Graph
  • Syntax tree and DAG both, are graphical representations. Syntax tree does not find the common sub expressions whereas DAG can
  • Another usage of DAG is the application of optimization technique in the basic block
  • To apply optimization technique on basic block, a DAG is a constructed three address code which is the output of an intermediate code generation

Characteristics of DAG are:

  • All internal nodes store operator values
  • External or leaf nodes are identifiers or variable names or constants

Algorithm for Construction of DAG

There are three possible cases to construct DAG on three address code:

Case 1: x = y op z

Case 2: x = op y

Case 3: x = y

DAG can be constructed as follows:


If y is undefined then create a node with label y. Similarly create a node with label z.


For case 1, create a node with label op whose left child is node y, and node z will be the right child. Also, check for any common sub expressions. For case 2, determine if a node is labelled op. such node will have a child node y. for case 3 node n will be node y.


Delete x from list of identifiers for node x. append x to the list of attached identifiers for node n found in step 2.


Consider the following three address code statements.

a = b * c

d = b

e = d * c

b = e

f = b + c

g = f + d

Step 1

Consider the first statement, i.e., a = b * c. Create a leaf node with label b and c as left and right child respectively and parent of it will be *. Append resultant variable a to the node *.

Step 2

For second statement, i.e., d = b, node b is already created. So, append d to this node.


Step 3

For third statement e = d * c, the nodes for d, c and * are already create. Node e  is not created, so append node e to node *.


Step 4

For fourth statement b = e, append b to node e.


Step 5

For fifth statement f = b + c, create a node for operator + whose left child b and right child c and append f to newly created node +


Step 6

For last statement g = f + d, create a node for operator + whose left child d and right child f and append g to newly created node +.


DAG Applications:

  • Determines the common sub expressions
  • Determines the names used inside the block, and the names that are computed outside the block
  • Determines which statements of the block could have their computed value outside the block
  • Code may be represented by a DAG describing the inputs and outputs of each of the arithmetic operations performed within the code; this representation allows the compiler to perform common subexpression elimination efficiently
  • Several programming languages describe systems of values that are related to each other by a directed acyclic graph. When one value changes, its successors are recalculate; each value is evaluated as a function of this predecessors in the DAG

Question: – 9(b) Explain various code optimization techniques. Discuss the strategies for loop optimization and dead code elimination.

Answer:-Code Optimization Techniques

  • Optimization is a program transformation technique, which tries to improve the code by making it consume less resources (i.e. CPU, Memory) and deliver high speed.
  • In optimization, high-level general programming constructs are replaced by very efficient low-level programming codes.
  • A code optimizing process must follow the three rules given below:
    1. The output code must not, in any way, change the meaning of the program.
    2. Optimization should increase the speed of the program and if possible, the program should demand less number of resources.
    3. Optimization should itself be fast and should not delay the overall compiling process.
  • Efforts for an optimized code can be made at various levels of compiling the process.
  1. At the beginning, users can change/rearrange the code or use better algorithms to write the code.
  2. After generating intermediate code, the compiler can modify the intermediate code by address calculations and improving loops.
  3. While producing the target machine code, the compiler can make use of memory hierarchy and CPU registers.
  • Optimization can be categorized broadly into two types:

Machine Independent and Machine Dependent.

Machine Independent Optimization

  • In this optimization, the compiler takes in the intermediate code and transforms a part of the code that does not involve any CPU registers and/or absolute memory locations.
  • For example:
  • do
  • {
  • item = 10;
  •  value = value + item;
  • } while(value<100);
  • This code involves repeated assignment of the identifier item, which if we put this way:
  • Item = 10;
  • do
  • {
  • value = value + item;
  • } while(value<100);

should not only save the CPU cycles, but can be used on any processor.

Machine Dependent Optimization

  • Machine-dependent optimization is done after the target code has been generated and when the code is transformed according to the target machine architecture.
  • It involves CPU registers and may have absolute memory references rather than relative references.
  • Machine-dependent optimizers put efforts to take maximum advantage of memory hierarchy.

Machine independence includes two types

  1. Function Preserving
  2. Loop optimization
  • Function preserving
  • Common Sub Expression Elimination

The expression that produces the same results should be removed out from the code


T1 = 4+i T1 = 4+i
T2 = T2 +T1 T2 = T2 +T1
T3 = 4 * i T4 = T2 + T1
T4 = T2 + T3
  • Constant folding

If expression generates a constant value then instead of performing its calculation again and again we calculate it once and assign it.


T1 = 512 T1 = 2.5
  • Copy Propagation

In this propagation a F value is been send to G and G value is been send to H We can eliminate G variable directly assigning the value of F to H.

T1 = X T2 = T3 + T2
T2 = T3 + T2 T3 = T1
T3 = X
  • Dead Code Elimination

Dead code is one or more than one code statements, which are:

    • Either never executed or unreachable,
    • Or if executed, their output is never used.

Thus, dead code plays no role in any program operation and therefore it can simply be eliminated.

Partially dead code Elimination

  • There are some code statements whose computed values are used only under certain circumstances, i.e., sometimes the values are used and sometimes they are not.
  • Such codes are known as partially dead-code.

  • The above control flow graph depicts a chunk of program where variable ‘a’ is used to assign the output of expression ‘x * y’.
  • Let us assume that the value assigned to ‘a’ is never used inside the loop.
  • Immediately after the control leaves the loop, ‘a’ is assigned the value of variable ‘z’, which would be used later in the program.
  • We conclude here that the assignment code of ‘a’ is never used anywhere, therefore it is eligible to be eliminated.
  • Likewise, the picture below depicts that the conditional statement is always false, implying that the code, written in true case, will never be executed, hence it can be removed.

  1. Loop Optimization

We are going to perform optimization on loops.

  • Code Motion

It specifies on a condition if we perform some operations to be carried out and then compare for a condition.

Instead of that perform the calculation outside the loop and assign a value in the calculation.

While(i < = limit-2)

T1 = limit – 2
While (i< =t1)
  • Strength Reduction

It specifies the operators such as multiplication and division can be replaced by a addition and subtraction respectively.

The multiplication operator can be easily replaced by left shift operator a<<1 Division can be replaced by a a>>1 operator.

T1 = a * 2 a<<1
T1 = a / 2 a >> 1
  • Frequency Reduction

In this case if a expression inside a loop is not dynamically affected by a loop we calculate it outside the loop and use the value inside the loop.

  • Loop Distribution

It specifies the values in a particular loop to be assigned to a array keeps of varing i.e the array location in which a loop need to be work again and again. We can use two different loops which allows loop distribution

Peephole Optimization

  • This optimization technique works locally on the source code to transform it into an optimized code. By locally, we mean a small portion of the code block at hand.
  • These methods can be applied on intermediate codes as well as on target codes.
  • A bunch of statements is analyzed and are checked for the following possible optimization:
  1. Redundant instruction elimination
  • At source code level, the following can be done by the user:

At compilation level, the compiler searches for instructions redundant in nature. Multiple loading and storing of instructions may carry the same meaning even if some of them are removed. For example:

  • MOV x, R0
  • MOV R0, R1

We can delete the first instruction and re-write the sentence as:

MOV x, R1

  1. Unreachable code
  • Unreachable code is a part of the program code that is never accessed because of programming constructs.
  • Programmers may have accidently written a piece of code that can never be reached.


void add_ten(int x)


return x + 10;

printf(“value of x is %d”, x);


  • In this code segment, the printf statement will never be executed as the program control returns back before it can execute, hence printf can be removed.
  • In this code segment, the printf statement will never be executed as the program control returns back before it can execute, hence printf can be removed.
  1. Flow of control optimization
  • There are instances in a code where the program control jumps back and forth without performing any significant task.
  • These jumps can be removed. Consider the following chunk of code:
  • MOV R1, R2
  • GOTO L1
  • L1:   GOTO L2
  • L2: INC R1
  • In this code, label L1 can be removed as it passes the control to L2. So instead of jumping to L1 and then to L2, the control can directly reach L2, as shown below:
  • MOV R1, R2
  • GOTO L2
  •             L2:   INC R1
  1. Algebraic expression simplification
  • There are occasions where algebraic expressions can be made simple. For example, the expression a = a + 0 can be replaced by a itself and the expression a = a + 1 can simply be replaced by INC a.
  1. Strength reduction
  • There are operations that consume more time and space. Their ‘strength’ can be reduced by replacing them with other operations that consume less time and space, but produce the same result.
  • For example, x * 2 can be replaced by x << 1, which involves only one left shift. Though the output of a * a and a2 is same, a2 is much more efficient to implement.
  1. Accessing machine instructions
  • The target machine can deploy more sophisticated instructions, which can have the capability to perform specific operations much efficiently.
  • If the target code can accommodate those instructions directly, that will not only improve the quality of code, but also yield more efficient results.

This article is contributed by Tarun Jangra. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

Author: Tarun Jangra

Leave a Reply

Your email address will not be published. Required fields are marked *