# COMPILER DESIGN (MAY-2017) UNIT-III

#### Question:-  6(a) Explain quadruples and triples with example. Write three address  code of expression: a+a*(b+c)+(b-c)*d

Each instruction in quadruples presentation is divided into four fields: operator, arg1, arg2, and result. The above example is represented below in quadruples format:

• Using quadruple representation, the three-address statement x = y op z is represented by placing op in the operator field, y in the operand1 field, z in the operand2 field, and x in the result field.
• The statement x = op y , where op is a unary operator, is represented by placing op in the operator field, y in the operand1 field, and x in the result field; the operand2 field is not used.
• A statement like param t 1 is represented by placing param in the operator field and t 1 in the operand1 field; neither operand2 nor the result field are used.
• Unconditional and conditional jump statements are represented by placing the target labels in the result field.
• For example, a quadruple representation of the three-address code for the statement x = (a + b) * – c/d is shown in Table 6.1. The numbers in parentheses represent the pointers to the triple structure.

### Triples

Each instruction in triples presentation has three fields : op, arg1, and arg2.The results of respective sub-expressions are denoted by the position of expression. Triples represent similarity with DAG and syntax tree. They are equivalent to DAG while representing expressions.

Triples face the problem of code immovability while optimization, as the results are positional and changing the order or position of an expression may cause problems.

Triple Representation

• The contents of the operand1, operand2, and result fields are therefore normally the pointers to the symbol records for the names represented by these fields.
• Hence, it becomes necessary to enter temporary names into the symbol table as they are created.
• This can be avoided by using the position of the statement to refer to a temporary value.
• If this is done, then a record structure with three fields is enough to represent the three-address statements:
• The first holds the operator value, and the next two holding values for the operand1 and operand2, respectively. Such a representation is called a “triple representation”.
• The contents of the operand1 and operand2 fields are either pointers to the symbol table records, or they are pointers to records (for temporary names) within the triple representation itself.
• For example, a triple representation of the three-address code for the statement x = (a + b)* ˆ’ c/d.

Indirect Triple Representation

• Another representation uses an additional array to list the pointers to the triples in the desired order.
• This is called an indirect triple representation. For example, a triple representation of the three-address code for the statement x = ( a + b )* ˆ’ c/d

This representation is an enhancement over triples representation. It uses pointers instead of position to store results. This enables the optimizers to freely re-position the sub-expression to produce an optimized code.

Comparison

• By using quadruples, we can move a statement that computes A without requiring any changes in the statements using A, because the result field is explicit.
• However, in a triple representation, if we want to move a statement that defines a temporary value, then we must change all of the pointers in the operand1 and operand2 fields of the records in which this temporary value is used.
• Thus, quadruple representation is easier to work with when using an optimizing compiler, which entails a lot of code movement.
• Indirect triple representation presents no such problems, because a separate list of pointers to the triple structure is maintained.
• When statements are moved, this list is reordered, and no change in the triple structure is necessary; hence, the utility of indirect triples is almost the same as that of quadruples.

#### Question: -6(b) Explain Various run time storage allocation strategies.

Runtime environment manages runtime memory requirements for the following entities:

• Code: It is known as the text part of a program that does not change at runtime. Its memory requirements are known at the compile time.
• Procedures: Their text part is static but they are called in a random manner. That is why, stack storage is used to manage procedure calls and activations.
• Variables: Variables are known at the runtime only, unless they are global or constant. Heap memory allocation scheme is used for managing allocation and de-allocation of memory for variables in runtime.

STORAGE ALLOCATION STRATEGIES

The different storage allocation strategies are :

1. Static allocation – lays out storage for all data objects at compile time
2. Stack allocation – manages the run-time storage as a stack.
3. Heap allocation – allocates and deallocates storage as needed at run time from a data area known as heap.
Static Allocation

In this allocation scheme, the compilation data is bound to a fixed location in the memory and it does not change when the program executes. As the memory requirement and storage locations are known in advance, runtime support package for memory allocation and de-allocation is not required.

Stack Allocation

Procedure calls and their activations are managed by means of stack memory allocation. It works in last-in-first-out (LIFO) method and this allocation strategy is very useful for recursive procedure calls.

Heap Allocation

Variables local to a procedure are allocated and de-allocated only at runtime. Heap allocation is used to dynamically allocate memory to the variables and claim it back when the variables are no more required.

Except statically allocated memory area, both stack and heap memory can grow and shrink dynamically and unexpectedly. Therefore, they cannot be provided with a fixed amount of memory in the system.

# Symbol Table in Compiler

Symbol Table is an important data structure created and maintained by the compiler in order to keep track of semantics of variable i.e. it stores information about scope and binding information about names, information about instances of various entities such as variable and function names, classes, objects, etc.

• It is built in lexical and syntax analysis phases.
• The information is collected by the analysis phases of compiler and is used by synthesis phases of compiler to generate code.
• It is used by compiler to achieve compile time efficiency.
• It is used by various phases of compiler as follows :-
1. Lexical Analysis:Creates new table entries in the table, example like entries about token.
2. Syntax Analysis:Adds information regarding attribute type, scope, dimension, line of reference, use, etc in the table.
3. Semantic Analysis:Uses available information in the table to check for semantics i.e. to verify that expressions and assignments are semantically correct(type checking) and update it accordingly.
4. Intermediate Code generation:Refers symbol table for knowing how much and what type of run-time is allocated and table helps in adding temporary variable information.
5. Code Optimization:Uses information present in symbol table for machine dependent optimization.
6. Target Code generation:Generates code by using address information of identifier present in the table.

Symbol Table entries – Each entry in symbol table is associated with attributes that support compiler in different phases.
Items stored in Symbol table:

• Variable names and constants
• Procedure and function names
• Literal constants and strings
• Compiler generated temporaries
• Labels in surce languages

Information used by compiler from Symbol table:

• Data type and name
• Declaring procedures
• Offset in storage
• If structure or record then, pointer to structure table.
• For parameters, whether parameter passing by value or by reference
• Number and type of arguments passed to function

Operations of Symbol table – The basic operations defined on a symbol table include:

Implementation of Symbol table –
Following are commonly used data structure for implementing symbol table :-

1. List –
• In this method, an array is used to store names and associated information.
• A pointer“available” is maintained at end of all stored records and new names are added in the order as they arrive
• To search for a name we start from beginning of list till available pointer and if not found we get an error “use of undeclared name”
• While inserting a new name we must ensure that it is not already present otherwise error occurs i.e. “Multiple defined name”
• Insertion is fast O(1), but lookup is slow for large tables – O(n) on average
• Advantage is that it takes minimum amount of space.
• Searching of names is done in order pointed by link of link field.
• A pointer “First”is maintained to point to first record of symbol table.
• Insertion is fast O(1), but lookup is slow for large tables – O(n) on average
3. Hash Table–
• In hashing scheme two tables are maintained – a hash table and symbol table and is the most commonly used method to implement symbol tables..
• A hash table is an array with index range: 0 to tablesize – 1.These entries are pointer pointing to names of symbol table.
• To search for a name we use hash function that will result in any integer between 0 to tablesize – 1.
• Insertion and lookup can be made very fast – O(1).
• Advantage is wuick search is possible and disadvantage is that hashing is complicated to implement.
4. Binary Search Tree–
• Another approach to oimplement symbol table is to use binary search tree i.e. we add two link fields i.e. left and right child.
• All names are created as child of root node that always follow the property of binary search tree.
• Insertion and lookup are O(log2n) on average.

#### Question: – 7(b) Give the general  structures of activation record. Explain the purpose or each component.

Answer: – The general  structures of activation record

•  A program as a source code is merely a collection of text (code, statements etc.) and to make it alive, it requires actions to be performed on the target machine.
• A program needs memory resources to execute instructions.
• A program contains names for procedures, identifiers etc., that require mapping with the actual memory location at runtime.
• By runtime, we mean a program in execution.
• Runtime environment is a state of the target machine, which may include software libraries, environment variables, etc., to provide services to the processes running in the system.
• Runtime support system is a package, mostly generated with the executable program itself and facilitates the process communication between the process and the runtime environment.
• It takes care of memory allocation and de-allocation while the program is being executed.

Activation Records

• Recall that an activation is an execution of a subprogram. In most languages, local variables are allocated when this execution begins (activation time).
• The storage (for formals, local variables, function results etc.) needed for an activation is organized as an activation record (or frame).
• In a language with recursion, each simultaneous activation of a recursive subprogram can have different parameters, different values for local variables, return a different result.
• Each activation needs its own activation record
• The number of activation records needed isn’t known at compile time
• Allocation of activation records is dynamic, and local variables are stack dynamic (unless declared static).
• A program is a sequence of instructions combined into a number of procedures.
• Instructions in a procedure are executed sequentially.
• A procedure has a start and an end delimiter and everything inside it is called the body of the procedure.
• The procedure identifier and the sequence of finite instructions inside it make up the body of the procedure.
• The execution of a procedure is called its activation.
• An activation record contains all the necessary information required to call a procedure.
• An activation record may contain the following units (depending upon the source language used).
 Return value Parameters Local Data Temporary Data Links(optional) Status

Activate Record

• It is used to store the current record and the record is been stored in the stack.
• It contains return value .After the execution the value is been return.
• It can be called as return value. Parameter
• It specifies the number of parameters used in functions.

Local Data

• The data that is been used inside the function is called as local address Temporary Data
• It is used to store the data in temporary variables.

• It specifies the additional links that are required by the program.

Status

• It specifies the status of program that is the flag used.
 Temporaries Stores temporary and intermediate values of an expression. Local Data Stores local data of the called procedure. Machine Status Stores machine status such as Registers, Program Counter etc., before the procedure is called. Control Link Stores the address of activation record of the caller procedure. Access Link Stores the information of data which is outside the local scope. Actual Parameters Stores actual parameters, i.e., parameters which are used to send input to the called procedure. Return Value Stores return values.
• Whenever a procedure is executed, its activation record is stored on the stack, also known as control stack.
• When a procedure calls another procedure, the execution of the caller is suspended until the called procedure finishes execution.
• At this time, the activation record of the called procedure is stored on the stack.
• We assume that the program control flows in a sequential manner and when a procedure is called, its control is transferred to the called procedure.
• When a called procedure is executed, it returns the control back to the caller. This type of control flow makes it easier to represent a series of activations in the form of a tree, known as the activation tree.

To understand this concept, we take a piece of code as an example:

. . .

printf (“Press any key to continue…”);

. . .

int show_data (char *user)

{

return 0;

}

. . .

• Below is the activation tree of the code given.

• Now we understand that procedures are executed in depth-first manner, thus stack allocation is the best suitable form of storage for procedure activations.