[Back to FINDREPL SWAG index]  [Back to Main SWAG index]  [Original]

   SYMBOL TABLE

   All Compilers and interpreters must maintain a data structure
   called the SYMBOL TABLE. This is where all the inFormation about
   the Programs symbols are kept. Maintaining a well-organized
   symbol table is a skill all Compiler Writers must master.

   As a Compiler parses a source Program, it relies on the symbol
   table to provide inFormation about each identifier (such as
   Variables and Constants) - it must be able to access and update
   inFormation about each identifier and do so quickly - otherwise
   the process is slowed or produces incorrect results.

   No matter what inFormation is kept, or how the table is organized
   certain operations are fundamental to a symbol tables operation.

   You ENTER inFormation about about an identifier into the table by
   *creating* and entry.

   You SEARCH the table to look up an identifier's entry and make
   available the inFormation in that entry.

   You UPDATE the entry to modify stored inFormation.

   There can be only one entry per identifier in the symbol table,
   so you must first search the table beFore making a new entry.

   TABLE ORGANIZATION

   There are many different ways to handle symbol tables: Arrays,
   linked lists, hash tables...but since the most common operations
   perFormed on a symbol table are searching it For existing entries
   it makes perfect sense to implement it as a BINARY TREE.

   Each NODE in the TREE contains and entry, and points to two other
   nodes. The *values* of the nodes on the subtree to the left are
   always LESS than the parent node, While the subtree to the right
   is always MORE than the parent. This makes searching sorted
   binary trees very efficient.

   Inserting new nodes is as easy as searching the tree: if the
   value you want to insert is LESS than the current node, search
   the node to the left. If it is MORE, search the tree to the right.
   Keep doing this recursively Until an empty node is found, then
   insert the value into that node.

   NITTY-GRITTY

   Now that we've covered some background on the table, here's a
   recap on the symbol table Type defs. For those that missed them
   in the first message, or didn't save them:

Type
   sptr = ^String; { useful For minimum-size allocation }

   DEFN_KEY = (UNDEFINED,
               Const_DEFN, Type_DEFN, Var_DEFN, FIELD_DEFN,
               VALPARM_DEFN, VarPARM_DEFN,
               PROG_DEFN, PROC_DEFN, FUNC_DEFN
              );

   ROUTINE_KEY = (rkDECLARED, rkForWARD,
                  rkREAD, rkREADLN, rkWrite, rkWriteLN,
                  rkABS, rkARCTAN, rkCHR, rkCOS, rkEOF, rkEOLN,
                  rkEXP, rkLN, rkODD, rkORD, rkPRED, rkROUND,
                  rkSIN, rkSQR, rkSQRT, rkSUCC, rkTRUNC
                 );

   RTN_BLOCK = Record               {info about routine declarations}
      key              :ROUTINE_KEY;
      parm_count,
      total_parm_size,
      total_local_size :Word;
      parms, locals,
      local_symtab     :SYMTAB_PTR; {symbol tables of routine}
      code_segment     :sptr;       {interpreter}
   end;

   DTA_BLOCK = Record
      offset     :Word;
      Record_idp :SYMTAB_PTR;
   end;

   INFO_REC = Record
      Case Byte of
        0:(Constant :VALUE);     { literal value }
        1:(routine  :RTN_BLOCK); { identifier is routine }
        2:(data     :DTA_BLOCK); { identifier is data }
   end;

   DEFN_REC = Record
      key  :DEFN_KEY; { what is identifier? }
      info :INFO_REC; { stuff about identifier }
   end;

   SYMTAB_PTR  = ^SYMTAB_NODE;
   SYMTAB_NODE = Record          {actual tree node}
      left, right   :SYMTAB_PTR; {Pointers to left and right subtrees}
      next          :SYMTAB_PTR; {For chaining a node}
      name          :sptr;       {identifier name String}
      level,                     {nesting level}
      co_index      :Integer;    {code Label index}
      defn          :DEFN_REC;   {definition info}
   end; { Record }

   EXCERCISE #1

   Implement a symbol table SEARCH routine, and a symbol table ENTER
   routine. Both routines must accept a Pointer to the root of the
   tree, and the name of the identifier you are working With, and
   must return a Pointer to the node that was found in the search
   routine, or enters in the enter routine. If no node was found, or
   entered, the routines must return NIL.

   The resulting symbol table should be a sorted tree.



³   Implement a symbol table SEARCH routine, and a symbol table ENTER
³   routine. Both routines must accept a Pointer to the root of the
³   tree, and the name of the identifier you are working with, and
³   must return a Pointer to the node that was found in the search
³   routine, or enters in the enter routine. If no node was found, or
³   entered, the routines must return NIL.
³   The resulting symbol table should be a sorted tree.



Function Enter(root: SymTab_Ptr; PidStr: spstr): SymTab_Ptr;
{ - inserts a new indetifier String PidStr in the symol table. }
{ - nil is returned if duplicate identifier is found.          }
Var
  Ptemp: SymTab_Ptr;
begin
  if (root <> nil) then    { not a terminal node }
    if (PidStr = root^.name) then
      begin
        Enter := nil;
        Exit
      end
    else    { recursive insertion calls to either left or right sub-tree }
      if (PidStr > root^.name) then
        Enter(root^.right, PidStr)
      else
        Enter(root^.left, PidStr)
  else { a terminal node }
    begin
      new(Ptemp);     { create a new tree leaf node }
      Ptemp^.name := PidStr;
      Ptemp^.left := nil;
      Ptemp^.right := nil
    end
end; { Enter }


Function Search(root: SymTab_Ptr; PidStr: spstr): SymTab_Ptr;
{ - search For a certain identifier String PidStr in the symbol table. }
{ - returns nil if search faild.                                       }
begin
  While (root <> nil) and (PidStr <> root^.name) do
    if (PidStr > root^.name) then     { search the right sub-tree }
      root := root^.right
    else
      if (PidStr < root^.name) then
        root := root^.left;           { search the left sub-tree  }
   Search := root                     { return the node           }
end;

{===========================================================================}

Comment:
     What made you choose BINARY trees over AVL trees?  With binary trees,
     the structure may become degenerate (unbalanced) and, the routines for
     searching and insertion becomes inefficient.

>Comment:
>     What made you choose BINARY trees over AVL trees?  With binary trees,
>     the structure may become degenerate (unbalanced) and, the routines for
>     searching and insertion becomes inefficient.

   Glad you could join us!

   I chose a binary tree because it's simple and easy to Write, also
   a degenerate tree isn't much of a concern, simply because it's
   intended to hold only identifiers and Constants, not every
   statement. :)

   As long as it sorts the data as it inserts, it will work. This
   isn't, after all, a graduate "course". The intention is to teach
   people how compilers work and show interested parties how to
   understand and Write their own, if they're interested. This is
   YOUR compiler you're writing, if you want to implement an AVL
   tree, go ahead!

>Function Search(root: SymTab_Ptr; PidStr: spstr): SymTab_Ptr;

   This works. It's efficient and does the job.

>Function Enter(root: SymTab_Ptr; PidStr: spstr): SymTab_Ptr;

>    else    { recursive insertion calls to either left or right sub-tree }
>      if (PidStr > root^.name) then
>        Enter(root^.right, PidStr)
>      else
>        Enter(root^.left, PidStr)

   Note: recursive calls shouldn't be necessary in this Function.
   You can search the table the same way you did With Search, and
   you don't run the risk of running out of stack space. Procedure
   calls can also be exensive, slowing down the Program too much
   especially if a lot of symbols are searched.

>  else { a terminal node }
>    begin
>      new(Ptemp);     { create a new tree leaf node }
>      Ptemp^.name := PidStr;
>      Ptemp^.left := nil;
>      Ptemp^.right := nil
>    end
>end; { Enter }

   Please note that there is a lot of data that will be have to
   added to this section over time, as an identifier could be
   ANYTHING from a ConstANT to a Program identifier.

   That isn't too important right now, as we're just getting started
   on the symbol table but suggest you add the following lines, for
   use later:

   Ptemp^.info     := NIL;
   Ptemp^.defn.key := UNDEFINED;
   Ptemp^.level    := 0;     {recursion level}
   Ptemp^.Label_index := 0;  {Label # to be used in code output}

[Back to FINDREPL SWAG index]  [Back to Main SWAG index]  [Original]