CPSC 2310 - DAY 22
OCTOBER 26, 2016

=============================================================================

ONE PASS STRUCTURE
------------------

Clemson University -- CPSC 231 


One-pass structure

  the definition must occur before any uses (that is, uses can have only
    backward references to definitions)

  examples: macro definition before macro calls
            variable declarations before uses


Two-pass structure

  uses can have forward or backward references to definitions

  first pass collects the definitions into a table (e.g., symbol table)

  second pass accomplishes the translation by consulting the definition
    table whenever a use is encountered

  only the table is required to remain in memory; the source file can be
    read from disk twice

  example: traditional assembler


One-pass-with-fixup structure

  also known as back-patching

  the translated code as well as the definition table both need to be
    kept in memory

  algorithm for assembler

        label(x)      // x defined at location 10
            ...
          blt0(y)     // y used at location 15
            ...
          beq0(y)     // y used at location 20
            ...
        label(y)      // y defined at location 25

    In one-pass-with-fixup, the symbol table needs a third field (in
    addition to the symbol name and address fields).  The third field
    indicates whether the symbol is defined or undefined.

    When you encounter the use of a symbol, perform a lookup in the
    symbol table:

      a) if the symbol is defined, use the address value from the entry

      b) if the symbol is not yet present, add an entry for this symbol
         and mark it as undefined; use the address field in the entry as
         a pointer to a linked list of use locations for this undefined
         symbol and add the address of the current word to the first node,
         e.g.,

                 symbol  type   address
                +------+-------+-------+
                |     ...              |
                +------+-------+-------+
                | x    | def   | 10    |
                +------+-------+-------+
                |     ...              |    use-location node
                +------+-------+-------+      +----+------+
                | y    | undef | <head>------>| 15 | NULL |
                +------+-------+-------+      +----+------+
                |     ...              |
                +------+-------+-------+

      c) if the symbol is present but undefined, add another node to
         the linked list with the address of the current word in that
         node, e.g.,

                |     ...              |
                +------+-------+-------+  +----+-------+  +----+------+
                | y    | undef | <head>-->| 20 | <next>-->| 15 | NULL |
                +------+-------+-------+  +----+-------+  +----+------+
                |     ...              |

    Note that you can extend the use-location nodes to contain an
    indicator for what type of fixup is needed, e.g., full-word value,
    high 22 bits, low 10 bits, etc.

    When you encounter the definition of a symbol, perform a lookup in
    the symbol table:

      a) if the symbol is already defined, you have encountered a
         "multiple definition" error

      b) if the symbol is not yet present, add an entry for this symbol
         and mark it as defined; use the current value of the location
         counter as the address

      c) if the symbol is present but undefined, "fix up" or "back patch"
         all the undefined uses by traversing the linked list and storing
         the current value of the location counter into the memory words
         identified in the nodes of the list; then free the list and mark
         the symbol as defined using the current value of the location
         counter as the address, e.g.,

                 symbol  type   address
                +------+-------+-------+
                |     ...              |
                +------+-------+-------+
                | x    | def   | 10    |
                +------+-------+-------+
                |     ...              |
                +------+-------+-------+
                | y    | def   | 25    |
                +------+-------+-------+
                |     ...              |
                +------+-------+-------+

    When you reach the end of the assembly language program, scan the
    symbol table for any remaining undefined entries.  These are either
    errors or, absent a requirement to explicitly mark the use of
    external names, these are symbols that must be passed to the linker
    to be resolved.

    To visualize this, consider the actions of the assembler as it moves
    through a posible source program:

      start translation
        ...
        ... use x ...      // forward reference - insert "x" into the
                           //   symbol table as an undefined symbol and
                           //   start a linked list of use locations
        ...
        ... use x ...      // forward reference - add this use location 
                           //   to the linked list
        ...
        ... define x ...   // traverse the linked list and fixup all 
                           //   forward references with the value of the
                           //   definition, then mark "x" as a defined
                           //   symbol and store the defining value
        ...
        ... use x ...      // backward reference - resolve by table lookup
        ...
        ... define x ...   // multiply-defined symbol error!
        ...
      end translation      // check for undefined symbols


Numeric local labels in a two-pass structure (a relaxation of the
  prohibition on multiply-defined symbols)

  (adapted from the GNU description)

  Local labels help programmers use names temporarily.  They are numeric
  symbols that can be redefined at various points in a program, but each
  valid use of a local label has a unique definition to which it refers.
  To define a local label for our chapter 1 assemblers, write a label
  with a number <N>

    label(<N>)

  To refer to the most recent previous definition of that symbol write
  <N>b, using the same number as when you defined the label.  To refer
  to the next definition of a local label, write <N>f.  The "b" stands
  for "backward", and the "f" stands for "forward".  Prior to the first
  definition of label <N>, the reference <N>b is undefined; likewise,
  after the last definition of label <N>, the reference <N>f is undefined.

  Local labels are useful in the bodies of macro definitions so that you
  do not need macro expansion counters to generate unique labels.

  Consider the following example code with two definitions of label "1":

    assembly code       location         generated code and data
    -------------       -----------      -----------------------

    label(start)
      ba(1f)            <address 0>      70  2

    label(1)
      ba(1f)            <address 2>      70  4

    label(1)
      ba(x)             <address 4>      70  8

    label(1)
      ba(start)         <address 6>      70  0

    label(x)
      halt              <address 8>      0

    end(1b)             <address 9>      6

  Note that in the first branch the target address "1f" is resolved to 2,
  but in the second branch the target address "1f" is resolved to 4.

  For the above assembly code, the first pass of a two-pass assembler
  builds a symbol table with a local/global flag and a linked list of
  definitions from each local entry:

    +-------+--------+---+  +-----+--+  +-+--+  +-+--+  +-+--+  +-----+--+
    | 1     | local  | *--->|undef| *-->|2| *-->|4| *-->|6| *-->|undef| *--.
    +-------+--------+---+  +-----+--+  +-+--+  +-+--+  +-+--+  +-----+--+ |
    | start | global | 0 |                                                 =
    +-------+--------+---+
    | x     | global | 8 |
    +-------+--------+---+

  The three definitions of the local label "1" in the example code above
  are recorded in the linked list in the symbol table with the locations
  2, 4, and 6, respectively, along with header and trailer nodes that
  specify that the location is undefined.

  During the second pass, the assembler removes the head node from a local
  label's list whenever a redefinition of that local label is encountered.

  Thus, because there are three definitions of "1" in the above code, by
  the end of the second pass only two nodes (the location-6 node and the
  undefined-trailer node) remain in the linked list for "1".

  With the removal of the head node when encountering a (re-)definition, the
  rules for resolving a local label "1b" of "1f" during the second pass are:

    "1b" - use the value from the first node in the linked list for "1"
    "1f" - use the value from the second node in the linked list for "1"

  To visualize this, consider how the linked list would be modified as the
  assembler moves through the example source code on the second pass:

    // at beginning of second pass, "1" has list (undef,2,4,6,undef)

    0: label(start)

         ba(1f)       70  2     // "1f" resolves to 2

    2: label(1)                 // redefine "1", so "1" now
                                //   has list (2,4,6,undef)

         ba(1f)       70  4     // "1f" resolves to 4

    4: label(1)                 // redefine "1", so "1" now
                                //   has list (4,6,undef)
         ba(x)        70  8

    6: label(1)                 // redefine "1", so "1" now
                                //   has list (6,undef)
         ba(start)    70  0

    8: label(x)

        halt          0

    9: end(1b)        6         // "1b" resolves to 6


*** Alternative to using a two pass structure is a one pass with fix-up***


Clemson University -- CPSC 231 


Language translation approaches


  - compilation vs. interpretation

    * compilation diagram

                         .---------.                  .------------------.
      step 1: compile    | program | --> compiler --> | compiled program |
                         `---------'                  `------------------'
                                                       .
                                                . . . .
                         .-------.             v            .--------.
      step 2: run        | input | --> compiled program --> | output |
                         `-------'                          `--------'

    * compilation is translation from one language to another, where the
        translated form is typically easier to execute; a pure compiler
        produces language that will be directly executed by hardware
    * compilation allows one translation and then multiple executions of
        the executable file (sometimes called a binary file, or load module);
        thus a fairly large amount of time can be spent by the compiler
        doing analysis and optimization once, in order to produce an
        executable that runs quickly each time it is run
    * a compiled program typically runs fast but is harder to debug
    * compiler example: gcc


    * interpretation diagram

                       .---------.
      single step      | program | ---.
                       `---------'    v               .--------.
                                      interpreter --> | output |
                       .-------.      ^               `--------'
                       | input | -----'
                       `-------'

    * interpretation skips the intermediate step of producing a form of
        the program in another language and combines translation and
        execution
    * interpretation starts from the source code each time you want to
        run the program; it performs the same analysis as a compiler but
        on a source-line-by-source-line basis; a pure interpreter keeps
        no results from this analysis even when encountering the same
        source line repeatedly within the body of a loop (this means
        an interpreted program will run faster if you make all the
        variable and function names only one or two characters in length
        and remove all the comments -- but I don't recommend doing this!)
    * an interpreted program typically runs slow but is easier to debug
        because of better run-time error diagnostics
    * interpreted languages easily support dynamic typing and dynamic
        scoping of variables
    * interpreter examples: shells, m4 or python on the command line;
        also, formatted I/O (e.g., printf) relies on interpretation


    * hybrid approach diagram

               .---------.                  .-----------.
      step 1:  | program | --> compiler --> | byte code |
               `---------'                  `-----------'
                                                 .
                      . . . . . . . . . . . . . .
                     v
               .-----------.
      step 2:  | byte code |---.
               `-----------'   v       .--------.
                               JVM --> | output |
               .-------.       ^       `--------'
               | input | ------'
               `-------'

    * Java compiler and JVM interpreter - a hybrid translation model
      - "javac" produces byte code, which is easy to interpret
      - "java" interprets byte code
    * provides for portability of byte code files across numerous systems
    * Perl also has a hybrid translation model

    * other hybrid translation models include just-in-time (JIT) compilers,
        which compile functions/procedures at run-time, on the first call

    * terminology - source code that needs to be compiled is typically
        called a "program" while source code that is interpreted may be
        called a "script" (but may also be called a "program" also)


major translators in compilation model

  1. language preprocessor - textual substitution and conditional
       compilation (direct execution of special statements)
  2. compiler - lexical analysis, parsing, code generation, optimization
  3. macro processor - textual substitution and conditional assembly
  4. assembler - translate symbols into addresses and machine code
  5. linker - external symbol resolution plus relocation, produces executable
  6. loader - relocation according to load address, produces memory image

  (note many compilers generate object code directly - without calling
  a separate assembler)


  compile steps
  -------------

           language
         preprocessor       compiler        assembler      linker
            (cpp)            (ccom)           (as)          (ld)
   .------.        .--------.      .--------.       .------.    .-------------.
   |source|------->|expanded|----->|assembly|------>|object|--->| executable  |
   | code |        | source |      |language|       | code |    |(load module)|
   | (.c) |        |  code  |      |  (.s)  |       | (.o) |    |  (a.out)    |
   `------'        `--------'      | (.asm) |       |(.obj)|    |  (.exe)     |
            macro                  `--------'       `------'    `-------------'
        expansion and       compile  ^      assembly        link  ^
         conditional         time    |        time          time  |
         compilation                 |                            /
                                     |              .-------.    /
                                     /              |library|----
                    macro processor /               |routine|
                        (m4)       /                `-------'
      ---------------             /
     |assembly source|------------                        static linking
     |w/ macros (.m) |
     `---------------
                  macro expansion and
                  conditional assembly


  load and run steps
  ------------------

    command interpreter
        (shell)                     loader    fetch/decode/execute on CPU
                     .-------------.      ..................................
      search for     | executable  |----->. memory (... machine lang. ...) .
      file name      |(load module)|      . image  (...insts. and data...) .
                     |  (a.out)    |      ..................................
                     |  (.exe)     |      ^          ^ ^
                     `-------------'      .          | |
                                          .          | |
                                          .          | |
                      load-time linking  .           / | run-time linking
                       (early Windows)  .           /  |  (most systems)
                                       .           /   |
                  .---------------.   .           /   ...................
                  | library files |--------------'    .  shared object  .
                  |(Microsoft DLL)|                   .      (.so)      .
                  `---------------                    ...................

                                              dynamic linking


translators

  - language preprocessor, e.g, for C
    * special syntax for preprocessor statements, e.g., #include
    * macro facility, #define - trivially used for constant substitution
    * conditional compilation, #ifdef - used for versioning

      #ifdef VERBOSE
        printf( "value of a is %d\n", a );
      #endif

      where "#define VERBOSE" is included in the program source or where
        you compile with "gcc -DVERBOSE"


  - compiler
    * lexical analysis - extracting lexical items ("tokens") from the input
    * syntactic analysis - parsing statements according to the grammar rules
        the of language, generates a parse tree
    * semantic analysis - determining the meaning of operations according to
        the datatypes of the variables in the parse tree, may involve adding
        conversion operators to the parse tree
    * intermediate code generation
    * machine-independent optimizations, e.g., loop transformations
    * machine-specific code generation and register allocation
    * machine-dependent optimizations, e.g., branch delay slot scheduling

      consider the statement a = b + 2*c; in the following code

        float a,b;
        extern float c;
        ...
        a = b + 2*c;
        ...

      lexical analysis extracts eight tokens and assigns symbolic identifiers
      to entries in the symbol table

           `a'    `='    `b'    `+' `2' `*'    `c'    `;'

        symtab[0] `=' symtab[1] `+' `2' `*' symtab[2] `;'

      syntactic analysis builds a parse tree

                           =
                         /   \
                symtab[0]      +
                             /   \
                    symtab[1]      *
                                 /   \
                              `2'     symtab[2]

      semantic analysis determines meaning

                           =:float
                         /         \
          symtab[0]:float           +:float
                                  /         \
                   symtab[1]:float           *:float
                                           /         \
                           convert_to_float           symtab[2]:float
                                   |
                                  `2'

    intermediate code generation yields something like

        convert_to_float( 2            , temp_float_0 )
        multiply_float(   temp_float_0 , symtab[2]    , temp_float_1 )
        add_float(        symtab[1]    , temp_float_1 , temp_float_2 )
        store_float(      temp_float_2 , symtab[0]    )

    machine-independent optimization goes ahead and either does the conversion
    at compile time or strength reduces the multiply by 2 to an add

        add_float(        symtab[2]    , symtab[2]    , temp_float_1 )
        add_float(        symtab[1]    , temp_float_1 , temp_float_2 )
        store_float(      temp_float_2 , symtab[0]    )

    from this registers would be assigned and ARM code would be generated
    (including storage allocation and addressing for variables)


  - macro processor
    * simple abstraction through textual substitution ("open" subroutines)
    * provides either keyword or positional parameter substitution
    * extends inst. set by synthesizing instructions using macro definitions
        (similar to what is already done by SPARC assembler, cf. "set")
    * cost occurs at assembly time of expanding macro definition, not at run
        time of procedure call, register save/restore, and procedure return
    * conditional assembly is same idea as #ifdef facility of C preprocessor

    comparison with run-time functions

                      macro                         function
                      -----                         --------

      invocation      in-line substitution          run-time call and return

      parameters      untyped                       typed

                      evaluated at each             evaluated once at time
                        appearance                    of call

      trade-offs      fast but one copy of          more overhead per call but
                        code at each call site        only one copy of code


  - assembler
    * translates program written in assembly language to binary machine code
        resolving local symbolic addresses; typically this is 1-to-1 translation
    * forward references generally require 2-pass assemblers
      pass 1: find symbolic labels and assign them addresses
              run location counter (virtual instruction pointer)
              determine instruction size
              record addresses in symbol table
      pass 2: use symbol table information to construct instructions
              symbolic -> binary
    * alternative to 2-pass approach is 1-pass with fixup (i.e., backpatching)
    * other assembler facilities include data layout directives (pseudo-ops)


  - linker
    * separate assembly or compilation means the assembler does not know all
        the addresses, thus the assembler produces only partially-resolved
        object files
    * linker combines separate object files into a single executable
      - layout pieces of code & data (storage allocation based on sizes)
      - resolve external references
      - perform relocation of absolute addresses
    * two pass:
      1: assign code and data to memory addresses and build symbol table from
         public symbols
      2: use table to resolve external addresses and produce load module
    * object module file format (this is early UNIX; ELF is more complex)
      - header (includes sizes of text, data, and bss sections)
      - text section (read only)
      - data section (read/write)
      - relocation/external symbol entries for text section
      - relocation/external symbol entries for data section
      - symbol table
      - string table (symbol table entries index into string table)


  - command interpreter (shell) - a program that reads command lines from the
    keyboard (or from a script file) and either directly executes the command
    or searches for an executable file having that command name and then loads
    and branches to that loaded program


  - loader
    * bring a program into memory in preparation for execution
    * read file header to find size of pieces
    * allocate memory area(s)
    * read instructions and data from file into memory
    * relocation - adjusting absolute addresses relative to load point
    * jump to startup code


binding times

  The assembler, linker, and loader are all programs taking input files and
  producing output.  Decisions and translations made by these programs are
  said to be done at "assembly time", at "link time", and at "run time",
  respectively.  Actual execution (i.e., instruction interpretation by the
  hardware such as performing adds, branches, etc.) takes place at "run time".
  During execution, you can also talk of things happening at specific times,
  such as register saving at procedure call time.

  Dynamic linking is an example of a late decision, or "late binding".  It is
  the linking of separate procedures at either load time or run time, and it
  typically requires that the normal (static) linker include a simple table
  that names the needed routines (for load-time linking) or include simple
  "stub" routines that find and link to the shared library routines on their
  first calls (for run-time linking).

  Another form of delayed binding is "just-in-time" (JIT).  This is used in
  several Java compilers, where methods are not compiled until the first call.

  Many storage allocation decisions are made at each step.  For example,
  offsets are assigned to labels at assembly time, under the assumption that
  any absolute addresses will be updated by the linker and loader later.

  (When we later study virtual memory, we will see that it is also an example
  of late binding - specifically one where physical memory allocation
  decisions that might be made by a traditional loader are instead deferred
  to run time and made by the operating system.)


other programming tools / components of a program development environment

  editors               (e.g., vi, emacs)
  beautifiers           (e.g., indent)
  project control       (e.g., make)
  version control       (e.g., sccs)
  GUI toolkit           (e.g., widget library)
  test coverage         (e.g., gcov)
  debuggers             (e.g., gdb, dbx, ddd)
  debugging tools       (e.g., Purify)
                          reading or writing beyond the bounds of an array
                          reading or writing freed memory
                          freeing memory multiple times
                          reading uninitialized memory
                          reading or writing through null pointers
                          overflowing the stack by recursive function calls
                          reading or writing memory addresses on which a
                            watch-point has been set
  portability advisors  (e.g., lint)
  style checkers        (e.g., CodeCheck)
                          exceeding a given input line length
                          exceeding a given nesting depth of if-else stmts.
                          not aligning open and close curly braces (Horstmann)
  performance profilers (e.g., gprof)