CMSC 415, MSCS 521  Computer Architecture

 

Answers to selected problems from Homework Assignment # 1.

 

  1. Using the MIPS program (with bugs intact) below, determine the instruction format for each instruction and the decimal values of each instruction field.

 

                  addi      $v0, $zero, 0                #initialize count

      loop:     lw         $v1, 0($a0)                  #read next word from source

                  sw        $v1, 0($a1)                  #write to destination

                  addi      $a0, $a0, 4                   #advance ponter to next source

                  addi      $a1, $a1, 4                   #advance pointer to next destination

                  beq      $v1, $zero, loop           #loop if word copied != zero

  1. The above program tries to copy words from the address in register $a0 to the address in register $a1, counting the number of words copied in register $v0.  The program stops copying when it finds a word equal to 0.  You do not have to preserve the contents of registers $v1, $a0, and $a1.  The terminating word should be copied but not counted.  There are multiple bugs in this MIPS program; fix them and write a bug-free version.  Like many of the exercises in this chapter, the easiest way to write MIPS programs is to use the simulator described in Appendix A (on the CD).

 

 

                  addi      $v0, $zero, 0                #initialize count

      loop:     lw         $v1, 0($a0)                  #read next word from source

                  sw        $v1, 0($a1)                  #write to destination

                  beq      $v1, $zero, finish          #stop if word copied == zero

                  addi      $v0, $v0, 1                   #increment count

                  addi      $a0, $a0, 4                   #advance ponter to next source

                  addi      $a1, $a1, 4                   #advance pointer to next destination

                  j           loop                             #continue -- read and copy next word

      finish:    ----

                 

       

  1. Add comments to the following MIPS code and describe in one sentence what it computes.  Assume that $a0 is used for the input and initially contains n, a positive integer.  Assume that $v0 is used for the output.

 

            begin:   addi      $t0, $zero, 0                 #move 0 into t0

                        addi      $t1, $zero, 1                 #move 1 into t1

            loop:     slt         $t2, $a0, $t1                #if ($a0 < $t1) set t2 to 1

                        bne       $t2, $zero, finish           #exit if ($t2 == 1)

                        add      $t0, $t0, $t1                 #increase t0 by amount in t1

                        addi      $t1, $t1, 2                    #increase $t1 by 2

                        j           loop                             #continue

            finish:    add      $v0, $t0, $zero             #move  $t0 into v0

 

            computes (n/2)2 if n is even ((n+1)/2)2 if n is odd

 

  1. Add comments to the following MIPS code and describe in one sentence what it computes.  Assume that $a0 and $a1 are used for the input and initially contain the integers a and b respectively.  Assume that $v0 is used for the output.

 

                        add      $t0, $zero, $zero         #move 0 into t0

            loop:    beq      $a1, $zero, finish        #exit when $a1 is 0

                        add      $t0, $t0, $a0                #add a to contents of $t0

                        sub      $a1, $a1, 1                  #decrement contents of a1

                        j           loop                             #back to comparison of $a1 with 0

            finish:  addi     $t0, $t0, 100                #add 100 to $t0

                        add      $v0, $t0, $zero                        #move contents of t0 to v0

 

            computes b*a + 100

 

  1. The following code fragment processes an array and produces two important values in registers $v0 and $v1.  Assume that the array consists of 5000 words indexed 0 through 4999, and its base address is stored in $a0 and its size (5000) in $a1.  Describe in one sentence what this code does.  Specifically, what will be returned in $v0 and $v1?

 

                        add      $a1, $a1, $a1

                        add      $a1, $a1, $a1

                        add      $v0, $zero, $zero

                        add      $v0, $zero, $zero

            outer:    add      $t4, $a0, $t0

                        lw         $t4, 0($t4)

                        add      $t5, $zero, $zero

                        add      $t1, $zero, $zero

            inner:    add      $t3, $a0, $t1

                        lw         $t3, 0($t3)

                        bne       $t3, $t4, skip

                        addi      $t5, $t5, 1

            skip:     addi      $t1, $t1, 4

                        bne       $t1, $a1, inner

                        slt         $t2, $t5, $v0

                        bne       $t2, $zero, next

                        add      $v0, $t5, $zero

                        add      $v1, $t4, $zero

            next:     addi      $t0, $t0, 4

                        bne       $t0, $a1, outer

 

            finds the mode (most frequently occurring item) of an array

 

  1. Assume that the code from the previous exercise is run on a machine with a 500-Mhz clock that requires the following number of cycles for each instruction:

 

     

Instruction

Cycles

add, addi, slt

1

lw, bne

2

 

      In the worst case, how many seconds will it take to execute the code?

 

      Statements in the inner loop (between labels inner and next) are executed (worst case) 25 x 106 times. 

      These are the only statements we need to count.  There are 4 lw or bne statements -- a total of

      200 x 106 cycles, and 6 statements executed for a total of 125 x 106 cycles.

 

      time = 325 x106 cycles / 500 x 106 cycles/sec = 0.65 sec

 

  1. For the C code that is shown below, write an equivalent assembly language program for each of the four architectural styles described below.  Assume all of the variables are initially in memory.  For each code sequence, calculate the instruction bytes fetched and the memory data bytes transferred (read or written).  Which architecture is most efficient as measured by code size?  Which architecture is most efficient as measured by total memory bandwidth required (code + data)?  If the answers are not the same, why are they different?

 

a = b + c;

b = a + c;

d = a – b;

 

  1. Sometimes architectures are characterized according to the typical number of memory addresses per instruction (for arithmetic/logic instructions).  Commonly used terms are 0, 1, 2, and 3 addresses per instruction.  Associate the names below with each cagtegory.

 

      0 address à Stack

      1 address à Accumulator

      2 address à either Memory-memory or Load-store

      3 address à either memory-memory or Load-store (MIPS is an example of a 3 address Load-store)

                           for memory-memory to be 3 address would require a long instruction word

 

Comparing Instruction Sets of different styles

The architectural styles are the following:

·        Accumulator

·        Memory-memory:  All three operands of each instruction are in memory

·        Stack:  All operations occur on top of the stack.  Only push and pop access memory.  All other instructions remove their operands from the stack and replace them with the result.  The implementation uses a stack for the top two entries; accesses that use other stack positions are memory references.

·        Load-store:  All operations occur in registers, and register-to-register instructions have three operands per instruction.  There are 16 general-purpose registers, and register specifiers are 4 bits long.

 

For a given code sequence, we can calculate the instruction bytes fetched and memory data bytes transferred using the following assumptions about all four instruction sets:

·        The opcode is always 1 byte (8 bits).

·        All memory addresses are 2 bytes

·        All data operands are 4 bytes

·        All instructions are an integral number of bytes in length.

·        There are no optimizations to reduce memory traffic.

 

A register load will require four instruction bytes – one for the opcode, one for the register destination, and two for a memory address – to be fetched from memory along with four data bytes to be transferred.  A memory-memory add instruction will require seven instruction bytes – one for the opcode and six for the three memory references – to be fetched from memory and 12 data bytes being transferred ( 8 from memory to the processor and 4 from the processor back to memory).

The following table displays a summary of this information for each of the architectural styles.

 

Style

Instructions for

a = b + c

Code bytes

Data bytes

Accumulator

3

3 + 3 + 3

4 + 4 + 4

Memory-memory

1

7

12

Stack

4

3 + 3 + 1 + 3

4 + 4 + 0 + 4

Load-store

4

4 + 4 + 3 + 4

4 + 4 + 0 + 4