CMSC 435 Algorithm Analysis and Design

 

For each of the data formats: random, reverse ordered, highly degenerate, and nearly sorted, run sorttest for all combinations of sorting algorithms and data sizes and complete each of the following tables.  When you have completed the tables, analyze your data and determine the asymptotic behavior of each of the sorting algorithms for each of the data types.

 

Random data              Tabulate number of element comparisons and swaps for each run

 

                                                                        Data Size

Sorting Algorithm

16

64

256

1024

4096

Selection Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Insertion Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Bubble Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Merge Sort

       compares

 

 

 

 

 

       copies

 

 

 

 

 

 

Quicksort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Quicksort Altern.

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

 


Reverse Ordered Data:  Tabulate number of element comparisons and swaps for each run

 

                                                                        Data Size

Sorting Algorithm

16

64

256

1024

4096

Selection Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Insertion Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Bubble Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Merge Sort

       compares

 

 

 

 

 

       copies

 

 

 

 

 

 

Quicksort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Quicksort Altern.

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Almost Sorted Data: Tabulate number of element comparisons and swaps for each run

 

                                                                        Data Size

Sorting Algorithm

16

64

256

1024

4096

Selection Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Insertion Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Bubble Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Merge Sort

       compares

 

 

 

 

 

       copies

 

 

 

 

 

 

Quicksort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Quicksort Altern.

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Highly Repetetive Data: Tabulate number of element comparisons and swaps for each run

 

                                                                        Data Size

Sorting Algorithm

16

64

256

1024

4096

Selection Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Insertion Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Bubble Sort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Merge Sort

       compares

 

 

 

 

 

       copies

 

 

 

 

 

 

Quicksort

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

Quicksort Altern.

       compares

 

 

 

 

 

       swaps

 

 

 

 

 

 

 

Part 2:   Interpretation of the Data

 

  1. What do you observe about selection sort? ______________________________________

 

  1. Explain briefly why this is the case____________________________________________

 

                _______________________________________________________________________

 

                _______________________________________________________________________

 

  1. For what distribution(s) does selection sort produce the fewest number of swaps?

 

                _____________________________________________

 

  1. In what way is the first quicksort algorithm more efficient than the alternate quicksort algorithm?

 

                _______________________________________________________________________

 

 

  1. What is the difference in the two algorithms that produces this result?

 

                _______________________________________________________________________

 

                _______________________________________________________________________

 

                _______________________________________________________________________

 

  1. For what distribution does quicksort (either version) perform badly? _____________________

 

  1. What is your explanation for the previous observation?

 

                ________________________________________________________________________

 

                ________________________________________________________________________

 

                ________________________________________________________________________

 

  1. What algorithm would you choose for highly repetetive data?___________________________ Explain your choice

 

                ________________________________________________________________________

 

                ________________________________________________________________________

 

  1. If you were sorting only small data sets with 64 or fewer elements, which sorting algorithm would you choose and why?

 

                choice:   ___________________Explanation:  ___________________________________

 

                _______________________________________________________________________

 

                _______________________________________________________________________

 

  1. If you had to choose one algorithm to sort a suite of data sets of various sample sizes, ranging from less than 16 to more than 10,000, and consisting of about 50% with randomly distributed keys, 25% highly degenerate, 20% almost sorted, and 5% reverse ordered, which one would you choose?  Briefly explain why.

 

                choice:   ___________________Explanation:  ___________________________________

 

                _______________________________________________________________________

 

                _______________________________________________________________________

 

                _______________________________________________________________________

 

 

 

 

 

 

 

 

 

 

                Consider the following code fragment             Consider a RISC (MIPS) implementation

 

                int j = 0;                                                                 Assign the registers as follows:

                while (j < A.length) {                                                           $s0 ß j,  $s1 ß i,   $s2ß A.length

                    i = j + 1;                                                                              $t0 ß address(A[0]),

                    while (i < A.length) {                                                       $t1 ß temp,   $t2 ß 4*j  and addr(A[j])

                        if(A[i] < A[j])                                                                $t3 ß 4 * i  and addr(A[i]),  $t4ß scratch

                            //swap                                                                        $t6 ß A[j],  $t5 ß A[i]

                            int temp = A[i];                                         Assume  $t0  and $s2  are already loaded, $0 ß 0

                            A[i] = A[j];

                            A[j] = temp;                                                               add   $s0, $0, $0     #set $s0 ß 0

                        }                                                                      outer:      slt     $t4, $s0, Ss2 # $t4ß1 if j<A.length

                        i++;                                                                 beq   $t4, $0, exit    #quit if $t4 != 1

                    }                                                                                          sll     $t2, $s0, 2       #$t2ß 4 * j

                    j++;                                                                                     add    $t2, $t0, $t2    #$t2ß address(A[j])

                }                                                                                              lw     $t6, ($t2)0       #$t6 ß A[j]

                                                                                                inner:      addi  $s1, $s0, 1      #i = j + 1

                                                                                                                slt     $t4, $s1, $s2  #$t4ß1 if i<A.length

                                                                                                                beq   $t4, $0, exit2  #if $t4ß1, quit inner          

                                                                                                                #begin the compare of A[i] < A[j] here

                                                                                                                sll     $t3, $s1, 2       #$t3ß 4 * i

                                                                                                                add    $t3, $t1, $t3    #$t3ß address(A[i])

                                                                                                                lw     $t5, ($t3)0       #$t5 ß A[i]

                                                                                                                slt     $t4, $t5, $t6    #$t4ß1 if A[i]<A[j]

                                                                                                                bne   $t4, $0, inc         #else no swap

                                                                                                                #end compare begin swap

                                                                                                                add    $t1, $t5, $0      #$t1 ß A[i]

                                                                                                                add    $t5, $t6, $0      #$t5 ß A[j]

                                                                                                                add    $t6, $t1, $0      #$t6 ß A[i]

                                                                                                                sw     $t5, ($t3)0        #store A[i]

                                                                                                                sw     $t6, ($t2)0        #store A[j]

                                                                                                inc:          addi   $s1,  $s1, 1      #i = i + 1

                                                                                                                j         inner                 #continue inner

                                                                                                exit2:       addi   $s0, $s0, 1       #j = j + 1

                                                                                                                j         outer                 #continue outer

                                                                                                exit:         end of fragment                                                                                   

 

                How many assembler instructions are needed to perform a swap? _________________

 

                How many assembler instructions are needed to perform a comparison of two elements?

 

                _____________

 

                How many assembler instructions are needed to perform a comparison of the two variables

 

                i and A.length? ______________

 

Part 3.  Determine the best fitting curve for random data

 

In the tables below, let ni = the sample size for i = 1..5  {n1 = 16, n2 = 64, …, n5 = 4096} and let ci equal the number of element compares for each of the 5 sample sizes.  Complete the tables provided below using the data you collected in the random data table and use the data to determine whether a curve          y = cn2 or y = cnlg(n) reasonably fits the data, where c is some constant and y = f(n) is the number of compares as a function of n.  If Dci/Dni is approximately 1 for each of the table entries (especially the last three column entries), then you may conclude that the data reasonably fits a curve y = cn2 and you may leave the last two rows blank.  If this is not the case, fill in the last two rows of the table and determine if Dci/Dmi is approximately 1 for each of the four columns, and , if so, conclude that y = cn lg(n) fits the data reasonably well.  When you have finished you may compare your results to the results displayed by the RegressionApplet where the number of compares for each sample size is an average taken over five independent trials.

 

Selection sort

n1 = 16

n2 = 64

n3 = 256

n4 = 1024

n5 = 4096

compares

ci

 

 

 

 

 

 

Dci = ci/ci-1

 

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

Dni = ni2/ni-12

 

xxxxxxx

xxxxxxx

xxxxxxx

 

16

 

16

 

16

 

16

 

Dci/Dni

 

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

ni lg(ni)

 

 

64

 

384

 

2048

 

10290

 

49152

 

Dmi = nilg(ni)

         ni-1lg(ni-1)

xxxxxxx

xxxxxxx

xxxxxxx

 

6

 

5.33

 

5.02

 

4.76

 

Dci/Dmi

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

                Best fitting curve: ________________________________________

 

 

Insertion sort

n1 = 16

n2 = 64

n3 = 256

n4 = 1024

n5 = 4096

compares

ci

 

 

 

 

 

 

Dci = ci/ci-1

 

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

Dni = ni2/ni-12

 

xxxxxxx

xxxxxxx

xxxxxxx

 

16

 

16

 

16

 

16

 

Dci/Dni

 

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

ni lg(ni)

 

 

64

 

384

 

2048

 

10290

 

49152

 

Dmi = nilg(ni)

         ni-1lg(ni-1)

xxxxxxx

xxxxxxx

xxxxxxx

 

6

 

5.33

 

5.02

 

4.76

 

Dci/Dmi

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

                Best fitting curve: ________________________________________

 

 

Bubble sort

n1 = 16

n2 = 64

n3 = 256

n4 = 1024

n5 = 4096

compares

ci

 

 

 

 

 

 

Dci = ci/ci-1

 

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

Dni = ni2/ni-12

 

xxxxxxx

xxxxxxx

xxxxxxx

 

16

 

16

 

16

 

16

 

Dci/Dni

 

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

ni lg(ni)

 

 

64

 

384

 

2048

 

10290

 

49152

 

Dmi = nilg(ni)

         ni-1lg(ni-1)

xxxxxxx

xxxxxxx

xxxxxxx

 

6

 

5.33

 

5.02

 

4.76

 

Dci/Dmi

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

                Best fitting curve: ________________________________________

 

 

Merge sort

n1 = 16

n2 = 64

n3 = 256

n4 = 1024

n5 = 4096

compares

ci

 

 

 

 

 

 

Dci = ci/ci-1

 

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

Dni = ni2/ni-12

 

xxxxxxx

xxxxxxx

xxxxxxx

 

16

 

16

 

16

 

16

 

Dci/Dni

 

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

ni lg(ni)

 

 

64

 

384

 

2048

 

10290

 

49152

 

Dmi = nilg(ni)

         ni-1lg(ni-1)

xxxxxxx

xxxxxxx

xxxxxxx

 

6

 

5.33

 

5.02

 

4.76

 

Dci/Dmi

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

 

                Best fitting curve: ________________________________________

 

 

 

Quicksort

n1 = 16

n2 = 64

n3 = 256

n4 = 1024

n5 = 4096

compares

ci

 

 

 

 

 

 

Dci = ci/ci-1

 

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

Dni = ni2/ni-12

 

xxxxxxx

xxxxxxx

xxxxxxx

 

16

 

16

 

16

 

16

 

Dci/Dni

 

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

ni lg(ni)

 

 

64

 

384

 

2048

 

10290

 

49152

 

Dmi = nilg(ni)

         ni-1lg(ni-1)

xxxxxxx

xxxxxxx

xxxxxxx

 

6

 

5.33

 

5.02

 

4.76

 

Dci/Dmi

xxxxxxx

xxxxxxx

xxxxxxx

 

 

 

 

 

               

                Best fitting curve: ________________________________________