CMSC 435  Algorithm Analysis & Design

 

Quicksort

 

1.      Design Strategy

 

When designing an iterative algorithm such as a sorting algorithm, it is a good practice to state an overall strategy and

then draw a picture that depicts the consequences of that strategy at some intermediate stage in the execution of the

algorithm.  The picture should indicate the portion of the job that has already been done, the boundary between the

completed portion of the job and the part yet to be done with the position of the index or label for the boundary clearly

established, and a list of all of the clauses that state the properties of the completed portion of the job.  Next, state the

starting conditions and the stopping conditions, then use this information to write an algorithm that maintains the the listed

properties of the completed portion of the job invariant before and after every iteration of the job.

 

2.      Example 1 -- Selection Sort  (Order a List in ascending order of some key)

 

Strategy -- Find the smallest key in the unordered portion of the list and place it at the front of the unordered portion, then move the boundary between the ordered and unordered portion of the list to the right to add this element to the sorted portion of the list.

 

Picture --

Invariants --

      1)  The elements in the front segment are sorted in ascending order.

                  For all j, k :  low £ h £ k < i : A[h] £ A[k]

      2)  All elements in the front segment are less than or equal to any element in the back (unsorted) segment.

                  For all h, k : low £ h < i, and i £ k < high : A[h] £ A[k]

      3)  All elements in the middle segment are greater than or equal to A[i]

 

Starting Conditions --

      Front segment is empty and the back segment is the whole list.   (i = low)

Stopping Conditions --

      Front segment (sorted segment) is the whole list  (i = high)

      Continue the iterations while i < high -1  {since remaining item in the back segment is ³ any item in the front seg.}

 

Inner Loop Starting and Stopping Conditions --

      start -- middle segment consists of A[i] only,  j =  i + 1

      stop -- middle segment is the whole back part of the list,  j = high

 

Algorithm --

     

      void SelectionSort(Comparable  A[], int low, int high) {

           for (int i = low; i < high - 1; i++)  {

                int j = i +1;

                while (j < high)  {

                     if (A[j] < A[i])

                          swap(A[j], A[i])

                     j++;

                }

           }

      }

 

3.      Quicksort

 

Strategy -- If the size of the list is greater than 1, partition the list into two segments, with an item called a pivot separating the two parts, such that every element in the front segment is less than or equal to the pivot value, and every element in the back segment is greater than the pivot.  Quicksort the two segments.

 

Partitioning the list is an iterative process within this recursive algorithm.  We will first write the recursive algorithm, then consider the Partition function separately.

 

Recursive code:

 

      void Quicksort (Comparable A[ ], int low, int high) {

           if ((high - low) > 1)  {

                  Partition( A, low, high, pivot);

                  Quicksort(A, low, pivot);

                  Quicksort(A, pivot + 1, high);

           }

      }

 

4.      Partitioning the list

 

Strategy -- Randomly select a pivot and place it at the front of the list.  The front segment contains only items that are less than or equal to the pivot, the back segment, only items greater than the pivot, and the middle segment is the unfinished or untested portion of the list.  Move the boundary between the front segment and the middle segment until a value greater than the pivot is encountered and stop.  Move the boundary between the back segment and the middle segment until a value less than or equal to the pivot is encountered and stop.  Swap these two values and continue moving the boundaries until the stopping condition is reached.

 

Picture --

Invariants --

      1)  All elements in the front segment are less than or equal to the pivot.

                  for all k : low £ k < i :  A[k] £ A[low]

      2)  All elements in the back segment are greater than the pivot

                  for all k : j < k < high : A[k] > A[low]

 

Starting Conditions --

      Front segment contains only the pivot                (i = low + 1)

      Back segment empty                                         (j = high - 1)

 

Stopping Conditions --

      The middle segment is empty (whole list partitioned)  (j < i)

      Continue while i £ j

      Note!  index j will coincide with the last item in the front segment when the two boundaries meet.

 

Code --

      void Partition (Comparable A[], int low, int high, int & pivot)  {

            int p = random(high-low) + low;     //select a random index in the range low £ p < high

            swap (A[low], A[p]);                      //put value selected at fron of the list

            int i = low +1, j = high -1;

            while (i < = j)  {

                  while (A[i] <= A[low])

                       i++;

                  while (A[j] > A[low])

                       j--;

                  if (i < j)

                       swap(A[i], A[j]);

           }

           swap (A[j], A[low]);                        //put pivot at the back of the front segment

           pivot = j;                                         //return index of the pivot

      }

 

5.      Analysis of Quicksort

 

Let T(n) = the average number of compares in sorting a list of size n.

 

Then if the pivot is located at index i, the two resulting segments will be of length i and n - i -1 respectively, and

the number of compares to form the two segments is n -1 (every element is compared with the pivot).  Thus

 

      T(n) = T(i) + T(n - i - 1) + n - 1            where T(0) = T(1) = 0        //no compares for list of size 0 or 1

 

Since every value from i = 0 to i = n - 1 is equally likely to be the position of the pivot, the average number of

compares is the average over all positions of the pivot.

                           n-1

      T(n)  = (1/n)( å(T(i) + T(n-i-1)) ) + n-1

                           i=0

                     n-1

      T(n)  = (2/n)( å T(i)  ) + n-1                 since second summation is the same as first

                           i=0

 

Multiply both sides by n

 

                        n-1

      nT(n)  =  2 å T(i)  + n(n-1)

                       i=0

 

Writing this recurrence for T(n-1) we have

 

                                    n-2

      (n - 1)T(n -1)  =  2 å T(i)  + (n - 1)(n-2)

                                    i=0

 

Subtracting the second equation from the first, we obtain:

 

      n T(n) - (n-1) T(n-1) = 2 T(n-1) + 2(n-1)         since only the last term of the first sum remains after subtraction

 

Add  (n-1)T(n-1) to both sides of the equation and combine terms

 

      n T(n) = (n + 1) T(n-1) + 2(n - 1)

 

Divide by n and obtain

 

      T(n) = ((n + 1)/n) T(n-1) + 2((n-1)/n)

 

Now make the substitution   T(n) = (n + 1) yn          then  T (n-1) = n yn-1 and our equation becomes

 

      yn =  yn-1 + 2 (n-1)/n(n+1)

 

We can unravel the recursion by using

 

      yn=1 =  yn-2 + 2 (n-2)/n(n-1)   and get    yn =  yn-2 + 2 (n-1)/n(n+1) + 2(n-2)/n(n-1)

 

Continuing this process we obtain

                 n

      yn = 2 å (i - 1) / i (i+1)

                i=1

Use partial fractions to reduce this sum

 

      (i - 1)/ i (i + 1) = c1/(i + 1)  + c2/ i

 

Then putting the two terms on the right over a common denominator i (i + 1) and equating the numerators

 

      i - 1 = c1 i + c2 (i + 1)               c1 = 2, c2 = -1

 

and

 

                 n

      yn = 2 å (2/(i+1) - 1/i)

                i=1

 

write this as

                 n                                      n

      yn = 2 å (1/(i+1) - 1/i)  + å 1/(i + 1)

                i=1                                   i=1

 

The first summation reduces to (1/(n+1) ) - 1 since all but the last iteration of the first term and the first iteration of the second subratct out.  The second summation can be written

 

       n+1            n

       å 1/i  = å 1/i   + 1/(n+1) -1

       i=2            i=1

 

where the last term of the sum is removed and the bottom limit is changed to i = 1 by adding and subtracting 1 on the rhs.

 

Making these changes to the equation for yn we get

                                                                              n

      yn = 2 ( 1 - (n+1))/(n + 1) + 2/ (n + 1) - 2 + 2 å 1/i

                                                                             i=1

 

and

                                            n

      yn = 4(1/(n+1) - 1) + 2 å 1/i

                                          i=1

 

taking the limit as n ® ¥, the first term tends to a constant (-4) and the second term is Q( lg n)

 

Finally returning to the substitution  T(n) = nyn  we get T(n) = Q(nlg(n))

 

This is intuitively realized if we recognize that partitioning locates one point, the pivot, in its correct poistion in a sorted list

every time a (sub) list is segmented, and it will take Q(lg(n)) iterations with a total of Q(n)  compares for each iteration.