|
Quicksort is a sorting algorithm that,
on average, needs O(n log(n)) comparisons to sort n items, but in the worst case requires
O(n2) comparisons. Its inner loop is inherently very fast on nearly all computers, which makes it
significantly faster than other O(n log n) algorithms that can sort in place or nearly so in the average
case.
Simple Quicksort's space requirement, similarly, is O(log n) on average and O(n) in the
worst case, both for the stack space it uses. It is commonly falsely thought to be an in-place algorithm, meaning that it has only constant or O(1) storage overhead.
Quicksort was developed by C. A. R. Hoare.
Performance and algorithm details
Because of its good average performance and simple implementation, Quicksort is one of the most popular sorting algorithms in
use. It is an unstable sort in that it doesn't preserve any ordering that is
already between elements of the same value. Quicksort's worst-case performance is Θ(n2); much worse than some other sorting
algorithms such as heapsort or merge
sort. However, if pivots are chosen randomly, most bad choices of pivots are unlikely; the worst-case has only probability
1/n! of occurring.
The Quicksort algorithm uses a recursive divide and conquer
strategy to sort a list. The steps are:
- Pick a pivot element from the list.
- Reorder the list so that all elements less than the pivot precede all elements greater than the pivot. This means that the
pivot is in its final place; the algorithm puts at least one element in its final place on each pass over the list. This step is
commonly referred to as "partitioning".
- Recursively sort the sub-list of elements less than the pivot and the sub-list of elements greater than the pivot. If one of
the sub-lists is empty or contains one element, it can be ignored.
In pseudocode, the complete algorithm in its simplest form is:
partition(a, left, right, pivotIndex)
pivotValue := a[pivotIndex]
swap(a[pivotIndex], a[right]) // Move pivot to end
storeIndex := left
for i = left to right-1
if a[i] <= pivotValue
swap(a[storeIndex], a[i])
storeIndex := storeIndex + 1
swap(a[right], a[storeIndex]) // Move pivot to its final place
return storeIndex
quicksort(a, left, right)
if right > left
select a pivot value a[pivotIndex]
pivotNewIndex := partition(a, left, right, pivotIndex)
quicksort(a, left, pivotNewIndex-1)
quicksort(a, pivotNewIndex+1, right)
The inner loop which performs the partition is often very amenable to optimization as all comparisons are being done with a
single pivot element. This is one reason why Quicksort is often the fastest sorting algorithm, at least on average over all
inputs.
The most crucial problem of Quicksort is the choice of pivot element. A naive implementation of Quicksort, like the ones
below, will be terribly inefficient for certain inputs. For example, if the pivot always turns out to be the smallest element in
the list, then Quicksort degenerates to Selection sort with
Θ(n2) running time. A secondary problem is the recursion depth. This becomes linear, and the stack
requires Θ(n) extra space.
Choosing a better pivot
The worst-case behavior of quicksort is not merely a theoretical problem. When quicksort is used in web services, for example,
it is possible for an attacker to deliberately exploit the worst case performance and choose data which will cause a slow running
time or maximize the chance of running out of stack space. See competitive analysis for more discussion of this issue.
Sorted or partially sorted data is quite common in practice and the naive implementation which selects the first element as
the pivot does poorly with such data. To avoid this problem the middle element can be used. This works well in practice but
attacks can cause worst-case performance.
A better optimization can be to select the median element of the first, middle and last elements as the pivot. Adding two
randomly selected elements resists chosen data attacks, more so if a cryptographically sound random number generator is used to
reduce the chance of an attacker predicting the "random" elements. The use of the fixed elements reduces the chance of bad luck
causing a poor pivot selection for partially sorted data when not under attack. These steps increase overhead, so it may be worth
skipping them once the partitions grow small and the penalty for poor pivot selection drops.
Finding the true median value and using it as the pivot can be done if the number of elements is large enough to make it
necessary but this is seldom done in practice.
Other optimizations
Another optimization is to switch to a different sorting algorithm once the list becomes small, perhaps ten or less elements.
Selection sort might be inefficient for large data sets, but it is often faster than Quicksort on small lists.
One widely used implementation of quicksort, that in the 1997 Microsoft C library, used a cutoff of 8 elements before
switching to insertion sort, asserting that testing had shown that to
be a good choice. It used the middle element for the partition value, asserting that testing had shown that the median of three
algorithm did not, in general, increase performance.
In datasets which contain a lot of equal elements, quicksort can degenerate to worst case time complexity in sorting the
'bottom tier' of partitions. A good variation in such cases is to test seperately for equal elements and store these in a 'fat
pivot' in the center of the partition. An implementation of this variation implemented in C is shown below.
Sedgewick (1978) suggested an enhancement to the use of simple sorts for small numbers of elements, which reduced the number
of instructions required by postponing the simple sorts until the quicksort had finished, then running an insertion sort over the
whole array.
LaMarca and Ladner (1997) consider "The Influence of Caches on the Performance of Sorting", a very significant issue in
microprocessor systems with multi-level caches and high cache miss times. They conclude that while the Sedgewick optimization
decreases the number of instructions, it also decreases locality of cache references and worsens performance compared to doing
the simple sort when the need for it is first encountered. However, the effect was not dramatic and they suggested that it was
starting to become more significant with more than 4 million 64 bit float elements. This work is cited by Musser, following.
Their work showed greater improvements for other sorting types.
Because recursion requires additional memory, Quicksort has been implemented
in a non-recursive, iterative form. This has the advantage of predictable memory
use regardless of input, and the disadvantage of considerably greater code complexity. Those considering iterative
implementations of Quicksort would do well to also consider Introsort or especially Heapsort.
A simple alternative for reducing Quicksort's memory consumption uses true recursion only on the smaller of the two sublists
and tail recursion on the larger. This limits the additional storage of Quicksort to O(log n). The procedure
quicksort in the preceding pseudocode would be rewritten as
quicksort(a, left, right)
while right > left
select a pivot value a[pivotIndex]
pivotNewIndex := partition(a, left, right, pivotIndex)
if pivotNewIndex-1 - left < right - (pivotNewIndex+1)
quicksort(a, left, pivotNewIndex-1)
left := pivotNewIndex+1
else
quicksort(a, pivotNewIndex+1, right)
right := pivotNewIndex-1
Other competitive sorting algorithms
A variation on quicksort which is becoming widely used is introspective sort, often called introsort
(Musser 1997). This starts with quicksort and
switches to heapsort when the recursion depth exceeds a preset value. This
overcomes the overhead of increasingly complex pivot selection techniques while ensuring O(n log n) worst-case
performance. Musser reported that on a median-of-3 killer sequence of 100,000 elements running time was 1/200th that of
median-of-3 quicksort. Musser also considered the effect of Sedgewick delayed small sorting on caches, reporting that it could
double the number of cache misses but that its performance with double-ended queues was significantly better and it should be
retained for template libraries, in part because the gain in other cases from doing the sorts immediately was not great.
The June 2000 SGI C++ Standard Template Library stl_algo.c implementation of unstable sort uses the Musser introsort
approach with the recursion depth to switch to heapsort passed as a parameter, median-of-3 pivot selection and the Sedgewick
final insertion sort pass. The element threshold for switching to the simple insertion sort was 16.
The C++ STL implementations generally significantly (several times as fast) outperform the C implementation because they are
implemented to allow inlining, while the generic C equivalent must use function calls for comparisons. This advantage could be
compensated for by using custom versions of the sort function, at the cost of losing the advantage of a totally generic library
function.
The most direct competitor of Quicksort is heapsort. Heapsort is typically
somewhat slower than Quicksort, but the worst-case running time is always O(n log n). Quicksort is usually
faster, though there remains the chance of worst case performance except in the introsort variant. If it's known in advance that
heapsort is going to be necessary, using it directly will be faster than waiting for introsort to switch to it. Heapsort also has
the important advantage of using only constant additional space (heapsort is in-place), whereas even the best variant of
Quicksort uses O(log n) space.
Quicksort is a space-optimized version of the binary tree sort.
Instead of inserting items sequentially into an explicit tree, Quicksort organizes them concurrently into a tree that is implied
by the recursive calls. The algorithms make exactly the same comparisons, but in a different order.
Relationship to selection
A simple selection algorithm, which chooses the kth smallest of a list of elements, works nearly the same as
quicksort, except instead of recursing on both sublists, it only recurses on the sublist which contains the desired element. This
small change changes the average complexity to linear or O(n) time. A variation on this algorithm brings the worst-case time down
to O(n) (see selection algorithm for more information).
Conversely, once we know a worst-case O(n) selection algorithm is available, we can use it to find the ideal pivot
(the median) at every step of Quicksort, producing a variant with worst-case O(n log n) running time. This
variant is considerably slower on average, however.
Implementations
A simple implementation of Quicksort on an array of integers in C:
/* Sorts the elements of a between low and high inclusive */
void quicksort(int a[], int low, int high)
{
/* Use the first element as the pivot */
int pivot = a[low];
/* Set l and r and move the array elements such
that for all i:low<i<l, a[i]<=pivot,
and for all i:r<=i<high, a[i]>pivot. */
int l = low + 1;
int r = high;
while(l < r) {
if (a[l] <= pivot) l++;
else {
r--;
swap(a[l], a[r]);
}
}
/* Put pivot element into place */
l--;
swap(a[low], a[l]);
/* Recursively sort the partitions, if their size is > 1
(if a partition's size is 1, it is already sorted) */
if (l-low > 1)
quicksort(a, low, l);
if (high-r > 1)
quicksort(a, r, high);
}
An implementation of Quicksort similar to the above, but using a 'fat pivot' in C:
void fquicksort(int a[], int low, int high) {
int pivot = a[low];
int i = low + 1, j = high, k = high;
int t;
while (i<j) {
if (a[i] < pivot) i++;
else if (a[i] > pivot) {
j--; k--;
t = a[i];
a[i] = a[j];
a[j] = a[k];
a[k] = t; }
else {
j--;
swap(a[i], a[j]);
}
}
i--;
swap(a[low], a[i]);
if (i - low > 1)
fquicksort(a, low, i);
if (high - k > 1)
fquicksort(a, k, high);
}
Here's an implementation in Python, which
uses more efficient partitioning for optimization:
def partition(array, start, end, compare):
while start < end:
# at the top of this loop, our partition element is at 'start'
while start < end:
if compare(array[start], array[end]):
(array[start], array[end]) = (array[end], array[start])
break
end -= 1
# at the top of this loop, our partition element is at 'end'
while start < end:
if compare(array[start], array[end]):
(array[start], array[end]) = (array[end], array[start])
break
start += 1
return start
def quicksort(array, compare=lambda x, y: x > y, start=None, end=None):
"""The fastest exchange sort for most purposes."""
if start is None: start = 0
if end is None: end = len(array)
if start < end:
i = partition(array, start, end-1, compare)
quicksort(array, compare, start, i)
quicksort(array, compare, i+1, end)
Here's a very short version written in the functional
programming language Haskell:
quicksort :: (Ord a) => [a] -> [a]
quicksort [] = []
quicksort (pivot:rest) = (quicksort [y| y<-rest, y<pivot]) ++
[pivot] ++
(quicksort [y| y<-rest,y>=pivot])
Note the use of list comprehensions, and also that this uses
the head of the list as a pivot element, so it does not give optimum results if the list is already nearly sorted (improvements
can be made by preselecting a more suitable pivot before the main quicksort function). This version is very easy to understand,
as it simply recursively sorts the lists formed by filtering all the items less than the head, and all the items greater than the
head, and inserts the singleton list containing the head item in between. The code is almost self explanatory. (Unfortunately,
for existing Haskell implementations, it is inefficient as copying and concatenation of many small list makes this quicksort
perform O(n2) on average. Future Haskell implementations may perform optimizations "behind the scenes", but currently
these are not required by the language.)
Here's an implementation in OCaml very similar to the Haskell implementation above.
# let rec quicksort lst =
match lst with
[] -> []
| head::tail ->
let left, right = List.partition (function x -> x < head) tail in
(quicksort left) @ head::(quicksort right);;
val quicksort : 'a list -> 'a list = <fun>
Here's an implementation in Common Lisp.
(defun partition (fun lst)
(list (remove-if-not fun lst) (remove-if fun lst)))
(defun quicksort (lst)
(if (null lst) nil
(let ((part (partition (lambda (x) (< x (car lst))) (cdr lst))))
(append (quicksort (car part)) (cons (car lst) (quicksort (cadr part)))))))
Here's an implementation in Java.
import java.util.Comparator;
import java.util.Random;
public class Quicksort {
public static final Random RND = new Random();
private void swap(Object[] array, int i, int j) {
Object tmp = array[i];
array[i] = array[j];
array[j] = tmp;
}
private int partition(Object[] array, int left, int right, Comparator c) {
int index = left + RND.nextInt(right - left + 1);
Object pivot = array[index];
swap(array, index, right);
for (int i = index = left; i < right; ++ i) {
if (c.compare(array[i], pivot) <= 0) {
swap(array, index ++, i);
}
}
swap(array, index, right);
return (index);
}
private void qsort(Object[] array, int left, int right, Comparator c) {
if (right > left) {
int index = partition(array, left, right, c);
qsort(array, left, index - 1, c);
qsort(array, index + 1, right, c);
}
}
public void sort(Object[] array, Comparator c) {
qsort(array, 0, array.length - 1, c);
}
}
It uses randomly selected pivot. A user supplied comparator is employed to determine partial order of array elements.
External links
References
- Hoare, C. A. R. "Partition: Algorithm 63," "Quicksort: Algorithm 64," and "Find: Algorithm 65." Comm. ACM 4, 321-322,
1961
- R. Sedgewick. Implementing quicksort programs, Communications of the ACM, 21(10):847857, 1978.
- David Musser. Introspective Sorting and Selection Algorithms, Software Practice and Experience vol 27, number 8, pages
983-993, 1997
- A. LaMarca and R. E. Ladner. "The Influence of Caches on the Performance of Sorting." Proceedings of the Eighth Annual
ACM-SIAM Symposium on Discrete Algorithms, 1997. pp. 370-379.
|