Watson

Humans seem to have a basic desire to structure and organize the world around them. One of the most basic ways of organizing a collection of objects is to arrange them according to some common characteristic, such as size, weight, cost, or age.

A group of objects is sorted when the objects are arranged according to some rule that involves the use of “orderable keys.” While the term “orderable” will not be rigorously defined, intuitively it means keys on which the concepts of “less than” “greater than”, and “equal to” make sense, or on which “precedes” and “follows” have meaning. Orderable keys include numbers, such as: weight, height, age, income, and social security number. Character strings, such as words, names, and addresses, are also orderable. Numeric keys are generally arranged from smallest to largest, or, occasionally, largest to smallest. Character based keys are generally arranged in dictionary, or lexicographic, order.

Sorting is a very important problem. If information is not organized in some fashion, retrieval can be quite time consuming. To find a particular item in an unsorted list, we are forced to use sequential search, rather than the more efficient binary search. While the sequential search approach may be acceptable for small numbers of items, it becomes unwieldy when dealing with large numbers of objects. For this reason, computer scientists have focused a great deal of attention on the sorting problem and have devised a number of sorting algorithms. Over the years, these algorithms have been closely studied and the efficiency of each carefully analyzed.

This section presents three common approaches to the sorting problem: selection sort, insertion sort, and merge sort. As was the case with sequential and binary searches, each of the three sorts will be compared to determine their efficiency. In order to develop estimates of the runtimes of the three sort algorithms, we will use a “computer” capable of executing one million comparisons per second. While this value is somewhat more realistic than the “antique computer” (capable of executing 1,000 comparisons per second) used in the previous section, today’s computers are still many times more powerful.

Modern computers are capable of performing billions of basic low-level instructions per second. The number of higher-level operations they can perform each second depends on many factors other than CPU speed. In the case of sorting, the number of comparisons per second depends on factors such as whether the list being sorted is in main memory or on disk, and whether the computer is running other programs besides the sort procedure.

The take home message is don’t place too much emphasis on the numbers we will generate for “predicted runtimes” – since these are highly dependent on the assumptions made concerning computing hardware. Instead, what is important is the relative performance of the algorithms, which is not affected by underlying hardware assumptions.