CS 2223 Nov 05 2015

Lecture Path: 06
Back Next

Expected reading: pp. 243-257

Two by two;
hands of blue
River

1 Sorting Principles

We are starting a new weekly module. In this module we will cover sorting as a domain application. Sorting is a model problem for computer science and has been studied extensively. To study the performance of sorting algorithms, we will analyze the cost and frequency of operations, as you saw on p. 181. Along the way, we will learn about the heap data structure and demonstrate its use in a priority queue.

As a general comment, you will likely only implement a data type or data structure "from scratch" within an academic context, such as this course. My focus is to make sure that you can understand code examples and pseudocode.

1.1 Important concepts from readings

Sorting processes arrays of items where each item contains a key.
Mathematically determine how many comparisons.
Mathematically determine how many exchanges (or swaps).
Fully ordered Items, also Reflexive, Antisymmetric, and Transitive.
CompareTo Interface

1.2 Opening Questions

Did anyone solve daily exercise for splitting a Bag in half? I realize that I had framed the problem in such a way that it was technically impossible. I only realized that the next day. Sorry! In any event, you check out the code sample and here is the basic idea.

Start with two references to the front of the linked list. Advance one of them one step at a time, and advance the other one two steps at a time. Eventually the "two-step" reference will be null, in which case the "one-step" reference is at (more-or-less) the halfway point. The code covers the special cases.

1.3 Thoughts on HW1

Open Discussion.

1.4 Thoughts on Tilde Approximation

This dicussion relates back to page 181 of the book. The purpose of this notation

Figure 1: Anatomy of execution (p.181)

This analysis derives the following cost model:

Figure 2: Cost Model (p.181)

1.5 Sorting Groundrules

The best way to compare sorting algorithms is to develop a benchmark approach to properly compare "apples to apples." Sedgewick does this by defining two fundamental operations:

boolean less (Comparable v, Comparable w) – This determines whether v is strictly smaller than w. It may seem odd, but you can do sorting without actually ever checking whether two objects are equal.
void exch (Comparable[] a, int i, int j) – Exchanges the elements within the array at positions i and j.

In this regard, we are approaching sorting as Computer Scientists where we can set up experiments and identify valid sorting algorithms and compare their behavior with one another.

Another feature of the Sedgewick approach is to demonstrate the sorting algorithm on small data sets. This allows you to understand the mechanics fully so you can then empirically validate its performance against the predicted speed.

In most cases it doesn’t matter what type of object is being sorted. However it is worth noting that more complicated objects will have slowed performance when computing the less operator between two items.

The goal of these next few lectures – and indeed the entire chapter on Sorting – is to demonstrate a range of mathematical analytic skills to predicting the performance of different algorithmic approaches towards a common problem.

1.6 Selection Sort

This approach is a very "human-centered" approach to sorting, one which you might already employ on a daily basis.

Since the goal is to sort the entire array, selection sort starts by locating the smallest element in the array and swapping it with the a[0]. As you already know, this will take n-1 less comparisons. Once this step is completed, you know have a problem that has decreased in size by 1, namely, you want to sort the "right-side" of the array, which is one size smaller.

If you continue this logic, in n-2 less comparisons, you can find the element that is to be switched with position a[1].

public static void sort(Comparable[] a) { int N = a.length; Truth_____________________________ for (int i = 0; i < N; i++) { int min = i; Truth___________________________ for (int j = i+1; j < N; j++) { if (less(a[j], a[min])) { min = j; } } Truth___________________________ exch(a, i, min); Truth___________________________ } Truth_____________________________ }

Describe and explain expected time performance

1.7 Insertion Sort

Insertion sort is nominally better because it chooses a different strategy for sorting. It first observes that the a[0], when considered by itself, is already sorted. What about the a[1]? If it is already in position (i.e., it is greater than a[0]) then nothing needs to be done, otherwise they are swapped.

If you continue this logic, you might be able to see that the goal of this method is to assume that a[0] .. a[i] are already sorted, and then it tries to see where a[i+1] should be inserted into place. It performs this task from the right down to the left, that is, in decreasing order, because it may be that a[i+1] is already in position by being greater than a[i]. When this is not the case, the algorithm exchanges neighbors while trying to locate the proper place to insert a[i+1]. Once done, the array a[0] .. a[i+1] is sorted.

This algorithm also reduces the size of the problem to solve by one, but it has some nice attributes that are worth hilighting:

It is the only sorting algorithm we will present which benefits when items are already in sorted order.
There are more exchanges in Insertion Sort than Selection Sort.

Describe and explain expected time performance

public static void sort(Comparable[] a) { int N = a.length; Truth_____________________________ for (int i = 0; i < N; i++) { for (int j = i; j > 0 && less(a[j], a[j-1]); j−−) { exch(a, j, j-1); } Truth_____________________________ } Truth_____________________________ }

1.8 Sample Exam Question

This one relates to linked lists as seen with the Bag data type. How would you implement a contains method that determines whether an element is to be found in tbe Bag.

/** @param <Item> the type of elements in the Bag */ public class Bag<Item> { Node first; // first node in the list (may be null) class Node { Item item; Node next; } /** Determine whether item is contained in the bag. */ public boolean contains (Item item) { // fill in here... } }

Also evaluate the performance of the method in terms of N, where N is the number of elements in the Bag.