Intermediate Sorting Algorithms

Don't be scared!

Objectives

Understand the limitations of the sorting algorithms we've learned so far
Implement merge sort
Implement quick sort
Implement radix sort

WHY LEARN THIS?

The sorting algorithms we've learned so far don't scale well
Try out bubble sort on an array of 100000 elements, it will take quite some time!
We need to be able to sort large arrays more quickly

FASTER SORTS

There is a family of sorting algorithms that can improve time complexity from O(n ) to O(n log n)
There's a tradeoff between efficiency and simplicity
The more efficient algorithms are much less simple, and generally take longer to understand
Let's dive in!

Merge Sort

It's a combination of two things - merging and sorting!
Exploits the fact that arrays of 0 or 1 element are always sorted
Works by decomposing an array into smaller arrays of 0 or 1 elements, then building up a newly sorted array

How does it work?

Let's visualize this!

[ 8, 3, 5, 4, 7, 6, 1, 2 ]

[ 8, 3, 5, 4 ]

[ 7, 6, 1, 2 ]

[ 8, 3 ]

[ 5, 4 ]

[ 1, 2 ]

[ 7, 6 ]

[ 8 ]

[ 3 ]

[ 5 ]

[ 4 ]

[ 7 ]

[ 6 ]

[ 1 ]

[ 2 ]

[ 3, 8 ]

[ 4, 5 ]

[ 6, 7 ]

[ 1, 2 ]

[ 3, 4, 5, 8 ]

[ 1, 2, 6, 7 ]

[ 1, 2, 3, 4, 5, 6, 7, 8 ]

Merging Arrays

In order to implement merge sort, it's useful to first implement a function responsible for merging two sorted arrays
Given two arrays which are sorted, this helper function should create a new array which is also sorted, and consists of all of the elements in the two input arrays
This function should run in O(n + m) time and O(n + m) space and should not modify the parameters passed to it.

Merging Arrays Pseudocode

Create an empty array, take a look at the smallest values in each input array
While there are still values we haven't looked at...

If the value in the first array is smaller than the value in the second array, push the value in the first array into our results and move on to the next value in the first array
If the value in the first array is larger than the value in the second array, push the value in the second array into our results and move on to the next value in the second array
Once we exhaust one array, push in all remaining values from the other array

YOUR

TURN

mergeSort Pseudocode

Break up the array into halves until you have arrays that are empty or have one element
Once you have smaller sorted arrays, merge those arrays with other sorted arrays until you are back at the full length of the array
Once the array has been merged back together, return the merged (and sorted!) array

mergeSort([10,24,76,73])

mergeSort([10,24])

mergeSort([76,73])

mergeSort([76])

mergeSort([73])

mergeSort([10])

mergeSort([24])

[10]

[24]

[10,24]

[73,76]

[76]

[73]

merge

[10,24,73,76]

YOUR

TURN

Big O of mergeSort

Time Complexity (Best)	Time Complexity (Average)	Time Complexity (Worst)	Space Complexity
O(n log n)	O(n log n)	O(n log n)	O(n)

Big O of mergeSort

[ 8 ]

[ 3 ]

[ 5 ]

[ 4 ]

[ 7 ]

[ 6 ]

[ 1 ]

[ 2 ]

[ 3, 8 ]

[ 4, 5 ]

[ 6, 7 ]

[ 1, 2 ]

[ 3, 4, 5, 8 ]

[ 1, 2, 6, 7 ]

[ 1, 2, 3, 4, 5, 6, 7, 8 ]

O(log n) decompositions

O(n) comparisons per decomposition

Why???

Quick Sort

Like merge sort, exploits the fact that arrays of 0 or 1 element are always sorted
Works by selecting one element (called the "pivot") and finding the index where the pivot should end up in the sorted array
Once the pivot is positioned appropriately, quick sort can be applied on either side of the pivot

How does it work?

[ 5, 2, 1, 8, 4, 7, 6, 3 ]

3, 2, 1, 4

7, 6, 8

1, 2

7, 6, 8

Let's visualize this!

3, 2, 1, 4

1, 2

7, 6, 8

7, 6, 8

7, 6, 8

7, 6, 8

Pivot Helper

In order to implement merge sort, it's useful to first implement a function responsible arranging elements in an array on either side of a pivot
Given an array, this helper function should designate an element as the pivot
It should then rearrange elements in the array so that all values less than the pivot are moved to the left of the pivot, and all values greater than the pivot are moved to the right of the pivot
The order of elements on either side of the pivot doesn't matter!
The helper should do this in place, that is, it should not create a new array
When complete, the helper should return the index of the pivot

Picking a pivot

The runtime of quick sort depends in part on how one selects the pivot
Ideally, the pivot should be chosen so that it's roughly the median value in the data set you're sorting
For simplicity, we'll always choose the pivot to be the first element (we'll talk about consequences of this later)

Pivot Helper Example

let arr = [ 5, 2, 1, 8, 4, 7, 6, 3 ]

pivot(arr); // 4;

arr;
// any one of these is an acceptable mutation:
// [2, 1, 4, 3, 5, 8, 7, 6]
// [1, 4, 3, 2, 5, 7, 6, 8]
// [3, 2, 1, 4, 5, 7, 6, 8]
// [4, 1, 2, 3, 5, 6, 8, 7]
// there are other acceptable mutations too!

All that matters is for 5 to be at index 4, for smaller values to be to the left, and for larger values to be to the right

Pivot Pseudocode

It will help to accept three arguments: an array, a start index, and an end index (these can default to 0 and the array length minus 1, respectively)
Grab the pivot from the start of the array
Store the current pivot index in a variable (this will keep track of where the pivot should end up)
Loop through the array from the start until the end

If the pivot is greater than the current element, increment the pivot index variable and then swap the current element with the element at the pivot index

Swap the starting element (i.e. the pivot) with the pivot index
Return the pivot index

YOUR

TURN

Quicksort Pseudocode

Call the pivot helper on the array
When the helper returns to you the updated pivot index, recursively call the pivot helper on the subarray to the left of that index, and the subarray to the right of that index
Your base case occurs when you consider a subarray with less than 2 elements

YOUR

TURN

Big O of Quicksort

Time Complexity (Best)	Time Complexity (Average)	Time Complexity (Worst)	Space Complexity
O(n log n)	O(n log n)	O(n )	O(log n)

Big O of Quicksort

Why???

Best Case

[8, 5, 6, 1, 3, 7, 2, 4, 12, 13, 14, 11, 9, 15, 10]

[4, 5, 6, 1, 3, 7, 2]

[12, 13, 14, 11, 9, 15, 10]

[2, 1, 3]

[6, 7, 5]

[10, 11, 9]

[14, 15, 13]

O(n) comparisons per decomposition

O(log n) decompositions

Big O of Quicksort

Why???

Worst Case

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

O(n) comparisons per decomposition

O(n) decompositions

[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

[3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

[4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

[6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

[7, 8, 9, 10, 11, 12, 13, 14, 15]

[8, 9, 10, 11, 12, 13, 14, 15]

[9, 10, 11, 12, 13, 14, 15]

[10, 11, 12, 13, 14, 15]

[11, 12, 13, 14, 15]

[12, 13, 14, 15]

[13, 14, 15]

[14, 15]

Bubble Sort - O(n^2)
Insertion Sort - O(n^2)
Selection Sort - O(n^2)
Quick Sort - O(n log (n))
Merge Sort - O(n log (n))

COMPARISON SORTS

Average Time Complexity

Can we do better?

CAN WE DO BETTER?

YES,

BUT NOT BY MAKING COMPARISONS

RADIX SORT

Radix sort is a special sorting algorithm that works on lists of numbers

It exploits the fact that information about the size of a number is encoded in the number of digits.

More digits means a bigger number!

It never makes comparisons between elements!

How does it work?

[1556, 4, 3556, 593, 408, 4386, 902, 7, 8157, 86, 9637, 29]

Let's visualize this!

[1556, 4, 3556, 593, 408, 4386, 902, 7, 8157, 86, 9637, 29]

[902, 593, 4, 1556, 3556, 4386, 86, 7, 8157, 9637, 408, 29]

[902, 4, 7, 408, 29, 9637, 1556, 3556, 8157, 4386, 86, 593]

[4, 7, 29, 86, 8157, 4386, 408, 1556, 3556, 593, 9637, 902]

[4, 7, 29, 86, 408, 593, 902, 1556, 3556, 4386, 8157, 9637]

RADIX SORT HELPERS