This article is part 9 of 11 in the series Python Data Structures Tutorial

Prerequisites

To learn about Merge Sort, you must know:

  1. Python 3
  2. Python data structures - Lists
  3. Recursion

What is Merge Sort?

We are in the fourth tutorial of the sorting series. The previous tutorials cover Bubble Sort, Insertion Sort, and Selection Sort. If you haven’t read these, please do as we will be building off of those concepts. Like all sorting algorithms, we consider a list to be sorted only if it is in ascending order. Descending order is considered the worst unsorted case.

Merge sort is very different than the other sorting techniques we have seen so far. Merge Sort can be used to sort an unsorted list or to merge two sorted lists.

Sort an unsorted list

The idea is to split the unsorted list into smaller groups until there is only one element in a group. Then, group two elements in the sorted order and gradually build the size of the group. Every time the merging happens, the elements in the groups must be compared one by one and combined into a single list in the sorted order. This process continues till all the elements are merged and sorted. Note that when the regrouping happens the sorted order must always be maintained.

Let us see this using an example, alist = [5,9,1,2,7,0]

Splitting

Step 1: [5,9,1] [2,7,0]

Step 2: [5] [9,1] [2] [7,0]

Step 3: [5] [9] [1] [2] [7] [0]

Note: In Step 2, we are dealing with an odd number of elements in the group so we are arbitrarily splitting them. So, [5,9] [1] [2,7] [0] is also correct.

Merging

Step 4: [5,9] [1,2] [0,7]

Step 5: [1,2,5,9][0,7]

Step 6: [0,1,2,5,7,9]

See this animation for better understanding.

Merge two sorted lists

You might think that alternatively taking elements from the two sorted lists and putting it together will produce one single sorted list. This is a very wrong idea. Let us see why.

a =  [1,3,4,9]  b = [2,5,7,8] 

On merging the two lists as discussed above this is what we get: [1,2,3,5,4,7,9,8]

Clearly, the merged list is not sorted. So we need to compare the elements before making one big sorted list. Let us see this with an example.

a =  [1,3,4,9]  b = [2,5,7,8] 

Step 1: 1<2 new list → [1] a =  [3,4,9]  b = [2,5,7,8] 

Step 2: 3>2  new list → [1,2] a =  [3,4,9]  b = [5,7,8] 

Step 3: 3<5 new list → [1,2,3] a =  [4,9]  b = [5,7,8] 

Step 4: 4<5 new list → [1,2,3,4] a = [9] b = [5,7,8]

Step 5: 9>5 new list → [1,2,3,4,5] a = [9] b = [7,8]

Step 6:9>7 new list → [1,2,3,4,5,7] a = [9] b = [8]

Step 7: 9>8 new list → [1,2,3,4,5,7,8] a = [9] b = []

Step 8: new list → [1,2,3,4,5,7,8,9]

How to implement Merge Sort?

Now that you have a fair understanding of what Merge Sort is, let us take a look at the algorithm of how to sort a list and its code.

Algorithm

  1. Split the unsorted list into groups recursively until there is one element per group
  2. Compare each of the elements and then group them 
  3. Repeat step 2 until the whole list is merged and sorted in the process

Code

def mergeSort(alist):

   print("Splitting ",alist)

   if len(alist)>1:
       mid = len(alist)//2
       lefthalf = alist[:mid]
       righthalf = alist[mid:]

       #recursion
       mergeSort(lefthalf)
       mergeSort(righthalf)

       i=0
       j=0
       k=0

       while i < len(lefthalf) and j < len(righthalf):
           if lefthalf[i] < righthalf[j]:
               alist[k]=lefthalf[i]
               i=i+1
           else:
               alist[k]=righthalf[j]
               j=j+1
           k=k+1

       while i < len(lefthalf):
           alist[k]=lefthalf[i]
           i=i+1
           k=k+1

       while j < len(righthalf):
           alist[k]=righthalf[j]
           j=j+1
           k=k+1

   print("Merging ",alist)

alist = [54,26,93,17,77,31,44,55,20]
mergeSort(alist)
print(alist)

Code courtesy of Interactive Python.

Conclusion

Merge Sort works both with a large and small number of elements making it more efficient than Bubble, Insertion and Selection Sort. This comes at a price since Merge Sort uses additional space to produce a sorted list. The worst case runtime complexity of Merge Sort is o(nlog(n)) and the space complexity is n. Try to merge two sorted lists and comment below. That’s it for this tutorial. Happy Pythoning!

To Practice: Try this interactive course on the basics of Lists, Functions, Packages and NumPy in Python.