Summary of ten classical sorting algorithms python

Summary of ten classical sorting algorithms

1. General overview

Sorting algorithmAverage time complexityBest case scenarioWorst case scenarioSpatial complexitysort orderstability
Bubble sorting O ( n 2 ) O(n^ 2) O(n2) O ( n ) O(n) O(n) O ( n 2 ) O(n^ 2) O(n2) O ( 1 ) O(1) O(1) I n − p l a c e In- place In−place
Select sort O ( n 2 ) O(n^ 2) O(n2) O ( n 2 ) O(n^ 2) O(n2) O ( n 2 ) O(n^ 2) O(n2) O ( 1 ) O(1) O(1) I n − p l a c e In- place In−place
Insert sort O ( n 2 ) O(n^ 2) O(n2) O ( n ) O(n) O(n) O ( n 2 ) O(n^ 2) O(n2) O ( 1 ) O(1) O(1) I n − p l a c e In- place In−place
Shell Sort O ( n log ⁡ n ) O(n\log n) O(nlogn) O ( n log ⁡ 2 n ) O(n\log ^2 n) O(nlog2n) O ( n log ⁡ 2 n ) O(n\log ^2 n) O(nlog2n) O ( 1 ) O(1) O(1) I n − p l a c e In- place In−place
Merge sort O ( n log ⁡ n ) O(n\log n) O(nlogn) O ( n log ⁡ n ) O(n\log n) O(nlogn) O ( n log ⁡ n ) O(n\log n) O(nlogn) O ( n ) O(n) O(n) O u t − p l a c e Out- place Out−place
Quick sort O ( n log ⁡ n ) O(n\log n) O(nlogn) O ( n log ⁡ n ) O(n\log n) O(nlogn) O ( n 2 ) O(n^ 2) O(n2) O ( l o g   n ) O(log\ n) O(log n) I n − p l a c e In- place In−place
Heap sort O ( n log ⁡ n ) O(n\log n) O(nlogn) O ( n log ⁡ n ) O(n\log n) O(nlogn) O ( n log ⁡ n ) O(n\log n) O(nlogn) O ( 1 ) O(1) O(1) I n − p l a c e In- place In−place
Count sort O ( n + k ) O(n+k) O(n+k) O ( n + k ) O(n+k) O(n+k) O ( n 2 ) O(n^ 2) O(n2) O ( k ) O(k) O(k) O u t − p l a c e Out- place Out−place
Bucket sorting O ( n + k ) O(n+k) O(n+k) O ( n + k ) O(n+k) O(n+k) O ( n 2 ) O(n^ 2) O(n2) O ( n + k ) O(n+k) O(n+k) O u t − p l a c e Out- place Out−place
Cardinality sort O ( n × k ) O(n\times k) O(n×k) O ( n × k ) O(n\times k) O(n×k) O ( n × k ) O(n\times k) O(n×k) O ( n + k ) O(n+k) O(n+k) O u t − p l a c e Out- place Out−place

2. Classification algorithm

1. Bubble sorting

Repeatedly visit the sequence to be sorted, compare two elements at a time, and exchange them if they are in the wrong order.

Bubble sorting is the simplest sorting method. Its main idea is to filter out the largest number in the sequence in each cycle and put it last, because its comparison is only carried out between two adjacent numbers. Therefore, when a large number is first, the more times it needs to be exchanged and the greater the complexity.

The advantage is that the space complexity is low and the result is stable.

# bubbleSort.py

from genRand import generateRandom

def bubbleSort(arr):
    for i in range(1, len(arr)):
        for j in range(len(arr)-1):
            if arr[j] > arr[j+1]:
                arr[j], arr[j+1] = arr[j+1], arr[j]
    return arr

if __name__ == "__main__":
    arr = generateRandom(15)
    print(arr)
    bubbleSort(arr)
    print(arr)

2. Select sort

Select Sorting no matter what data input, its time complexity is O ( n 2 ) O(n^ 2) O(n2), so the smaller the input data scale, the better.

The main idea of selective sorting is to select the smallest data in the sequence every cycle and store it in front of the sorting sequence.

In each external loop, a minimum index can be set to record the index value of the minimum data. After the end of the external loop, the number corresponding to i and the minimum index can be exchanged to reduce the number of exchanges.

The disadvantage is that because it is not exchanged one by one, the order of large numbers will be disturbed during each external loop exchange, so its result is unstable.

from genRand import generateRandom

def selectionSort(arr):
    for i in range(len(arr)-1):
        # Record the index of the smallest number
        minIndex = i
        for j in range(i+1, len(arr)):
            if arr[minIndex] > arr[j]:
                minIndex = j
        # When i is not the minimum number, i is exchanged with the minimum number
        if i != minIndex:
            arr[i], arr[minIndex] = arr[minIndex], arr[i]
        # print(i, arr)
    return arr

if __name__ == "__main__":
    arr = generateRandom(15)
    print(arr)
    arr = selectionSort(arr)
    print(arr)

3. Insert sort

Insert sorting is very efficient when small-scale data or basic order.

The process of inserting and sorting is the same as that of playing cards and sorting hands. For unordered data, scan from back to front in the sorted sequence, find the corresponding position and insert.

Insert sorting determines the position of a number in each outer loop. Therefore, in each inner loop, a pointer is required to record the position of the sorted sequence, compare the value of this position with the number determined by the outer loop, and if it is between, insert the determined number of the outer loop into this position. After each external cycle, the sorting ends.

Insertion sort has the same advantages and disadvantages as bubble sort, with low space complexity and stable results.

from genRand import generateRandom

def insertSort(arr):
    for i in range(len(arr)):
        preIndex = i - 1
        current = arr[i]
        while preIndex>=0 and arr[preIndex]>current:
            arr[preIndex+1] = arr[preIndex]
            preIndex -= 1
        arr[preIndex+1] = current
        # print(i, arr)
    return arr

if __name__ == "__main__":
    arr = generateRandom(15)
    print(arr)
    arr = insertSort(arr)
    print(arr)

4. Hill sort

Hill sort is a more efficient and improved version of insert sort. It is simple to implement and performs well for medium-sized data.

Hill sort proposes two improvements based on insertion sort:

  1. Insertion sorting is efficient when operating on data that has almost been sorted, that is, it can achieve the efficiency of linear sorting
  2. But insert sort is generally inefficient, because insert sort can only move data one bit at a time

The basic idea is: first divide the whole record to be sorted into several subsequences for direct insertion sorting. When the records in the whole sequence are "basically orderly", then directly insert and sort all records in turn.

from genRand import generateRandom
import math

def shellSort(arr):
    # The initial step length is set to 1/k of the total length
    k = 2
    gap = len(arr) // k
    while gap >= 1:
        for i in range(len(arr)):
            j = i
            while j>=gap and arr[j-gap]>arr[j]:
                arr[j], arr[j-gap] = arr[j-gap], arr[j]
                j -= gap
        gap = gap // k
    return arr

if __name__ == "__main__":
    arr = generateRandom(15)
    print(arr)
    print(shellSort(arr))

5. Merge sort

Merge sort is an effective sort algorithm based on merge operation.

from genRand import generateRandom

def mergeSort(arr):
    if len(arr)<2:
        return arr
    middle = len(arr) // 2
    left, right = arr[:middle], arr[middle:]
    return merge(mergeSort(left), mergeSort(right))

def merge(left, right):
    result = []
    while left and right:
        if left[0]<=right[0]:
            result.append(left.pop(0))
        else:
            result.append(right.pop(0))
    while left:
        result.append(left.pop(0))
    while right:
        result.append(right.pop(0))
    return result

if __name__ == "__main__":
    arr = generateRandom(15)
    print(arr)
    print(mergeSort(arr))

Quick sort

Quick sort on average, sorting n items requires O ( n   l o g ( n ) ) O(n\ log(n)) O(n log(n)) comparisons, in fact, quick sort is usually significantly better than others O ( n   l o g ( n ) ) O(n\ log(n)) The O(n log(n)) algorithm is faster because its internal loop can be implemented efficiently in most architectures and which one.

Quick sorting is a classical application of divide and conquer in sorting algorithm. In essence, quick sort should be a recursive branch method based on bubble sort.

from genRand import generateRandom

def quickSort(arr, left=None, right=None):
    left = 0 if not isinstance(left, (int, float)) else left
    right = len(arr)-1 if not isinstance(right, (int, float)) else right
    if left<right:
        partitionIndex = partition(arr, left, right)
        quickSort(arr, left, partitionIndex-1)
        quickSort(arr, partitionIndex+1, right)
    return arr

def partition(arr, left, right):
    pivot = left
    index = pivot + 1
    i = index
    while i <= right:
        if arr[i] < arr[pivot]:
            arr[i], arr[index] = arr[index], arr[i]
            index += 1
        i += 1
    arr[pivot], arr[index - 1] = arr[index - 1], arr[pivot]
    return index-1

if __name__ == "__main__":
    arr = generateRandom(15)
    print(arr)
    print(quickSort(arr))

Heap sort

Heap is a structure similar to a complete binary tree, and satisfies the nature of heap at the same time: that is, the key value or index of the child node is always less than (or greater than) its parent node. Heap sort can be said to be a selective sort that uses the concept of heap to sort.

from genRand import generateRandom

def buildMaxHeap(arr):
    for i in range(len(arr)//2, -1, -1):
        heapify(arr, i)

def heapify(arr, i):
    left = 2*i + 1
    right = 2*i + 2
    largest = i
    if left<arrLen and arr[left]>arr[largest]:
        largest = left
    if right<arrLen and arr[right]>arr[largest]:
        largest = right
    if largest != i:
        arr[i], arr[largest] = arr[largest], arr[i]
        heapify(arr, largest)

def heapSort(arr):
    global arrLen
    arrLen = len(arr)
    buildMaxHeap(arr)
    for i in range(len(arr)-1, 0, -1):
        arr[i], arr[0] = arr[0], arr[i]
        arrLen -= 1
        heapify(arr, 0)
    return arr

if __name__ == "__main__":
    arr = generateRandom(15)
    print(arr)
    print(heapSort(arr))

Count sort

The core of counting and sorting is to convert the input value into key and store it in the additional array space.

from genRand import generateRandom

def countingSort(arr, maxValue):
    bucketLen = maxValue + 1
    bucket = [0]*bucketLen
    sortedIndex = 0
    for i in range(len(arr)):
        if not bucket[arr[i]]:
            bucket[arr[i]] = 0
        bucket[arr[i]] += 1
    for j in range(bucketLen):
        while bucket[j]>0:
            arr[sortedIndex] = j
            sortedIndex += 1
            bucket[j] -= 1
    return arr

if __name__ == "__main__":
    arr = generateRandom(15)
    print(arr)
    print(countingSort(arr, 20))

Bucket sorting

Bucket sorting is an upgraded version of counting sorting. It makes use of the mapping relationship of the function. The key to efficiency lies in the determination of the mapping function. In order to make bucket sorting more efficient, we need to do these two things:

  1. When there is enough extra space, try to increase the number of barrels
  2. The mapping function used can evenly distribute the input N data into K buckets

It is similar to counting sorting, except that the data of the corresponding range is put into the bucket for sorting.

At the same time, for the sorting of elements in the bucket, it is important to choose which comparison sorting algorithm has an intuitive impact on the performance.

The code will not be repeated here.

Cardinality sort

Radix sorting is a non comparative integer sorting algorithm. Its principle is to cut the integer into different numbers according to the number of bits, and then compare them according to each number of bits. Since integers can also express strings (such as names or dates) and floating-point numbers in a specific format, cardinality sorting is not limited to integers.

There are two methods of cardinality sorting:

These three sorting algorithms all use the concept of bucket, but there are obvious differences in the use of bucket:

  • Cardinality sorting: allocate buckets according to each number of key value;
  • Counting and sorting: only one key value is stored in each bucket;
  • Bucket sorting: each bucket stores a certain range of values;

The code will not be repeated here.

appendix

Finally, genrand Python code:

# genRand.py
import random

def generateRandom(num):
    arr = []
    for i in range(0, num):
        arr.append(random.randint(0, 20))
    return arr

Tags: Python Algorithm

Posted by day_tripperz on Sun, 17 Apr 2022 18:45:31 +0930