Common data structures

Algorithm diagram

Binary search

1. For ordered sets, you can use

2. Search from the middle

def binary_search(list,item):
    low = 0
    high = len(list) - 1
    while low <= high:
        mid = (low+high)//2
        guess = list[mid]
        if guess == item:
            return mid
        if guess > item:
            high = mid-1
        else:
            low = mid+1
    return None

print(binary_search([1,3,4,5,6,7,8,9],4))

3. Only suitable for ordered sets

Large O representation

1. The large o notation refers to a speed that is not in seconds. The big O notation allows you to compare operands. It indicates the growth rate of algorithm running time.

2. Common large O indicates time:

2.1 O(log n), also known as log time, this algorithm includes binary search.
2.2 O(n), also known as linear time, such an algorithm includes simple search.
2.3 O(n * log n), such an algorithm includes the quick sort introduced in Chapter 4 - a faster sort algorithm.
2.4 O(n**2), such an algorithm includes the selective sorting introduced in Chapter 2 - a slower sorting algorithm.
2.5 O(n!), Such an algorithm includes the solution to the traveling salesman problem that will be introduced next - a very slow algorithm

The difference between array and linked list

Array: in memory, elements need to be connected. Once the added elements exceed the obtained memory, they need to be expanded and then reallocated. If a lot of space is prepared in advance, it will become a waste of space. Therefore, the array is not suitable for adding element O(n). Array also has great advantages. When looking for elements, you can find them directly, and the complexity is O(1)

Linked list: it will not be a waste of space. As long as there is a location, it can be stored. One element needs to mark the location O(1) of the next element. However, when searching, you can only view it step by step from the beginning. The time is O(n)

Select sort

1. Large o representation: O(n**2)

1. If an unordered array is given, sort the array. First, cycle the array n times to find the smallest one, then put it into a new array, and then continue to cycle n times to find the second smallest one, and continue to cycle until the order is taken, so the time complexity is O(n**2)

def findSmallest(arr):
    smallest = arr[0]
    smallest_index = 0
    for i in range(1,len(arr)):
        if arr[i] < smallest:
            smallest = arr[i]
            smallest_index = i
    return smallest_index

def selectionSort(arr):
    newArr = []
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        newArr.append(arr.pop(smallest))
    return newArr

print(selectionSort([5,3,6,2,10]))

recursion

Stack

The difference between queue and queue is that queue is FIFO and stack is FIFO. It's like washing dishes. The first dish to be washed is placed at the bottom. When you use it, you take it from the top, that is, the last plate you put in.

Function is called call stack.

Recursive call stack

For example, to find the factorial of a number, the simplest recursive method is:

def fact(x):
    if x == 1:
        return x
    else:
        return x * fact(x-1)
    

Quick sort (quick sort)

Fast sorting is mainly to divide an array into two groups greater than this number and less than this number according to a random number, and then use recursion to sort

Quick steps:

(1) Select the reference value.
(2) Divide the array into two subarrays: elements less than the reference value and elements greater than the reference value.
(3) Quickly sort the two subarrays.

For example, to sort a queue:

def quicksort(array):
    #The baseline condition must be set, otherwise there will be endless calls
 if len(array) < 2:
     return array
 else:
     pivot = array[0]
     less = [i for i in array[1:] if i <= pivot]
     greater = [i for i in array[1:] if i>= pivot]
     return quicksort(less) + [pivot] + quicksort(greater)
 

Hash table

1. It is called dictionary in python and hash in ruby, for example: {'name': 'Lisi'}

Hash function

1. Hash function "map input to number".

2. Hash function is not irregular. On the contrary, hash function must meet certain requirements before it can be called a qualified hash function. Simple requirements:

 it must be consistent. For example, suppose you get 4 when you enter apple, then you get 4 every time you enter apple
Must be 4. If not, the hash table will be useless.
 it should map different inputs to different numbers. For example, if a hash function returns 1 regardless of the input, it is not a good hash function. Ideally, map different inputs to different numbers.

Breadth first search

Brief introduction:

If you're looking for a fruit seller, you're going to look among your friends.

[the external chain image transfer fails, and the source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-qqmri8ga-1613276487839) (C: \ users \ JKX \ appdata \ roaming \ typora \ typora user images \ 1611748550842. PNG)]

First, find your first level of friends. If you don't find your friends again. The first layer of friends becomes a one-time relationship, and then it accumulates, such as two-layer relationship, three-layer relationship and so on.

The relationship weight of one layer is the highest, which is searched first, and then searched layer by layer.

This is called breadth first search.

Simply implement this case:

#Create a friend relationship
friend = {}
friend["you"] = ["alice", "bob", "claire"]
friend["bob"] = ["anuj", "peggy"]
friend["alice"] =  ["peggy"]
friend["claire"] = ["thom", "jonny"]
friend["anuj"] = []
friend["peggy"] = []
friend["thom"] = []
friend["jonny"] = []
#Create a queue
from collections import deque
def search_shuiguo():
    search_queue = deque()
    search_queue += friend["you"]
    while search_queue:
        #Throw the value to the left of the queue
        person = search_queue.popleft()
        if person_is_shuiguo(person):
            print(person + "It's a fruit seller")
            return True
        else:
            search_queue += friend[person]
    return False

def person_is_shuiguo(name):
    return name[-1] == "m"

Queue is first in first out, which is very suitable for breadth first search.

Dixtra algorithm

Brief introduction:

Dixtra algorithm is simply breadth first search plus weight.

For example:

[the external chain image transfer fails, and the source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-ojcxbknf-1613276487841) (C: \ users \ JKX \ appdata \ roaming \ typora \ typora user images \ 1611750380705. PNG)]

As shown in the figure, you have to calculate the distance from the twin peaks to the Golden Gate Bridge. Breadth first search can be used.

First calculate the first step away from yourself, two steps away from yourself and three steps away. The nearest distance can be calculated. However, this does not take into account that the length of each distance is different, the tools are different, and the time will be different. At this time, dixtra algorithm can be used.

Example: [the external chain image transfer fails, and the source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-rklwyvsn-1613276487842) (C: \ users \ JKX \ appdata \ roaming \ typora \ typora user images \ 1611750617931. PNG)]

The dixtra algorithm consists of four steps:

(1) Find the "Cheapest" node, that is, the node that can arrive in the shortest time.
(2) The cost of updating the neighbors of the node will be described later.
(3) Repeat this process until this is done for each node in the graph.
(4) Calculate the final path.

For example, find the shortest formation from seven points to the end point.

[the external chain image transfer fails, and the source station may have anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-pkgkxp17-1613276487844) (C: \ users \ JKX \ appdata \ roaming \ typora \ typora user images \ 1611843621124. PNG)]

Create three hash tables

[the external chain image transfer fails, and the source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-k7vx8uit-1613276487845) (C: \ users \ JKX \ appdata \ roaming \ typora \ typora user images \ 1611843679714. PNG)]

#Create a distance hash from a neighbor
graph = {}
graph["start"] = {}
graph["start"]["a"] = 6
graph["start"]["b"] = 2
graph["a"] = {}
graph["a"]["fin"] = 1
graph["b"] = {}
graph["b"]["a"] = 3
graph["b"]["fin"] = 5
graph["fin"] = {}

#Create an expense table from the starting point to each location
#Represents infinity
infinity = float("inf")
costs = {}
costs["a"] = 6
costs["b"] = 2
#Because I'm not my neighbor, I don't know how far it will be, so I set infinity
costs["fin"] = infinity

#Stores the hash of the parent node
parents = {}
parents["a"] = "start"
parents["b"] = "start"
parents["fin"] = None

#Create an array to record the processed nodes
processed = []
#Find the node with the least overhead
def find_lowest_cost_node(costs):
    lowest_cost = float("inf")
    lowest_cost_node = None
    for node in costs:
        cost = costs[node]
        if cost < lowest_cost and node not in processed:
            lowest_cost = cost
            lowest_cost_node = node
    return lowest_cost_node
#Find the node with the least overhead among the unprocessed nodes
node = find_lowest_cost_node(costs)
while node is not None:
    cost = costs[node]
    neighbors = graph[node]
    for n in neighbors.keys():
        new_cost = cost + neighbors[n]
        if costs[n] > new_cost:
            parents[n] = node

    processed.append(node)
    node = find_lowest_cost_node(costs)

greedy algorithm

Greedy algorithm is a very simple problem-solving strategy. It may not be the optimal solution. But it is also a solution. It is a convenient solution, whether it is a fast solution or not

Suppose you run a radio program that should be heard by listeners in 50 states across the United States. To do this, you need to decide which stations to broadcast on. You have to pay for broadcasting on every radio station, so you try to broadcast on as few radio stations as possible.

Greedy algorithm solution:

(1) Select a radio station that covers the most uncovered states. Even if this radio station covers some already covered
It doesn't matter.
(2) Repeat the first step until all States are covered.

def find_states():
    # State to be covered
    states_needed = set(["mt", "wa", "or", "id", "nv", "ut", "ca", "az"])
    # List of broadcasting stations
    stations = {}
    stations["kone"] = set(["id", "nv", "ut"])
    stations["ktwo"] = set(["wa", "id", "mt"])
    stations["kthree"] = set(["or", "nv", "ca"])
    stations["kfour"] = set(["nv", "ut"])
    stations["kfive"] = set(["ca", "az"])

    final_stations = set()
    while states_needed:
        best_station = None
        states_covered = set()
        for station,states in stations.items():
            covered = states & states_needed
            if len(covered) > len(states_covered):
                best_station = station
                states_covered = covered
        states_needed -= states_covered
        final_stations.add(best_station)
    return final_stations


print(find_states())

Tags: Python data structure Algorithm quick sort

Posted by dgiberson on Tue, 19 Apr 2022 07:25:30 +0930