1. Introduction to Algorithms
1.1 Binary Search

For lists containing n elements, a binary search requires a maximum of log2n steps, while a simple search requires a maximum of N steps

Binary lookups only work when lists are ordered

Code implementation:
def binary_search(list, item): low = 0 high = len(list)  1 while low <= high: mid = (low + high) guess = list[mid] if guess == item: return mid if guess > item: high = mid  1 else: low = mid + 1 return None my_list = [1, 3, 5, 7, 9] print(binary_search(my_list, 3))

Runtime: Logarithmic time
1.2 Large O Representation

The large O representation is a special representation indicating the speed of the algorithm

The large O representation allows you to compare operands, which indicates how fast the algorithm runs.

The large O notation indicates the worstcase run time

Common large O runtime:

O ( l o g n ) O(log n) O(logn), logarithmic time, including binary lookup.

O ( n ) O(n) O(n), linear time, including simple lookups.

O ( n l o g n ) O(nlog n) O(nlogn), including Quick Sort, a faster sort algorithm.

O ( n 2 ) O(n^2) O(n2), including Selective Sorting, a slower sort algorithm.

O ( n ! ) O(n!) O(n!), Includes a solution to the traveler problem  a very slow algorithm.
Large O run time algorithm O ( 1 ) O(1) O(1) (Constant Time) Hash list O ( l o g n ) O(log n) O(logn) (logarithmic time) Binary Search O ( n ) O(n) O(n) (linear time) Simple Find O ( n l o g n ) O(nlog n) O(nlogn) Quick Sort O ( n 2 ) O(n^2) O(n2) Select Sort O ( n ! ) O(n!) O(n!) traveling salesman problem


Inspirations:
 The speed of the algorithm is not the time, but the increase of the operands.
 When we talk about the speed of the algorithm, what we mean is how fast the running time will increase as the input increases.
 The running time of the algorithm is expressed in large O notation.
 O ( l o g n ) O(log n) O(logn) ratio O ( n ) O(n) O(n) is fast, and the more elements you need to search for, the faster the former is.
2. Select Sort
2.1 Arrays and Chain Lists

array

Chain list: Each element of the chain table stores the address of the next element, thereby stringing together a series of random memory addresses

Array and Chain List Runtime:
array linked list read O ( 1 ) O(1) O(1) O ( n ) O(n) O(n) input O ( n ) O(n) O(n) O ( 1 ) O(1) O(1) delete O ( n ) O(n) O(n) O ( 1 ) O(1) O(1) 
Index: The location of an element is called an index

Insert: When using a chain table, inserting an element is simple, simply modifying the address to which the element preceding it points. When using arrays, you must move back all the elements that follow. Chain lists are a better choice when you need to insert elements in the middle.

Delete: Chain lists are a better choice when deleting elements.
2.2 Selection Sort

Walk through the list to find the largest element and add it to a new list. Continue with this and you'll get an ordered list

Runtime: O ( n 2 ) O(n^2) O(n2)

Code implementation:
#Find Minimum def findSmallest(arr): smallest = arr[0] smallest_index = 0 for i in range(1, len(arr)): if arr[i] < smallest: smallest = arr[i] smallest_index = i return smallest_index #Add Minimum to New Array def selectionSort(arr): newArr = [] for i in range(len(arr)): smallest = findSmallest(arr) newArr.append(arr.pop(smallest)) return newArr print(selectionSort([5, 3, 6, 2, 10]))
3. Recursion
 Recursive: Call your own function
 Recursive condition: Recursive condition refers to the function calling itself
 Baseline condition: A baseline condition means that a function no longer calls itself, thereby avoiding an infinite loop.
 Stack
 Recursive Call Stack
 Stack operation: press in and pop out
4. Quick Sorting
Divide and conquer (D&C)

A wellknown recursive problem solving approach.

step
 Find Baseline Conditions
 Determine how to scale down the issue to meet baseline criteria
4.2 Quick Sort

Steps:
 Selecting an element from an array, which is called a base value, generally uses the first element of the array as the base value
 Partition: Find elements smaller than the base value and elements larger than the base value:
 A subarray of all numbers less than the base value
 Baseline value
 A subarray of all arrays greater than the base value
 Quick Sort Two Subarrays

Code implementation:
def quicksort(array): if len(array) < 2: return array else: pivot = array[0] less = [i for i in array[1:] if i <= pivot] greater = [i for i in array[1:] if i > pivot] return quicksort(less) + [pivot] + quicksort(greater) print(quicksort([10, 5, 2, 3]))
4.3 Retalk about Big O notation
 Quick sorting performance is highly dependent on the benchmark value you choose
5. Hash List
5.1 hash function

Map Input to Number

Requirement:
 Uniformity
 Onetoone correspondence

Hash functions always map the same input to the same index

Hash functions map different inputs to different indexes

The hash function knows how big the array is and returns only valid indexes
5.2 Hash list

Hash List: hash map, map, dictionary, associated array

The Hash list provided by python is a dictionary

Application of hash list:
 Simulate mapping relationships
 Prevent duplication
 Use as a cache
 Benefits: Users see web pages faster and server work less

Conflict: Two keys are assigned the same location
 Reflections:
 The hash function is important. Try to have hash functions map keys evenly to different locations in the hash list.
 If a Hash list stores a chain table that is too long, the speed of the Hash list will decrease dramatically
 Reflections:

Performance: Hash lists are found, inserted, and deleted very quickly
Hash List (Average) Hash list (worst case) array chain read O ( 1 ) O(1) O(1) O ( n ) O(n) O(n) O ( 1 ) O(1) O(1) O ( n ) O(n) O(n) input O ( 1 ) O(1) O(1) O ( n ) O(n) O(n) O ( n ) O(n) O(n) O ( 1 ) O(1) O(1) delete O ( 1 ) O(1) O(1) O ( n ) O(n) O(n) O ( n ) O(n) O(n) O ( 1 ) O(1) O(1)
5.3 Filling factor
 Fill factor = total number of elements/locations in the Hash list
 Filling factor = 1: Best case, each element has its own location
 Fill factor >1: Number of elements exceeds number of positions in the array
 When the filling factor starts to increase, you need to add a position to the hash list, which is called adjusting the length
 The lower the fill factor, the less likely the conflict will occur and the higher the performance of the hash list.
 Empirical rule: Once the fill factor is greater than 0.7, adjust the length of the hash list.
6. Widthfirst search
6.1 Shortest Path Problem
 Steps:
 Use diagrams to model problems
 Solve problems using breadthfirst search
6.2 Widthfirst Search

Finding algorithm for Graphs

Application:
 The first type of question: Find a path: From Node A, is there a path to Node B?
 The second type of question: Find the shortest path (the path with the fewest segments, not the fastest): Which path is the shortest from Node A to Node B?

Find the shortest path:
 During the execution of the breadthfirst search, the search range extends outward gradually from the start, that is, first checking the firstdegree relationship, then checking the seconddegree relationship
 This can only be achieved if the search is in the order in which it was added

Queue:

Operation: Entry and Exit

Operation sequence: FIFO
data structure Operation sequence Stack FIFO queue FIFO


Implementation Diagram
 Directed graph: where the relationship is oneway
 Undirected graph: no arrows, directly connected nodes are related to each other

How the algorithm works:
 Create a queue to store the people you want to check
 Pop a person out of the queue
 Check if this person is a target
 Queue all this person's neighbors

Algorithmic implementation:
from collections import deque graph = {} graph["you"] = ["alice", "bob", "claire"] graph["bob"] = ["anuj", "peggy"] graph["alice"] = ["peggy"] graph["claire"] = ["thom", "jonny"] graph["anuj"] = [] graph["peggy"] = [] graph["thom"] = [] graph["jonny"] = [] def person_is_seller(name): return name[1] == 'm' def search(name): search_queue = deque() search_queue += graph[name] searched = [] while search_queue: person = search_queue.popleft() if not person in searched: if person_is_seller(person): print(person + " is a mango seller!") return True else: search_queue += graph[person] searched.append(person) return False search("you")

Runtime: O ( V + E ) O(V+E) O(V+E), V is the number of vertices, E is the number of edges
7. Dixtra algorithm
7.1 Using the Dixtra algorithm
 Objective: To find the quickest route
 Steps:
 Find the cheapest node to reach in the shortest time
 The cost of updating the neighbors of this node, which will be explained later
 Repeat this process until it is done for each node in the diagram
 Calculate Final Path
7.2 Terms
 Weight: The Dixtra algorithm is used for graphs where each edge has associated numbers, which are called weights
 Weighted graph: A graph with weights is called a weighted graph. To compute the shortest path in a weighted graph, the Dixtra algorithm can be used
 Nonweighted graph: A graph without weights is called a nonweighted graph. To calculate the shortest path in a nonweighted graph, use a breadthfirst search
7.3 Implementation
graph = {} graph["you"] = ["alice", "bob", "claire"] graph["start"] = {} graph["start"]["a"] = 6 graph["start"]["b"] = 2 graph["a"] = {} graph["a"]["fin"] = 1 graph["b"] = {} graph["b"]["a"] = 3 graph["b"]["fin"] = 5 graph["fin"] = {} infinity = float("inf") infinity = float("inf") costs = {} costs["a"] = 6 costs["b"] = 2 costs["fin"] = infinity parents = {} parents["a"] = "start" parents["b"] = "start" parents["fin"] = None processed = [] def find_lowest_cost_node(costs): lowest_cost = float("inf") lowest_cost_node = None for node in costs: cost = costs[node] if cost < lowest_cost and node not in processed: lowest_cost = cost lowest_cost_node = node return lowest_cost_node node = find_lowest_cost_node(costs) while node is not None: cost = costs[node] neighbors = graph[node] for n in neighbors.keys(): new_cost = cost + neighbors[n] if costs[n] > new_cost: costs[n] = new_cost parents[n] = node processed.append(node) node = find_lowest_cost_node(costs)
8. Greedy algorithm
8.1 Classroom Scheduling Problem
 Ideas:
 Choose the earliest lesson to end, which is the first lesson in this classroom.
 Next, you must choose a class that does not start until after the first lesson is over. Again, you choose to end the earliest lesson, which will be the second lesson in this classroom.
8.2 Backpack Problem
 Ideas:
 Steal the most expensive item that can be loaded into a backpack.
 Retheft also loads the most expensive items in a backpack, and so on.
8.3 Set Coverage Problem

Approximation algorithm: Approximation algorithm can be used when it takes too long to obtain an exact solution
 Standard: how fast, how close the approximate solution is to the optimal solution,

Runtime: O ( n 2 ) O(n^2) O(n2)

Code implementation:
states_needed = set(["mt", "wa", "or", "id", "nv", "ut","ca", "az"]) stations = {} stations["kone"] = set(["id", "nv", "ut"]) stations["ktwo"] = set(["wa", "id", "mt"]) stations["kthree"] = set(["or", "nv", "ca"]) stations["kfour"] = set(["nv", "ut"]) stations["kfive"] = set(["ca", "az"]) final_stations = set() while states_needed: best_station = None states_covered = set() for station, states_for_station in stations.items(): covered = states_needed & states_for_station if len(covered) > len(states_covered): best_station = station states_covered = covered final_stations.add(best_station) states_needed = states_covered print(final_stations)

Runtime:
Exact algorithm greedy algorithm O ( n ! ) O(n!) O(n!) O ( n 2 ) O(n^2) O(n2)
8.4 Travel salesman problem
 Problem: To find the shortest path through a specified number of points  the NP Complete Problem
9. K nearest neighbor algorithm
9.1 Create Recommendation System

Feature extraction: Pythagoras formula: ( a 1 − a 2 ) 2 + ( b 1 − b 2 ) 2 + ( c 1 − c 2 ) 2 + ( d 1 − d 2 ) 2 + ( e 1 − e 2 ) 2 \sqrt{(a_1a_2)^2+(b_1b_2)^2+(c_1c_2)^2+(d_1d_2)^2+(e_1e_2)^2} (a1−a2)2+(b1−b2)2+(c1−c2)2+(d1−d2)2+(e1−e2)2

Category: Grouping

Regression: Prediction results
9.2 Machine Learning

OCR (Optical Character Recognition)
 Training: Browse through a large number of digital images to extract the characteristics of these numbers.
 When you encounter a new image, you extract its features and find out who its nearest neighbors are!

Naive Bayes classifier

Forecast the stock market