# In-depth understanding of skip table and its application in Redis

## foreword

The jump table can achieve the same time complexity O(logN) as the red-black tree, and the implementation is simple. The underlying data structure of the ordered collection object in Redis uses the jump table. Its author William Pu commented: Jumping linked list is a data structure that may replace balanced tree in many applications. This article will learn about the implementation of the jump table and its application in Redis. **

### 1. The basic concept of skip table

The skip list, namely the skip list (Skip List), is based on a parallel linked list data structure, and the operation efficiency can reach O(logN), which is friendly to concurrency. The schematic diagram of the skip list is shown below. The characteristics of the jump table can be summarized as follows.

• The jump list is a multi-level (level) linked list structure;

• Each layer in the jump list is an ordered linked list, and is arranged in ascending order (default) of elements;

• Which layer an element in the jump list will appear in is randomly determined, but as long as the element appears in the kth layer, then the linked list below the k layer will also appear in this element;

• The underlying linked list of the skip list contains all elements;

• The head node and tail node of the jump table do not store elements, and the number of layers of the head node and tail node is the maximum number of layers of the jump table;

• The nodes in the jump list contain two pointers, one pointer points to the next node of the linked list of the same layer, and the other pointer points to the node of the same element of the lower linked list.

Take the jump table in the figure above as an example. If you want to find element 71, the search process is shown in the figure below. Start searching from the head node of the top-level linked list. When finding the node of element 71, a total of 4 nodes have been traversed. It is necessary to traverse 7 nodes, so the jump table trades space for time, which shortens the time it takes to operate the jump table. Algorithms for skip lists have the same asymptotic expected time bounds as balanced trees, and are simpler, faster, and use less space. This data structure was invented by William Pugh (transliterated as William Pugh), and first appeared in his paper "Skip Lists: A Probabilistic Alternative to Balanced Trees" published in 1990. I found an author's paper on jumping tables on Google. If you are interested, it is strongly recommended to download and read:

The jump table uses a non-strict balance mechanism in the dynamic search process to make insertion and deletion more convenient and faster. This non-strict balance is based on probability, rather than the strict balance of the balanced tree. When it comes to non-strict balance, the first thing that comes to mind is the red-black tree RbTree. It also uses non-strict balance to avoid adjusting the tree structure like AVL. I won’t talk about red-black trees here. It seems that jumping tables is a similar way. But it's based on probability.

### 2. Jump table nodes

It is known that the nodes in the jump list need to have a pointer to the next node of the linked list of the current layer, and a pointer to the same element node of the lower linked list, so the nodes in the jump list are defined as follows.

```public class SkiplistNode {

public int data;
public SkiplistNode next;
public SkiplistNode down;
public int level;

public SkiplistNode(int data, int level) {
this.data = data;
this.level = level;
}```

The above is the simplest way to define the nodes in the jump list. The stored element data is an integer, and the size of the element data is directly compared when comparing nodes.

### 3. Jump table initialization

When the jump list is initialized, the head and tail nodes of each layer of the linked list are created and stored in a collection. The number of layers of the head and tail nodes is randomly specified, and the number of layers of the head and tail nodes represents the number of layers of the current jump list . After initialization, the jump table structure is as follows. The relevant code for jump table initialization is as follows.

```public LinkedList<SkiplistNode> headNodes;

public int curLevel;

public Random random;

public Skiplist() {
random = new Random();

//tailNodes is used to store the tail nodes of each layer

//When initializing the skip table, the number of layers of the skip table is randomly specified
curLevel = getRandomLevel();
//After specifying the initial random number of layers of the jump table, it is necessary to create the head node and tail node of each layer and build a good relationship
SkiplistNode head = new SkiplistNode(Integer.MIN_VALUE, 0);
SkiplistNode tail = new SkiplistNode(Integer.MAX_VALUE, 0);
for (int i = 0; i <= curLevel; i++) {

SkiplistNode tailNew = new SkiplistNode(Integer.MAX_VALUE, tail.level + 1);
tailNew.down = tail;

tail = tailNew;
}
}```

### 4. How to add jump table

When each element is added to the jump list, it is first necessary to randomly specify the number of layers of this element in the jump list. It is necessary to expand the number of layers of the jump table, and expanding the number of layers of the jump table is to expand the number of layers of the head and tail nodes. The following is an addition process that needs to expand the number of jump table layers.

In the initial state, the number of layers of the jump table is 2, as shown in the figure below. Now we want to add element 120 to the jump table, and the number of randomly assigned layers is 3, which is greater than the current layer number 2 of the jump table. At this time, we need to expand the number of layers of the jump table first, as shown in the figure below. Now we want to add element 120 to the jump table, and the number of randomly assigned layers is 3, which is greater than the current layer number 2 of the jump table. At this time, we need to expand the number of layers of the jump table first, as shown in the figure below. When inserting element 120 into the skip table, it starts from the top level and goes down one level, as shown in the figure below. The code of the add method of the jump table is as follows.

```public void add(int num) {
//Get the number of layers of the value added this time
int level = getRandomLevel();
//If the number of layers of the value added this time is greater than the number of layers of the current skip list
//You need to expand the number of jump table layers before adding the current value
if (level > curLevel) {
expanLevel(level - curLevel);
}

//curNode indicates the node corresponding to the num value in the current layer
SkiplistNode curNode = new SkiplistNode(num, level);
//preNode indicates the previous node of curNode in the current layer
SkiplistNode preNode = headNodes.get(curLevel - level);
for (int i = 0; i <= level; i++) {
//Traverse backwards from the head node of the current layer until a preNode is found
//such that preNode.data < num <= preNode.next.data
while (preNode.next.data < num) {
preNode = preNode.next;
}

//Insert curNode between preNode and preNode.next
curNode.next = preNode.next;
preNode.next = curNode;

//If the current layer is not 0, continue to add nodes to the next layer
if (curNode.level > 0) {
SkiplistNode downNode = new SkiplistNode(num, curNode.level - 1);
//curNode points to the node of the next layer
curNode.down = downNode;
//curNode moves down one layer
curNode = downNode;
}
//preNode moves down one layer
preNode = preNode.down;
}
}

private void expanLevel(int expanCount) {
SkiplistNode tail = tailNodes.getFirst();
for (int i = 0; i < expanCount; i++) {
SkiplistNode tailNew = new SkiplistNode(Integer.MAX_VALUE, tail.level + 1);
tailNew.down = tail;

tail = tailNew;

}
}```

### 5. Search method of jump table

When searching for an element in the jump list, you need to start from the top level and search down layer by layer. Follow the rules below when searching.

• When the target value is greater than the value of the next node of the current node, continue to search backward on the linked list of this layer;

• When the target value is greater than the value of the current node and less than the value of the next node of the current node, move down one level and search backward from the position of the same node in the lower linked list;

• The target value is equal to the current node value, and the search ends.

• The figure below is a schematic diagram of a search process. • The code for the search of the jump list is as follows.

```public boolean search(int target) {
//Start searching from the top layer, curNode represents the currently traversed node
while (curNode != null) {
if (curNode.next.data == target) {
//The node corresponding to the target value is found, and returns true at this time
return true;
} else if (curNode.next.data > target) {
//The next node value of curNode is greater than target
//Indicates that the target node is between curNode and curNode.next
//At this time, it is necessary to search for the lower layer
curNode = curNode.down;
} else {
//The next node value of curNode is less than target
//Indicates that the target node is behind the next node of curNode
//At this time, continue to search backwards in this layer
curNode = curNode.next;
}
}
return false;
}```

### 6. How to delete the jump table

When it is necessary to delete an element in the jump table, it is necessary to delete the nodes of this element in all layers. The specific deletion rules are as follows.

• First, search for the node to be deleted according to the search method of the jump table. If it can be found, the node to be deleted is located at the highest level of the node layer;

• From the highest level of the node to be deleted, delete the nodes to be deleted in each layer. The way of deletion is to make the node before the node to be deleted directly point to the node after the node to be deleted.

• The figure below is a schematic diagram of the deletion process. • The code for deleting the jump table is as follows.

```public boolean erase(int num) {
//The traversal process of deleting a node is the same as the traversal process of finding a node
//However, if the target node is found when deleting the node, the node deletion operation needs to be performed
while (curNode != null) {
if (curNode.next.data == num) {
//preDeleteNode indicates the previous node of the node to be deleted
SkiplistNode preDeleteNode = curNode;
while (true) {
//To delete the node to be deleted in the current layer is to make the previous node of the node to be deleted point to the next node of the node to be deleted
preDeleteNode.next = curNode.next.next;
//After the current layer is deleted, you need to continue to delete the nodes to be deleted in the next layer
//Here let preDeleteNode move down one layer
//After moving down one layer, preDeleteNode is not necessarily the previous node of the node to be deleted
preDeleteNode = preDeleteNode.down;

//If preDeleteNode is null, it means that the underlying node to be deleted has been deleted
//At this point, the deletion process ends and returns true
if (preDeleteNode == null) {
return true;
}

//After preDeleteNode moves down one layer, it needs to continue to traverse backwards from the current position
//Until a preDeleteNode is found, making the value of preDeleteNode.next equal to the target value
//At this time, preDeleteNode becomes the previous node of the node to be deleted
while (preDeleteNode.next.data != num) {
preDeleteNode = preDeleteNode.next;
}
}
} else if (curNode.next.data > num) {
curNode = curNode.down;
} else {
curNode = curNode.next;
}
}
return false;
}```

### 7. The complete code of the jump table

The complete code of the jump table is as follows.

```public class Skiplist {

public int curLevel;

public Random random;

public Skiplist() {
random = new Random();

//tailNodes is used to store the tail nodes of each layer

//When initializing the skip table, the number of layers of the skip table is randomly specified
curLevel = getRandomLevel();
//After specifying the initial random number of layers of the jump table, it is necessary to create the head node and tail node of each layer and build a good relationship
SkiplistNode head = new SkiplistNode(Integer.MIN_VALUE, 0);
SkiplistNode tail = new SkiplistNode(Integer.MAX_VALUE, 0);
for (int i = 0; i <= curLevel; i++) {

SkiplistNode tailNew = new SkiplistNode(Integer.MAX_VALUE, tail.level + 1);
tailNew.down = tail;

tail = tailNew;
}
}

public boolean search(int target) {
//Start searching from the top layer, curNode represents the currently traversed node
while (curNode != null) {
if (curNode.next.data == target) {
//The node corresponding to the target value is found, and returns true at this time
return true;
} else if (curNode.next.data > target) {
//The next node value of curNode is greater than target
//Indicates that the target node is between curNode and curNode.next
//At this time, it is necessary to search for the lower layer
curNode = curNode.down;
} else {
//The next node value of curNode is less than target
//Indicates that the target node is behind the next node of curNode
//At this time, continue to search backwards in this layer
curNode = curNode.next;
}
}
return false;
}

//Get the number of layers of the value added this time
int level = getRandomLevel();
//If the number of layers of the value added this time is greater than the number of layers of the current skip list
//You need to expand the number of jump table layers before adding the current value
if (level > curLevel) {
expanLevel(level - curLevel);
}

//curNode indicates the node corresponding to the num value in the current layer
SkiplistNode curNode = new SkiplistNode(num, level);
//preNode indicates the previous node of curNode in the current layer
SkiplistNode preNode = headNodes.get(curLevel - level);
for (int i = 0; i <= level; i++) {
//Traverse backwards from the head node of the current layer until a preNode is found
//such that preNode.data < num <= preNode.next.data
while (preNode.next.data < num) {
preNode = preNode.next;
}

//Insert curNode between preNode and preNode.next
curNode.next = preNode.next;
preNode.next = curNode;

//If the current layer is not 0, continue to add nodes to the next layer
if (curNode.level > 0) {
SkiplistNode downNode = new SkiplistNode(num, curNode.level - 1);
//curNode points to the node of the next layer
curNode.down = downNode;
//curNode moves down one level
curNode = downNode;
}
//preNode moves down one layer
preNode = preNode.down;
}
}

public boolean erase(int num) {
//The traversal process of deleting a node is the same as the traversal process of finding a node
//However, if the target node is found when deleting the node, the node deletion operation needs to be performed
while (curNode != null) {
if (curNode.next.data == num) {
//preDeleteNode indicates the previous node of the node to be deleted
SkiplistNode preDeleteNode = curNode;
while (true) {
//To delete the node to be deleted in the current layer is to make the previous node of the node to be deleted point to the next node of the node to be deleted
preDeleteNode.next = curNode.next.next;
//After the current layer is deleted, you need to continue to delete the nodes to be deleted in the next layer
//Here let preDeleteNode move down one layer
//After moving down one level, preDeleteNode is not necessarily the previous node of the node to be deleted
preDeleteNode = preDeleteNode.down;

//If preDeleteNode is null, it means that the underlying node to be deleted has been deleted
//At this point, the deletion process ends and returns true
if (preDeleteNode == null) {
return true;
}

//After preDeleteNode moves down one layer, it needs to continue to traverse backwards from the current position
//Until a preDeleteNode is found, making the value of preDeleteNode.next equal to the target value
//At this time, preDeleteNode becomes the previous node of the node to be deleted
while (preDeleteNode.next.data != num) {
preDeleteNode = preDeleteNode.next;
}
}
} else if (curNode.next.data > num) {
curNode = curNode.down;
} else {
curNode = curNode.next;
}
}
return false;
}

private void expanLevel(int expanCount) {
SkiplistNode tail = tailNodes.getFirst();
for (int i = 0; i < expanCount; i++) {
SkiplistNode tailNew = new SkiplistNode(Integer.MAX_VALUE, tail.level + 1);
tailNew.down = tail;

tail = tailNew;

}
}

private int getRandomLevel() {
int level = 0;
while (random.nextInt(2) > 1) {
level++;
}
return level;
}

}```

### Eight. Application of jump table in Redis

The ZSet structure contains both a dictionary and a jump table, and the jump table saves all set elements from small to large by score. The dictionary holds the mapping from member to score. These two structures share the member and score of the same element through pointers, without wasting additional memory.

```typedef struct zset {
dict *dict;
zskiplist *zsl;
} zset;```

Dictionary and skip list layout in ZSet: #### 1. The implementation details of the jump table in ZSet

The realization principle of the number of random layers:

The skip list is a probabilistic data structure, and the number of insertion levels of elements is randomly specified. William Pugh described its calculation process in the paper as follows: specify the maximum layer number MaxLevel of the node, specify the probability p, and the default layer number lvl is 1;

Generate a random number r of 0~1, if r<p, and lvl<MaxLevel, then lvl ++;

Repeat step 2 until the generated r >p, at this time lvl is the number of layers to be inserted.

Pseudocode for generating random layers in the paper: The implementation of the jump table in Redis basically follows this idea, but there are minor differences. Take a look at the random source code src/z_set.c of Redis about the number of layers of the skip table:

```/* Returns a random level for the new skiplist node we are going to create.
* The return value of this function is between 1 and ZSKIPLIST_MAXLEVEL
* (both inclusive), with a powerlaw-alike distribution where higher
* levels are less likely to be returned. */
int zslRandomLevel(void) {
int level = 1;
while ((random()&0xFFFF) < (ZSKIPLIST_P * 0xFFFF))
level += 1;
return (level<ZSKIPLIST_MAXLEVEL) ? level : ZSKIPLIST_MAXLEVEL;
}```

Two of the macros are defined in redis.h:

```#define ZSKIPLIST_MAXLEVEL 32 /* Should be enough for 2^32 elements */
#define ZSKIPLIST_P 0.25      /* Skiplist P = 1/4 */```

You can see in while:

`(random()&0xFFFF) < (ZSKIPLIST_P*0xFFFF)`

When I saw this formula at first glance, I was a little surprised because it involved bit operations. I need to study why Antirez uses bit operations to write this way?

The initial guess is that random() returns a floating-point number [0-1], so I found a tool for converting floating-point numbers to binary online, and entered 0.5 to see the result: It can be seen that the result of 0.5 32bit conversion hexadecimal is 0x3f000000, if it is ANDed with 0xFFFF, the result is still 0, which is not as expected.

In actual application, the realization of the number of random layers is not uniform. What is important is the generation of random numbers. In LevelDB, the generation code for the number of jump table layers is as follows:

```template <typename Key, typename Value>
int SkipList<Key, Value>::randomLevel() {

static const unsigned int kBranching = 4;
int height = 1;
while (height < kMaxLevel && ((::Next(rnd_) % kBranching) == 0)) {
height++;
}
assert(height > 0);
assert(height <= kMaxLevel);
return height;
}

uint32_t Next( uint32_t& seed) {
seed = seed & 0x7fffffffu;

if (seed == 0 || seed == 2147483647L) {
seed = 1;
}
static const uint32_t M = 2147483647L;
static const uint64_t A = 16807;
uint64_t product = seed * A;
seed = static_cast<uint32_t>((product >> 31) + (product & M));
if (seed > M) {
seed -= M;
}
return seed;
}```

It can be seen that leveldb uses random numbers and kBranching to take the modulus. If the value is 0, an additional layer is added. Although floating-point numbers are not used, the probability balance is also achieved.

#### 2. The average number of layers of jump table nodes

We can easily see that the higher the number of node layers, the lower the probability of occurrence. In any case, the number of layers always satisfies the power law. The larger the number, the smaller the probability of occurrence.

If the frequency of something occurs in a power relationship with a certain property of it, then this frequency can be said to obey a power law.

The performance of the power law is that the frequency of a few events accounts for the majority of the overall frequency, while most of the remaining events only account for a small part of the overall frequency.

When the power law is applied to the random layers of the jump table, most of the node layers are yellow parts, and only a few are green parts, and the probability is very low.

The quantitative analysis is as follows:

• The number of node layers is at least 1, and the number of node layers greater than 1 satisfies a probability distribution.

• The probability that the number of node layers is exactly equal to 1 is p^0(1-p)

• The probability that the number of node layers is exactly equal to 2 is p^1(1-p)

• The probability that the number of node layers is exactly equal to 3 is p^2(1-p)

• The probability that the number of node layers is exactly equal to 4 is p^3(1-p)

The probability that the number of layers of recursive nodes is exactly equal to K is p^(k-1)(1-p)

Therefore, if we require the average number of layers of nodes, then it is transformed into the expectation problem of probability distribution: In the table, P is the probability, and V is the corresponding value, which gives the possibility of all values ​​and probabilities, so the expectation of this probability distribution can be calculated. The formula in the square brackets is actually the geometric sequence learned in the first grade. Commonly used techniques are misplaced, subtracted and summed. From this, we can see that the expected value of the number of node layers is inversely proportional to 1-p. For Redis, the expected number of node layers is 1.33 when p=0.25.

## Summarize

The time complexity of the jump table is the same as that of the AVL tree and the red-black tree, which can reach O(logN). However, the AVL tree must maintain a height balance, and the red-black tree must maintain an approximate balance of height, which will lead to a delay when inserting or deleting nodes. Some time overhead, so compared to the AVL tree and red-black tree, the jump table saves the time overhead of maintaining a high balance, but correspondingly pays more space to store nodes of multiple layers, so the jump table A table is a data structure that trades space for time. Taking the underlying data structure zset in Redis as a typical application, we can further see the practical application of the jump list.

Posted by LiamProductions on Thu, 23 Feb 2023 14:18:56 +1030