Concurrent container - CopyOnWriteArrayList

Before CopyOnWriteArrayList appeared, we had ArrayList and LinkedList as arrays and linked lists of lists, and thread safe vectors and collections Synchronizedlist() can be used.

First, let's look at the size of the thread safe Vector and the code of the get method:

public synchronized int size() {
    return elementCount;
}
public synchronized E get(int index) {
    if (index >= elementCount)
        throw new ArrayIndexOutOfBoundsException(index);
    return elementData(index);
}

It can be seen that Vector is very similar to Hashtable. Internally, synchronized is used to ensure thread safety, and the granularity of locks is relatively large. They are method level locks. When the amount of concurrency is high, it is easy to compete, and the concurrency efficiency is relatively low. Moreover, the previous lists are not allowed to be edited during the iteration. If you add or delete elements during the iteration, you will throw a ConcurrentModificationException. This feature also brings trouble to users in many cases.

So from jdk1 Since 5, Java has provided a concurrency container using CopyOnWrite mechanism. CopyOnWriteArrayList is the main concurrency List. The concurrent set of CopyOnWrite also includes CopyOnWriteArraySet, whose bottom layer is implemented by CopyOnWriteArrayList. Therefore, taking CopyOnWriteArrayList as a breakthrough, let's take a look at the characteristics of CopyOnWrite container.

1: Applicable scenario

  • The read operation can be as fast as possible, but it doesn't matter if the write operation is slower

For example, some system level information often only needs to be loaded or modified a few times, but it will be frequently accessed by all modules in the system. For this scenario, what we most want to see is that the read operation can be as fast as possible, and it doesn't matter if the write operation is slower.

  • Read more and write less

In the blacklist scenario, if we have a search website, users enter keywords to search for content in the search box of the website, but some keywords are not allowed to be searched. These keywords that cannot be searched will be placed in a blacklist. The blacklist does not need to be updated in real time. It may be updated once every night. This scenario of reading more and writing less is also suitable for using the CopyOnWrite collection.

2: Reading and writing rules

Upgrade the read-write lock (read sharing and other mutually exclusive) rules. In order to maximize the reading performance, the CopyOnWriteArrayList does not need to be locked at all. What's more, writing will not block the reading operation, that is, you can read while writing. Only writing and writing need to be synchronized, that is, multiple writes are not allowed to occur at the same time, However, reading is allowed to occur at the same time when writing occurs. In this way, the performance of read operations will be greatly improved.

That is, the reading and writing rules of CopyOnWriteArrayList are: write and write are mutually exclusive, and others are shared

3: Characteristics

  • Meaning of CopyOnWrite

CopyOnWrite means that when a container needs to be modified, the current container is not modified directly. Instead, Copy the current container, Copy a new container, and then modify the new container. After the modification, point the reference of the original container to the new container. This completes the whole modification process.

Therefore, the CopyOnWrite container can be read concurrently without locking, because the current container will not add any elements or modify them. At the same time, all modification operations (add, set, etc.) of CopyOnWriteArrayList are realized by creating a new copy of the underlying array, so the CopyOnWrite container is also an embodiment of the idea of separation of reading and writing. Different containers are used for reading and writing.

  • Set content is allowed to be modified during iteration
/**
 * @discription Demonstrates how the contents of a collection can be modified during a CopyOnWriteArrayList iteration
 */
public class CopyOnWriteArrayListDemo {

    public static void main(String[] args) {

        CopyOnWriteArrayList<Integer> list = new CopyOnWriteArrayList<>(new Integer[]{1,2,3});
        System.out.println(list);

        Iterator<Integer> iterator1 = list.iterator();
        list.add(4);
        System.out.println(list);

        Iterator<Integer> iterator2 = list.iterator();
        System.out.println("====Verify Iterator 1 content====");
        iterator1.forEachRemaining(System.out::println);
        System.out.println("====Verify Iterator 2 content====");
        iterator2.forEachRemaining(System.out::println);
    }
}
// The result is:
[1, 2, 3]
[1, 2, 3, 4]
====Verify Iterator 1 content====
1 2 3
====Verify Iterator 2 content====
1 2 3 4

The first iterator prints out [1, 2, 3], while the second prints out [1, 2, 3, 4]. Although their printing timing occurs after the fourth element is added, their creation timing is different. Since there are only three elements in the List when iterator 1 is created, no matter what changes are made to the List, it is insensitive to it.

This result shows that once the CopyOnWriteArrayList iterator is established, if you add elements to the previous CopyOnWriteArrayList object, the changes of elements will not be displayed in the iterator and errors will not be reported, which is very different from ArrayList.

4: Shortcomings

These disadvantages are not only for CopyOnWriteArrayList, but also for other CopyOnWrite containers:

  • Memory usage problem

Because of CopyOnWrite's copy on write mechanism, when writing, the memory of two objects will be stationed in the memory at the same time, which will occupy additional memory space.

  • When there are many or complex elements, the cost of replication is very high

The replication process will not only occupy double memory, but also consume CPU and other resources, which will reduce the overall performance.

  • Data consistency problem

Since the copy of CopyOnWrite container is modified first, this modification is not visible to other threads in real time, and can only be reflected after modification. If you want the written data to be seen by other threads immediately, the CopyOnWrite container is not applicable.

5: Source code analysis

5.1 data structure

public class CopyOnWriteArrayList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable {

    /** Reentrant lock object */
    final transient ReentrantLock lock = new ReentrantLock();
	
    /** CopyOnWriteArrayList The bottom layer is implemented by the array, which is decorated with volatile to ensure the visibility of the array */
    private transient volatile Object[] array;

    /**Get array*/
    final Object[] getArray() {
        return array;
    }

    /**Set array**/
    final void setArray(Object[] a) {
        array = a;
    }

    /**Initializing CopyOnWriteArrayList is equivalent to initializing an array**/
    public CopyOnWriteArrayList() {
        setArray(new Object[0]);
    }
    //.........
}

In this class, there will be a ReentrantLock lock to ensure the thread safety of modification operation. The Object [] array named array below is modified by volatile to ensure the visibility of the array. This is the array that stores elements

5.2 add method

public boolean add(E e) {
 
    // Lock
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
 
        // Get the length and elements of the original array
        Object[] elements = getArray();
        int len = elements.length;
 
        // Copy out a new array
        Object[] newElements = Arrays.copyOf(elements, len + 1);
 
        // When adding, adds a new element to the new array
        newElements[len] = e;
 
        // Replace the point of volatile Object[] array with a new array
        setArray(newElements);
        return true;
    } finally {
        lock.unlock();
    }
}

When adding, first lock and copy a new array. The addition operation is completed on the new array, then point the array to the new array, and finally unlock.

The above steps realize the idea of CopyOnWrite: the write operation is carried out on the copy of the original container, and the list will not be locked when reading data. Moreover, we can see that if a new read thread comes in during the copy operation of the container, the old data is still read, because the reference of the object has not been changed at that time.

You can see the code of the read operation, that is, the three methods related to get, which are the two overloads of the get method and the getArray method. The code is as follows:

public E get(int index) {
    return get(getArray(), index);
}
final Object[] getArray() {
    return array;
}
private E get(Object[] a, int index) {
    return (E) a[index];
}

It can be seen that the get related operations are not locked, which ensures the high speed of the read operation.

5.3 iterator COWIterator class

This iterator has two important properties, Object[] snapshot and int cursor. The snapshot represents the snapshot of the array, that is, the array at the time when the iterator is created, and the cursor is the cursor of the iterator. The construction method of iterator is as follows:

private COWIterator(Object[] elements, int initialCursor) {
    cursor = initialCursor;
    snapshot = elements;
}

It can be seen that when the iterator is built, it will assign the current elements to the snapshot, and all operations of the subsequent iterator are based on the snapshot array, such as:

public E next() {
    if (! hasNext())
        throw new NoSuchElementException();
    return (E) snapshot[cursor++];
}

As you can see in the next method, the returned content is a snapshot object. Therefore, even if the original array is modified later, the snapshot will not be perceived or affected. There is no need to lock the iterative operation, and no exception will be thrown. The result returned by the iterator is consistent with the content when the iterator is created.

Posted by pornost4r on Sun, 17 Apr 2022 20:00:15 +0930