JAVA object layout -- object header

Due to the object-oriented idea of Java, a large number of objects need to be stored in the JVM. In order to realize some additional functions during storage, some tag fields need to be added to the object to enhance the function of the object. When learning the knowledge of synchronized concurrent programming, it is always difficult for us to understand its implementation principle. Because biased locks, lightweight locks and heavyweight locks all involve object headers, understanding Java object headers is a prerequisite for us to deeply understand synchronized. We use the 64 bit JDK example below

1. Overall structure of object layout

2. Get an object layout instance

1. First, introduce an artifact to view the object layout in maven project

		<dependency>
			<groupId>org.openjdk.jol</groupId>
			<artifactId>jol-core</artifactId>
			<version>0.9</version>
		</dependency>

2. Call classlayout parseInstance(). toPrintable()

public class Main{
    public static void main(String[] args) throws InterruptedException {
        L l = new L();  //new an object 
        System.out.println(ClassLayout.parseInstance(l).toPrintable());//Layout of output l objects
    }
}
//Object class
class L{
    private boolean myboolean = true;
}

Output after operation:

 OFFSET  SIZE      TYPE DESCRIPTION                               VALUE
      0     4           (object header)                           01 00 00 00 (00000001 00000000 00000000 00000000) (1)
      4     4           (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4           (object header)                           f0 e4 2c 11 (11110000 11100100 00101100 00010001) (288154864)
     12     4           (object header)                           01 00 00 00 (00000001 00000000 00000000 00000000) (1)
     16     1   boolean L.myboolean                               true
     17     7           (loss due to the next object alignment)
Instance size: 24 bytes
Space losses: 0 bytes internal + 7 bytes external = 7 bytes total

The memory occupied by the object header is 16*8bit=128bit. If you print out by yourself, the possible result is 96 bit, which is because I turn off pointer compression. The jdk8 version enables pointer compression by default. You can turn off pointer compression by configuring vm parameters. For more compressed pointer access to JAVA documentation: Official website

Turn off pointer compression		-XX:-UseCompressedOops 

After pointer compression is turned on, look at the memory layout of the object:

 OFFSET  SIZE      TYPE DESCRIPTION                               VALUE
      0     4           (object header)                           01 00 00 00 (00000001 00000000 00000000 00000000) (1)
      4     4           (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4           (object header)                           43 c1 00 f8 (01000011 11000001 00000000 11111000) (-134168253)
     12     1   boolean L.myboolean                               true
     13     3           (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 3 bytes external = 3 bytes total
  • OFFSET: OFFSET address, in bytes;
  • SIZE: occupied memory SIZE, in bytes;
  • TYPE DESCRIPTION: TYPE DESCRIPTION, where object header is the object header;
  • VALUE: the VALUE currently stored in the corresponding memory;

Turning on pointer compression can reduce the memory usage of objects. Therefore, turning on pointer compression can theoretically save about 50% of memory. jdk8 and later versions have enabled pointer compression by default and do not need to be configured.

The object header structure obtained by ordinary objects is:

|--------------------------------------------------------------|
|                     Object Header (128 bits)                 |
|------------------------------------|-------------------------|
|        Mark Word (64 bits)         | Klass pointer (64 bits) |
|------------------------------------|-------------------------|

Get structure after compressing ordinary objects:

|--------------------------------------------------------------|
|                     Object Header (96 bits)                  |
|------------------------------------|-------------------------|
|        Mark Word (64 bits)         | Klass pointer (32 bits) |
|------------------------------------|-------------------------|

The object header structure obtained by the array object is:

|---------------------------------------------------------------------------------|
|                                 Object Header (128 bits)                        |
|--------------------------------|-----------------------|------------------------|
|        Mark Word(64bits)       | Klass pointer(32bits) |  array length(32bits)  |
|--------------------------------|-----------------------|------------------------|
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           01 00 00 00 (00000001 00000000 00000000 00000000) (1)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           6d 01 00 f8 (01101101 00000001 00000000 11111000) (-134217363)
     12     4        (object header)                           05 00 00 00 (00000101 00000000 00000000 00000000) (5)
     16    20    int [I.<elements>                             N/A
     36     4        (loss due to the next object alignment)
Instance size: 40 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

3. Composition of object header

Let's first understand the storage structure of a JAVA object. In the Hotspot virtual machine, the storage layout of objects in memory is divided into three areas: object Header, Instance Data, and Padding. The results we just printed can be classified as follows:

 OFFSET  SIZE      TYPE DESCRIPTION                               VALUE
      0     4           (object header)    //markword             01 00 00 00 (00000001 00000000 00000000 00000000) (1)
      4     4           (object header)    //markword             00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4           (object header)   //klass pointer class metadata 43 c1 00 f8 (01000011 11000001 00000000 11111000) (-134168253)
     12     1   boolean L.myboolean                               true	// Instance Data object actual data
     13     3           (loss due to the next object alignment)			//Padding align fill data

1.Mark Word

This part is mainly used to store the runtime data of the object itself, such as hashcode, gc generation age, etc. The bit length of mark Word is a Word size of the JVM, that is, the mark Word of 32-bit JVM is 32 bits and 64 bit JVM is 64 bits. In order to store more information in a Word size, the JVM sets the lowest two bits of the Word as marker bits. The mark Word under different marker bits is shown as follows:

The meanings of each part are as follows: Lock: the 2-bit lock status flag bit. Because you want to use as few binary bits as possible to represent as much information as possible, you set the lock flag. If the value of this tag is different, the meaning of the whole mark word is different. By counting the last three digits, we can determine the type of lock

enum {  locked_value             	= 0, // 0 00 lightweight lock
         unlocked_value           = 1,// 0 01 no lock
         monitor_value            = 2,// 0 10 heavyweight lock
         marked_value             = 3,// 0 11 gc flag
         biased_lock_pattern      = 5 // 1 01 deflection lock
  };
Analyze lock status through memory information

Write a demo of synchronized locking to analyze the lock state. Then, let's take another look. Use the memory information of the object in the case of synchronized locking to analyze the lock state through the object header.

code:

public class Main{
    public static void main(String[] args) throws InterruptedException {
        L l = new L();
        Runnable RUNNABLE = () -> {
            while (!Thread.interrupted()) {
                synchronized (l) {
                    String SPLITE_STR = "===========================================";
                    System.out.println(SPLITE_STR);
                    System.out.println(ClassLayout.parseInstance(l).toPrintable());
                    System.out.println(SPLITE_STR);
                }
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        };
        for (int i = 0; i < 3; i++) {
            new Thread(RUNNABLE).start();
        }
    }
}

class L{
    private boolean myboolean = true;
}

Output:

===========================================
 OFFSET  SIZE      TYPE DESCRIPTION                               VALUE
      0     4           (object header)                           5a 97 02 c1 (01011010 10010111 00000010 11000001) (-1056794790)
      4     4           (object header)                           d7 7f 00 00 (11010111 01111111 00000000 00000000) (32727)
      8     4           (object header)                           43 c1 00 f8 (01000011 11000001 00000000 11111000) (-134168253)
     12     1   boolean L.myboolean                               true
     13     3           (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 3 bytes external = 3 bytes total

===========================================

Mark Word is 0x000007fd7c102975a, the corresponding binary is 0xb000000000 00000000 01111111 11010111 11000001 00000010 10010111 01011010. We can see that in the first line of object header, the binary corresponding to value=5a is 01011010, the penultimate digit is 0, which means it is not a bias lock, and the last two digits are 10, which means it is a weight lock

enum {  locked_value             	= 0, // 0 00 lightweight lock
         unlocked_value           = 1,// 0 01 no lock
         monitor_value            = 2,// 0 10 heavyweight lock
         marked_value             = 3,// 0 11 gc flag
         biased_lock_pattern      = 5 // 1 01 deflection lock
  };

Example 2:

public class Main{
    public static void main(String[] args) throws InterruptedException {
        L l = new L();
        synchronized (l) {
            Thread.sleep(1000);
            System.out.println(ClassLayout.parseInstance(l).toPrintable());
            Thread.sleep(1000);
        }     //Lightweight lock
    }
}

class L{
    private boolean myboolean = true;
}

Output:

 OFFSET  SIZE      TYPE DESCRIPTION                               VALUE
      0     4           (object header)                           f0 18 58 00 (11110000 00011000 01011000 00000000) (5773552)
      4     4           (object header)                           00 70 00 00 (00000000 01110000 00000000 00000000) (28672)
      8     4           (object header)                           43 c1 00 f8 (01000011 11000001 00000000 11111000) (-134168253)
     12     1   boolean L.myboolean                               true
     13     3           (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 3 bytes external = 3 bytes total

The corresponding mark word is 0x0000700005818f0, and the corresponding binary is 0xb000000000 00000000 0111000000 00000000 01011000 00011000 11110000. According to the penultimate digit at the end, 0 indicates that it is not an offset lock, and the penultimate digit is 00 indicates that it is a lightweight lock

enum {  locked_value             	= 0, // 0 00 lightweight lock
         unlocked_value           = 1,// 0 01 no lock
         monitor_value            = 2,// 0 10 heavyweight lock
         marked_value             = 3,// 0 11 gc flag
         biased_lock_pattern      = 5 // 1 01 deflection lock
  };

You may have questions about how mark word = 0x0000700005818f0 is calculated. The string arranged in reverse order according to the value of the first 64 bits is an example of mark word:

 OFFSET  SIZE      TYPE DESCRIPTION                               VALUE
      0     4           (object header)                           f0 18 58 00 (11110000 00011000 01011000 00000000) (5773552)
      4     4           (object header)                           00 70 00 00 (00000000 01110000 00000000 00000000) (28672)
      8     4           (object header)                           43 c1 00 f8 (01000011 11000001 00000000 11111000) (-134168253)
     12     1   boolean L.myboolean                               true
     13     3           (loss due to the next object alignment)

The first 64 bits of the Mark word string are arranged in reverse order: 00000000 00000000 0111000000 00000000 01011000 00011000 11110000, which is converted to hexadecimal as 0000700005818f0

2.Klass Pointer

That is, the pointer of the object to its metadata. The virtual machine uses this pointer to determine which class instance it is. Not all virtual machine implementations must retain type pointers on object data (accessed through the handle pool).

Simply extend the access method of the object. The purpose of creating the object is to use it. Therefore, our Java program will operate the objects on the heap through the reference data of the local variable table in the virtual machine stack at runtime. However, reference is only a reference to an object specified in the JVM. How can this reference be located to a specific object? Therefore, different virtual machines can achieve different positioning methods. There are two main types: handle pool and direct pointer.

2.1 using handle access

A piece of memory will be opened in the heap as the handle pool. The handle stores the memory address of the object instance data (attribute value structure) and the memory address of the access type data (class information and method type information). The object instance data is also generally opened in the heap, and the type data is generally stored in the method area.

Advantages: reference stores a stable handle address. When the object is moved (moving the object is a very common behavior during garbage collection), it will only change the instance data pointer in the handle, and the reference itself does not need to be changed. Disadvantages: it increases the time overhead of one pointer positioning.

2.2 access with pointer

Pointer access means that the reference directly stores the memory address of the object in the heap, but the corresponding type data access address needs to be stored in the instance.

Advantages: it saves the cost of pointer positioning at one time. Disadvantages: when the object is moved (such as memory rearrangement after GC), the reference itself needs to be modified.

Summary:

If accessed through the handle pool, the type pointer of the object does not need to exist in the object header, but most of the current virtual machine implementations use direct pointer access. In addition, if the object is a JAVA array, there will be some data in the object header to identify the length of the array. Otherwise, the JVM can see the metadata information of ordinary objects to know its size, but not array objects

3. Align padding bytes

Because the JVM requires that the memory size occupied by java objects should be a multiple of 8bit, there are a few bytes to supplement the size of the object to a multiple of 8bit, which will not be introduced in particular

4.JVM upgrade lock process

1. When it is not regarded as a lock, it is an ordinary object. Mark Word records the HashCode of the object, the lock flag bit is 01, and the bit of whether it is biased towards the lock is 0.

2. When the object is treated as A synchronous lock and A thread A grabs the lock, the lock flag bit is still 01, but the bit of whether it is biased to the lock is changed to 1. The first 23 bits record the thread id that grabs the lock, indicating that it enters the biased lock state.

3. When thread A attempts to obtain the lock again, the JVM finds that the flag bit of the synchronization lock object is 01 and whether the bias lock is 1, that is, the bias state. The thread id recorded in Mark Word is the thread A's own id, indicating that thread A has obtained the bias lock and can execute the code of the synchronization lock.

4. When thread B attempts to obtain the lock, the JVM finds that the synchronization lock is in the biased state, but the thread id in Mark Word is not recorded as B, so thread B will first attempt to obtain the lock with CAS operation. The operation to obtain the lock here may be successful, because thread A generally does not automatically release the biased lock. If the lock grabbing is successful, change the thread id in Mark Word to the id of thread B, which means that thread B has obtained the bias lock and can execute the synchronization lock code. If the lock grabbing fails, continue with step 5.

5. The lock grabbing failure in the biased lock state indicates that the current lock has certain competition, and the biased lock will be upgraded to a lightweight lock. The JVM will open up a separate space in the thread stack of the current thread, where the pointer to the object lock Mark Word is saved, and the pointer to this space is saved in the object lock Mark Word. The above two save operations are CAS operations. If the save is successful, it means that the thread has grabbed the synchronization lock. Change the lock flag bit in Mark Word to 00 to execute the synchronization lock code. If the save fails, it means that the lock grabbing fails and the competition is too fierce. Continue with step 6.

6. If the lightweight lock fails to grab the lock, the JVM will use the spin lock. The spin lock is not a lock state, but represents constant retry and attempt to grab the lock. From jdk1 Starting from 7, the spin lock is enabled by default, and the number of spins is determined by the JVM. If the lock grabbing is successful, execute the synchronization lock code. If it fails, continue to step 7.

7. After the spin lock is retried, if the lock grabbing still fails, the synchronization lock will be upgraded to a heavyweight lock, and the lock flag bit will be changed to 10. In this state, all threads that fail to grab the lock will be blocked.

Summary: this chapter mainly introduces the object layout, including object header, object instance data, and alignment data It also introduces more information and parsing methods in the header. Please pay attention to the official account: java treasure.

Pay attention to the official account: java treasure

Tags: Next.js mark

Posted by Dville on Thu, 14 Apr 2022 05:34:41 +0930