Pointer to Golang unsafe.Pointer

Compared with C, for the sake of safety, Go language adds restrictions on pointer types and operations, which makes Go programmers not only enjoy the convenience of pointer, but also avoid the danger of pointer. In addition to the normal pointer, Go language provides a general pointer in the unsafe package through unsafe.Pointer. Through this general pointer and several other functions of the unsafe package, users can bypass the type system of Go language and directly operate the memory, such as pointer type conversion and read-write structure private members. It is because of its powerful functions and the serious consequences of careless reading and writing of the wrong memory address that Go language is designed to put these functions in the unsafe package. In fact, it's not as unsafe as you think. It's very convenient to use it properly. In some low-level source code, the frequency of unsafe package is not low.

Basic knowledge

The pointer holds the memory address of a value, and the type * t represents the pointer to the value of type T. Its zero value is nil.

&Operator generates a pointer for its operands.

i := 42
p = &i

*The operator takes the value of the pointer to the address, which is also called "dereference."

fmt.Println(*p) // Read the stored value through the pointer p * p = 21 / / set the memory address of P execution through the pointer p to copy the stored value

Why do I need pointer types? Refer to an example from go101 website:

package main

import "fmt"

func double(x int) { 
   x += x
}

func main() { 
   var a = 3 
   double(a)
   fmt.Println(a) // 3
}

Double a in the double function, but the function in the example can't. Because Go language function parameters are value passing. x in the double function is only a copy of parameter a, and the operation of x in the function cannot be fed back to parameter a.

This problem can be solved by replacing the parameter with a pointer.

package main
import "fmt"

func double(x *int) { 
   *x += *x
    x = nil
}

func main() {
    var a = 3
    double(&a)
    fmt.Println(a) // 6
    p := &a
    double(p)
    fmt.Println(a, p == nil) // 12 false
}

At first glance, you may have some doubts about the following line of code

x = nil

If you think about the parameters in the Go language mentioned above are all value passing, you will know that this line of code does not affect the external variable a at all. Because all parameters are value passing, x in the function is only a copy of & A.

*x += *x

This sentence doubles the value x points to (that is, the value & A points to, that is, the variable a). But the operation on X itself (a pointer) does not affect the outer a, so x=nil inside the double function does not affect the outer a.

Restrictions on pointers

Compared with the flexibility of pointer in C language, the pointer in Go language has many restrictions, but it allows us to enjoy the convenience of pointer and avoid the danger of pointer. Let's talk about some restrictions of Go on pointer operation

Limitation 1: pointers cannot participate in operations

Let's take a simple example

package main
import "fmt"
func main() {
	a := 5
	p := a
	fmt.Println(p)
	p = &a + 3
}

The above code will not be compiled and a compilation error will be reported

invalid operation: &a + 3 (mismatched types *int and int)

That is to say, Go does not allow mathematical operations on pointers.

Limitation 2: different types of pointers are not allowed to convert to each other.

The following program also fails to compile successfully:

package main
func main() {	
	var a int = 100
	var f *float64
	f = &a
}

Compilation errors will also be reported:

cannot use &a (type *int) as type *float64 in assignment

Limitation 3: different types of pointers cannot be compared and assigned to each other

This restriction is the same as the above restriction 2, because the pointer can't do type conversion, so you can't use = = or= In comparison, pointer variables of different types cannot be assigned to each other. For example, the following will also report compilation errors.

package main
func main() {	
	var a int = 100
	var f *float64
	f = &a
}

The pointer of Go language is type safe, but it has many limitations, so Go also provides a general pointer for type conversion, which is unsafe.Pointer provided by unsafe package. In some cases, it makes code more efficient and, of course, more dangerous.

unsafe package

Unsafe package is used in the compilation phase, which can bypass the type system of Go language and operate memory directly. For example, the unsafe package is used to manipulate the non exported members of a structure. The unsafe package gives me the ability to read and write memory directly.

The unsafe package has only two types and three functions, but it is very powerful.

type ArbitraryType int
type Pointer *ArbitraryType 
func Sizeof(x ArbitraryType)  uintptr
func Offsetof(x ArbitraryType) uintptr
func Alignof(x ArbitraryType) uintptr

Arbitrrytype is an alias of int, which has a special meaning in Go. Represents an arbitrary Go expression type. Pointer is an alias of int pointer type. In Go, any pointer type can be converted to unsafe.Pointer type.

The parameters of the three functions are all of arbitrrytype type, that is, they accept variables of any type.

  • Sizeof accepts any type of value (expression) and returns the number of bytes it occupies. This is different from that in c language. The parameter of sizeof function in c language is type, and here is a value, such as a variable.
  • Offset of: returns the number of bytes from the position of the structure member in memory to the beginning of the structure. The passed parameter must be a member of the structure (the address pointed by the structure pointer is the address at the beginning of the structure, that is, the memory address of the first member).
  • Alignof returns the number of aligned bytes of variables. Although this function receives variables of any type, it has one premise: if the variable is a struct type, it can not directly take the struct type variable as a parameter, but can only take the value of the struct type variable as a parameter.

Note that the results returned by the above three functions are of uintptr type, which can be converted to unsafe.Pointer. All three functions are executed during compilation

unsafe.Pointer

unsafe.Pointer is called universal pointer. There are four important descriptions of this type in official documents

  1. Any type of Pointer can be converted to Pointer
  2. Pointer can be converted to any type of pointer
  3. uintptr can be transformed into Pointer
  4. Pointer can be converted to uintptr

unsafe.Pointer is a specially defined pointer type. In Go language, it is a bridge for the conversion of various pointers. It can hold the address of any type of variable.

What is called "can hold any type of variable address"? This means that the variable converted by unsafe.Pointer must be pointer type, otherwise the compilation will report an error.

a := 1
b := unsafe.Pointer(a) //report errors
b := unsafe.Pointer(&a) // correct

Like ordinary pointers, unsafe.Pointer can also be compared, and can be compared with nil to determine whether it is null.

unsafe.Pointer can't directly carry out mathematical operation, but it can be converted to uintptr, and then it can be converted to unsafe.Pointer.

uintptr

uintptr is a built-in type of Go language, which can store pointers. The underlying data type on 64 bit platform is uint64.

// uintptr is an integer type that is large enough to hold the bit pattern of any pointer.
type uintptr uintptr

An unsafe.Pointer can also be converted to uintptr type, and then saved to a variable of uintptr type (Note: this variable only has the same numerical value as the current pointer, not a pointer), and then used for necessary pointer numerical operation( Uintptr is an unsigned integer, enough to hold an address). Although this conversion is reversible, converting a uintptr to an unsafe.Pointer at random may destroy the type system, because not all numbers are valid memory addresses.

Another thing to note is that uintptr has no pointer semantics, which means that the memory address where the uintptr value is stored will be recycled when Go GC occurs. unsafe.Pointer has pointer semantics to protect it from garbage collection.

After talking about so many conceptual topics, the NMS will take you to see how to use unsafe.Pointer for pointer conversion and combine with the private members of uintptr read-write structure.

Application examples

Pointer type conversion using unsafe.Pointer

import (
	"fmt"
	"reflect"
	"unsafe"
) 
func main() {
 
    v1 := uint(12)
    v2 := int(13)
 
    fmt.Println(reflect.TypeOf(v1)) //uintfmt.Println(reflect.TypeOf(v2)) //int
 
    fmt.Println(reflect.TypeOf(&v1)) //*uintfmt.Println(reflect.TypeOf(&v2)) //*int
 
    p := &v1
    p = (*uint)(unsafe.Pointer(&v2)) //Type conversion using unsafe.Pointer
 
    fmt.Println(reflect.TypeOf(p)) // *unit
    fmt.Println(*p) //13
}

Using unsafe.Pointer to read and write private members of a structure

Through the Offsetof method, we can get the offset of the structure member, and then get the address of the member. By reading and writing the memory of the address, we can change the value of the member.

Here is a memory allocation related fact: the structure will be allocated a continuous block of memory, and the address of the structure also represents the address of the first member.

package main 
import (
	"fmt"
	"unsafe"
) 
func main() { 
	var x struct {
        a int
        b int
        c []int
    } 
    // The parameter of the unsafe.Offsetof function must be a field, such as x.b. the method will return the offset of the B field from the X starting address, including possible holes.
    pb := &x.b //Equivalent Pb: = (* int) (unsafe. Pointer (uintptr (unsafe. Pointer (& x)) + unsafe. Offsetof (x.b)))
    *pb = 42
    fmt.Println(x.b) // "42"
 }

Although the above writing is cumbersome, it is not a bad thing here, because these functions should be used carefully. Don't try to introduce a temporary variable of type uintptr, because it may break the security of the code

It is risky to change to the following usage:

tmp := uintptr(unsafe.Pointer(&x)) + unsafe.Offsetof(x.b)
pb := (*int16)(unsafe.Pointer(tmp))
*pb = 42

With the execution of the program, goroutine will often expand or shrink the stack, copy the data of the old stack memory to the new stack area, and then change the direction of all the pointers. An unsafe.Pointer is a pointer, so when the data it points to is moved to a new stack, the pointer will also be updated. But the temporary variable of uintptr type is just a normal number, so its value should not be changed. In the above error code, a non pointer temporary variable tmp is introduced, which makes the system unable to correctly recognize that it is a pointer to variable x. When the second statement is executed, the data of variable x may have been transferred, and the temporary variable tmp is no longer the current address of & x.b. The third statement to the previous invalid address space assignment statement will make the whole program crash.

Zero copy conversion of string and [] byte

This is a very classic example. Realize zero copy conversion between string and bytes slice.

The types of string and [] byte at run time are represented as reflect.StringHeader and reflect.SliceHeader

type SliceHeader struct {
	Data uintptr
	Len  int
	Cap  int
}
type StringHeader struct {
	Data uintptr
	Len  int
}

We only need to share the underlying [] byte array to achieve zero copy conversion.

The code is relatively simple, without detailed explanation. By constructing reflect.StringHeader and reflect.SliceHeader, the conversion between string and [] byte is completed.

import (
	"fmt"
	"reflect"
	"unsafe"
)

func main() {
	s := "Hello World"
	b := string2bytes(s)
	fmt.Println(b)
	s = bytes2string(b)
	fmt.Println(s)

}

func string2bytes(s string) []byte {
	stringHeader := (*reflect.StringHeader)(unsafe.Pointer(&s))

	bh := reflect.SliceHeader{
		Data: stringHeader.Data,
		Len: stringHeader.Len,
		Cap: stringHeader.Len,
	}

	return *(*[]byte)(unsafe.Pointer(&bh))
}

func bytes2string(b []byte) string {
	sliceHeader := (*reflect.SliceHeader)(unsafe.Pointer(&b))

	sh := reflect.StringHeader{
		Data: sliceHeader.Data,
		Len:  sliceHeader.Len,
	}

	return *(*string)(unsafe.Pointer(&sh))
}

summary

Unsafe package is also widely used in Go source code. Through unsafe package, we can bypass the limitation of Go pointer and operate memory directly. It is risky to use it, but in some scenarios, it can improve the efficiency of code.

Link to the original text: https://studygolang.com/articles/32744

Tags: Go

Posted by maskme on Sat, 22 May 2021 02:40:57 +0930