preface
for range syntax sugar is provided in golang to help us easily traverse data or slices, as well as map and channel; In t rust, for in syntax sugar is provided to help us easily traverse Array, Vector or slice, or map. This article will introduce the mechanism behind these syntax sugars in the two languages through various types of examples and the traps that may be encountered if they are not used properly.
Comparison of traversal methods
Golang
Let's first look at the three traversal methods in Golang:
arr := []int{2, 4, 6} // Just index for i := range arr { fmt.Println(i) } // Index and value for i, v := range arr { fmt.Println(i, v) } // As long as the value for _, v := range arr { fmt.Println(v) }
output
0
1
2
0 2
1 4
2 6
2
4
6
Rust
First of all, we need to understand that there are four different methods to traverse arr in t rust. The essence is to convert the current arr to the corresponding iterator.
let arr = vec![1, 2, 3]; for a in arr { println!("{}", a); } let arr = vec![1, 2, 3]; for a in arr.into_iter() { println!("{}", a); } let arr = vec![1, 2, 3]; for a in arr.iter() { println!("{}", a); } let mut arr = vec![1, 2, 3]; for a in arr.iter_mut() { println!("{}", a); }
Where for a in arr is equivalent to for a in arr.into_iter(), this traversal method will move out the variables in the current arr. After traversal, arr can no longer be used.
for a in arr.iter() returns the immutable borrowing of each item in arr, for a in arr.iter_mut() returns the variable borrowing of each item in the arr. After traversal, the ARR can continue to be used.
If an index is needed, the enumerate method of the iterator should be used. This method encapsulates the current iterator into an iterator whose element is an iterator containing an index and a tuple of the element, as follows:
let arr = vec![1, 2, 3]; for (i, v) in arr.into_iter().enumerate() { println!("{} {}", i, v); } let arr = vec![1, 2, 3]; for (i, v) in arr.iter().enumerate() { println!("{} {}", i, v); } let mut arr = vec![1, 2, 3]; for (i, v) in arr.iter_mut().enumerate() { println!("{} {}", i, v); }
There is also a common traversal method in Rust that only returns indexes:
for i in 1..4{ println!("{}",i); }
Output:
1
2
3
You can see beg End is a left closed and right open interval, excluding end.
Concurrent task distribution
In actual project development, it is often necessary to distribute a set of task data to different goroutine s or thread s for concurrent execution.
Golang
Let's first look at the common mistakes in Golang
func slice_trap_wrong() { in := []int{1, 2, 3} for _, b := range in { go func() { fmt.Println("job", b) }() } } func main() { slice_trap_wrong() select {} }
Results of operation:
job 3 job 3 job 3 fatal error: all goroutines are asleep - deadlock! goroutine 1 [select (no cases)]: main.main() /home/repl/904b2209-3e69-479f-a530-1954e1cf59cd/main.go:25 +0x25 exit status 2 ** Process exited - Return Code: 1 **
In order to ensure that all goroutines in the program can be fully executed, the main goroutine cannot exit in advance. Here, we use select {} to permanently block the current main function, so we will eventually report fatal error: all goroutines are asleep - deadlock! Error.
We ignore the last error information. We can see that after the three goroutines execute the do task, the printed data is 3. The reason is that for, b: = b in the scope of range in is the same variable, so b associated with closures in three goroutines newly created in the program is the same variable. According to the GPM model, these three goroutines are not necessarily executed immediately after they are created, such as for, b: = after traversing the range in, the value of b is 3. When the three goroutines really start to execute, because they execute the same b, they get the last value of b, 3
There are two ways to solve the problem after knowing the cause of the error:
The task data is passed in as an input parameter. Because the parameters in golang are passed by value, the b entered into goroutine is a new variable after replication
func slice_trap_right() { in := []int{1, 2, 3} for _, b := range in { go func(b int) { fmt.Println("job", b) }(b) } }
2. Manually copy the closure of goroutine
go func slice_trap_right() { in := []int{1, 2, 3} for _, b := range in { b:=b go func() { fmt.Println("job", b) }() } }
Rust
Next, let's look at the wrong writing in Rust:
let arr = vec![1, 2, 3]; let mut t_arr = vec![]; for a in arr { t_arr.push(thread::spawn(|| println!("{}", a))); } // Wait for all threads to finish executing for t in t_arr { t.join().unwrap(); }
The compiler directly reports an error because the life cycle of thread exceeds the life cycle of a in for a in arr:
error[E0373]: closure may outlive the current function, but it borrows `a`, which is owned by the current function --> src/main.rs:16:34 | 16 | t_arr.push(thread::spawn(|| println!("{}", a))); | ^^ - `a` is borrowed here | | | may outlive borrowed value `a` | note: function requires argument type to outlive `'static` --> src/main.rs:16:20 | 16 | t_arr.push(thread::spawn(|| println!("{}", a))); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: to force the closure to take ownership of `a` (and any other referenced variables), use the `move` keyword | 16 | t_arr.push(thread::spawn(move || println!("{}", a))); | ++++ For more information about this error, try `rustc --explain E0373`. error: could not compile `channel` due to previous error
The solution is given in the Rust error prompt. Use the move keyword to let the closure obtain the ownership of a:
let arr = vec![1, 2, 3]; let mut t_arr = vec![]; for a in arr { println!("{:p}", &a); t_arr.push(thread::spawn(move || println!("{} {:p}", a, &a))); } for t in t_arr { t.join(); }
Output results
0x7ffe3a4398d4 0x7ffe3a4398d4 1 0x7f68b4b87b8c 0x7ffe3a4398d4 2 0x7f68b4986b8c 3 0x7f68b4785b8c
You can see that a running in the final thread is not the same as a in for a in arr, and a in different threads is also a different variable.
We print out the original address of the member in the arr:
let arr = vec![1, 2, 3]; let mut t_arr = vec![]; for a in arr.iter(){ println!("{:p}",a); } for a in arr { println!("{:p}", &a); t_arr.push(thread::spawn(move || println!("{} {:p}", a, &a))); } for t in t_arr { t.join(); }
Get output:
0x5633d313ead0 0x5633d313ead4 0x5633d313ead8 0x7ffdbb7d2cc4 0x7ffdbb7d2cc4 1 0x7fee6d482b8c 0x7ffdbb7d2cc4 2 0x7fee6d281b8c 3 0x7fee6d080b8c
You can know that the members in the arr are first transferred to a, and then copied again after moving.
Let's try the structure again without derivative copy trait.
`pub struct People {
pub age: i32,
}`
Write the test code as follows:
let arr = vec![People { age: 1 }, People { age: 2 }, People { age: 3 }]; let mut t_arr = vec![]; for a in arr.iter() { println!("{:p}", a); } for a in arr { println!("{:p}", &a); t_arr.push(thread::spawn(move || println!("{} {:p}", a.age, &a))); } for t in t_arr { t.join(); }
Get output
0x555c64ec9ad0 0x555c64ec9ad4 0x555c64ec9ad8 0x7ffdc5766474 0x7ffdc5766474 1 0x7f0b6f4b1b8c 0x7ffdc5766474 2 0x7f0b6f2b0b8c 3 0x7f0b6f0afb8c
You can see that the People variable in arr has also been moved twice.
If the data structure in the project is large and multiple moves are unacceptable, you can use the following method to optimize through Arc.
Arc optimization
Arc is a thread safe reference count pointer that can be safely passed before the thread.
let arr = vec![ Arc::new(People { age: 1 }), Arc::new(People { age: 2 }), Arc::new(People { age: 3 }), ]; let mut t_arr = vec![]; for a in arr.iter() { // When copying, the strong reference count is actually increased let a = Arc::clone(a); t_arr.push(thread::spawn(move || { // Through Arc::strong_count can get the strong reference count in Arc println!("in thread {} count:{}", a.age, Arc::strong_count(&a)) })); } for t in t_arr { t.join().unwrap(); } // After the thread is executed, the internal a is drop ped, which actually only reduces the strong reference count for a in arr { println!("final: {} count:{}", a.age, Arc::strong_count(&a)); }
output
in thread 1 count:2
in thread 3 count:2
in thread 2 count:2
final: 1 count:1
final: 2 count:1
final: 3 count:1
Cyclic perpetual motion machine
If we modify the elements of the array while traversing the array, can we get a loop that will never stop?
Let's look at the golang version first
Golang
This example comes from the implementation of Go language for and range
func main() { arr := []int{1, 2, 3} for _, v := range arr { arr = append(arr, v) } fmt.Println(arr) } $ go run main.go 1 2 3 1 2 3
The output of the above code means that the loop only traverses three elements in the original slice. The additional elements we added when traversing the slice will not increase the execution times of the loop, so the loop finally stops. So why not carry it out all the time? golang converts for range before generating the machine code. It will first obtain the length of the input arr and save it, and then execute the classic three-stage for loop. For details, see the explanation of dravness in the implementation of Go language for and range, which will not be discussed in detail here.
Rust
Let's take a look at the above circular perpetual motion machine to be implemented in t rust:
let mut arr= vec![1,2,3]; for a in arr.iter() { println!("{}", a); arr.push(4); }
Trust reports an error. Due to the ownership security mechanism of trust, there can only be one mut borrowing, and there can be no other read-only borrowing at this time.
error[E0502]: cannot borrow `arr` as mutable because it is also borrowed as immutable --> src/main.rs:41:9 | 39 | for a in arr.iter() { | ---------- | | | immutable borrow occurs here | immutable borrow later used here 40 | println!("{}", a); 41 | arr.push(4); | ^^^^^^^^^^^ mutable borrow occurs here
If there is such a need to generate new data based on the data in the current Vector in a real project, for example, in the process of DFS or BFS traversal, the child nodes of the current parent node should be added to the traversal array
let mut arr = vec![1, 2, 3]; let mut i = 0; let len = arr.len(); while i < len { arr.push(arr[i]); i += 1; }
Note that the above code also calculates the length of the current arr in advance. If you use the following method, it will become an infinite loop.
let mut arr = vec![1, 2, 3]; let mut i = 0; while i < arr.len() { arr.push(arr[i]); i += 1; }
Magic pointer
Golang
The Golang example also comes from the implementation of Go language for and range
func main() { arr := []int{1, 2, 3} newArr := []*int{} for _, v := range arr { newArr = append(newArr, &v) } for _, v := range newArr { fmt.Println(*v) } } $ go run main.go 3 3 3
Many people think that 1, 2 and 3 are returned because they think that v in for range is temporarily generated inside each loop. They think so
for i:=0;i<len(arr);i++{ v := arr[i]; newArr = append(newArr, &v) }
Actually
v:=0 for i:=0;i<len(arr);i++{ v = arr[i]; newArr = append(newArr, &v) }
You can read the original text and the following comments in detail. You can also print out the traversal addresses in arr and newArr.
Rust
We use for a in arr.iter() to directly obtain the immutable borrowing of each traversal in arr, so new in the program_ Arr can print normally as expected
let arr: Vec<i32> = vec![1, 2, 3]; let mut new_arr: Vec<&i32> = vec![]; for a in arr.iter() { new_arr.push(a); } for a in new_arr { println!("{}", *a); }
output
1
2
3
It should be noted that if you write in the following way, you will compile and report an error
let arr: Vec<i32> = vec![1, 2, 3]; let mut new_arr: Vec<&i32> = vec![]; for a in arr { new_arr.push(&a); } for a in new_arr { println!("{}", *a); }
The reason is that after the execution of for a in arr, a is drop ped. The declaration cycle of a here is not enough to support for a in new_arr:
error[E0597]: `a` does not live long enough --> src/main.rs:60:22 | 60 | new_arr.push(&a); | ^^ borrowed value does not live long enough 61 | } | - `a` dropped here while still borrowed 62 | for a in new_arr { | ------- borrow later used here
In addition, we can make another test example to see whether a in for a in arr creates a new variable or reuses a variable in each iteration:
let arr: Vec<i32> = vec![1, 2, 3]; for a in arr.iter(){ println!("{:p}",a); } for a in arr { println!("{:p}",&a); }
First, we get the invariant borrowing of each element in arr through arr.iter(), that is, print the element address, and then print the address of a in each iteration of for a in arr, as follows:
0x563932220b10 0x563932220b14 0x563932220b18 0x7fff5e40c2e4 0x7fff5e40c2e4 0x7fff5e40c2e4
The printing addresses in for a in arr are the same, indicating that the same variable is reused, which is similar to for range in golang.
Traversal map
Here is a simple comparison of the traversal methods of the two languages. A more detailed comparison of map will be described in a special issue.
Golang
scores := make(map[string]int) scores["Yello"] = 50 scores["Blue"] = 10 for k, v := range scores { fmt.Println(k, v) }
Rust
use std::collections::HashMap; let mut scores = HashMap::new(); scores.insert(String::from("Blue"), 10); scores.insert(String::from("Yellow"), 50); for (key, value) in &scores { println!("{}: {}", key, value); }
reference resources
Implementation of Go language for and range https://draveness.me/golang/d...