[Linux operating system] - process control -- process waiting

catalogue

A preliminary understanding of fork function

Return value of fork function

fork general usage

Reason for fork call failure

Process termination

Process exit scenario

How the process exits

Use of process exit

Process waiting

Process waiting necessity

Method of process waiting

wait method

waitpid method (emphasis)

status understanding

waitpid obtains the correct expression of exit code and signal

Understanding of waitpid

[understanding of options]:

A preliminary understanding of fork function

fork function is a very important function in linux. It creates a new process from an existing process. The new process is a child process, while the original process is a parent process

#include <unistd. h> (available through man fork)
pid_t fork(void);
Return value: 0 is returned from the process, the parent process returns the child process id, and - 1 is returned in case of error

The process calls fork. When the control is transferred to the fork code in the kernel, the kernel does the following:

  • Allocate new memory blocks and kernel data structures to child processes
  • Copy part of the data structure content of the parent process to the child process
  • Add sub process to system process list
  • fork returns to start scheduler scheduling

When a process calls fork, there are two processes with the same binary code. And they all run in the same place. But each process will be able to start their own journey. See the following procedure

int main( void )
{
    pid_t pid;
    printf("Before: pid is %d\n", getpid());
    if ( (pid=fork()) == -1 )perror("fork()"),exit(1);
    printf("After:pid is %d, fork return %d\n", getpid(), pid);
    sleep(1);
    return 0;
}

Operation results:
[root@localhost linux]# ./a.out
Before: pid is 43676
After:pid is 43676, fork return 43677
After:pid is 43677, fork return 0

Here you see three lines of output, one line before and two lines after. Process 43676 prints the before message first, and then it prints after. Another after message has 43677 printed. Note that process 43677 does not print before, why? As shown in the figure below

Therefore, before fork, the parent process executes independently. After fork, the parent and child execution flows execute separately. Note that after fork, it is entirely up to the scheduler to decide who will execute first.

Return value of fork function

  • The child process returns 0,
  • The parent process returns the pid of the child process

Copy on write

Usually, the parent-child code is shared. When the parent-child doesn't write again, the data is also shared. When either party tries to write, they will copy each copy in the form of copy on write.

fork general usage

  • A parent process wants to copy itself so that the parent and child processes execute different code segments at the same time. For example, a parent process waits for a client request and generates a child process to process the request.
  • A process executes a different program. For example, when the child process returns from fork, the exec function is called.

Reason for fork call failure

  • There are too many processes in the system
  • The number of processes of the actual user exceeds the limit

Process termination

Process exit scenario

  • The code runs and the result is normal
  • The code is running, and the result is abnormal
  • Code terminated abnormally

When we write such a program, we write a makefile file to run the compiler

[wjy@VM-24-9-centos 16]$ cat makefile
myproc:myproc.c
	gcc -o $@ $^
.PHONY:clean
clean:
	rm -r myproc
[wjy@VM-24-9-centos 16]$ cat myproc.c
#include <stdio.h>

int main()
{
  printf("hello world\n");
  return 100;
}
[wjy@VM-24-9-centos 16]$ make
gcc -o myproc myproc.c
[wjy@VM-24-9-centos 16]$ ./myproc
hello world
[wjy@VM-24-9-centos 16]$ echo $?
100
[wjy@VM-24-9-centos 16]$ echo $?
0
[wjy@VM-24-9-centos 16]$ echo $?
0

$per echo? Will display the exit code of the previous program. Whether echo is a process or a command, most processes of the command are a process. In the command line, the exit code obtained by echo is the same every time, which is 0. At this time, the return value when the running result is correct.

When ls is followed by options such as - a,-b,-c, some options are not supported. When the error option is added, ls terminates directly. When echo is used to test its exit code, it is 2. Its exit code is equivalent to the return value of the main function. Here, 2 is the return value after the code is run and the result is incorrect.

[wjy@VM-24-9-centos 16]$ ls -a -b -c -d -e
ls: invalid option -- 'e'
Try 'ls --help' for more information.
[wjy@VM-24-9-centos 16]$ echo $?
2

The above two cases are the performance results after the code is run:

  • 0 represents the return value of successful operation (success)
  • If it is not 0, that is, the returned value with incorrect result after running is completed. Every number other than 0 is the reason for the wrong operation, indicating why the program runs incorrectly. Each error is represented by a different number.

strerror: used to convert numbers into error messages.

Non-zero, each represents an error reason. The error information here is based on the library of C language. If you don't want to use the system, you can also implement it yourself. You can define a pointer array, such as char* arr[140], each pointer stores an error code information, and the subscript represents the error code number.

[wjy@VM-24-9-centos 16]$ cat myproc.c
#include <stdio.h>
#include <string. h> / / strError header file
int main()
{
  for(int i=0;i<140;i++)
  {
    printf("%d:%s\n",i,strerror(i));
  }
  return 0;
}
//Error code information
[wjy@VM-24-9-centos 16]$ ./myproc
0:Success
1:Operation not permitted
2:No such file or directory
3:No such process
//... There's too much in the middle

//There is no error code after printing to 133, so there are 134 error code messages starting from 0
132:Operation not possible due to RF-kill
133:Memory page has hardware error
134:Unknown error 134
135:Unknown error 135
136:Unknown error 136
137:Unknown error 137
138:Unknown error 138
139:Unknown error 139

So when using the ls command to display a file, but an error message is reported to tell us that there is no file, use echo $? Check the error code, which is 2, and then correspond to the above table. It is found that the information corresponding to 2 is No such file or directory.

[wjy@VM-24-9-centos 16]$ ls -a myfile.txt
ls: cannot access myfile.txt: No such file or directory
[wjy@VM-24-9-centos 16]$ echo $?
2

The third case is the abnormal termination of the code.

In the following code, make will report an error when compiling, and there is an error statement at the end of the run. This kind of program terminates abnormally in the middle of running. We call it program crash!

When we use echo $? When checking the process exit code, it is found that it is 136. No matter where the error is reported or where the program starts to crash, the process is 136 (you can try it yourself), and the error code information corresponding to 136 is unknown and unknown. This is because the program compilation error, the error code information has no meaning. What needs to be solved is no longer under the control of the error code.

[wjy@VM-24-9-centos 16]$ cat myproc.c
#include <stdio.h>
#include <string.h>
int main()
{
  for(int i=0;i<140;i++)
  {
    printf("%d:%s\n",i,strerror(i));
  }
  int a=10;
  a/=0;
  printf("%d\n",a);
  return 0;
}

//make compilation found errors
[wjy@VM-24-9-centos 16]$ vim myproc.c
[wjy@VM-24-9-centos 16]$ make
gcc -o myproc myproc.c -std=c99
myproc.c: In function 'main':
myproc.c:10:4: warning: division by zero [-Wdiv-by-zero]
   a/=0;
[wjy@VM-24-9-centos 16]$ ./myproc
0:Success
1:Operation not permitted
2:No such file or directory
3:No such process
//...
132:Operation not possible due to RF-kill
133:Memory page has hardware error
134:Unknown error 134
135:Unknown error 135
136:Unknown error 136
137:Unknown error 137
138:Unknown error 138
139:Unknown error 139
Floating point exception//At the end of the run, there is an error prompt statement
[wjy@VM-24-9-centos 16]$ echo $?
136

How the process exits

Process exit mode

  1. Return from main
  2. Call exit
  3.  _exit

What we printed above is the exit code of the main function. Can we get the exit code from the non main function?

When a non main function is called, its value is printed. The return value of func function / non main function is called function return.

The return of the main function is called process exit.

[wjy@VM-24-9-centos 16]$ cat myproc.c
#include <stdio.h>
#include <string.h>

int func()
{
  printf("func test\n");
  return 1;
}
int main()
{
  func();
  for(int i=0;i<5;i++)
  {
    printf("%d:%s\n",i,strerror(i));
  }
  return 0;
}
//Operation results
[wjy@VM-24-9-centos 16]$ make
gcc -o myproc myproc.c -std=c99
[wjy@VM-24-9-centos 16]$ ./myproc
func test
0:Success
1:Operation not permitted
2:No such file or directory
3:No such process
4:Interrupted system call

Another is called process termination, which is to exit the process with exit (process code).

After the following program uninstalls the func function of exit and runs, the code behind exit will not be executed. And its process exit code is 12 So no matter where exit is called, the process will exit.

[wjy@VM-24-9-centos 16]$ cat myproc.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h> 
int func()
{
  printf("func test\n");
  return 1;
}
int main()
{
  func();
  exit(12);//Set the process exit code to 12
  for(int i=0;i<5;i++)
  {
    printf("%d:%s\n",i,strerror(i));
  }
  return 0;
}
//Operation results
[wjy@VM-24-9-centos 16]$ ./myproc
func test
[wjy@VM-24-9-centos 16]$ echo $?
12
hello world[wjy@VM-24-9-centos 16]$ echo $?
0

Exit not only displays the exit code, but also has a function. See the following code

When the hello world statement is printed without line breaks, and the sleep statement is dormant for four seconds, the hello world statement will be put into the buffer first, loaded from the buffer and printed on the display after four seconds. Displaying the process exit code with return allows the contents of the buffer to be loaded.

Similarly, exit(EXIT_SUCCESS) is the same as return 0. The process exit code is 0 (because the corresponding exit code of success is 0), and after sleep 4 seconds, refresh the content from the buffer to the display. Exit and main return themselves require the system to refresh the buffer.

#include <stdio.h>
#include <unistd.h>
int main()
{
  printf("hello world");
  sleep(4);
  return 0;
}
//Operation results
[wjy@VM-24-9-centos 16]$ ./myproc
hello world[wjy@VM-24-9-centos 16]$ 

But_ Exit, although it can display the exit code, it will not display the desired content.

 

Summary:

  1. The main function return represents the process exit; The return of a non main function represents the return of the function.
  2. When exit is called anywhere, it means to terminate the process. The parameter is the exit code.
  3. _The exit terminates the process and forcibly terminates the process. Subsequent closing work of the process, such as refreshing the buffer, will not be carried out. (the buffer here is the user and buffer)

Let's use a picture to explain:

exit() will have the process shown in the following figure, and_ Exit will go directly to the place where the program is terminated. Both have to go to the end of the kernel termination process. If the buffer is a buffer on an operating system such as kernel, exit and_ Exit needs to flush the buffer, but_ Exit does not refresh, indicating that the buffer is not on the operating system.

Use of process exit

What is done at the OS (operating system) level when the process exits?

At the system level, there is one process missing: the PCB of the process, mm_struct, page table and various mapping relationships, and the space of code + data application are released.

Process waiting

Process waiting necessity

What is process waiting?

When a process fork s and creates a child process, the child process is to help the parent process complete a certain task, so as to create a child process. In order to let the parent process know the completion progress of the child process and when it is completed, the parent process needs to wait for the child process to exit through wait/waitpid. This phenomenon is called process waiting.

Why do you want the process to wait?

  1. By obtaining the exit information of the sub process, we can know the execution result of the sub process. We need to know how the tasks assigned by the parent process to the child process are completed. For example, when the subprocess runs, whether the result is right or wrong, or whether it exits normally.
  2. Can guarantee: timing problem, if the child process exits first and the parent process exits later. If the parent process has not survived the child process, the child process will become an orphan process. The child process is adopted by the system, and the parent process can't see the exit code, so it can't get all kinds of practical problems of the exit code.
  3. When the process exits, it will enter the zombie state first, which will cause memory leakage. It is necessary to release the resources occupied by the child process through the parent process wait. If the parent process does not release the resources of the child process, once the process becomes a zombie state, it will be invulnerable, and the kill-9 of "killing without blinking an eye" can do nothing, because no one can kill a dead process.
  4. The parent process reclaims the resources of the child process and obtains the exit information of the child process by waiting for the process

Method of process waiting

In the following code, after fork, the parent process will exit directly, the child process will continue to run, and the child process will exit after 5 seconds, and the child process will become an orphan process,

Therefore, we need the parent process to wait. Even if the parent process does nothing, it must wait for the child process to release and return the process exit code before it can be released.

wait method

pid_t wait(int*status);

  • Return value:

The pid of the waiting process is returned successfully, and - 1 is returned for failure.

  • Parameters:

Output type parameter to obtain the exit status of the child process. If you don't care, it can be set to NUL

If waiting is added, this code will become: at first, the parent process sleeps for 10 seconds. During these 10 seconds, the child process will run first, and the parent process will not track the state of the child process, because the parent process is sleeping. After waiting for 5 seconds, the child process will run into Z (zombie) state. After waiting for 10 seconds, the parent process starts running, reclaims the child process, and the child process will not be. After recycling, wait another 10 seconds and the process ends. This proves that waiting can reclaim child processes.

#include <sys/types.h>
#include <sys/wait.h>

int main()
{
  pid_t id=fork();
  if(id==0)
  {
    //child
    int cnt=5;
    while(cnt)
    {
      printf("child[%d] is running:cnt is:%d\n",getpid(),cnt);//Get your own pid
      cnt--;
      sleep(1);
    }
    exit(0);
  }
  sleep(10);
  printf("father wait begin!\n");
  //parent
  pid_t ret=wait(NULL);
  if(ret>0)//Wait for success and return the child process pid
  {
    printf("father wait:%d\n",ret);
  }
  else{//If the wait fails, - 1 is returned
    printf("father wait failed!\n");
  }
  sleep(10);

  return 0;
}
//Operation results
[wjy@VM-24-9-centos 16]$ ./myproc
child[9747] is running:cnt is:5
child[9747] is running:cnt is:4
child[9747] is running:cnt is:3
child[9747] is running:cnt is:2
child[9747] is running:cnt is:1
father wait begin!
father wait:9747

waitpid method (emphasis)

pid_ t waitpid(pid_t pid, int *status, int options);

If the pid is - 1, the parent process waits for any child process, which is equivalent to wait; If it is a child process pid, the specified child process is waiting.

status: is an output parameter.
Return value:
When it returns normally, waitpid returns the process ID of the collected sub process;
If the option WNOHANG is set, and the waitpid in the call finds that there are no exited child processes to collect, it returns 0;
If there is an error in the call, it will return -1. At this time, errno will be set to the corresponding value to indicate the error;

pid:
Pid=-1, wait for any child process. Equivalent to wait.
        pid>0. Wait for a child process whose process ID is equal to pid.
status:
Disabled (status): true if it is the status returned by the normally terminated child process. (check whether the process exits normally)
WEXITSTATUS(status): if wired is non-zero, extract the subprocess exit code. (check the exit code of the process)
options:
WNOHANG: if the child process specified by pid does not end, the waitpid() function returns 0 and does not wait. If it ends normally, it will return to this sub entry
ID of the process.

So how to write with waitpid method? In fact, only slightly changed the following, and the result is the same as above.

int main()
{
//...
  //pid_t ret=waitpid(id,NULL,0);// Wait to specify a process
  pid_t ret=waitpid(-1,NULL,0);//Waiting for any child process is equivalent to wait ing
  if(ret>0)//Wait for success and return the child process pid
  {
    printf("father wait:%d,success\n",ret);
  }
  else{//If the wait fails, - 1 is returned
    printf("father wait failed!\n");
  }
  sleep(10);
}

//Operation results
[wjy@VM-24-9-centos 16]$ ./myproc
child[11290] is running:cnt is:5
child[11290] is running:cnt is:4
child[11290] is running:cnt is:3
child[11290] is running:cnt is:2
child[11290] is running:cnt is:1
father wait begin!
father wait:11290,success

status understanding

Status is an output parameter, which is a pointer. In other words, as long as a pointer is passed in, the content in it can be changed (by dereference). Status stores the status of the waiting process.

When we set the status to 0, the status is 0

If you change the exit code of the child process, what is the state of the parent process waiting for the child process? Its status becomes an unknown number. Why?

Because the parent process must let the parent process get the result of the child process through status. What status result the parent process gets must be strongly related to how the child process exits!! If the subprocess exits, it is the process exit code we talked about. Then this result is the three results of the process exit mentioned above:

  • After the code runs, the correct result returned -- 0;
  • After the code runs, the incorrect result of repentance -- a non-zero number
  • Code terminated abnormally.

status is 32 bits: only the lower 16 bits are used, and the higher 16 bits are not used for the time being.

When the program runs, an exit code will be returned, and an abnormal program code will give a signal. So how do we know the program is finished? Therefore, when the process runs, the sub process pid will give two kinds of information. First, give a signal to check whether the sub process has been executed normally. This signal will be sent to bits 0-7. If the program gives a signal, the code of the process terminates abnormally. If it is normal, most of them are 0, then the program will run and the last eight bits will be enabled to give the exit code information, which is the second kind of information.

Another is the core dump flag, which is also a signal.

Let's try to verify the value of the exit code and signal. Because the exit code of the exit state is in the upper eight bits, move it to the right by 8 bits, and then with the upper (& 0.. 0 1111 1111), hexadecimal represents 0xFF So you can get the exit code.

The abnormal signal is in 0-7 bits, so it is directly connected with the upper (&) 0111111,16 hexadecimal to represent 0x7F, and the signal code is obtained.

In the following program, the exit code of the child process is set to 11. After the parent process waits to run, the process exit signal is 0, indicating that the program ends normally without exception, and the displayed process exit code is the process exit code set in the child process exit. Verified our above problem.

int main()
{
  pid_t id=fork();
  if(id==0)
  {
    //child
    int cnt=3;
    while(cnt)
    {
      printf("child[%d] is running:cnt is:%d\n",getpid(),cnt);
      cnt--;
      sleep(1);
    }
    exit(11);
  }
  //parent
  int status=0;
  pid_t ret=waitpid(id,&status,0);
  if(ret>0)//Wait for success and return the child process pid
  {
    printf("father wait:%d,success,status exit code:%d,status exit signal:%d\n",ret,(status>>8)&0xFF,status&0x7F);//Exit code and exit status of subprocess
  }
  else{
    printf("father wait failed!\n");
  }

  return 0;
}

//Operation results
[wjy@VM-24-9-centos 16]$ ./myproc
father wait begin!
child[30125] is running:cnt is:3
child[30125] is running:cnt is:2
child[30125] is running:cnt is:1
father wait:30125,success,status exit code:11,status exit signal:0

The following are the different signals corresponding to the abnormal exit of the process, which can be viewed with a kill -l command.

So who is the parent process of all processes started by the command line? Through verification, it is found that its parent process ppid corresponds to bash, which is the parent process of all processes started by the command line! Bash must get the push result of the sub process through wait, so we use echo &? You can see the exit code of the child process!

waitpid obtains the correct expression of exit code and signal

Above, we obtain the exit code and signal of the sub process, and use the bit operation operator, which is not standardized. Therefore, waitpid provides a macro without bit operation.

 status:
Disabled (status): true if it is the status returned by the normally terminated child process. (check whether the process exits normally)
WEXITSTATUS(status): if wired is non-zero, extract the subprocess exit code. (check the exit code of the process)

Before viewing the exit code, check whether the return status of the child process is normal.

int main()
{

  pid_t id=fork();
  if(id==0)
  {
    //child
    int cnt=5;
    while(cnt)
    {
      printf("child[%d] is running:cnt is:%d\n",getpid(),cnt);//Get your own pid
      cnt--;
      sleep(1);
    }
    exit(1);
  } 
  //parent
  int status=0;
  pid_t ret=waitpid(id,&status,0);
  if(WIFEXITED(status))//Indicates that no exit signal has been received
  {
    //Normal end
    printf("exit code:%d\n",WEXITSTATUS(status));
  }
  else{
    printf("error,get a signal!\n");
  }
}
//Operation results
[wjy@VM-24-9-centos 16]$ ./myproc
child[16413] is running:cnt is:5
child[16413] is running:cnt is:4
child[16413] is running:cnt is:3
child[16413] is running:cnt is:2
child[16413] is running:cnt is:1
exit code:1

Understanding of waitpid

Waitpid is the system call interface, which is called by the user. The user passes the address to waitpid through the status output parameter. Within the operating system, there is a parent process and a child process. Waitpid obtains the exit code of the child process by calling the parent process. There is virtual memory in the subprocess. The virtual memory finds the physical memory through the page table mapping. After the subprocess ends, it enters the zombie state: the PCB saves the exit data when the process exits, so there are two things in the PCB of the subprocess, one is exit_ Code (exit code), one is signal (exit signal).

When we define an int* status pointer and pass it in through waitpid, the exit in the subprocess_ Code will be assigned to the status space pointed by the pointer, * status_p=exit_code. This value is also assigned in segments. When it is obtained above, it is a right shift and and operation. Here, when it is assigned to status * status from the child process_ p |=(exit_code<<8);* status_ p|=signal; Then it is returned to the user through waitpid, and the user can get this value (or the operation is because the initial value of status is 0. If you want to obtain the signal and exit code, you need to use the or operator)

[understanding of options]:

options is the state of the parent process while waiting. Set 0 above. 0 is the default value. It is a default behavior and represents blocking waiting. If it is set to WNOHANG, it means that the waiting mode is non blocking waiting.

What are blocking and non blocking?

For example, when a is downstairs and wants to invite B downstairs for dinner, but B says to a that I have something to do, you have to wait for half an hour first. A says OK. A doesn't do anything in this half an hour, just wait for B. When a asks B every 10 minutes to detect the state of B, this is a non blocking state. If you don't do anything for 30 minutes and don't react until B goes downstairs, it's a blocked state.

Blocking:

When the operating system has two entries, the parent process waits for the child process in the case of waitpid (). The parent process enters the Zuze state and is put into the waiting queue. When the operating system OS detects that the child process has finished running, the parent process changes from waiting state (S) to running state (R).

  • The essence of blocking: in fact, the PCB of the process is put into the waiting queue and the state of the process is changed to S state.
  • Essence of return: the PCB of the process gets the R queue from the waiting queue and is scheduled by the CPU.

Let's use non blocking code and see its results

#include <stdio.h>
#include <string.h>
#include <stdlib.h> 
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{

  pid_t id=fork();
  if(id==0)
  {
    //child
    int cnt=10;
    while(cnt)
    {
      printf("child[%d] is running:cnt is:%d\n",getpid(),cnt);//Get your own pid
      cnt--;
      sleep(1);
    }
    exit(1);
  } 
  //parent
  
  int status=0;
  while(1)//Polling detection
  {
    pid_t ret=waitpid(id,&status,WNOHANG);
    if(ret==0)
    {
      //The child process did not exit, but the waitpid wait was successful. The parent process needs to wait repeatedly
      printf("Do father things!\n");
    }
    else if(ret>0)
    {
      //The subprocess exits, waitpid succeeds, and the corresponding results are obtained
      printf("father wait:%d,success,status exit code:%d,status exit signal:%d\n",ret,(status>>8)&0xFF,status&0x7F);
      break;
    }
    else//ret<0 
    {
      perror("waitpid!\n");
      break;
    }
    sleep(1);//The parent process is detected every 1 second
  }
//Operation results
[wjy@VM-24-9-centos 16]$ ./myproc
Do father things!
child[8089] is running:cnt is:10
Do father things!
child[8089] is running:cnt is:9
Do father things!
...
Do father things!
child[8089] is running:cnt is:2
Do father things!
child[8089] is running:cnt is:1
Do father things!
father wait:8089,success,status exit code:1,status exit signal:0

Tags: Linux

Posted by noeffred on Sun, 17 Apr 2022 19:38:05 +0930