14. Socket of python Network Programming
1, What is socket
Socket is also called socket. Socket is the intermediate software abstraction layer between application layer and TCP/IP protocol family. It is a group of interfaces. In the design mode, socket is actually a facade mode. It hides the complex TCP/IP protocol family behind the socket interface. For users, a set of simple interfaces is all, allowing the socket to organize data to comply with the specified protocol.
Therefore, we do not need to deeply understand the tcp/udp protocol. The socket has been encapsulated for us. We only need to follow the rules of socket to program. The written program naturally follows the tcp/udp standard.
Socket originated from the version of Unix at the University of California, Berkeley in the 1970s, which is called BSD Unix. Therefore, socket is sometimes called "Berkeley socket" or "BSD socket". Initially, sockets were designed to communicate between multiple applications on the same host. This is also known as interprocess communication, or IPC. There are two kinds of sockets (or two races), which are file based and network-based.
Socket based on file type: AF_UNIX
Socket based on network type: AF_INET
2, Socket workflow
1. Socket workflow based on tcp protocol
The server first initializes the Socket, then binds to the port, listens to the port, calls accept to block, and waits for the client to connect. At this time, if a client initializes a Socket and then connects to the server, if the connection is successful, the connection between the client and the server is established. The client sends a data request, the server receives the request and processes the request, then sends the response data to the client, the client reads the data, finally closes the connection, and one interaction ends.
Server
#!/usr/bin/env python3 # -*- coding:utf-8 -*- import socket server = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # Network type socket object based on tcp streaming protocol server.bind(('127.0.0.1', 9939)) # Bind IP+PORT server.listen(5) # Set half connection pool size connect, client_addr = server.accept() # Enter the listening state and get the connection and client address print(client_addr) data = connect.recv(1024) # Receive data, maximum 1024Bytes connect.send(data.upper()) # Return data. The original data is capitalized and returned connect.close() # Close connection
client
#!/usr/bin/env python3 # -*- coding:utf-8 -*- import socket client = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # Network type socket object based on tcp streaming protocol client.connect(('127.0.0.1', 9939)) # Connect server client.send('hello socket'.encode('utf-8')) data = client.recv(1024) print(data.decode('utf-8')) client.close()
Run the server first and then the client to get the running results
# Client results HELLO SOCKET # Server results ('127.0.0.1', 50140)
2. Socket workflow based on udp protocol
udp protocol differs from tcp protocol in that:
- No link required
- udp protocol is a message type protocol, while tcp protocol is a link based streaming protocol
Let's simulate socket communication based on udp protocol
Server
#!/usr/bin/env python3 # -*- coding:utf-8 -*- import socket server = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) # Network type: socket object based on UDP protocol server.bind(('127.0.0.1', 9939)) # Bind IP+PORT data, client_addr = server.recvfrom(1024) # Receive data, maximum 1024Bytes print(data.decode('utf-8')) server.sendto(data.upper(), client_addr) # Return uppercase content data server.close() # Close connection
client
#!/usr/bin/env python3 # -*- coding:utf-8 -*- import socket client = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) # Network type: socket object based on UDP protocol client.sendto('hello socket'.encode('utf-8'), ('127.0.0.1', 9939)) # Send data "hello" data, client_addr = client.recvfrom(1024) # Receive return data print(data.decode('utf-8')) client.close() # Close link
Operation results
# client HELLO SOCKET # Server hello socket
3, Remote command based on tcp
Server
#!/usr/bin/env python3 # -*- coding:utf-8 -*- import subprocess from socket import * server = socket(AF_INET, SOCK_STREAM) server.bind(('127.0.0.1', 9939)) server.listen(5) while True: connect, client_addr = server.accept() while True: try: cmd = connect.recv(1024) if not cmd: break obj = subprocess.Popen(cmd.decode('utf-8'), shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) out = obj.stdout.read() err = obj.stderr.read() connect.send(out) connect.send(err) except Exception: break connect.close()
client
#!/usr/bin/env python3 # -*- coding:utf-8 -*- from socket import * client = socket(AF_INET, SOCK_STREAM) client.connect(('127.0.0.1', 9939)) while True: cmd = input('Please enter the command:').strip() if not cmd: continue client.send(cmd.encode('utf-8')) cmd_res = client.recv(1024) print(cmd_res.decode('gbk')) # gbk encoding is used by default in windows system, so gbk decoding is used, and utf-8 decoding is used in linux
Operation results
Please enter the command: dir Driver D The volume in is Data The serial number of the volume is 72 DE-32A1 D:CodePyCodechaney02.python Network programming, based on TCP Directory of protocol remote execution commands 2021/02/12 17:28 <DIR> . 2021/02/12 17:28 <DIR> .. 2021/02/12 16:56 333 Client.py 2021/02/12 17:28 581 Server.py 2 Files 914 bytes 2 Directories 127,736,856,576 Available bytes
4, tcp packet sticking problem
What is sticky bag?
In tcp protocol, in order to send multiple packets to the receiver more effectively to the other party, the sender uses the optimization method Nagle algorithm, which combines the data with small interval and small amount of data into a large data block, and then packets. In this way, it is difficult for the receiver to distinguish the boundary of data, and the phenomenon of "sticky packet" often occurs. At this point, we need to provide a scientific unpacking mechanism. Therefore, flow oriented communication has no message protection boundary.
How to solve sticking package?
We can find that the sticking packet problem is that the receiver does not know the boundary of the data and cannot split the data to form valid data. In view of this, we can solve the problem of sticking packets, which is to let the receiver know how to unpack the data and find the boundary between the data. At this time, we can add a header to the data packet. The header information contains the description information of the data, such as the size of the data information, so that the receiver can unpack the data packet and divide the received data according to the header information.
In fact, this is to customize a protocol. In this process, it should be noted that the length of the header information needs to be fixed (using struct.pack(), etc.) to facilitate the analysis of the receiver. At the same time, it should be noted that the header information should be serialized (json, pickle, etc.) to facilitate inverse solution.