telethon code analysis and TL implementation 1
foreword
I tried the official tdLib package before, and it can be adjusted, but it is still not as easy to use as the telethon package, and it is very easy to change python.
Open source code address: https://github.com/LonamiWebs/Telethon/
Instructions:
from telethon import TelegramClient, events, sync import socks # These example values won't work. You must get your own api_id and # api_hash from https://my.telegram.org, under API Development. #api_id = 12345 #api_hash = '0123456789abcdef0123456789abcdef' api_id = 94575 # Parameters copied from tdlib api_hash = "a3406de8d171bb422bb6ddf3bbd800e2" proxy1 = ("socks5", '127.0.0.1', 1081) client = TelegramClient('session_name', api_id, api_hash, proxy=proxy1) client.start()
It can run until prompted to enter the mobile phone number, that is, it can be connected!
Detailed documentation on TL language: https://core.telegram.org/mtproto/TL
Similar to protobuf, TL implements the definition of data format and RPC calling method.
But the grammar and ideas are very different.
- Constructor: used to define a data type, representing serialization and deserialization methods;
- Method: Describe the parameters of the called method, the serialization method of the parameters, and the return type, which corresponds to the constructor;
Constructors and methods are defined in a similar way. The defined statement can be hashed with CRC32 to get a 32-bit integer, which is used to uniquely mark the method or constructor;
Therefore, the first thing that the client of each language needs to do is to read the TL rules, write a TL compiler, and generate the TL part of the code according to the current version of TL officially announced for RPC and data encoding and decoding;
telethon_generator\data\api.tl
telethon_generator\data\mtproto.tl
1. TL object and deserialization analysis
Under the protocol described by TL, all messages are a TLObject, and all requests initiated can be considered as a TLRequest. First, let's investigate how to encapsulate and deserialize related data classes.
There is a code generator source code in the github source code, which generates TL package related files according to the relevant rules, otherwise the handwriting will be broken by handwriting, and the official documents have been updated;
TL language is mainly similar to protobuf to realize object serialization, deserialization and RPC process. First, we need a common file to define the most basic rules:
See code: telethon\extensions\binaryreader.py
This file is written manually, not automatically generated, and includes the most basic serialization and deserialization protocols:
The tgread_object method is the beginning of all work, so you need to read this method first:
1.1 Methods of reading objects from binary
def tgread_object(self): """Reads a Telegram object.""" # 4-byte little-endian int as constructor number constructor_id = self.read_int(signed=False) # The telethon\tl\alltlobjects.py file defines all object constructors and methods, encapsulated as a dictionary # Search if this constructor currently exists clazz = tlobjects.get(constructor_id, None) if clazz is None: # If you can't find the corresponding number, try parsing some of the most basic types, # http://crc32.bchrt.com/ Calculation tool # value = constructor_id if value == 0x997275b5: # boolTrue return True elif value == 0xbc799737: # boolFalse crc32('boolFalse = Bool') return False # crc32("vector t:Type # [ t ] = Vector t") = 0x1cb5c415 elif value == 0x1cb5c415: # Vector return [self.tgread_object() for _ in range(self.read_int())] clazz = core_objects.get(constructor_id, None) if clazz is None: # If you can't find it, go back 4 bytes! ! self.seek(-4) # Go back pos = self.tell_position() error = TypeNotFoundError(constructor_id, self.read()) self.set_position(pos) raise error return clazz.from_reader(self) # Each type implements classMethod, constructing itself from binary
Remark:
0x1cb5c415 is a vector type and needs to be marked separately! !
The above code first reads a 4-byte integer from the byte stream, looks up the constructor according to the dictionary (generated by the code generator), and uses the constructor to read the following data;
Several of the built-in types are handled separately because:
-
The BOOL value has no subsequent part, only the constructor;
-
If it is a special vector, you need to read the number, write a loop, read N elements, put them in the array and return;
Finally, if the identifier cannot be matched, it is an error. Generally, the biggest reason is that the current TL is not compatible with the one used by the other party!
1.2 Example of deserialization
Here is just an example of an RPC process in the previous key negotiation process. The class code here is automatically generated by the generator. Let's learn what needs to be generated:
The client calls the method:
req_pq#60469778 nonce:int128 = ResPQ;
The server uses the constructor to construct the return data:
resPQ#05162463 nonce:int128 server_nonce:int128 pq:string server_public_key_fingerprints:Vector<long> = ResPQ;
First look at the method: we searched for 0x60469778 and found:
0x60469778: functions.ReqPqRequest,
The function is defined as:
class ReqPqRequest(TLRequest): CONSTRUCTOR_ID = 0x60469778 SUBCLASS_OF_ID = 0x786986b8 # There is a particularity here, python's int can be considered as infinite, so it is not 4 bytes def __init__(self, nonce: int): """ :returns ResPQ: Instance of ResPQ. """ self.nonce = nonce # turn itself into a dictionary def to_dict(self): return { '_': 'ReqPqRequest', 'nonce': self.nonce } # Convert itself into a binary byte stream, first concatenate the 4-byte function ID, then the parameter list, a total of 20 bytes def _bytes(self): return b''.join(( b'x\x97F`', #4 bytes 'x', 0x97, 'F', '`', i.e. 0x78, 0x97, 0x46, 0x60 self.nonce.to_bytes(16, 'little', signed=True), # 128 bits, little endian integer )) # Deserialization is very simple, directly read 16 bytes as a small integer @classmethod def from_reader(cls, reader): _nonce = reader.read_large_int(bits=128) return cls(nonce=_nonce)
The function to read 128 bits here is in binaryreader.py:
def read_large_int(self, bits, signed=True): """Reads a n-bits long integer value.""" return int.from_bytes( self.read(bits // 8), byteorder='little', signed=signed)
Then we see how the return data of this function is constructed:
telethon\tl\types_init_.py
class ResPQ(TLObject): CONSTRUCTOR_ID = 0x5162463 SUBCLASS_OF_ID = 0x786986b8 def __init__(self, nonce: int, server_nonce: int, pq: bytes, server_public_key_fingerprints: List[int]): """ Constructor for ResPQ: Instance of ResPQ. """ self.nonce = nonce self.server_nonce = server_nonce self.pq = pq self.server_public_key_fingerprints = server_public_key_fingerprints def to_dict(self): return { '_': 'ResPQ', 'nonce': self.nonce, 'server_nonce': self.server_nonce, 'pq': self.pq, 'server_public_key_fingerprints': [] if self.server_public_key_fingerprints is None else self.server_public_key_fingerprints[:] } def _bytes(self): return b''.join(( b'c$\x16\x05', # The little-endian representation of the type of 0x5162463 self.nonce.to_bytes(16, 'little', signed=True), # 16 bytes self.server_nonce.to_bytes(16, 'little', signed=True), # 16 bytes self.serialize_bytes(self.pq), # Here is the serialize string method in the parent class b'\x15\xc4\xb5\x1c', # 0x1cb5c415 is the vector type # followed by the number of int elements of 4-byte little endian type struct.pack('<i',len(self.server_public_key_fingerprints)), # Serialize 8-byte long elements one by one b''.join(struct.pack('<q', x) for x in self.server_public_key_fingerprints), )) @classmethod def from_reader(cls, reader): _nonce = reader.read_large_int(bits=128) _server_nonce = reader.read_large_int(bits=128) _pq = reader.tgread_bytes() reader.read_int() _server_public_key_fingerprints = [] for _ in range(reader.read_int()): _x = reader.read_long() _server_public_key_fingerprints.append(_x) return cls(nonce=_nonce, server_nonce=_server_nonce, pq=_pq, server_public_key_fingerprints=_server_public_key_fingerprints)
Note: For how to use struct to read and write binary data, refer to: https://blog.csdn.net/qq_30638831/article/details/80421019?spm=1001.2014.3001.5506
Remarks: Each class is a subclass derived from TLObject, so this class encapsulates some of the most basic methods, how to serialize basic types, such as how to serialize strings. (The telethon\tl\tlobject.py class is also a hand-written base class ;)
Summary: Each class has:
- a static method as a factory function;
- The constructor implements input parameters;
- implement serialization
- Convert to dictionary type
1.3 Reading and writing of basic types
1.3.1 Serialization of Basic Types and Composite Types
1.3.1.1 Packaging
The official document https://core.telegram.org/mtproto/serialize believes that data can be divided into two categories: Bare type and Boxed type:
-
The first letter of the package type is capitalized. When serializing: first the identifier of the type, and then the data,
-
The first character of the pure value type is lowercase. When serializing, no type identifier is added;
-
%X can be used to represent the pure value type corresponding to X: x
For large arrays, if the encapsulation method is used, each element will have an identifier, which wastes storage space and bandwidth, so it is more reasonable to use the corresponding pure value type to represent it!
for example
int_couple int int = IntCouple
int_couple is equivalent to %int_couple and %IntCouple
A pair of integers: 3, 4 if represented by encapsulation type:
If the corresponding identifier of intCouple is 404, then
404 3 4
Here 404 is not a real identifier, the official documentation is just for example, the identifier is calculated using CRC32.
1.3.1.2 Basic types
The basic types include, at the same time, there are two ways to represent the encapsulation form and the pure value type.
(int, long, double, string) corresponds to (Int, Long, Double, String)
int ? = Int; long ? = Long; double ? = Double; string ? = String;
-
int: little endian storage, 4 bytes;
-
long: little endian storage, 8 bytes;
-
double: little-endian storage, 8 bytes;
-
string: see the next section, has the same meaning as bytes
But if the above 4 types use the corresponding encapsulation type, you need to add an identifier.
The identifier is calculated using CRC32:
CRC32("int ? = Int")
1.3.1.3 Composite types
It is officially recommended to add field names when defining types, such as User and Group,
If you do not write the variable name, the meaning of the field cannot be recognized.
user int string string = User; group int string string = Group;
Therefore, the following method is recommended:
user id:int first_name:string last_name:string = User; group id:int title:string description:string = Group;
Adding and extending user requires redefining a constructor, but the generated class name does not change. Serialization and deserialization identify different types through identifiers:
userv2 id:int unread_messages:int first_name:string last_name:string in_groups:vector int = User;
1.3.2 Serialization of vector types
Vector can be considered as a built-in type or as a composite type Vector,
vector {t:Type} # [ t ] = Vector t;
This is similar to a template container, but in fact the constructor always uses the same identifier,
const 0x1cb5c415 = crc32("vector t:Type # [ t ] = Vector t")
The order of serialization is:
-
0x1cb5c415 4 bytes is a vector type, no matter what type of its elements, this will not change!
-
followed by the number of int elements of 4-byte little endian type
-
N elements are serialized according to the type, (each element does not include the type)
When deserializing, according to the custom type, the specific type of the element is known, and there is no need to store the element type;
Related to this are: IntHash and StrHash, which are used to represent hash types, which are arrays of key-value pairs,
here:
coupleInt {t:Type} int t = CoupleInt t; intHash {t:Type} (vector %(CoupleInt t)) = IntHash t; coupleStr {t:Type} string t = CoupleStr t; strHash {t:Type} (vector %(CoupleStr t)) = StrHash t;
Using c++ to describe something like:
using coupleInt = std::pair<int, t>; using IntHash<t> = std::vector<coupleInt>;
The percent sign % is used here, indicating that each element is not added with a construction identifier when stored in the array.
1.3.3 string(bytes) string serialization method
https://core.telegram.org/mtproto/serialize
-
If the length is less than 254: use 1 byte to represent the length, followed by a byte stream of N bytes; the total length is finally aligned with 4 bytes;
-
The length is greater than or equal to 254: the first byte is 254, followed by a 3-byte little-endian int, followed by an N-byte byte stream; the total length is finally aligned with 4 bytes;
About padding length:
-
If length is less than 254: 4 - (len(data) + 1) % 4
-
Length greater than or equal to 254: 4 - len(data) % 4
code show as below:
@staticmethod def serialize_bytes(data): """Write bytes by using Telegram guidelines""" if not isinstance(data, bytes): if isinstance(data, str): data = data.encode('utf-8') else: raise TypeError( 'bytes or str expected, not {}'.format(type(data))) r = [] if len(data) < 254: padding = (len(data) + 1) % 4 if padding != 0: padding = 4 - padding r.append(bytes([len(data)])) r.append(data) else: padding = len(data) % 4 if padding != 0: padding = 4 - padding r.append(bytes([ 254, len(data) % 256, (len(data) >> 8) % 256, (len(data) >> 16) % 256 ])) r.append(data) r.append(bytes(padding)) return b''.join(r)
1.4 Built-in types
The official documentation states that the relevant basic types are built in: https://core.telegram.org/mtproto/TL-tl
/ // // Common Types (source file common.tl, only necessary definitions included) // / // Built-in types int ? = Int; long ? = Long; double ? = Double; string ? = String; // Boolean emulation boolFalse = Bool; boolTrue = Bool; // Vector vector {t:Type} # [t] = Vector t; tuple {t:Type} {n:#} [t] = Tuple t n; vectorTotal {t:Type} total_count:int vector:%(Vector t) = VectorTotal t; Empty False; true = True;
The built-in meaning here is that we need to manually implement the relevant business logic, and the subsequent functions are implemented by calling these basic functions in the code generator;
There are 1500 identifiers in alltlobjects.py generated by the code and a class for one;
The code tlobject.py implements the most basic functions; but only defines two abstract classes, TLObject and TLRequest,
The specific factory method for reading data requires each class to be implemented in various codes;
@classmethod def from_reader(cls, reader):
1.5 Summary
At this point, we have clear the basic calling logic:
- The business layer receives the data stream;
- Construct BinaryReader(data) using data stream;
- Use reader.tgread_object() as the entry function to try to deserialize;
- This function finds the appropriate class and factory function to deserialize the object according to the recognized identifier; (other methods of BinaryReader will also be used in the process)
2. Tracking the implementation of the login verification process algorithm
TelegramClient is a class that the library uses directly for customers, which inherits from a lot of parent classes:
-
TelegramBaseClient
-
AuthMethods,
-
AccountMethods,
-
DownloadMethods,
-
DialogMethods,
-
ChatMethods,
-
BotMethods,
-
MessageMethods,
-
UploadMethods,
-
ButtonMethods,
-
UpdateMethods,
-
MessageParseMethods,
-
UserMethods,
At the current stage, TelegramBaseClient and AuthMethods are the relevant classes that exactly establish the connection with the server and perform the exchange of keys;
Relevant classes are described in the following table and related official protocol documents: https://core.telegram.org/mtproto/description
kind | illustrate | document |
---|---|---|
AuthKey | It encapsulates the basic KEY calculation and management work | telethon\crypto\authkey.py |
MTProtoState | Implemented data encryption and decryption, including the calculation of msg_id and seq_no; | telethon\network\mtprotostate.py |
do_authentication function | The state machine for the authentication process is implemented here! ! ! | telethon\network\authenticator.py |
MTProtoSender | Manage the underlying connection; implement the core key exchange process and related state machines that interact with the server; receive thread function; send thread function; message processing event distribution function after receiving messages; | telethon\network\mtprotosender.py |
MTProtoPlainSender | This class needs to be used to send plaintext before exchanging keys; | telethon\network\mtprotoplainsender.py |
PacketCodec | An interface for encoding and decoding is defined; this is a pure virtual class that implements nothing; | telethon\network\connection\connection.py |
Connection | A base class that encapsulates asyncio.open_connection; in fact, the subclass only needs to reset the static member variable packet_codec; it realizes the basic connection function of TCP and the realization of the sending and receiving thread; the upper layer only needs to call connect, send, recv; | telethon\network\connection\connection.py |
FullPacketCodec | https://core.telegram.org/mtproto#tcp-transport implements encoding and decoding according to the document; sending: 4 bytes total length, 4 bytes send count, data, 4 bytes checksum; remarks total length equal to 12+ The data is long; the format is the same when decoding, and the checksum needs to be checked; | telethon\network\connection\tcpfull.py |
ConnectionTcpFull | Inherited from Connection, only the FullPacketCodec class is set as the codec; | telethon\network\connection\tcpfull.py |
HttpPacketCodec | Send the data using HTTP; also read the data part from the HTTP packet; | telethon\network\connection\http.py |
ConnectionHttp | Inherited from Connection, using HttpPacketCodec as the codec class; | telethon\network\connection\http.py |
Reference: "python abstract class abc module" https://zhuanlan.zhihu.com/p/508700685
"Python asyncio asynchronous programming" https://www.jianshu.com/p/7fd361cde22c
https://www.jianshu.com/p/eed5da9965f2
MTProtoSender is the core work engine; _connect(self) is the entry function after the whole work starts. The process is as follows:
1) Call self._try_connect to try to connect to the underlying TCP connection (possibly encapsulated by some other protocol, you know);
2) If the connection is successful, try to exchange keys: self._try_gen_auth_key;
3) If you cannot connect or exchange keys after trying self._retries times, an error will be reported, which is generally an error that cannot be connected;
4) After establishing the logical connection, start two threads: self._send_loop() and self._recv_loop();
5) This way, the connection is fully established!
2.1 TCP connection _try_connect
At the beginning of the experiment we call:
client = TelegramClient('session_name', api_id, api_hash, proxy=proxy1) client.start()
The call stack looks like this:
-
AuthMethods.start()
-
AuthMethods._start()
-
TelegramBaseClient.connect(), the default constructor uses the ConnectionTcpFull type as the underlying connection class; that is, construct one and call self._sender.connect()
-
MTProtoSender.connect(), _connect() is called again in the function, here is the logic of the analysis in the previous part;
Note: The telethon\network\connection directory defines several TCP bottom-level related classes;
As described in the table above, Telegram supports 2 connection methods. The data connection in TCP mode is discussed here, which is also the default connection mode;
**The packet format is: **https://core.telegram.org/mtproto/mtproto-transports#full
4B(length) + 4B(serial number)+ NBytes( data)+ 4B(CRC32)
+----+----+----...----+----+ |len.|seq.| payload |crc.| +----+----+----...----+----+
Specific code reference: telethon\network\connection\tcpfull.py
The encapsulation of the TCP connection is implemented in the Connection class;
At this point, the TCP connection is completed, the encapsulation and unpacking of the TCP data packets are also completed, and the upper-layer business can happily perform logical interaction.
Note: There are 4 data encapsulation formats:
Due to space limitations, no further discussion will be discussed here;
2.2 Key exchange _try_gen_auth_key
After completing the TCP connection in the previous section,
The self._try_gen_auth_key function performs the key exchange process:
-
First create an MTProtoPlainSender for sending plaintext, where the previous connection needs to be passed;
-
Call authenticator.do_authentication to execute the state machine, where the key exchange is completed inside the function; after success, an authorization key and a time offset will be obtained;
As the previous post has discussed the relevant key exchange process: here is a comparison of the implementation process,
Step 1: Send a random number of 16 bytes and get the server response, including (pq, server_nonce, public key hash),
The data (nonce, server_nonce) will be used as a temporary sessionID later.
# Step 1 sending: PQ Request, endianness doesn't matter since it's random nonce = int.from_bytes(os.urandom(16), 'big', signed=True) # Here, the function ReqPqMultiRequest is used to construct the sent data, which is equal to the remote RPC, and returns the constructor res_pq type data. # The design here is really neat res_pq = await sender.send(ReqPqMultiRequest(nonce)) assert isinstance(res_pq, ResPQ), 'Step 1 answer was %s' % res_pq if res_pq.nonce != nonce: raise SecurityError('Step 1 invalid nonce from server') # Here is the call to the system library, which uses big endian mode to parse out a large integer p*q pq = get_int(res_pq.pq)
It should be mentioned that the ReqPqMultiRequest constructor is called here, not the ReqPqRequest we wrote earlier. This is mainly the change of the protocol version. The current document on the official website is also an example of req_pq. The current protocol version is 2.0, and req_pq_multi has been used.
Step 2: Execute DH key exchange, report a new random number first, and encrypt it
# factorize the product of large prime numbers to get p, q p, q = Factorization.factorize(pq) p, q = rsa.get_byte_array(p), rsa.get_byte_array(q) # In order to transmit encrypted information later, create a new random number new_nonce, new_nonce = int.from_bytes(os.urandom(32), 'little', signed=True) # Construct new sent data pq_inner_data = bytes(PQInnerData( pq=rsa.get_byte_array(pq), p=p, q=q, nonce=res_pq.nonce, server_nonce=res_pq.server_nonce, new_nonce=new_nonce ))
Encrypt pq_inner_data:
In the rsa.py file, the public key information currently used by the server is defined, and the appropriate public key can be found through the index returned by the service
# sha_digest + data + random_bytes cipher_text, target_fingerprint = None, None # From the public key index returned by the server, find the first one, encrypt pq_inner_data for fingerprint in res_pq.server_public_key_fingerprints: cipher_text = rsa.encrypt(fingerprint, pq_inner_data) if cipher_text is not None: target_fingerprint = fingerprint break # This section is for compatibility with the old server's key, which can be ignored if cipher_text is None: # Second attempt, but now we're allowed to use old keys for fingerprint in res_pq.server_public_key_fingerprints: cipher_text = rsa.encrypt(fingerprint, pq_inner_data, use_old=True) if cipher_text is not None: target_fingerprint = fingerprint break if cipher_text is None: raise SecurityError( 'Step 2 could not find a valid key for fingerprints: {}' .format(', '.join( [str(f) for f in res_pq.server_public_key_fingerprints]) ) ) # The first 2 fields of the sent data are the random numbers exchanged before, server_dh_params = await sender.send(ReqDHParamsRequest( nonce=res_pq.nonce, server_nonce=res_pq.server_nonce, p=p, q=q, public_key_fingerprint=target_fingerprint, encrypted_data=cipher_text ))
Remarks: The rsa.encrypt() function executes the RSA_PAD process. This algorithm is more complicated and will be discussed separately later;
Check whether the service response is legal: the random number contained is the same as the previous one, and check that the new random number is sent by us to prevent man-in-the-middle attacks:
assert isinstance( server_dh_params, (ServerDHParamsOk, ServerDHParamsFail)),\ 'Step 2.1 answer was %s' % server_dh_params if server_dh_params.nonce != res_pq.nonce: raise SecurityError('Step 2 invalid nonce from server') if server_dh_params.server_nonce != res_pq.server_nonce: raise SecurityError('Step 2 invalid server nonce from server') if isinstance(server_dh_params, ServerDHParamsFail): nnh = int.from_bytes( sha1(new_nonce.to_bytes(32, 'little', signed=True)).digest()[4:20], 'little', signed=True ) if server_dh_params.new_nonce_hash != nnh: raise SecurityError('Step 2 invalid DH fail nonce from server') assert isinstance(server_dh_params, ServerDHParamsOk),\ 'Step 2.2 answer was %s' % server_dh_params
Step 3: Calculate your own key, check with the server for consistency, try to complete the exchange process, and use the AES256_ige_encrypt encryption algorithm to process the reported data;
At this point, the service response has been obtained: but it is encrypted by the server, and the ciphertext needs to be decrypted first.
stuct Server_DH_inner_data { int128 nonce, int128 server_nonce, int g, int dh_prime, // pow(g, {a or b}) mod dh_prime string g_a, // a need to cherish int server_time } // https://blog.csdn.net/robinfoxnan/article/details/127322483
# Step 3 sending: Complete DH Exchange # First calculate the encryption key and initial vector key, iv = helpers.generate_key_data_from_nonce( res_pq.server_nonce, new_nonce ) if len(server_dh_params.encrypted_answer) % 16 != 0: # See PR#453 raise SecurityError('Step 3 AES block size mismatch') # Unwrap the answer plain_text_answer = AES.decrypt_ige( server_dh_params.encrypted_answer, key, iv ) # The first 20 bytes are the checksum, followed by the structure of the service response with BinaryReader(plain_text_answer) as reader: reader.read(20) # hash sum server_dh_inner = reader.tgread_object() assert isinstance(server_dh_inner, ServerDHInnerData),\ 'Step 3 answer was %s' % server_dh_inner if server_dh_inner.nonce != res_pq.nonce: raise SecurityError('Step 3 Invalid nonce in encrypted answer') if server_dh_inner.server_nonce != res_pq.server_nonce: raise SecurityError('Step 3 Invalid server nonce in encrypted answer') # Here are the core parameters of the key exchange dh_prime = get_int(server_dh_inner.dh_prime, signed=False) g = server_dh_inner.g g_a = get_int(server_dh_inner.g_a, signed=False) time_offset = server_dh_inner.server_time - int(time.time()) b = get_int(os.urandom(256), signed=False) g_b = pow(g, b, dh_prime) gab = pow(g_a, b, dh_prime)
At this time, the key is actually equal to gab:
auth_key = (g_a)^b mod dh_prime;
After preparing the post parameters, you need to check the key parameters:
# IMPORTANT: Apart from the conditions on the Diffie-Hellman prime # dh_prime and generator g, both sides are to check that g, g_a and # g_b are greater than 1 and less than dh_prime - 1. We recommend # checking that g_a and g_b are between 2^{2048-64} and # dh_prime - 2^{2048-64} as well. # (https://core.telegram.org/mtproto/auth_key#dh-key-exchange-complete) if not (1 < g < (dh_prime - 1)): raise SecurityError('g_a is not within (1, dh_prime - 1)') if not (1 < g_a < (dh_prime - 1)): raise SecurityError('g_a is not within (1, dh_prime - 1)') if not (1 < g_b < (dh_prime - 1)): raise SecurityError('g_b is not within (1, dh_prime - 1)') safety_range = 2 ** (2048 - 64) if not (safety_range <= g_a <= (dh_prime - safety_range)): raise SecurityError('g_a is not within (2^{2048-64}, dh_prime - 2^{2048-64})') if not (safety_range <= g_b <= (dh_prime - safety_range)): raise SecurityError('g_b is not within (2^{2048-64}, dh_prime - 2^{2048-64})')
Still encrypted with the AES key just now
# Prepare client DH Inner Data client_dh_inner = bytes(ClientDHInnerData( nonce=res_pq.nonce, server_nonce=res_pq.server_nonce, retry_id=0, # TODO Actual retry ID g_b=rsa.get_byte_array(g_b) )) client_dh_inner_hashed = sha1(client_dh_inner).digest() + client_dh_inner # Encryption client_dh_encrypted = AES.encrypt_ige(client_dh_inner_hashed, key, iv) # Prepare Set client DH params dh_gen = await sender.send(SetClientDHParamsRequest( nonce=res_pq.nonce, server_nonce=res_pq.server_nonce, encrypted_data=client_dh_encrypted, ))
After the server responds, if it is correct, the two parties have reached an agreement through negotiation.
The format is as follows:
struct dh_gen_ok { int128 nonce; // mark a conversation int128 server_nonce; // mark a conversation int128 new_nonce_hash1; // mark }
test result
# The answer is 3 possibilities nonce_types = (DhGenOk, DhGenRetry, DhGenFail) assert isinstance(dh_gen, nonce_types), 'Step 3.1 answer was %s' % dh_gen name = dh_gen.__class__.__name__ if dh_gen.nonce != res_pq.nonce: raise SecurityError('Step 3 invalid {} nonce from server'.format(name)) if dh_gen.server_nonce != res_pq.server_nonce: raise SecurityError( 'Step 3 invalid {} server nonce from server'.format(name)) auth_key = AuthKey(rsa.get_byte_array(gab)) nonce_number = 1 + nonce_types.index(type(dh_gen)) new_nonce_hash = auth_key.calc_new_nonce_hash(new_nonce, nonce_number) dh_hash = getattr(dh_gen, 'new_nonce_hash{}'.format(nonce_number)) if dh_hash != new_nonce_hash: raise SecurityError('Step 3 invalid new nonce hash') if not isinstance(dh_gen, DhGenOk): raise AssertionError('Step 3.2 answer was %s' % dh_gen) return auth_key, time_offset
To be continued...