DHT
DHT stands for Distributed Hash Table and is essentially a distributed key-value database, where each member of the network can store something, for example, information about themselves.
The implementation of DHT in TON is inherently similar to the implementation of Kademlia, which is used in IPFS. Any network member can run a DHT node, generate keys and store data. To do this, he needs to generate a random ID and inform other nodes about himself.
To determine which node to store the data on, an algorithm is used to determine the "distance" between the node and the key. The algorithm is simple: we take the ID of the node and the ID of the key, we perform the XOR operation. The smaller the value, the closer the node. The task is to store the key on the nodes as close as possible to the key, so that other network participants can, using the same algorithm, find a node that can give data on this key.
Finding a value by key
Let's look at an example with a search for a key, connect to any DHT node and establish a connection via ADNL UDP.
For example, we want to find the address and public key for connecting to the node that hosts foundation.ton site.
Let's say we have already obtained the ADNL address of this site by executing the Get method of the DNS contract.
The ADNL address in hex representation is 516618cf6cbe9004f6883e742c9a2e3ca53ed02e3e36f4cef62a98ee1e449174
.
Now our goal is to find the ip, port and public key of the node that has this address.
To do this, we need to get the ID of the DHT key, first we will fill the DHT key schema:
dht.key id:int256 name:bytes idx:int = dht.Key
name
is the type of key, for ADNL addresses the word address
is used, and, for example, to search for shardchain nodes - nodes
. But the key type can be any array of bytes, depending on the value you are looking for.
Filling in this schema, we get:
8fde67f6 -- TL ID dht.key
516618cf6cbe9004f6883e742c9a2e3ca53ed02e3e36f4cef62a98ee1e449174 -- our searched ADNL address
07 61646472657373 -- key type, the word "address" as an TL array of bytes
00000000 -- index 0 because there is only 1 key
Next - get the key ID, sha256 hash from the bytes serialized above. It will be b30af0538916421b46df4ce580bf3a29316831e0c3323a7f156df0236c5b2f75
Now we can start searching. To do this, we need to execute a query that has schema:
dht.findValue key:int256 k:int = dht.ValueResult
key
is the id of our DHT key, and k
is the "width" of the search, the smaller it is, the more accurate, but fewer potential nodes to query. The maximum k for nodes in a TON is 10, usually 6 is used.
Let's populate this structure, serialize and send the request using the adnl.message.query
schema. You can read more about this in another article.
In response, we can get:
dht.valueNotFound
- if the value is not found.dht.valueFound
- if the value is found on this node.
dht.valueNotFound
If we get dht.valueNotFound
, the response will contain a list of nodes that are known to the node we requested and are as close as possible to the key we requested from the list of nodes known to it. In this case, we need to connect and add the received nodes to the list known to us.
After that, from the list of all nodes known to us, select the closest, accessible and not yet requested, and make the same request to it. And so on until we try all the nodes in the range we have chosen or until we stop receiving new nodes.
Let's analyze the response fields in more detail, the schemas used:
adnl.address.udp ip:int port:int = adnl.Address;
adnl.addressList addrs:(vector adnl.Address) version:int reinit_date:int priority:int expire_at:int = adnl.AddressList;
dht.node id:PublicKey addr_list:adnl.addressList version:int signature:bytes = dht.Node;
dht.nodes nodes:(vector dht.node) = dht.Nodes;
dht.valueNotFound nodes:dht.nodes = dht.ValueResult;
dht.nodes -> nodes
- list of DHT nodes (array).
Each node has an id
which is its public key, usually pub.ed25519, used as a server key to connect to the node via ADNL. Also, each node has a list of addresses addr_list:adnl.addressList
, version and signature.
We need to check the signature of each node, for this we read the value of signature
and set the field to zero (we make it an empty bytes array). After - we serialize the TL structure dht.node
with the empty signature and check the signature
field that was before emptying.
We check on the received serialized bytes, using the public key from id
field. [Implementation example]
From the list addrs:(vector adnl.Address)
we take the address and try to establish an ADNL UDP connection, as the server key we use id
, which is the public key.
To find out the "distance" to this node - we need to take key id from the key from the id
field and check the distance by the XOR operation from the node's key id and the desired key.
If the distance is small enough, we can make the same request to this node. And so on, until we find a value or there are no more new nodes.