It all starts with a little tree sapling. The sapling slowly grows a trunk, which then develops branches that eventually sprout leaves, and before you know it the whole thing turns into a big, lush green tree.
Now in the digital world also exists a tree, which goes by the name of Merkle Tree. This tree is primarily composed of a root node, a group of middle nodes, and leaf nodes. The term root node denotes the singular final node of a Merkle tree. A tree could very well have a large number of leaf nodes, but leaf nodes cannot have further child nodes. What purpose does a Merkle tree serve? Let us find out.
A Merkle tree is a non-linear, binary, hash tree-like data structure. Each leaf node of the tree stores the hash value of a data element, while a middle node stores the hash of the hashes of it’s two corresponding child nodes. The main advantage of using a Merkle tree is that several important pieces of information can be verified regarding a particular data element or the data set as a whole without the need to have access to the full data set. For example, it is possible to verify whether a particular data element is a part of a given data set, or to prove that a data element is actually a part of a larger data set without having to store and parse the full data set. It is due to these practical applications that Merkle trees are commonly used in industries such as Blockchain that are fundamentally based on P2P networks that frequently involve scenarios where data is fetched from a source whose credibility is not guaranteed, and thus the data is fetched and verified simultaneously. Introducing Merkle trees to the equation can help prevent issues such as synchronizing a full data set only to realize it is not verifiable, thus saving a lot of time and bandwidth.
In the case of blockchain platforms, generally speaking, users only need to synchronize data and transaction information that is related to their own accounts. If the user were to synchronize all of the data, efficiency would certainly take a hit. Thus, blockchains implement something known as the Simple Pay Verification (SPV) technique. Using this verification technique, users can build and verify Merkle proofs that only require a small portion of the data to be synchronized to carry out the necessary verification process. This directly results in lower storage and network bandwidth requirements for the end-user.
With respect to the Ontology framework, Merkle proofs have various application scenarios. One of the significant applications is the generation of a block Merkle tree with the transactions of the block acting as the leaf nodes. These records serve as verifiable evidence that a transaction has taken place on the chain. This article primarily aims at describing how Ontology carries out certain optimizations when implementing Merkle trees.
Most blockchains implement Merkle trees for individual blocks with the transaction hashes acting as the leaf nodes. However, in the case of the Ontology blockchain, with growing block height the block Merkle tree is dynamically updated with the new block data, leading to a mechanism that is slightly more complicated than the traditional scheme. This gives rise to a natural issue: How do you store this Merkle tree? Let us take a look at three different answers to that question.
This solution involves caching the Merkle tree. But there are two shortcomings here. First, because caching would mean that the tree is stored in volatile memory, when the machine is turned off or restarted, the entire chain would need to be traversed in order to generate the complete Merkle tree, a relatively time-consuming process. Also, as the block height increases the tree would be updated. Thus, the memory requirement would also grow linearly, seriously affecting scalability. So we can safely say that caching is not an optimum long term solution.
This solution involves storing the Merkle tree in a K-V database (such as LevelDB). Due to the simplicity of a K-V pair relationship, it would be necessary to specially encode the key and value to represent a tree-like structure. Also, the retrieval operation for a particular node of the tree would require multiple read operations, thereby decreasing the overall efficiency of the process.
Since the nodes of a Merkle tree are all hash values of a fixed length, if we were to map every hash value to an integer to form a 1–1 relationship, it would be possible to compress the entire tree into a one-dimensional integer array. The corresponding integer value could first be calculated and the node data can be stored at the corresponding indices. When accessing a particular node this integer value can be calculated and then used to directly access a node element’s data bypassing using it as the index. Storing this array in a file could solve the problem of the linear growth of the Merkle tree.
There are many different ways to map the tree nodes to an integer, with the most direct and intuitive one being layer by layer serialization based on the depth of the tree. But there’s a problem with this method of serialization. Every time the size of the tree changes, the serial number of a particular node will also change. Thus, the entire data would have to be parsed, serialized, and stored all over again. This would greatly affect the efficiency of the system. Hence, the stability of the file storage solution relies greatly on finding an efficient serialization method.
Apart from node serialization, using the file storage method presents other issues as well. As new blocks are constantly inserted and the block height continuously increases, the Merkle tree nodes are frequently updated, and thus the file also needs to be updated and constantly overwritten. This is another factor that would cause the efficiency to go down significantly.
To further add to the complexity of this process, it requires a data consistency processing mechanism in place. Considering a hypothetical case that very well may occur, let’s say that the node elements are being updated and the process is about half complete, and suddenly a new block is generated. This would result in an inconsistency in the Merkle tree data file.
If you closely observe the Merkle tree node insertion process, there are two types of nodes present in the Merkle tree: temporary nodes, whose node hash is susceptible to change as new nodes are inserted, and constant nodes that do not change with the insertion of new nodes. It is easy to demonstrate that the pre-condition required for a node to become a constant node is for it, along with its child nodes, to form a complete binary tree. Also, it is clear that the number of temporary nodes is very limited (just log(n)). The number of temporary nodes can be determined from the constant nodes, and after persisting in the memory, it will be changed as soon as new nodes are inserted.
Therefore, in Ontology’s solution to the problem, only constant nodes are added to the file. Another interesting coincidence is that the sequence in which constant nodes are formed is a stable serialization sequence. Considering the above-stated facts, there is only one action performed on the file, and that is to append. And this solves the data inconsistency problem that would arise due to overwriting the file as well.
Due to the special property of constant nodes, it becomes evident that the child nodes of a constant node will make no contribution to the Merkle tree’s update process. Thus, those who are not going to deploy nodes that provide verification services and are only interested in calculating the latest Merkle root’s hash value can simply store the root nodes of log(n) number of complete Merkle trees. This is enough to represent the entire Merkle tree status, thereby reducing the number of nodes that need to be stored to log(n) which is a lot more convenient to store using one key in the LevelDB. Updating the Merkle tree would then require only one read and write operation. The data structure definition is as follows:
It is clear from the compact Merkle tree’s definition that in order to obtain the root hash of the Merkle tree the hashes array needs to be parsed from right to left successively.
The algorithm is as follows:
Herein, the hash_empty function returns empty hashes and the hash_children function returns the hash value of the parent node that the two child nodes’ hashes correspond to.
When new leaf nodes are inserted, dynamic updation is carried out based on the current status of the Merkle tree. The algorithm used to add new leaf nodes is as follows:
The changes that take place in the hash values and compact representation data in the storage file from the process of Merkle tree growth are illustrated here.
The first figure is the Merkle tree’s single node status:
When a new node b is inserted into the Merkle tree, it’s size increases by 1. The new node b can be paired with node a to compute the hash value c.
When a new node d is inserted, since an already complete binary tree exists, the new node d is added individually for the time being. The tree size increases by 1.
At this point, every time a new node is inserted into the tree, we can determine the positioning and hash values based on the algorithm discussed above.
Merkle tree has a wide range of applications across different scenarios. In the context of Ontology, one of the applications of Merkle tree is to record every new block in the form of leaf nodes and provide existential evidence for the transactions that take place on-chain and are a part of these blocks.
For use cases where the purpose is not providing a verification service, Merkle trees can significantly improve the performance and storage capacity of consensus nodes. Ontology records and stores only the key nodes when implementing block Merkle trees. As a result of this methodology, we can update the Merkle tree using just one read and write operation on LevelDB. The time complexity is reduced to O(log(n)).
Moreover, when providing a verification service, the solution provided by Ontology can greatly simplify the frequent storage file overwriting issue and help maintain data consistency by reducing the operation to just appending the file.
So, do you see and appreciate the magic of Merkle trees now?
해지하기 버튼 자체가 없어요
Community | [중요] 이오스 렉스 참여방법 (EOS 입금부터 수익금 확인방법까지)
몇번이고 확인 했습니다. [해지하기]라는 버튼 자체가 없어요. 확인부탁드립니다.
Community | [중요] 이오스 렉스 참여방법 (EOS 입금부터 수익금 확인방법까지)
아이더스페이 가입자로서 이런 기사들이 밖으로 많이 보여지면 좋겠다고 생각합니다. QTS를 수익모델로 만들어진 투자상품이 아이더스페이인데 관련 정보는 https://aidus.co.kr/ 에서 얻어가세요.
News monitoring | '아이더스' QTS 최고 권위 ‘배틀 오브 더 퀀츠’ 포럼 참가
아이더스페이 가입자입니다. https://aidus.co.kr/ 에서 유익한 정보 얻어가세요~
News monitoring | '아이더스' 아이더스페이 사전 가입 이벤트 1차 2차 조기 매진
수익으로는 이오스로 추가 보상됩니다. 다른 토큰이 추가로 지급되지 않습니다.
Community | [중요] 이오스 렉스 참여방법 (EOS 입금부터 수익금 확인방법까지)
Write a post
Are you sure you want to delete this post?
Are you sure you want to delete this comment?
Purchase has been completed.
닉네임을 설정 후 작성해주세요.