You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

364 lines
12 KiB

  1. # Encoding
  2. ## Amino
  3. Tendermint uses the proto3 derivative [Amino](https://github.com/tendermint/go-amino) for all data structures.
  4. Think of Amino as an object-oriented proto3 with native JSON support.
  5. The goal of the Amino encoding protocol is to bring parity between application
  6. logic objects and persistence objects.
  7. Please see the [Amino
  8. specification](https://github.com/tendermint/go-amino#amino-encoding-for-go) for
  9. more details.
  10. Notably, every object that satisfies an interface (eg. a particular kind of p2p message,
  11. or a particular kind of pubkey) is registered with a global name, the hash of
  12. which is included in the object's encoding as the so-called "prefix bytes".
  13. We define the `func AminoEncode(obj interface{}) []byte` function to take an
  14. arbitrary object and return the Amino encoded bytes.
  15. ## Byte Arrays
  16. The encoding of a byte array is simply the raw-bytes prefixed with the length of
  17. the array as a `UVarint` (what proto calls a `Varint`).
  18. For details on varints, see the [protobuf
  19. spec](https://developers.google.com/protocol-buffers/docs/encoding#varints).
  20. For example, the byte-array `[0xA, 0xB]` would be encoded as `0x020A0B`,
  21. while a byte-array containing 300 entires beginning with `[0xA, 0xB, ...]` would
  22. be encoded as `0xAC020A0B...` where `0xAC02` is the UVarint encoding of 300.
  23. ## Hashing
  24. Tendermint uses `SHA256` as its hash function.
  25. Objects are always Amino encoded before being hashed.
  26. So `SHA256(obj)` is short for `SHA256(AminoEncode(obj))`.
  27. ## Public Key Cryptography
  28. Tendermint uses Amino to distinguish between different types of private keys,
  29. public keys, and signatures. Additionally, for each public key, Tendermint
  30. defines an Address function that can be used as a more compact identifier in
  31. place of the public key. Here we list the concrete types, their names,
  32. and prefix bytes for public keys and signatures, as well as the address schemes
  33. for each PubKey. Note for brevity we don't
  34. include details of the private keys beyond their type and name, as they can be
  35. derived the same way as the others using Amino.
  36. All registered objects are encoded by Amino using a 4-byte PrefixBytes that
  37. uniquely identifies the object and includes information about its underlying
  38. type. For details on how PrefixBytes are computed, see the [Amino
  39. spec](https://github.com/tendermint/go-amino#computing-the-prefix-and-disambiguation-bytes).
  40. In what follows, we provide the type names and prefix bytes directly.
  41. Notice that when encoding byte-arrays, the length of the byte-array is appended
  42. to the PrefixBytes. Thus the encoding of a byte array becomes `<PrefixBytes> <Length> <ByteArray>`. In other words, to encode any type listed below you do not need to be
  43. familiar with amino encoding.
  44. You can simply use below table and concatenate Prefix || Length (of raw bytes) || raw bytes
  45. ( while || stands for byte concatenation here).
  46. | Type | Name | Prefix | Length | Notes |
  47. | ----------------------- | ---------------------------------- | ---------- | -------- | ----- |
  48. | PubKeyEd25519 | tendermint/PubKeyEd25519 | 0x1624DE64 | 0x20 | |
  49. | PubKeySr25519 | tendermint/PubKeySr25519 | 0x0DFB1005 | 0x20 | |
  50. | PubKeySecp256k1 | tendermint/PubKeySecp256k1 | 0xEB5AE987 | 0x21 | |
  51. | PrivKeyEd25519 | tendermint/PrivKeyEd25519 | 0xA3288910 | 0x40 | |
  52. | PrivKeySr25519 | tendermint/PrivKeySr25519 | 0x2F82D78B | 0x20 | |
  53. | PrivKeySecp256k1 | tendermint/PrivKeySecp256k1 | 0xE1B0F79B | 0x20 | |
  54. | PubKeyMultisigThreshold | tendermint/PubKeyMultisigThreshold | 0x22C1F7E2 | variable | |
  55. ### Example
  56. For example, the 33-byte (or 0x21-byte in hex) Secp256k1 pubkey
  57. `020BD40F225A57ED383B440CF073BC5539D0341F5767D2BF2D78406D00475A2EE9`
  58. would be encoded as
  59. `EB5AE98721020BD40F225A57ED383B440CF073BC5539D0341F5767D2BF2D78406D00475A2EE9`
  60. ### Key Types
  61. Each type specifies it's own pubkey, address, and signature format.
  62. #### Ed25519
  63. TODO: pubkey
  64. The address is the first 20-bytes of the SHA256 hash of the raw 32-byte public key:
  65. ```
  66. address = SHA256(pubkey)[:20]
  67. ```
  68. The signature is the raw 64-byte ED25519 signature.
  69. #### Sr25519
  70. TODO: pubkey
  71. The address is the first 20-bytes of the SHA256 hash of the raw 32-byte public key:
  72. ```
  73. address = SHA256(pubkey)[:20]
  74. ```
  75. The signature is the raw 64-byte ED25519 signature.
  76. #### Secp256k1
  77. TODO: pubkey
  78. The address is the RIPEMD160 hash of the SHA256 hash of the OpenSSL compressed public key:
  79. ```
  80. address = RIPEMD160(SHA256(pubkey))
  81. ```
  82. This is the same as Bitcoin.
  83. The signature is the 64-byte concatenation of ECDSA `r` and `s` (ie. `r || s`),
  84. where `s` is lexicographically less than its inverse, to prevent malleability.
  85. This is like Ethereum, but without the extra byte for pubkey recovery, since
  86. Tendermint assumes the pubkey is always provided anyway.
  87. #### Multisig
  88. TODO
  89. ## Other Common Types
  90. ### BitArray
  91. The BitArray is used in some consensus messages to represent votes received from
  92. validators, or parts received in a block. It is represented
  93. with a struct containing the number of bits (`Bits`) and the bit-array itself
  94. encoded in base64 (`Elems`).
  95. ```go
  96. type BitArray struct {
  97. Bits int
  98. Elems []uint64
  99. }
  100. ```
  101. This type is easily encoded directly by Amino.
  102. Note BitArray receives a special JSON encoding in the form of `x` and `_`
  103. representing `1` and `0`. Ie. the BitArray `10110` would be JSON encoded as
  104. `"x_xx_"`
  105. ### Part
  106. Part is used to break up blocks into pieces that can be gossiped in parallel
  107. and securely verified using a Merkle tree of the parts.
  108. Part contains the index of the part (`Index`), the actual
  109. underlying data of the part (`Bytes`), and a Merkle proof that the part is contained in
  110. the set (`Proof`).
  111. ```go
  112. type Part struct {
  113. Index int
  114. Bytes []byte
  115. Proof SimpleProof
  116. }
  117. ```
  118. See details of SimpleProof, below.
  119. ### MakeParts
  120. Encode an object using Amino and slice it into parts.
  121. Tendermint uses a part size of 65536 bytes, and allows a maximum of 1601 parts
  122. (see `types.MaxBlockPartsCount`). This corresponds to the hard-coded block size
  123. limit of 100MB.
  124. ```go
  125. func MakeParts(block Block) []Part
  126. ```
  127. ## Merkle Trees
  128. For an overview of Merkle trees, see
  129. [wikipedia](https://en.wikipedia.org/wiki/Merkle_tree)
  130. We use the RFC 6962 specification of a merkle tree, with sha256 as the hash function.
  131. Merkle trees are used throughout Tendermint to compute a cryptographic digest of a data structure.
  132. The differences between RFC 6962 and the simplest form a merkle tree are that:
  133. 1. leaf nodes and inner nodes have different hashes.
  134. This is for "second pre-image resistance", to prevent the proof to an inner node being valid as the proof of a leaf.
  135. The leaf nodes are `SHA256(0x00 || leaf_data)`, and inner nodes are `SHA256(0x01 || left_hash || right_hash)`.
  136. 2. When the number of items isn't a power of two, the left half of the tree is as big as it could be.
  137. (The largest power of two less than the number of items) This allows new leaves to be added with less
  138. recomputation. For example:
  139. ```
  140. Simple Tree with 6 items Simple Tree with 7 items
  141. * *
  142. / \ / \
  143. / \ / \
  144. / \ / \
  145. / \ / \
  146. * * * *
  147. / \ / \ / \ / \
  148. / \ / \ / \ / \
  149. / \ / \ / \ / \
  150. * * h4 h5 * * * h6
  151. / \ / \ / \ / \ / \
  152. h0 h1 h2 h3 h0 h1 h2 h3 h4 h5
  153. ```
  154. ### MerkleRoot
  155. The function `MerkleRoot` is a simple recursive function defined as follows:
  156. ```go
  157. // SHA256(0x00 || leaf)
  158. func leafHash(leaf []byte) []byte {
  159. return tmhash.Sum(append(0x00, leaf...))
  160. }
  161. // SHA256(0x01 || left || right)
  162. func innerHash(left []byte, right []byte) []byte {
  163. return tmhash.Sum(append(0x01, append(left, right...)...))
  164. }
  165. // largest power of 2 less than k
  166. func getSplitPoint(k int) { ... }
  167. func MerkleRoot(items [][]byte) []byte{
  168. switch len(items) {
  169. case 0:
  170. return nil
  171. case 1:
  172. return leafHash(items[0])
  173. default:
  174. k := getSplitPoint(len(items))
  175. left := MerkleRoot(items[:k])
  176. right := MerkleRoot(items[k:])
  177. return innerHash(left, right)
  178. }
  179. }
  180. ```
  181. Note: `MerkleRoot` operates on items which are arbitrary byte arrays, not
  182. necessarily hashes. For items which need to be hashed first, we introduce the
  183. `Hashes` function:
  184. ```
  185. func Hashes(items [][]byte) [][]byte {
  186. return SHA256 of each item
  187. }
  188. ```
  189. Note: we will abuse notion and invoke `MerkleRoot` with arguments of type `struct` or type `[]struct`.
  190. For `struct` arguments, we compute a `[][]byte` containing the amino encoding of each
  191. field in the struct, in the same order the fields appear in the struct.
  192. For `[]struct` arguments, we compute a `[][]byte` by amino encoding the individual `struct` elements.
  193. ### Simple Merkle Proof
  194. Proof that a leaf is in a Merkle tree is composed as follows:
  195. ```golang
  196. type SimpleProof struct {
  197. Total int
  198. Index int
  199. LeafHash []byte
  200. Aunts [][]byte
  201. }
  202. ```
  203. Which is verified as follows:
  204. ```golang
  205. func (proof SimpleProof) Verify(rootHash []byte, leaf []byte) bool {
  206. assert(proof.LeafHash, leafHash(leaf)
  207. computedHash := computeHashFromAunts(proof.Index, proof.Total, proof.LeafHash, proof.Aunts)
  208. return computedHash == rootHash
  209. }
  210. func computeHashFromAunts(index, total int, leafHash []byte, innerHashes [][]byte) []byte{
  211. assert(index < total && index >= 0 && total > 0)
  212. if total == 1{
  213. assert(len(proof.Aunts) == 0)
  214. return leafHash
  215. }
  216. assert(len(innerHashes) > 0)
  217. numLeft := getSplitPoint(total) // largest power of 2 less than total
  218. if index < numLeft {
  219. leftHash := computeHashFromAunts(index, numLeft, leafHash, innerHashes[:len(innerHashes)-1])
  220. assert(leftHash != nil)
  221. return innerHash(leftHash, innerHashes[len(innerHashes)-1])
  222. }
  223. rightHash := computeHashFromAunts(index-numLeft, total-numLeft, leafHash, innerHashes[:len(innerHashes)-1])
  224. assert(rightHash != nil)
  225. return innerHash(innerHashes[len(innerHashes)-1], rightHash)
  226. }
  227. ```
  228. The number of aunts is limited to 100 (`MaxAunts`) to protect the node against DOS attacks.
  229. This limits the tree size to 2^100 leaves, which should be sufficient for any
  230. conceivable purpose.
  231. ### IAVL+ Tree
  232. Because Tendermint only uses a Simple Merkle Tree, application developers are expect to use their own Merkle tree in their applications. For example, the IAVL+ Tree - an immutable self-balancing binary tree for persisting application state is used by the [Cosmos SDK](https://github.com/cosmos/cosmos-sdk/blob/ae77f0080a724b159233bd9b289b2e91c0de21b5/docs/interfaces/lite/specification.md)
  233. ## JSON
  234. ### Amino
  235. Amino also supports JSON encoding - registered types are simply encoded as:
  236. ```
  237. {
  238. "type": "<amino type name>",
  239. "value": <JSON>
  240. }
  241. ```
  242. For instance, an ED25519 PubKey would look like:
  243. ```
  244. {
  245. "type": "tendermint/PubKeyEd25519",
  246. "value": "uZ4h63OFWuQ36ZZ4Bd6NF+/w9fWUwrOncrQsackrsTk="
  247. }
  248. ```
  249. Where the `"value"` is the base64 encoding of the raw pubkey bytes, and the
  250. `"type"` is the amino name for Ed25519 pubkeys.
  251. ### Signed Messages
  252. Signed messages (eg. votes, proposals) in the consensus are encoded using Amino.
  253. When signing, the elements of a message are re-ordered so the fixed-length fields
  254. are first, making it easy to quickly check the type, height, and round.
  255. The `ChainID` is also appended to the end.
  256. We call this encoding the SignBytes. For instance, SignBytes for a vote is the Amino encoding of the following struct:
  257. ```go
  258. type CanonicalVote struct {
  259. Type byte
  260. Height int64 `binary:"fixed64"`
  261. Round int64 `binary:"fixed64"`
  262. BlockID CanonicalBlockID
  263. Timestamp time.Time
  264. ChainID string
  265. }
  266. ```
  267. The field ordering and the fixed sized encoding for the first three fields is optimized to ease parsing of SignBytes
  268. in HSMs. It creates fixed offsets for relevant fields that need to be read in this context.
  269. For more details, see the [signing spec](../consensus/signing.md).
  270. Also, see the motivating discussion in
  271. [#1622](https://github.com/tendermint/tendermint/issues/1622).