diff --git a/docs/specification/new-spec/README.md b/docs/specification/new-spec/README.md new file mode 100644 index 000000000..a5061e62e --- /dev/null +++ b/docs/specification/new-spec/README.md @@ -0,0 +1,57 @@ +# Tendermint Specification + +This is a markdown specification of the Tendermint blockchain. + +It defines the base data structures used in the blockchain and how they are validated. + +It contains the following components: + +- [Encoding and Digests](encoding.md) +- [Blockchain](blockchain.md) +- [State](state.md) + +## Overview + +Tendermint provides Byzantine Fault Tolerant State Machine Replication using +hash-linked batches of transactions. Such transaction batches are called "blocks". +Hence Tendermint defines a "blockchain". + +Each block in Tendermint has a unique index - its Height. +A block at `Height == H` can only be committed *after* the +block at `Height == H-1`. +Each block is committed by a known set of weighted Validators. +Membership and weighting within this set may change over time. +Tendermint guarantees the safety and liveness of the blockchain +so long as less than 1/3 of the total weight of the Validator set +is malicious. + +A commit in Tendermint is a set of signed messages from more than 2/3 of +the total weight of the current Validator set. Validators take turns proposing +blocks and voting on them. Once enough votes are received, the block is considered +committed. These votes are included in the *next* block as proof that the previous block +was committed - they cannot be included in the current block, as that block has already been +created. + +Once a block is committed, it can be executed against an application. +The application returns results for each of the transactions in the block. +The application can also return changes to be made to the validator set, +as well as a cryptographic digest of its latest state. + +Tendermint is designed to enable efficient verification and authentication +of the latest state of the blockchain. To achieve this, it embeds +cryptographic commitments to certain information in the block "header". +This information includes the contents of the block (eg. the transactions), +the validator set committing the block, as well as the various results returned by the application. +Note, however, that block execution only occurs *after* a block is committed. +Thus, application results can only be included in the *next* block. + +Also note that information like the transaction results and the validator set are never +directly included in the block - only their cryptographic digests (Merkle roots) are. +Hence, verification of a block requires a separate data structure to store this information. +We call this the `State`. Block verification also requires access to the previous block. + +## TODO + +- Light Client +- P2P +- Reactor protocols (consensus, mempool, blockchain, pex) diff --git a/docs/specification/new-spec/blockchain.md b/docs/specification/new-spec/blockchain.md new file mode 100644 index 000000000..ce2529f80 --- /dev/null +++ b/docs/specification/new-spec/blockchain.md @@ -0,0 +1,391 @@ +# Tendermint Blockchain + +Here we describe the data structures in the Tendermint blockchain and the rules for validating them. + +# Data Structures + +The Tendermint blockchains consists of a short list of basic data types: +`Block`, `Header`, `Vote`, `BlockID`, `Signature`, and `Evidence`. + +## Block + +A block consists of a header, a list of transactions, a list of votes (the commit), +and a list of evidence if malfeasance (ie. signing conflicting votes). + +``` +type Block struct { + Header Header + Txs [][]byte + LastCommit []Vote + Evidence []Evidence +} +``` + +## Header + +A block header contains metadata about the block and about the consensus, as well as commitments to +the data in the current block, the previous block, and the results returned by the application: + +``` +type Header struct { + // block metadata + Version string // Version string + ChainID string // ID of the chain + Height int64 // current block height + Time int64 // UNIX time, in millisconds + + // current block + NumTxs int64 // Number of txs in this block + TxHash []byte // SimpleMerkle of the block.Txs + LastCommitHash []byte // SimpleMerkle of the block.LastCommit + + // previous block + TotalTxs int64 // prevBlock.TotalTxs + block.NumTxs + LastBlockID BlockID // BlockID of prevBlock + + // application + ResultsHash []byte // SimpleMerkle of []abci.Result from prevBlock + AppHash []byte // Arbitrary state digest + ValidatorsHash []byte // SimpleMerkle of the ValidatorSet + ConsensusParamsHash []byte // SimpleMerkle of the ConsensusParams + + // consensus + Proposer []byte // Address of the block proposer + EvidenceHash []byte // SimpleMerkle of []Evidence +} +``` + +Further details on each of this fields is taken up below. + +## BlockID + +The `BlockID` contains two distinct Merkle roots of the block. +The first, used as the block's main hash, is the Merkle root +of all the fields in the header. The second, used for secure gossipping of +the block during consensus, is the Merkle root of the complete serialized block +cut into parts. The `BlockID` includes these two hashes, as well as the number of +parts. + +``` +type BlockID struct { + Hash []byte + Parts PartsHeader +} + +type PartsHeader struct { + Hash []byte + Total int32 +} +``` + +## Vote + +A vote is a signed message from a validator for a particular block. +The vote includes information about the validator signing it. + +``` +type Vote struct { + Timestamp int64 + Address []byte + Index int + Height int64 + Round int + Type int8 + BlockID BlockID + Signature Signature +} +``` + + +There are two types of votes: +a prevote has `vote.Type == 1` and +a precommit has `vote.Type == 2`. + +## Signature + +Tendermint allows for multiple signature schemes to be used by prepending a single type-byte +to the signature bytes. Different signatures may also come with fixed or variable lengths. +Currently, Tendermint supports Ed25519 and Secp256k1. + +### ED25519 + +An ED25519 signature has `Type == 0x1`. It looks like: + +``` +// Implements Signature +type Ed25519Signature struct { + Type int8 = 0x1 + Signature [64]byte +} +``` + +where `Signature` is the 64 byte signature. + +### Secp256k1 + +A `Secp256k1` signature has `Type == 0x2`. It looks like: + +``` +// Implements Signature +type Secp256k1Signature struct { + Type int8 = 0x2 + Signature []byte +} +``` + +where `Signature` is the DER encoded signature, ie: + +``` +0x30 <0x02> 0x2 . +``` + +## Evidence + +TODO + +# Validation + +Here we describe the validation rules for every element in a block. +Blocks which do not satisfy these rules are considered invalid. + +We abuse notation by using something that looks like Go, supplemented with English. +A statement such as `x == y` is an assertion - if it fails, the item is invalid. + +We refer to certain globally available objects: +`block` is the block under consideration, +`prevBlock` is the `block` at the previous height, +and `state` keeps track of the validator set, the consensus parameters +and other results from the application. +Elements of an object are accessed as expected, +ie. `block.Header`. See [here](state.md) for the definition of `state`. + +## Header + +A Header is valid if its corresponding fields are valid. + +### Version + +Arbitrary string. + +### ChainID + +Arbitrary constant string. + +### Height + +``` +block.Header.Height > 0 +block.Header.Height == prevBlock.Header.Height + 1 +``` + +The height is an incrementing integer. The first block has `block.Header.Height == 1`. + +### Time + +The median of the timestamps of the valid votes in the block.LastCommit. +Corresponds to the number of nanoseconds, with millisecond resolution, since January 1, 1970. + +Note the timestamp in a vote must be greater by at least one millisecond than that of the +block being voted on. + +### NumTxs + +``` +block.Header.NumTxs == len(block.Txs) +``` + +Number of transactions included in the block. + +### TxHash + +``` +block.Header.TxHash == SimpleMerkleRoot(block.Txs) +``` + +Simple Merkle root of the transactions in the block. + +### LastCommitHash + +``` +block.Header.LastCommitHash == SimpleMerkleRoot(block.LastCommit) +``` + +Simple Merkle root of the votes included in the block. +These are the votes that committed the previous block. + +The first block has `block.Header.LastCommitHash == []byte{}` + +### TotalTxs + +``` +block.Header.TotalTxs == prevBlock.Header.TotalTxs + block.Header.NumTxs +``` + +The cumulative sum of all transactions included in this blockchain. + +The first block has `block.Header.TotalTxs = block.Header.NumberTxs`. + +### LastBlockID + +``` +prevBlockParts := MakeParts(prevBlock, state.LastConsensusParams.BlockGossip.BlockPartSize) +block.Header.LastBlockID == BlockID { + Hash: SimpleMerkleRoot(prevBlock.Header), + PartsHeader{ + Hash: SimpleMerkleRoot(prevBlockParts), + Total: len(prevBlockParts), + }, +} +``` + +Previous block's BlockID. Note it depends on the ConsensusParams, +which are held in the `state` and may be updated by the application. + +The first block has `block.Header.LastBlockID == BlockID{}`. + +### ResultsHash + +``` +block.ResultsHash == SimpleMerkleRoot(state.LastResults) +``` + +Simple Merkle root of the results of the transactions in the previous block. + +The first block has `block.Header.ResultsHash == []byte{}`. + +### AppHash + +``` +block.AppHash == state.AppHash +``` + +Arbitrary byte array returned by the application after executing and commiting the previous block. + +The first block has `block.Header.AppHash == []byte{}`. + +### ValidatorsHash + +``` +block.ValidatorsHash == SimpleMerkleRoot(state.Validators) +``` + +Simple Merkle root of the current validator set that is committing the block. +This can be used to validate the `LastCommit` included in the next block. +May be updated by the application. + +### ConsensusParamsHash + +``` +block.ConsensusParamsHash == SimpleMerkleRoot(state.ConsensusParams) +``` + +Simple Merkle root of the consensus parameters. +May be updated by the application. + +### Proposer + +``` +block.Header.Proposer in state.Validators +``` + +Original proposer of the block. Must be a current validator. + +NOTE: this field can only be further verified by real-time participants in the consensus. +This is because the same block can be proposed in multiple rounds for the same height +and we do not track the initial round the block was proposed. + +### EvidenceHash + +``` +block.EvidenceHash == SimpleMerkleRoot(block.Evidence) +``` + +Simple Merkle root of the evidence of Byzantine behaviour included in this block. + +## Txs + +Arbitrary length array of arbitrary length byte-arrays. + +## LastCommit + +The first height is an exception - it requires the LastCommit to be empty: + +``` +if block.Header.Height == 1 { + len(b.LastCommit) == 0 +} +``` + +Otherwise, we require: + +``` +len(block.LastCommit) == len(state.LastValidators) +talliedVotingPower := 0 +for i, vote := range block.LastCommit{ + if vote == nil{ + continue + } + vote.Type == 2 + vote.Height == block.LastCommit.Height() + vote.Round == block.LastCommit.Round() + vote.BlockID == block.LastBlockID + + val := state.LastValidators[i] + vote.Verify(block.ChainID, val.PubKey) == true + + talliedVotingPower += val.VotingPower +} + +talliedVotingPower > (2/3) * TotalVotingPower(state.LastValidators) +``` + +Includes one (possibly nil) vote for every current validator. +Non-nil votes must be Precommits. +All votes must be for the same height and round. +All votes must be for the previous block. +All votes must have a valid signature from the corresponding validator. +The sum total of the voting power of the validators that voted +must be greater than 2/3 of the total voting power of the complete validator set. + +### Vote + +A vote is a signed message broadcast in the consensus for a particular block at a particular height and round. +When stored in the blockchain or propagated over the network, votes are encoded in TMBIN. +For signing, votes are encoded in JSON, and the ChainID is included, in the form of the `CanonicalSignBytes`. + +We define a method `Verify` that returns `true` if the signature verifies against the pubkey for the CanonicalSignBytes +using the given ChainID: + +``` +func (v Vote) Verify(chainID string, pubKey PubKey) bool { + return pubKey.Verify(v.Signature, CanonicalSignBytes(chainID, v)) +} +``` + +where `pubKey.Verify` performs the approprioate digital signature verification of the `pubKey` +against the given signature and message bytes. + +## Evidence + + +``` + + +``` + +Every piece of evidence contains two conflicting votes from a single validator that +was active at the height indicated in the votes. +The votes must not be too old. + + +# Execution + +Once a block is validated, it can be executed against the state. + +The state follows the recursive equation: + +``` +app = NewABCIApp +state(1) = InitialState +state(h+1) <- Execute(state(h), app, block(h)) +``` diff --git a/docs/specification/new-spec/encoding.md b/docs/specification/new-spec/encoding.md new file mode 100644 index 000000000..a7482e6cb --- /dev/null +++ b/docs/specification/new-spec/encoding.md @@ -0,0 +1,178 @@ +# Tendermint Encoding + +## Binary Serialization (TMBIN) + +Tendermint aims to encode data structures in a manner similar to how the corresponding Go structs are laid out in memory. +Variable length items are length-prefixed. +While the encoding was inspired by Go, it is easily implemented in other languages as well given its intuitive design. + +### Fixed Length Integers + +Fixed length integers are encoded in Big-Endian using the specified number of bytes. +So `uint8` and `int8` use one byte, `uint16` and `int16` use two bytes, +`uint32` and `int32` use 3 bytes, and `uint64` and `int64` use 4 bytes. + +Negative integers are encoded via twos-complement. + +Examples: + +``` +encode(uint8(6)) == [0x06] +encode(uint32(6)) == [0x00, 0x00, 0x00, 0x06] + +encode(int8(-6)) == [0xFA] +encode(int32(-6)) == [0xFF, 0xFF, 0xFF, 0xFA] +``` + +### Variable Length Integers + +Variable length integers are encoded as length-prefixed Big-Endian integers. +The length-prefix consists of a single byte and corresponds to the length of the encoded integer. + +Negative integers are encoded by flipping the leading bit of the length-prefix to a `1`. + +Zero is encoded as `0x00`. It is not length-prefixed. + + +Examples: + +``` +encode(uint(6)) == [0x01, 0x06] +encode(uint(70000)) == [0x03, 0x01, 0x11, 0x70] + +encode(int(-6)) == [0xF1, 0x06] +encode(int(-70000)) == [0xF3, 0x01, 0x11, 0x70] + +encode(int(0)) == [0x00] +``` + +### Strings + +An encoded string is a length prefix followed by the underlying bytes of the string. +The length-prefix is itself encoded as an `int`. + +The empty string is encoded as `0x00`. It is not length-prefixed. + +Examples: + +``` +encode("") == [0x00] +encode("a") == [0x01, 0x01, 0x61] +encode("hello") == [0x01, 0x05, 0x68, 0x65, 0x6C, 0x6C, 0x6F] +encode("¥") == [0x01, 0x02, 0xC2, 0xA5] +``` + +### Arrays (fixed length) + +An encoded fix-lengthed array is the concatenation of the encoding of its elements. +There is no length-prefix. + +Examples: + +``` +encode([4]int8{1, 2, 3, 4}) == [0x01, 0x02, 0x03, 0x04] +encode([4]int16{1, 2, 3, 4}) == [0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x00, 0x04] +encode([4]int{1, 2, 3, 4}) == [0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x04] +encode([2]string{"abc", "efg"}) == [0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67] +``` + +### Slices (variable length) + +An encoded variable-length array is a length prefix followed by the concatenation of the encoding of its elements. +The length-prefix is itself encoded as an `int`. + +An empty slice is encoded as `0x00`. It is not length-prefixed. + +Examples: + +``` +encode([]int8{}) == [0x00] +encode([]int8{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x02, 0x03, 0x04] +encode([]int16{1, 2, 3, 4}) == [0x01, 0x04, 0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x00, 0x04] +encode([]int{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x4] +encode([]string{"abc", "efg"}) == [0x01, 0x02, 0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67] +``` + +### Time + +Time is encoded as an `int64` of the number of nanoseconds since January 1, 1970, +rounded to the nearest millisecond. + +Times before then are invalid. + +Examples: + +``` +encode(time.Time("Jan 1 00:00:00 UTC 1970")) == [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00] +encode(time.Time("Jan 1 00:00:01 UTC 1970")) == [0x00, 0x00, 0x00, 0x00, 0x3B, 0x9A, 0xCA, 0x00] // 1,000,000,000 ns +encode(time.Time("Mon Jan 2 15:04:05 -0700 MST 2006")) == [0x0F, 0xC4, 0xBB, 0xC1, 0x53, 0x03, 0x12, 0x00] +``` + +### Structs + +An encoded struct is the concatenation of the encoding of its elements. +There is no length-prefix. + +Examples: + +``` +type MyStruct struct{ + A int + B string + C time.Time +} +encode(MyStruct{4, "hello", time.Time("Mon Jan 2 15:04:05 -0700 MST 2006")}) == + [0x01, 0x04, 0x01, 0x05, 0x68, 0x65, 0x6C, 0x6C, 0x6F, 0x0F, 0xC4, 0xBB, 0xC1, 0x53, 0x03, 0x12, 0x00] +``` + + +## Merkle Trees + +Simple Merkle trees are used in numerous places in Tendermint to compute a cryptographic digest of a data structure. + +RIPEMD160 is always used as the hashing function. + +The function `SimpleMerkleRoot` is a simple recursive function defined as follows: + +``` +func SimpleMerkleRoot(hashes [][]byte) []byte{ + switch len(hashes) { + case 0: + return nil + case 1: + return hashes[0] + default: + left := SimpleMerkleRoot(hashes[:(len(hashes)+1)/2]) + right := SimpleMerkleRoot(hashes[(len(hashes)+1)/2:]) + return RIPEMD160(append(left, right)) + } +} +``` + +Note we abuse notion and call `SimpleMerkleRoot` with arguments of type `struct` or type `[]struct`. +For `struct` arguments, we compute a `[][]byte` by sorting elements of the `struct` according to field name and then hashing them. +For `[]struct` arguments, we compute a `[][]byte` by hashing the individual `struct` elements. + +## JSON (TMJSON) + +Signed messages (eg. votes, proposals) in the consensus are encoded in TMJSON, rather than TMBIN. +TMJSON is JSON where `[]byte` are encoded as uppercase hex, rather than base64. + +When signing, the elements of a message are sorted by key and the sorted message is embedded in an outer JSON that includes a `chain_id` field. +We call this encoding the CanonicalSignBytes. For instance, CanonicalSignBytes for a vote would look like: + +``` +{"chain_id":"my-chain-id","vote":{"block_id":{"hash":DEADBEEF,"parts":{"hash":BEEFDEAD,"total":3}},"height":3,"round":2,"timestamp":1234567890, "type":2} +``` + +Note how the fields within each level are sorted. + +## Other + +### MakeParts + +TMBIN encode an object and slice it into parts. + +``` +MakeParts(object, partSize) +``` diff --git a/docs/specification/new-spec/spec-notes.md b/docs/specification/new-spec/spec-notes.md new file mode 100644 index 000000000..fbffac741 --- /dev/null +++ b/docs/specification/new-spec/spec-notes.md @@ -0,0 +1,3 @@ +- Remove BlockID from Commit +- Actually validate the ValidatorsHash +- Move blockHeight=1 exception for LastCommit to ValidateBasic diff --git a/docs/specification/new-spec/state.md b/docs/specification/new-spec/state.md new file mode 100644 index 000000000..1d7900273 --- /dev/null +++ b/docs/specification/new-spec/state.md @@ -0,0 +1,104 @@ +# Tendermint State + +## State + +The state contains information whose cryptographic digest is included in block headers, +and thus is necessary for validating new blocks. +For instance, the Merkle root of the results from executing the previous block, or the Merkle root of the current validators. +While neither the results of transactions now the validators are ever included in the blockchain itself, +the Merkle roots are, and hence we need a separate data structure to track them. + +``` +type State struct { + LastResults []Result + AppHash []byte + + Validators []Validator + LastValidators []Validator + + ConsensusParams ConsensusParams +} +``` + +### Result + +``` +type Result struct { + Code uint32 + Data []byte + Tags []KVPair +} + +type KVPair struct { + Key []byte + Value []byte +} +``` + +`Result` is the result of executing a transaction against the application. +It returns a result code, an arbitrary byte array (ie. a return value), +and a list of key-value pairs ordered by key. The key-value pairs, or tags, +can be used to index transactions according to their "effects", which are +represented in the tags. + +### Validator + +A validator is an active participant in the consensus with a public key and a voting power. +Validator's also contain an address which is derived from the PubKey: + +``` +type Validator struct { + Address []byte + PubKey PubKey + VotingPower int64 +} +``` + +The `state.Validators` and `state.LastValidators` must always by sorted by validator address, +so that there is a canonical order for computing the SimpleMerkleRoot. + +We also define a `TotalVotingPower` function, to return the total voting power: + +``` +func TotalVotingPower(vals []Validators) int64{ + sum := 0 + for v := range vals{ + sum += v.VotingPower + } + return sum +} +``` + +### PubKey + +TODO: + +### ConsensusParams + +TODO: + +## Execution + +We define an `Execute` function that takes a state and a block, +executes the block against the application, and returns an updated state. + +``` +Execute(s State, app ABCIApp, block Block) State { + abciResponses := app.ApplyBlock(block) + + return State{ + LastResults: abciResponses.DeliverTxResults, + AppHash: abciResponses.AppHash, + Validators: UpdateValidators(state.Validators, abciResponses.ValidatorChanges), + LastValidators: state.Validators, + ConsensusParams: UpdateConsensusParams(state.ConsensusParams, abci.Responses.ConsensusParamChanges), + } +} + +type ABCIResponses struct { + DeliverTxResults []Result + ValidatorChanges []Validator + ConsensusParamChanges ConsensusParams + AppHash []byte +} +``` diff --git a/docs/specification/new-spec/wire.go b/docs/specification/new-spec/wire.go new file mode 100644 index 000000000..af76f3669 --- /dev/null +++ b/docs/specification/new-spec/wire.go @@ -0,0 +1,80 @@ +package main + +import ( + "fmt" + "time" + + wire "github.com/tendermint/go-wire" +) + +func main() { + + encode(uint8(6)) + encode(uint32(6)) + encode(int8(-6)) + encode(int32(-6)) + Break() + encode(uint(6)) + encode(uint(70000)) + encode(int(0)) + encode(int(-6)) + encode(int(-70000)) + Break() + encode("") + encode("a") + encode("hello") + encode("¥") + Break() + encode([4]int8{1, 2, 3, 4}) + encode([4]int16{1, 2, 3, 4}) + encode([4]int{1, 2, 3, 4}) + encode([2]string{"abc", "efg"}) + Break() + encode([]int8{}) + encode([]int8{1, 2, 3, 4}) + encode([]int16{1, 2, 3, 4}) + encode([]int{1, 2, 3, 4}) + encode([]string{"abc", "efg"}) + Break() + + timeFmt := "Mon Jan 2 15:04:05 -0700 MST 2006" + t1, _ := time.Parse(timeFmt, timeFmt) + n := (t1.UnixNano() / 1000000.) * 1000000 + encode(n) + encode(t1) + + t2, _ := time.Parse(timeFmt, "Thu Jan 1 00:00:00 -0000 UTC 1970") + encode(t2) + + t2, _ = time.Parse(timeFmt, "Thu Jan 1 00:00:01 -0000 UTC 1970") + fmt.Println("N", t2.UnixNano()) + encode(t2) + Break() + encode(struct { + A int + B string + C time.Time + }{ + 4, + "hello", + t1, + }) +} + +func encode(i interface{}) { + Println(wire.BinaryBytes(i)) + +} + +func Println(b []byte) { + s := "[" + for _, x := range b { + s += fmt.Sprintf("0x%.2X, ", x) + } + s = s[:len(s)-2] + "]" + fmt.Println(s) +} + +func Break() { + fmt.Println("------") +}