Browse Source

Merge pull request #1009 from tendermint/spec

Spec
pull/1024/head
Ethan Buchman 7 years ago
committed by GitHub
parent
commit
1a0db878bf
No known key found for this signature in database GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 813 additions and 0 deletions
  1. +57
    -0
      docs/specification/new-spec/README.md
  2. +391
    -0
      docs/specification/new-spec/blockchain.md
  3. +178
    -0
      docs/specification/new-spec/encoding.md
  4. +3
    -0
      docs/specification/new-spec/spec-notes.md
  5. +104
    -0
      docs/specification/new-spec/state.md
  6. +80
    -0
      docs/specification/new-spec/wire.go

+ 57
- 0
docs/specification/new-spec/README.md View File

@ -0,0 +1,57 @@
# Tendermint Specification
This is a markdown specification of the Tendermint blockchain.
It defines the base data structures used in the blockchain and how they are validated.
It contains the following components:
- [Encoding and Digests](encoding.md)
- [Blockchain](blockchain.md)
- [State](state.md)
## Overview
Tendermint provides Byzantine Fault Tolerant State Machine Replication using
hash-linked batches of transactions. Such transaction batches are called "blocks".
Hence Tendermint defines a "blockchain".
Each block in Tendermint has a unique index - its Height.
A block at `Height == H` can only be committed *after* the
block at `Height == H-1`.
Each block is committed by a known set of weighted Validators.
Membership and weighting within this set may change over time.
Tendermint guarantees the safety and liveness of the blockchain
so long as less than 1/3 of the total weight of the Validator set
is malicious.
A commit in Tendermint is a set of signed messages from more than 2/3 of
the total weight of the current Validator set. Validators take turns proposing
blocks and voting on them. Once enough votes are received, the block is considered
committed. These votes are included in the *next* block as proof that the previous block
was committed - they cannot be included in the current block, as that block has already been
created.
Once a block is committed, it can be executed against an application.
The application returns results for each of the transactions in the block.
The application can also return changes to be made to the validator set,
as well as a cryptographic digest of its latest state.
Tendermint is designed to enable efficient verification and authentication
of the latest state of the blockchain. To achieve this, it embeds
cryptographic commitments to certain information in the block "header".
This information includes the contents of the block (eg. the transactions),
the validator set committing the block, as well as the various results returned by the application.
Note, however, that block execution only occurs *after* a block is committed.
Thus, application results can only be included in the *next* block.
Also note that information like the transaction results and the validator set are never
directly included in the block - only their cryptographic digests (Merkle roots) are.
Hence, verification of a block requires a separate data structure to store this information.
We call this the `State`. Block verification also requires access to the previous block.
## TODO
- Light Client
- P2P
- Reactor protocols (consensus, mempool, blockchain, pex)

+ 391
- 0
docs/specification/new-spec/blockchain.md View File

@ -0,0 +1,391 @@
# Tendermint Blockchain
Here we describe the data structures in the Tendermint blockchain and the rules for validating them.
# Data Structures
The Tendermint blockchains consists of a short list of basic data types:
`Block`, `Header`, `Vote`, `BlockID`, `Signature`, and `Evidence`.
## Block
A block consists of a header, a list of transactions, a list of votes (the commit),
and a list of evidence if malfeasance (ie. signing conflicting votes).
```
type Block struct {
Header Header
Txs [][]byte
LastCommit []Vote
Evidence []Evidence
}
```
## Header
A block header contains metadata about the block and about the consensus, as well as commitments to
the data in the current block, the previous block, and the results returned by the application:
```
type Header struct {
// block metadata
Version string // Version string
ChainID string // ID of the chain
Height int64 // current block height
Time int64 // UNIX time, in millisconds
// current block
NumTxs int64 // Number of txs in this block
TxHash []byte // SimpleMerkle of the block.Txs
LastCommitHash []byte // SimpleMerkle of the block.LastCommit
// previous block
TotalTxs int64 // prevBlock.TotalTxs + block.NumTxs
LastBlockID BlockID // BlockID of prevBlock
// application
ResultsHash []byte // SimpleMerkle of []abci.Result from prevBlock
AppHash []byte // Arbitrary state digest
ValidatorsHash []byte // SimpleMerkle of the ValidatorSet
ConsensusParamsHash []byte // SimpleMerkle of the ConsensusParams
// consensus
Proposer []byte // Address of the block proposer
EvidenceHash []byte // SimpleMerkle of []Evidence
}
```
Further details on each of this fields is taken up below.
## BlockID
The `BlockID` contains two distinct Merkle roots of the block.
The first, used as the block's main hash, is the Merkle root
of all the fields in the header. The second, used for secure gossipping of
the block during consensus, is the Merkle root of the complete serialized block
cut into parts. The `BlockID` includes these two hashes, as well as the number of
parts.
```
type BlockID struct {
Hash []byte
Parts PartsHeader
}
type PartsHeader struct {
Hash []byte
Total int32
}
```
## Vote
A vote is a signed message from a validator for a particular block.
The vote includes information about the validator signing it.
```
type Vote struct {
Timestamp int64
Address []byte
Index int
Height int64
Round int
Type int8
BlockID BlockID
Signature Signature
}
```
There are two types of votes:
a prevote has `vote.Type == 1` and
a precommit has `vote.Type == 2`.
## Signature
Tendermint allows for multiple signature schemes to be used by prepending a single type-byte
to the signature bytes. Different signatures may also come with fixed or variable lengths.
Currently, Tendermint supports Ed25519 and Secp256k1.
### ED25519
An ED25519 signature has `Type == 0x1`. It looks like:
```
// Implements Signature
type Ed25519Signature struct {
Type int8 = 0x1
Signature [64]byte
}
```
where `Signature` is the 64 byte signature.
### Secp256k1
A `Secp256k1` signature has `Type == 0x2`. It looks like:
```
// Implements Signature
type Secp256k1Signature struct {
Type int8 = 0x2
Signature []byte
}
```
where `Signature` is the DER encoded signature, ie:
```
0x30 <length of whole message> <0x02> <length of R> <R> 0x2 <length of S> <S>.
```
## Evidence
TODO
# Validation
Here we describe the validation rules for every element in a block.
Blocks which do not satisfy these rules are considered invalid.
We abuse notation by using something that looks like Go, supplemented with English.
A statement such as `x == y` is an assertion - if it fails, the item is invalid.
We refer to certain globally available objects:
`block` is the block under consideration,
`prevBlock` is the `block` at the previous height,
and `state` keeps track of the validator set, the consensus parameters
and other results from the application.
Elements of an object are accessed as expected,
ie. `block.Header`. See [here](state.md) for the definition of `state`.
## Header
A Header is valid if its corresponding fields are valid.
### Version
Arbitrary string.
### ChainID
Arbitrary constant string.
### Height
```
block.Header.Height > 0
block.Header.Height == prevBlock.Header.Height + 1
```
The height is an incrementing integer. The first block has `block.Header.Height == 1`.
### Time
The median of the timestamps of the valid votes in the block.LastCommit.
Corresponds to the number of nanoseconds, with millisecond resolution, since January 1, 1970.
Note the timestamp in a vote must be greater by at least one millisecond than that of the
block being voted on.
### NumTxs
```
block.Header.NumTxs == len(block.Txs)
```
Number of transactions included in the block.
### TxHash
```
block.Header.TxHash == SimpleMerkleRoot(block.Txs)
```
Simple Merkle root of the transactions in the block.
### LastCommitHash
```
block.Header.LastCommitHash == SimpleMerkleRoot(block.LastCommit)
```
Simple Merkle root of the votes included in the block.
These are the votes that committed the previous block.
The first block has `block.Header.LastCommitHash == []byte{}`
### TotalTxs
```
block.Header.TotalTxs == prevBlock.Header.TotalTxs + block.Header.NumTxs
```
The cumulative sum of all transactions included in this blockchain.
The first block has `block.Header.TotalTxs = block.Header.NumberTxs`.
### LastBlockID
```
prevBlockParts := MakeParts(prevBlock, state.LastConsensusParams.BlockGossip.BlockPartSize)
block.Header.LastBlockID == BlockID {
Hash: SimpleMerkleRoot(prevBlock.Header),
PartsHeader{
Hash: SimpleMerkleRoot(prevBlockParts),
Total: len(prevBlockParts),
},
}
```
Previous block's BlockID. Note it depends on the ConsensusParams,
which are held in the `state` and may be updated by the application.
The first block has `block.Header.LastBlockID == BlockID{}`.
### ResultsHash
```
block.ResultsHash == SimpleMerkleRoot(state.LastResults)
```
Simple Merkle root of the results of the transactions in the previous block.
The first block has `block.Header.ResultsHash == []byte{}`.
### AppHash
```
block.AppHash == state.AppHash
```
Arbitrary byte array returned by the application after executing and commiting the previous block.
The first block has `block.Header.AppHash == []byte{}`.
### ValidatorsHash
```
block.ValidatorsHash == SimpleMerkleRoot(state.Validators)
```
Simple Merkle root of the current validator set that is committing the block.
This can be used to validate the `LastCommit` included in the next block.
May be updated by the application.
### ConsensusParamsHash
```
block.ConsensusParamsHash == SimpleMerkleRoot(state.ConsensusParams)
```
Simple Merkle root of the consensus parameters.
May be updated by the application.
### Proposer
```
block.Header.Proposer in state.Validators
```
Original proposer of the block. Must be a current validator.
NOTE: this field can only be further verified by real-time participants in the consensus.
This is because the same block can be proposed in multiple rounds for the same height
and we do not track the initial round the block was proposed.
### EvidenceHash
```
block.EvidenceHash == SimpleMerkleRoot(block.Evidence)
```
Simple Merkle root of the evidence of Byzantine behaviour included in this block.
## Txs
Arbitrary length array of arbitrary length byte-arrays.
## LastCommit
The first height is an exception - it requires the LastCommit to be empty:
```
if block.Header.Height == 1 {
len(b.LastCommit) == 0
}
```
Otherwise, we require:
```
len(block.LastCommit) == len(state.LastValidators)
talliedVotingPower := 0
for i, vote := range block.LastCommit{
if vote == nil{
continue
}
vote.Type == 2
vote.Height == block.LastCommit.Height()
vote.Round == block.LastCommit.Round()
vote.BlockID == block.LastBlockID
val := state.LastValidators[i]
vote.Verify(block.ChainID, val.PubKey) == true
talliedVotingPower += val.VotingPower
}
talliedVotingPower > (2/3) * TotalVotingPower(state.LastValidators)
```
Includes one (possibly nil) vote for every current validator.
Non-nil votes must be Precommits.
All votes must be for the same height and round.
All votes must be for the previous block.
All votes must have a valid signature from the corresponding validator.
The sum total of the voting power of the validators that voted
must be greater than 2/3 of the total voting power of the complete validator set.
### Vote
A vote is a signed message broadcast in the consensus for a particular block at a particular height and round.
When stored in the blockchain or propagated over the network, votes are encoded in TMBIN.
For signing, votes are encoded in JSON, and the ChainID is included, in the form of the `CanonicalSignBytes`.
We define a method `Verify` that returns `true` if the signature verifies against the pubkey for the CanonicalSignBytes
using the given ChainID:
```
func (v Vote) Verify(chainID string, pubKey PubKey) bool {
return pubKey.Verify(v.Signature, CanonicalSignBytes(chainID, v))
}
```
where `pubKey.Verify` performs the approprioate digital signature verification of the `pubKey`
against the given signature and message bytes.
## Evidence
```
```
Every piece of evidence contains two conflicting votes from a single validator that
was active at the height indicated in the votes.
The votes must not be too old.
# Execution
Once a block is validated, it can be executed against the state.
The state follows the recursive equation:
```
app = NewABCIApp
state(1) = InitialState
state(h+1) <- Execute(state(h), app, block(h))
```

+ 178
- 0
docs/specification/new-spec/encoding.md View File

@ -0,0 +1,178 @@
# Tendermint Encoding
## Binary Serialization (TMBIN)
Tendermint aims to encode data structures in a manner similar to how the corresponding Go structs are laid out in memory.
Variable length items are length-prefixed.
While the encoding was inspired by Go, it is easily implemented in other languages as well given its intuitive design.
### Fixed Length Integers
Fixed length integers are encoded in Big-Endian using the specified number of bytes.
So `uint8` and `int8` use one byte, `uint16` and `int16` use two bytes,
`uint32` and `int32` use 3 bytes, and `uint64` and `int64` use 4 bytes.
Negative integers are encoded via twos-complement.
Examples:
```
encode(uint8(6)) == [0x06]
encode(uint32(6)) == [0x00, 0x00, 0x00, 0x06]
encode(int8(-6)) == [0xFA]
encode(int32(-6)) == [0xFF, 0xFF, 0xFF, 0xFA]
```
### Variable Length Integers
Variable length integers are encoded as length-prefixed Big-Endian integers.
The length-prefix consists of a single byte and corresponds to the length of the encoded integer.
Negative integers are encoded by flipping the leading bit of the length-prefix to a `1`.
Zero is encoded as `0x00`. It is not length-prefixed.
Examples:
```
encode(uint(6)) == [0x01, 0x06]
encode(uint(70000)) == [0x03, 0x01, 0x11, 0x70]
encode(int(-6)) == [0xF1, 0x06]
encode(int(-70000)) == [0xF3, 0x01, 0x11, 0x70]
encode(int(0)) == [0x00]
```
### Strings
An encoded string is a length prefix followed by the underlying bytes of the string.
The length-prefix is itself encoded as an `int`.
The empty string is encoded as `0x00`. It is not length-prefixed.
Examples:
```
encode("") == [0x00]
encode("a") == [0x01, 0x01, 0x61]
encode("hello") == [0x01, 0x05, 0x68, 0x65, 0x6C, 0x6C, 0x6F]
encode("¥") == [0x01, 0x02, 0xC2, 0xA5]
```
### Arrays (fixed length)
An encoded fix-lengthed array is the concatenation of the encoding of its elements.
There is no length-prefix.
Examples:
```
encode([4]int8{1, 2, 3, 4}) == [0x01, 0x02, 0x03, 0x04]
encode([4]int16{1, 2, 3, 4}) == [0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x00, 0x04]
encode([4]int{1, 2, 3, 4}) == [0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x04]
encode([2]string{"abc", "efg"}) == [0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67]
```
### Slices (variable length)
An encoded variable-length array is a length prefix followed by the concatenation of the encoding of its elements.
The length-prefix is itself encoded as an `int`.
An empty slice is encoded as `0x00`. It is not length-prefixed.
Examples:
```
encode([]int8{}) == [0x00]
encode([]int8{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x02, 0x03, 0x04]
encode([]int16{1, 2, 3, 4}) == [0x01, 0x04, 0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x00, 0x04]
encode([]int{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x4]
encode([]string{"abc", "efg"}) == [0x01, 0x02, 0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67]
```
### Time
Time is encoded as an `int64` of the number of nanoseconds since January 1, 1970,
rounded to the nearest millisecond.
Times before then are invalid.
Examples:
```
encode(time.Time("Jan 1 00:00:00 UTC 1970")) == [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]
encode(time.Time("Jan 1 00:00:01 UTC 1970")) == [0x00, 0x00, 0x00, 0x00, 0x3B, 0x9A, 0xCA, 0x00] // 1,000,000,000 ns
encode(time.Time("Mon Jan 2 15:04:05 -0700 MST 2006")) == [0x0F, 0xC4, 0xBB, 0xC1, 0x53, 0x03, 0x12, 0x00]
```
### Structs
An encoded struct is the concatenation of the encoding of its elements.
There is no length-prefix.
Examples:
```
type MyStruct struct{
A int
B string
C time.Time
}
encode(MyStruct{4, "hello", time.Time("Mon Jan 2 15:04:05 -0700 MST 2006")}) ==
[0x01, 0x04, 0x01, 0x05, 0x68, 0x65, 0x6C, 0x6C, 0x6F, 0x0F, 0xC4, 0xBB, 0xC1, 0x53, 0x03, 0x12, 0x00]
```
## Merkle Trees
Simple Merkle trees are used in numerous places in Tendermint to compute a cryptographic digest of a data structure.
RIPEMD160 is always used as the hashing function.
The function `SimpleMerkleRoot` is a simple recursive function defined as follows:
```
func SimpleMerkleRoot(hashes [][]byte) []byte{
switch len(hashes) {
case 0:
return nil
case 1:
return hashes[0]
default:
left := SimpleMerkleRoot(hashes[:(len(hashes)+1)/2])
right := SimpleMerkleRoot(hashes[(len(hashes)+1)/2:])
return RIPEMD160(append(left, right))
}
}
```
Note we abuse notion and call `SimpleMerkleRoot` with arguments of type `struct` or type `[]struct`.
For `struct` arguments, we compute a `[][]byte` by sorting elements of the `struct` according to field name and then hashing them.
For `[]struct` arguments, we compute a `[][]byte` by hashing the individual `struct` elements.
## JSON (TMJSON)
Signed messages (eg. votes, proposals) in the consensus are encoded in TMJSON, rather than TMBIN.
TMJSON is JSON where `[]byte` are encoded as uppercase hex, rather than base64.
When signing, the elements of a message are sorted by key and the sorted message is embedded in an outer JSON that includes a `chain_id` field.
We call this encoding the CanonicalSignBytes. For instance, CanonicalSignBytes for a vote would look like:
```
{"chain_id":"my-chain-id","vote":{"block_id":{"hash":DEADBEEF,"parts":{"hash":BEEFDEAD,"total":3}},"height":3,"round":2,"timestamp":1234567890, "type":2}
```
Note how the fields within each level are sorted.
## Other
### MakeParts
TMBIN encode an object and slice it into parts.
```
MakeParts(object, partSize)
```

+ 3
- 0
docs/specification/new-spec/spec-notes.md View File

@ -0,0 +1,3 @@
- Remove BlockID from Commit
- Actually validate the ValidatorsHash
- Move blockHeight=1 exception for LastCommit to ValidateBasic

+ 104
- 0
docs/specification/new-spec/state.md View File

@ -0,0 +1,104 @@
# Tendermint State
## State
The state contains information whose cryptographic digest is included in block headers,
and thus is necessary for validating new blocks.
For instance, the Merkle root of the results from executing the previous block, or the Merkle root of the current validators.
While neither the results of transactions now the validators are ever included in the blockchain itself,
the Merkle roots are, and hence we need a separate data structure to track them.
```
type State struct {
LastResults []Result
AppHash []byte
Validators []Validator
LastValidators []Validator
ConsensusParams ConsensusParams
}
```
### Result
```
type Result struct {
Code uint32
Data []byte
Tags []KVPair
}
type KVPair struct {
Key []byte
Value []byte
}
```
`Result` is the result of executing a transaction against the application.
It returns a result code, an arbitrary byte array (ie. a return value),
and a list of key-value pairs ordered by key. The key-value pairs, or tags,
can be used to index transactions according to their "effects", which are
represented in the tags.
### Validator
A validator is an active participant in the consensus with a public key and a voting power.
Validator's also contain an address which is derived from the PubKey:
```
type Validator struct {
Address []byte
PubKey PubKey
VotingPower int64
}
```
The `state.Validators` and `state.LastValidators` must always by sorted by validator address,
so that there is a canonical order for computing the SimpleMerkleRoot.
We also define a `TotalVotingPower` function, to return the total voting power:
```
func TotalVotingPower(vals []Validators) int64{
sum := 0
for v := range vals{
sum += v.VotingPower
}
return sum
}
```
### PubKey
TODO:
### ConsensusParams
TODO:
## Execution
We define an `Execute` function that takes a state and a block,
executes the block against the application, and returns an updated state.
```
Execute(s State, app ABCIApp, block Block) State {
abciResponses := app.ApplyBlock(block)
return State{
LastResults: abciResponses.DeliverTxResults,
AppHash: abciResponses.AppHash,
Validators: UpdateValidators(state.Validators, abciResponses.ValidatorChanges),
LastValidators: state.Validators,
ConsensusParams: UpdateConsensusParams(state.ConsensusParams, abci.Responses.ConsensusParamChanges),
}
}
type ABCIResponses struct {
DeliverTxResults []Result
ValidatorChanges []Validator
ConsensusParamChanges ConsensusParams
AppHash []byte
}
```

+ 80
- 0
docs/specification/new-spec/wire.go View File

@ -0,0 +1,80 @@
package main
import (
"fmt"
"time"
wire "github.com/tendermint/go-wire"
)
func main() {
encode(uint8(6))
encode(uint32(6))
encode(int8(-6))
encode(int32(-6))
Break()
encode(uint(6))
encode(uint(70000))
encode(int(0))
encode(int(-6))
encode(int(-70000))
Break()
encode("")
encode("a")
encode("hello")
encode("¥")
Break()
encode([4]int8{1, 2, 3, 4})
encode([4]int16{1, 2, 3, 4})
encode([4]int{1, 2, 3, 4})
encode([2]string{"abc", "efg"})
Break()
encode([]int8{})
encode([]int8{1, 2, 3, 4})
encode([]int16{1, 2, 3, 4})
encode([]int{1, 2, 3, 4})
encode([]string{"abc", "efg"})
Break()
timeFmt := "Mon Jan 2 15:04:05 -0700 MST 2006"
t1, _ := time.Parse(timeFmt, timeFmt)
n := (t1.UnixNano() / 1000000.) * 1000000
encode(n)
encode(t1)
t2, _ := time.Parse(timeFmt, "Thu Jan 1 00:00:00 -0000 UTC 1970")
encode(t2)
t2, _ = time.Parse(timeFmt, "Thu Jan 1 00:00:01 -0000 UTC 1970")
fmt.Println("N", t2.UnixNano())
encode(t2)
Break()
encode(struct {
A int
B string
C time.Time
}{
4,
"hello",
t1,
})
}
func encode(i interface{}) {
Println(wire.BinaryBytes(i))
}
func Println(b []byte) {
s := "["
for _, x := range b {
s += fmt.Sprintf("0x%.2X, ", x)
}
s = s[:len(s)-2] + "]"
fmt.Println(s)
}
func Break() {
fmt.Println("------")
}

Loading…
Cancel
Save