docs/spec: some organizational cleanuppull/1759/head
@ -1,4 +1,4 @@ | |||||
# CODEOWNERS: https://help.github.com/articles/about-codeowners/ | # CODEOWNERS: https://help.github.com/articles/about-codeowners/ | ||||
# Everything goes through Bucky and Anton. For now. | |||||
* @ebuchman @melekes | |||||
# Everything goes through Bucky, Anton, Alex. For now. | |||||
* @ebuchman @melekes @xla |
@ -1,246 +0,0 @@ | |||||
# Tendermint Encoding (Pre-Amino) | |||||
## PubKeys and Addresses | |||||
PubKeys are prefixed with a type-byte, followed by the raw bytes of the public | |||||
key. | |||||
Two keys are supported with the following type bytes: | |||||
``` | |||||
TypeByteEd25519 = 0x1 | |||||
TypeByteSecp256k1 = 0x2 | |||||
``` | |||||
``` | |||||
// TypeByte: 0x1 | |||||
type PubKeyEd25519 [32]byte | |||||
func (pub PubKeyEd25519) Encode() []byte { | |||||
return 0x1 | pub | |||||
} | |||||
func (pub PubKeyEd25519) Address() []byte { | |||||
// NOTE: the length (0x0120) is also included | |||||
return RIPEMD160(0x1 | 0x0120 | pub) | |||||
} | |||||
// TypeByte: 0x2 | |||||
// NOTE: OpenSSL compressed pubkey (x-cord with 0x2 or 0x3) | |||||
type PubKeySecp256k1 [33]byte | |||||
func (pub PubKeySecp256k1) Encode() []byte { | |||||
return 0x2 | pub | |||||
} | |||||
func (pub PubKeySecp256k1) Address() []byte { | |||||
return RIPEMD160(SHA256(pub)) | |||||
} | |||||
``` | |||||
See https://github.com/tendermint/go-crypto/blob/v0.5.0/pub_key.go for more. | |||||
## Binary Serialization (go-wire) | |||||
Tendermint aims to encode data structures in a manner similar to how the corresponding Go structs | |||||
are laid out in memory. | |||||
Variable length items are length-prefixed. | |||||
While the encoding was inspired by Go, it is easily implemented in other languages as well, given its intuitive design. | |||||
XXX: This is changing to use real varints and 4-byte-prefixes. | |||||
See https://github.com/tendermint/go-wire/tree/sdk2. | |||||
### Fixed Length Integers | |||||
Fixed length integers are encoded in Big-Endian using the specified number of bytes. | |||||
So `uint8` and `int8` use one byte, `uint16` and `int16` use two bytes, | |||||
`uint32` and `int32` use 3 bytes, and `uint64` and `int64` use 4 bytes. | |||||
Negative integers are encoded via twos-complement. | |||||
Examples: | |||||
```go | |||||
encode(uint8(6)) == [0x06] | |||||
encode(uint32(6)) == [0x00, 0x00, 0x00, 0x06] | |||||
encode(int8(-6)) == [0xFA] | |||||
encode(int32(-6)) == [0xFF, 0xFF, 0xFF, 0xFA] | |||||
``` | |||||
### Variable Length Integers | |||||
Variable length integers are encoded as length-prefixed Big-Endian integers. | |||||
The length-prefix consists of a single byte and corresponds to the length of the encoded integer. | |||||
Negative integers are encoded by flipping the leading bit of the length-prefix to a `1`. | |||||
Zero is encoded as `0x00`. It is not length-prefixed. | |||||
Examples: | |||||
```go | |||||
encode(uint(6)) == [0x01, 0x06] | |||||
encode(uint(70000)) == [0x03, 0x01, 0x11, 0x70] | |||||
encode(int(-6)) == [0xF1, 0x06] | |||||
encode(int(-70000)) == [0xF3, 0x01, 0x11, 0x70] | |||||
encode(int(0)) == [0x00] | |||||
``` | |||||
### Strings | |||||
An encoded string is length-prefixed followed by the underlying bytes of the string. | |||||
The length-prefix is itself encoded as an `int`. | |||||
The empty string is encoded as `0x00`. It is not length-prefixed. | |||||
Examples: | |||||
```go | |||||
encode("") == [0x00] | |||||
encode("a") == [0x01, 0x01, 0x61] | |||||
encode("hello") == [0x01, 0x05, 0x68, 0x65, 0x6C, 0x6C, 0x6F] | |||||
encode("¥") == [0x01, 0x02, 0xC2, 0xA5] | |||||
``` | |||||
### Arrays (fixed length) | |||||
An encoded fix-lengthed array is the concatenation of the encoding of its elements. | |||||
There is no length-prefix. | |||||
Examples: | |||||
```go | |||||
encode([4]int8{1, 2, 3, 4}) == [0x01, 0x02, 0x03, 0x04] | |||||
encode([4]int16{1, 2, 3, 4}) == [0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x00, 0x04] | |||||
encode([4]int{1, 2, 3, 4}) == [0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x04] | |||||
encode([2]string{"abc", "efg"}) == [0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67] | |||||
``` | |||||
### Slices (variable length) | |||||
An encoded variable-length array is length-prefixed followed by the concatenation of the encoding of | |||||
its elements. | |||||
The length-prefix is itself encoded as an `int`. | |||||
An empty slice is encoded as `0x00`. It is not length-prefixed. | |||||
Examples: | |||||
```go | |||||
encode([]int8{}) == [0x00] | |||||
encode([]int8{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x02, 0x03, 0x04] | |||||
encode([]int16{1, 2, 3, 4}) == [0x01, 0x04, 0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x00, 0x04] | |||||
encode([]int{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x4] | |||||
encode([]string{"abc", "efg"}) == [0x01, 0x02, 0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67] | |||||
``` | |||||
### BitArray | |||||
BitArray is encoded as an `int` of the number of bits, and with an array of `uint64` to encode | |||||
value of each array element. | |||||
```go | |||||
type BitArray struct { | |||||
Bits int | |||||
Elems []uint64 | |||||
} | |||||
``` | |||||
### Time | |||||
Time is encoded as an `int64` of the number of nanoseconds since January 1, 1970, | |||||
rounded to the nearest millisecond. | |||||
Times before then are invalid. | |||||
Examples: | |||||
```go | |||||
encode(time.Time("Jan 1 00:00:00 UTC 1970")) == [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00] | |||||
encode(time.Time("Jan 1 00:00:01 UTC 1970")) == [0x00, 0x00, 0x00, 0x00, 0x3B, 0x9A, 0xCA, 0x00] // 1,000,000,000 ns | |||||
encode(time.Time("Mon Jan 2 15:04:05 -0700 MST 2006")) == [0x0F, 0xC4, 0xBB, 0xC1, 0x53, 0x03, 0x12, 0x00] | |||||
``` | |||||
### Structs | |||||
An encoded struct is the concatenation of the encoding of its elements. | |||||
There is no length-prefix. | |||||
Examples: | |||||
```go | |||||
type MyStruct struct{ | |||||
A int | |||||
B string | |||||
C time.Time | |||||
} | |||||
encode(MyStruct{4, "hello", time.Time("Mon Jan 2 15:04:05 -0700 MST 2006")}) == | |||||
[0x01, 0x04, 0x01, 0x05, 0x68, 0x65, 0x6C, 0x6C, 0x6F, 0x0F, 0xC4, 0xBB, 0xC1, 0x53, 0x03, 0x12, 0x00] | |||||
``` | |||||
## Merkle Trees | |||||
Simple Merkle trees are used in numerous places in Tendermint to compute a cryptographic digest of a data structure. | |||||
RIPEMD160 is always used as the hashing function. | |||||
The function `SimpleMerkleRoot` is a simple recursive function defined as follows: | |||||
```go | |||||
func SimpleMerkleRoot(hashes [][]byte) []byte{ | |||||
switch len(hashes) { | |||||
case 0: | |||||
return nil | |||||
case 1: | |||||
return hashes[0] | |||||
default: | |||||
left := SimpleMerkleRoot(hashes[:(len(hashes)+1)/2]) | |||||
right := SimpleMerkleRoot(hashes[(len(hashes)+1)/2:]) | |||||
return RIPEMD160(append(left, right)) | |||||
} | |||||
} | |||||
``` | |||||
Note: we abuse notion and call `SimpleMerkleRoot` with arguments of type `struct` or type `[]struct`. | |||||
For `struct` arguments, we compute a `[][]byte` by sorting elements of the `struct` according to | |||||
field name and then hashing them. | |||||
For `[]struct` arguments, we compute a `[][]byte` by hashing the individual `struct` elements. | |||||
## JSON (TMJSON) | |||||
Signed messages (eg. votes, proposals) in the consensus are encoded in TMJSON, rather than TMBIN. | |||||
TMJSON is JSON where `[]byte` are encoded as uppercase hex, rather than base64. | |||||
When signing, the elements of a message are sorted by key and the sorted message is embedded in an | |||||
outer JSON that includes a `chain_id` field. | |||||
We call this encoding the CanonicalSignBytes. For instance, CanonicalSignBytes for a vote would look | |||||
like: | |||||
```json | |||||
{"chain_id":"my-chain-id","vote":{"block_id":{"hash":DEADBEEF,"parts":{"hash":BEEFDEAD,"total":3}},"height":3,"round":2,"timestamp":1234567890, "type":2} | |||||
``` | |||||
Note how the fields within each level are sorted. | |||||
## Other | |||||
### MakeParts | |||||
Encode an object using TMBIN and slice it into parts. | |||||
```go | |||||
MakeParts(object, partSize) | |||||
``` | |||||
### Part | |||||
```go | |||||
type Part struct { | |||||
Index int | |||||
Bytes byte[] | |||||
Proof byte[] | |||||
} | |||||
``` |
@ -1,192 +1 @@ | |||||
# Application Blockchain Interface (ABCI) | |||||
ABCI is the interface between Tendermint (a state-machine replication engine) | |||||
and an application (the actual state machine). | |||||
The ABCI message types are defined in a [protobuf | |||||
file](https://github.com/tendermint/abci/blob/master/types/types.proto). | |||||
For full details on the ABCI message types and protocol, see the [ABCI | |||||
specificaiton](https://github.com/tendermint/abci/blob/master/specification.rst). | |||||
Be sure to read the specification if you're trying to build an ABCI app! | |||||
For additional details on server implementation, see the [ABCI | |||||
readme](https://github.com/tendermint/abci#implementation). | |||||
Here we provide some more details around the use of ABCI by Tendermint and | |||||
clarify common "gotchas". | |||||
## ABCI connections | |||||
Tendermint opens 3 ABCI connections to the app: one for Consensus, one for | |||||
Mempool, one for Queries. | |||||
## Async vs Sync | |||||
The main ABCI server (ie. non-GRPC) provides ordered asynchronous messages. | |||||
This is useful for DeliverTx and CheckTx, since it allows Tendermint to forward | |||||
transactions to the app before it's finished processing previous ones. | |||||
Thus, DeliverTx and CheckTx messages are sent asycnhronously, while all other | |||||
messages are sent synchronously. | |||||
## CheckTx and Commit | |||||
It is typical to hold three distinct states in an ABCI app: CheckTxState, DeliverTxState, | |||||
QueryState. The QueryState contains the latest committed state for a block. | |||||
The CheckTxState and DeliverTxState may be updated concurrently with one another. | |||||
Before Commit is called, Tendermint locks and flushes the mempool so that no new changes will happen | |||||
to CheckTxState. When Commit completes, it unlocks the mempool. | |||||
Thus, during Commit, it is safe to reset the QueryState and the CheckTxState to the latest DeliverTxState | |||||
(ie. the new state from executing all the txs in the block). | |||||
Note, however, that it is not possible to send transactions to Tendermint during Commit - if your app | |||||
tries to send a `/broadcast_tx` to Tendermint during Commit, it will deadlock. | |||||
## EndBlock Validator Updates | |||||
Updates to the Tendermint validator set can be made by returning `Validator` | |||||
objects in the `ResponseBeginBlock`: | |||||
``` | |||||
message Validator { | |||||
bytes address = 1; | |||||
PubKey pub_key = 2; | |||||
int64 power = 3; | |||||
} | |||||
message PubKey { | |||||
string type = 1; | |||||
bytes data = 2; | |||||
} | |||||
``` | |||||
The `pub_key` currently supports two types: | |||||
- `type = "ed25519" and `data = <raw 32-byte public key>` | |||||
- `type = "secp256k1" and `data = <33-byte OpenSSL compressed public key>` | |||||
If the address is provided, it must match the address of the pubkey, as | |||||
specified [here](/docs/spec/blockchain/encoding.md#Addresses) | |||||
(Note: In the v0.19 series, the `pub_key` is the [Amino encoded public | |||||
key](/docs/spec/blockchain/encoding.md#public-key-cryptography). | |||||
For Ed25519 pubkeys, the Amino prefix is always "1624DE6220". For example, the 32-byte Ed25519 pubkey | |||||
`76852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85` would be | |||||
Amino encoded as | |||||
`1624DE622076852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85`) | |||||
(Note: In old versions of Tendermint (pre-v0.19.0), the pubkey is just prefixed with a | |||||
single type byte, so for ED25519 we'd have `pub_key = 0x1 | pub`) | |||||
The `power` is the new voting power for the validator, with the | |||||
following rules: | |||||
- power must be non-negative | |||||
- if power is 0, the validator must already exist, and will be removed from the | |||||
validator set | |||||
- if power is non-0: | |||||
- if the validator does not already exist, it will be added to the validator | |||||
set with the given power | |||||
- if the validator does already exist, its power will be adjusted to the given power | |||||
## InitChain Validator Updates | |||||
ResponseInitChain has the option to return a list of validators. | |||||
If the list is not empty, Tendermint will adopt it for the validator set. | |||||
This way the application can determine the initial validator set for the | |||||
blockchain. | |||||
Note that if addressses are included in the returned validators, they must match | |||||
the address of the public key. | |||||
ResponseInitChain also includes ConsensusParams, but these are presently | |||||
ignored. | |||||
## Query | |||||
Query is a generic message type with lots of flexibility to enable diverse sets | |||||
of queries from applications. Tendermint has no requirements from the Query | |||||
message for normal operation - that is, the ABCI app developer need not implement Query functionality if they do not wish too. | |||||
That said, Tendermint makes a number of queries to support some optional | |||||
features. These are: | |||||
### Peer Filtering | |||||
When Tendermint connects to a peer, it sends two queries to the ABCI application | |||||
using the following paths, with no additional data: | |||||
- `/p2p/filter/addr/<IP:PORT>`, where `<IP:PORT>` denote the IP address and | |||||
the port of the connection | |||||
- `p2p/filter/id/<ID>`, where `<ID>` is the peer node ID (ie. the | |||||
pubkey.Address() for the peer's PubKey) | |||||
If either of these queries return a non-zero ABCI code, Tendermint will refuse | |||||
to connect to the peer. | |||||
## Info and the Handshake/Replay | |||||
On startup, Tendermint calls Info on the Query connection to get the latest | |||||
committed state of the app. The app MUST return information consistent with the | |||||
last block it succesfully completed Commit for. | |||||
If the app succesfully committed block H but not H+1, then `last_block_height = | |||||
H` and `last_block_app_hash = <hash returned by Commit for block H>`. If the app | |||||
failed during the Commit of block H, then `last_block_height = H-1` and | |||||
`last_block_app_hash = <hash returned by Commit for block H-1, which is the hash | |||||
in the header of block H>`. | |||||
We now distinguish three heights, and describe how Tendermint syncs itself with | |||||
the app. | |||||
``` | |||||
storeBlockHeight = height of the last block Tendermint saw a commit for | |||||
stateBlockHeight = height of the last block for which Tendermint completed all | |||||
block processing and saved all ABCI results to disk | |||||
appBlockHeight = height of the last block for which ABCI app succesfully | |||||
completely Commit | |||||
``` | |||||
Note we always have `storeBlockHeight >= stateBlockHeight` and `storeBlockHeight >= appBlockHeight` | |||||
Note also we never call Commit on an ABCI app twice for the same height. | |||||
The procedure is as follows. | |||||
First, some simeple start conditions: | |||||
If `appBlockHeight == 0`, then call InitChain. | |||||
If `storeBlockHeight == 0`, we're done. | |||||
Now, some sanity checks: | |||||
If `storeBlockHeight < appBlockHeight`, error | |||||
If `storeBlockHeight < stateBlockHeight`, panic | |||||
If `storeBlockHeight > stateBlockHeight+1`, panic | |||||
Now, the meat: | |||||
If `storeBlockHeight == stateBlockHeight && appBlockHeight < storeBlockHeight`, | |||||
replay all blocks in full from `appBlockHeight` to `storeBlockHeight`. | |||||
This happens if we completed processing the block, but the app forgot its height. | |||||
If `storeBlockHeight == stateBlockHeight && appBlockHeight == storeBlockHeight`, we're done | |||||
This happens if we crashed at an opportune spot. | |||||
If `storeBlockHeight == stateBlockHeight+1` | |||||
This happens if we started processing the block but didn't finish. | |||||
If `appBlockHeight < stateBlockHeight` | |||||
replay all blocks in full from `appBlockHeight` to `storeBlockHeight-1`, | |||||
and replay the block at `storeBlockHeight` using the WAL. | |||||
This happens if the app forgot the last block it committed. | |||||
If `appBlockHeight == stateBlockHeight`, | |||||
replay the last block (storeBlockHeight) in full. | |||||
This happens if we crashed before the app finished Commit | |||||
If appBlockHeight == storeBlockHeight { | |||||
update the state using the saved ABCI responses but dont run the block against the real app. | |||||
This happens if we crashed after the app finished Commit but before Tendermint saved the state. | |||||
[Moved](/docs/spec/software/abci.md) |
@ -0,0 +1,9 @@ | |||||
We are working to finalize an updated Tendermint specification with formal | |||||
proofs of safety and liveness. | |||||
In the meantime, see the [description in the | |||||
docs](http://tendermint.readthedocs.io/en/master/specification/byzantine-consensus-algorithm.html). | |||||
There are also relevant but somewhat outdated descriptions in Jae Kwon's [original | |||||
whitepaper](https://tendermint.com/static/docs/tendermint.pdf) and Ethan Buchman's [master's | |||||
thesis](https://atrium.lib.uoguelph.ca/xmlui/handle/10214/9769). |
@ -1,33 +1 @@ | |||||
# WAL | |||||
Consensus module writes every message to the WAL (write-ahead log). | |||||
It also issues fsync syscall through | |||||
[File#Sync](https://golang.org/pkg/os/#File.Sync) for messages signed by this | |||||
node (to prevent double signing). | |||||
Under the hood, it uses | |||||
[autofile.Group](https://godoc.org/github.com/tendermint/tmlibs/autofile#Group), | |||||
which rotates files when those get too big (> 10MB). | |||||
The total maximum size is 1GB. We only need the latest block and the block before it, | |||||
but if the former is dragging on across many rounds, we want all those rounds. | |||||
## Replay | |||||
Consensus module will replay all the messages of the last height written to WAL | |||||
before a crash (if such occurs). | |||||
The private validator may try to sign messages during replay because it runs | |||||
somewhat autonomously and does not know about replay process. | |||||
For example, if we got all the way to precommit in the WAL and then crash, | |||||
after we replay the proposal message, the private validator will try to sign a | |||||
prevote. But it will fail. That's ok because we’ll see the prevote later in the | |||||
WAL. Then it will go to precommit, and that time it will work because the | |||||
private validator contains the `LastSignBytes` and then we’ll replay the | |||||
precommit from the WAL. | |||||
Make sure to read about [WAL | |||||
corruption](https://tendermint.readthedocs.io/projects/tools/en/master/specification/corruption.html#wal-corruption) | |||||
and recovery strategies. | |||||
[Moved](/docs/spec/software/wal.md) |
@ -0,0 +1,192 @@ | |||||
# Application Blockchain Interface (ABCI) | |||||
ABCI is the interface between Tendermint (a state-machine replication engine) | |||||
and an application (the actual state machine). | |||||
The ABCI message types are defined in a [protobuf | |||||
file](https://github.com/tendermint/abci/blob/master/types/types.proto). | |||||
For full details on the ABCI message types and protocol, see the [ABCI | |||||
specificaiton](https://github.com/tendermint/abci/blob/master/specification.rst). | |||||
Be sure to read the specification if you're trying to build an ABCI app! | |||||
For additional details on server implementation, see the [ABCI | |||||
readme](https://github.com/tendermint/abci#implementation). | |||||
Here we provide some more details around the use of ABCI by Tendermint and | |||||
clarify common "gotchas". | |||||
## ABCI connections | |||||
Tendermint opens 3 ABCI connections to the app: one for Consensus, one for | |||||
Mempool, one for Queries. | |||||
## Async vs Sync | |||||
The main ABCI server (ie. non-GRPC) provides ordered asynchronous messages. | |||||
This is useful for DeliverTx and CheckTx, since it allows Tendermint to forward | |||||
transactions to the app before it's finished processing previous ones. | |||||
Thus, DeliverTx and CheckTx messages are sent asycnhronously, while all other | |||||
messages are sent synchronously. | |||||
## CheckTx and Commit | |||||
It is typical to hold three distinct states in an ABCI app: CheckTxState, DeliverTxState, | |||||
QueryState. The QueryState contains the latest committed state for a block. | |||||
The CheckTxState and DeliverTxState may be updated concurrently with one another. | |||||
Before Commit is called, Tendermint locks and flushes the mempool so that no new changes will happen | |||||
to CheckTxState. When Commit completes, it unlocks the mempool. | |||||
Thus, during Commit, it is safe to reset the QueryState and the CheckTxState to the latest DeliverTxState | |||||
(ie. the new state from executing all the txs in the block). | |||||
Note, however, that it is not possible to send transactions to Tendermint during Commit - if your app | |||||
tries to send a `/broadcast_tx` to Tendermint during Commit, it will deadlock. | |||||
## EndBlock Validator Updates | |||||
Updates to the Tendermint validator set can be made by returning `Validator` | |||||
objects in the `ResponseBeginBlock`: | |||||
``` | |||||
message Validator { | |||||
bytes address = 1; | |||||
PubKey pub_key = 2; | |||||
int64 power = 3; | |||||
} | |||||
message PubKey { | |||||
string type = 1; | |||||
bytes data = 2; | |||||
} | |||||
``` | |||||
The `pub_key` currently supports two types: | |||||
- `type = "ed25519" and `data = <raw 32-byte public key>` | |||||
- `type = "secp256k1" and `data = <33-byte OpenSSL compressed public key>` | |||||
If the address is provided, it must match the address of the pubkey, as | |||||
specified [here](/docs/spec/blockchain/encoding.md#Addresses) | |||||
(Note: In the v0.19 series, the `pub_key` is the [Amino encoded public | |||||
key](/docs/spec/blockchain/encoding.md#public-key-cryptography). | |||||
For Ed25519 pubkeys, the Amino prefix is always "1624DE6220". For example, the 32-byte Ed25519 pubkey | |||||
`76852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85` would be | |||||
Amino encoded as | |||||
`1624DE622076852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85`) | |||||
(Note: In old versions of Tendermint (pre-v0.19.0), the pubkey is just prefixed with a | |||||
single type byte, so for ED25519 we'd have `pub_key = 0x1 | pub`) | |||||
The `power` is the new voting power for the validator, with the | |||||
following rules: | |||||
- power must be non-negative | |||||
- if power is 0, the validator must already exist, and will be removed from the | |||||
validator set | |||||
- if power is non-0: | |||||
- if the validator does not already exist, it will be added to the validator | |||||
set with the given power | |||||
- if the validator does already exist, its power will be adjusted to the given power | |||||
## InitChain Validator Updates | |||||
ResponseInitChain has the option to return a list of validators. | |||||
If the list is not empty, Tendermint will adopt it for the validator set. | |||||
This way the application can determine the initial validator set for the | |||||
blockchain. | |||||
Note that if addressses are included in the returned validators, they must match | |||||
the address of the public key. | |||||
ResponseInitChain also includes ConsensusParams, but these are presently | |||||
ignored. | |||||
## Query | |||||
Query is a generic message type with lots of flexibility to enable diverse sets | |||||
of queries from applications. Tendermint has no requirements from the Query | |||||
message for normal operation - that is, the ABCI app developer need not implement Query functionality if they do not wish too. | |||||
That said, Tendermint makes a number of queries to support some optional | |||||
features. These are: | |||||
### Peer Filtering | |||||
When Tendermint connects to a peer, it sends two queries to the ABCI application | |||||
using the following paths, with no additional data: | |||||
- `/p2p/filter/addr/<IP:PORT>`, where `<IP:PORT>` denote the IP address and | |||||
the port of the connection | |||||
- `p2p/filter/id/<ID>`, where `<ID>` is the peer node ID (ie. the | |||||
pubkey.Address() for the peer's PubKey) | |||||
If either of these queries return a non-zero ABCI code, Tendermint will refuse | |||||
to connect to the peer. | |||||
## Info and the Handshake/Replay | |||||
On startup, Tendermint calls Info on the Query connection to get the latest | |||||
committed state of the app. The app MUST return information consistent with the | |||||
last block it succesfully completed Commit for. | |||||
If the app succesfully committed block H but not H+1, then `last_block_height = | |||||
H` and `last_block_app_hash = <hash returned by Commit for block H>`. If the app | |||||
failed during the Commit of block H, then `last_block_height = H-1` and | |||||
`last_block_app_hash = <hash returned by Commit for block H-1, which is the hash | |||||
in the header of block H>`. | |||||
We now distinguish three heights, and describe how Tendermint syncs itself with | |||||
the app. | |||||
``` | |||||
storeBlockHeight = height of the last block Tendermint saw a commit for | |||||
stateBlockHeight = height of the last block for which Tendermint completed all | |||||
block processing and saved all ABCI results to disk | |||||
appBlockHeight = height of the last block for which ABCI app succesfully | |||||
completely Commit | |||||
``` | |||||
Note we always have `storeBlockHeight >= stateBlockHeight` and `storeBlockHeight >= appBlockHeight` | |||||
Note also we never call Commit on an ABCI app twice for the same height. | |||||
The procedure is as follows. | |||||
First, some simeple start conditions: | |||||
If `appBlockHeight == 0`, then call InitChain. | |||||
If `storeBlockHeight == 0`, we're done. | |||||
Now, some sanity checks: | |||||
If `storeBlockHeight < appBlockHeight`, error | |||||
If `storeBlockHeight < stateBlockHeight`, panic | |||||
If `storeBlockHeight > stateBlockHeight+1`, panic | |||||
Now, the meat: | |||||
If `storeBlockHeight == stateBlockHeight && appBlockHeight < storeBlockHeight`, | |||||
replay all blocks in full from `appBlockHeight` to `storeBlockHeight`. | |||||
This happens if we completed processing the block, but the app forgot its height. | |||||
If `storeBlockHeight == stateBlockHeight && appBlockHeight == storeBlockHeight`, we're done | |||||
This happens if we crashed at an opportune spot. | |||||
If `storeBlockHeight == stateBlockHeight+1` | |||||
This happens if we started processing the block but didn't finish. | |||||
If `appBlockHeight < stateBlockHeight` | |||||
replay all blocks in full from `appBlockHeight` to `storeBlockHeight-1`, | |||||
and replay the block at `storeBlockHeight` using the WAL. | |||||
This happens if the app forgot the last block it committed. | |||||
If `appBlockHeight == stateBlockHeight`, | |||||
replay the last block (storeBlockHeight) in full. | |||||
This happens if we crashed before the app finished Commit | |||||
If appBlockHeight == storeBlockHeight { | |||||
update the state using the saved ABCI responses but dont run the block against the real app. | |||||
This happens if we crashed after the app finished Commit but before Tendermint saved the state. |
@ -0,0 +1,33 @@ | |||||
# WAL | |||||
Consensus module writes every message to the WAL (write-ahead log). | |||||
It also issues fsync syscall through | |||||
[File#Sync](https://golang.org/pkg/os/#File.Sync) for messages signed by this | |||||
node (to prevent double signing). | |||||
Under the hood, it uses | |||||
[autofile.Group](https://godoc.org/github.com/tendermint/tmlibs/autofile#Group), | |||||
which rotates files when those get too big (> 10MB). | |||||
The total maximum size is 1GB. We only need the latest block and the block before it, | |||||
but if the former is dragging on across many rounds, we want all those rounds. | |||||
## Replay | |||||
Consensus module will replay all the messages of the last height written to WAL | |||||
before a crash (if such occurs). | |||||
The private validator may try to sign messages during replay because it runs | |||||
somewhat autonomously and does not know about replay process. | |||||
For example, if we got all the way to precommit in the WAL and then crash, | |||||
after we replay the proposal message, the private validator will try to sign a | |||||
prevote. But it will fail. That's ok because we’ll see the prevote later in the | |||||
WAL. Then it will go to precommit, and that time it will work because the | |||||
private validator contains the `LastSignBytes` and then we’ll replay the | |||||
precommit from the WAL. | |||||
Make sure to read about [WAL | |||||
corruption](https://tendermint.readthedocs.io/projects/tools/en/master/specification/corruption.html#wal-corruption) | |||||
and recovery strategies. |