docs/spec: some organizational cleanuppull/1759/head
@ -1,4 +1,4 @@ | |||
# CODEOWNERS: https://help.github.com/articles/about-codeowners/ | |||
# Everything goes through Bucky and Anton. For now. | |||
* @ebuchman @melekes | |||
# Everything goes through Bucky, Anton, Alex. For now. | |||
* @ebuchman @melekes @xla |
@ -1,246 +0,0 @@ | |||
# Tendermint Encoding (Pre-Amino) | |||
## PubKeys and Addresses | |||
PubKeys are prefixed with a type-byte, followed by the raw bytes of the public | |||
key. | |||
Two keys are supported with the following type bytes: | |||
``` | |||
TypeByteEd25519 = 0x1 | |||
TypeByteSecp256k1 = 0x2 | |||
``` | |||
``` | |||
// TypeByte: 0x1 | |||
type PubKeyEd25519 [32]byte | |||
func (pub PubKeyEd25519) Encode() []byte { | |||
return 0x1 | pub | |||
} | |||
func (pub PubKeyEd25519) Address() []byte { | |||
// NOTE: the length (0x0120) is also included | |||
return RIPEMD160(0x1 | 0x0120 | pub) | |||
} | |||
// TypeByte: 0x2 | |||
// NOTE: OpenSSL compressed pubkey (x-cord with 0x2 or 0x3) | |||
type PubKeySecp256k1 [33]byte | |||
func (pub PubKeySecp256k1) Encode() []byte { | |||
return 0x2 | pub | |||
} | |||
func (pub PubKeySecp256k1) Address() []byte { | |||
return RIPEMD160(SHA256(pub)) | |||
} | |||
``` | |||
See https://github.com/tendermint/go-crypto/blob/v0.5.0/pub_key.go for more. | |||
## Binary Serialization (go-wire) | |||
Tendermint aims to encode data structures in a manner similar to how the corresponding Go structs | |||
are laid out in memory. | |||
Variable length items are length-prefixed. | |||
While the encoding was inspired by Go, it is easily implemented in other languages as well, given its intuitive design. | |||
XXX: This is changing to use real varints and 4-byte-prefixes. | |||
See https://github.com/tendermint/go-wire/tree/sdk2. | |||
### Fixed Length Integers | |||
Fixed length integers are encoded in Big-Endian using the specified number of bytes. | |||
So `uint8` and `int8` use one byte, `uint16` and `int16` use two bytes, | |||
`uint32` and `int32` use 3 bytes, and `uint64` and `int64` use 4 bytes. | |||
Negative integers are encoded via twos-complement. | |||
Examples: | |||
```go | |||
encode(uint8(6)) == [0x06] | |||
encode(uint32(6)) == [0x00, 0x00, 0x00, 0x06] | |||
encode(int8(-6)) == [0xFA] | |||
encode(int32(-6)) == [0xFF, 0xFF, 0xFF, 0xFA] | |||
``` | |||
### Variable Length Integers | |||
Variable length integers are encoded as length-prefixed Big-Endian integers. | |||
The length-prefix consists of a single byte and corresponds to the length of the encoded integer. | |||
Negative integers are encoded by flipping the leading bit of the length-prefix to a `1`. | |||
Zero is encoded as `0x00`. It is not length-prefixed. | |||
Examples: | |||
```go | |||
encode(uint(6)) == [0x01, 0x06] | |||
encode(uint(70000)) == [0x03, 0x01, 0x11, 0x70] | |||
encode(int(-6)) == [0xF1, 0x06] | |||
encode(int(-70000)) == [0xF3, 0x01, 0x11, 0x70] | |||
encode(int(0)) == [0x00] | |||
``` | |||
### Strings | |||
An encoded string is length-prefixed followed by the underlying bytes of the string. | |||
The length-prefix is itself encoded as an `int`. | |||
The empty string is encoded as `0x00`. It is not length-prefixed. | |||
Examples: | |||
```go | |||
encode("") == [0x00] | |||
encode("a") == [0x01, 0x01, 0x61] | |||
encode("hello") == [0x01, 0x05, 0x68, 0x65, 0x6C, 0x6C, 0x6F] | |||
encode("¥") == [0x01, 0x02, 0xC2, 0xA5] | |||
``` | |||
### Arrays (fixed length) | |||
An encoded fix-lengthed array is the concatenation of the encoding of its elements. | |||
There is no length-prefix. | |||
Examples: | |||
```go | |||
encode([4]int8{1, 2, 3, 4}) == [0x01, 0x02, 0x03, 0x04] | |||
encode([4]int16{1, 2, 3, 4}) == [0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x00, 0x04] | |||
encode([4]int{1, 2, 3, 4}) == [0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x04] | |||
encode([2]string{"abc", "efg"}) == [0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67] | |||
``` | |||
### Slices (variable length) | |||
An encoded variable-length array is length-prefixed followed by the concatenation of the encoding of | |||
its elements. | |||
The length-prefix is itself encoded as an `int`. | |||
An empty slice is encoded as `0x00`. It is not length-prefixed. | |||
Examples: | |||
```go | |||
encode([]int8{}) == [0x00] | |||
encode([]int8{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x02, 0x03, 0x04] | |||
encode([]int16{1, 2, 3, 4}) == [0x01, 0x04, 0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x00, 0x04] | |||
encode([]int{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x4] | |||
encode([]string{"abc", "efg"}) == [0x01, 0x02, 0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67] | |||
``` | |||
### BitArray | |||
BitArray is encoded as an `int` of the number of bits, and with an array of `uint64` to encode | |||
value of each array element. | |||
```go | |||
type BitArray struct { | |||
Bits int | |||
Elems []uint64 | |||
} | |||
``` | |||
### Time | |||
Time is encoded as an `int64` of the number of nanoseconds since January 1, 1970, | |||
rounded to the nearest millisecond. | |||
Times before then are invalid. | |||
Examples: | |||
```go | |||
encode(time.Time("Jan 1 00:00:00 UTC 1970")) == [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00] | |||
encode(time.Time("Jan 1 00:00:01 UTC 1970")) == [0x00, 0x00, 0x00, 0x00, 0x3B, 0x9A, 0xCA, 0x00] // 1,000,000,000 ns | |||
encode(time.Time("Mon Jan 2 15:04:05 -0700 MST 2006")) == [0x0F, 0xC4, 0xBB, 0xC1, 0x53, 0x03, 0x12, 0x00] | |||
``` | |||
### Structs | |||
An encoded struct is the concatenation of the encoding of its elements. | |||
There is no length-prefix. | |||
Examples: | |||
```go | |||
type MyStruct struct{ | |||
A int | |||
B string | |||
C time.Time | |||
} | |||
encode(MyStruct{4, "hello", time.Time("Mon Jan 2 15:04:05 -0700 MST 2006")}) == | |||
[0x01, 0x04, 0x01, 0x05, 0x68, 0x65, 0x6C, 0x6C, 0x6F, 0x0F, 0xC4, 0xBB, 0xC1, 0x53, 0x03, 0x12, 0x00] | |||
``` | |||
## Merkle Trees | |||
Simple Merkle trees are used in numerous places in Tendermint to compute a cryptographic digest of a data structure. | |||
RIPEMD160 is always used as the hashing function. | |||
The function `SimpleMerkleRoot` is a simple recursive function defined as follows: | |||
```go | |||
func SimpleMerkleRoot(hashes [][]byte) []byte{ | |||
switch len(hashes) { | |||
case 0: | |||
return nil | |||
case 1: | |||
return hashes[0] | |||
default: | |||
left := SimpleMerkleRoot(hashes[:(len(hashes)+1)/2]) | |||
right := SimpleMerkleRoot(hashes[(len(hashes)+1)/2:]) | |||
return RIPEMD160(append(left, right)) | |||
} | |||
} | |||
``` | |||
Note: we abuse notion and call `SimpleMerkleRoot` with arguments of type `struct` or type `[]struct`. | |||
For `struct` arguments, we compute a `[][]byte` by sorting elements of the `struct` according to | |||
field name and then hashing them. | |||
For `[]struct` arguments, we compute a `[][]byte` by hashing the individual `struct` elements. | |||
## JSON (TMJSON) | |||
Signed messages (eg. votes, proposals) in the consensus are encoded in TMJSON, rather than TMBIN. | |||
TMJSON is JSON where `[]byte` are encoded as uppercase hex, rather than base64. | |||
When signing, the elements of a message are sorted by key and the sorted message is embedded in an | |||
outer JSON that includes a `chain_id` field. | |||
We call this encoding the CanonicalSignBytes. For instance, CanonicalSignBytes for a vote would look | |||
like: | |||
```json | |||
{"chain_id":"my-chain-id","vote":{"block_id":{"hash":DEADBEEF,"parts":{"hash":BEEFDEAD,"total":3}},"height":3,"round":2,"timestamp":1234567890, "type":2} | |||
``` | |||
Note how the fields within each level are sorted. | |||
## Other | |||
### MakeParts | |||
Encode an object using TMBIN and slice it into parts. | |||
```go | |||
MakeParts(object, partSize) | |||
``` | |||
### Part | |||
```go | |||
type Part struct { | |||
Index int | |||
Bytes byte[] | |||
Proof byte[] | |||
} | |||
``` |
@ -1,192 +1 @@ | |||
# Application Blockchain Interface (ABCI) | |||
ABCI is the interface between Tendermint (a state-machine replication engine) | |||
and an application (the actual state machine). | |||
The ABCI message types are defined in a [protobuf | |||
file](https://github.com/tendermint/abci/blob/master/types/types.proto). | |||
For full details on the ABCI message types and protocol, see the [ABCI | |||
specificaiton](https://github.com/tendermint/abci/blob/master/specification.rst). | |||
Be sure to read the specification if you're trying to build an ABCI app! | |||
For additional details on server implementation, see the [ABCI | |||
readme](https://github.com/tendermint/abci#implementation). | |||
Here we provide some more details around the use of ABCI by Tendermint and | |||
clarify common "gotchas". | |||
## ABCI connections | |||
Tendermint opens 3 ABCI connections to the app: one for Consensus, one for | |||
Mempool, one for Queries. | |||
## Async vs Sync | |||
The main ABCI server (ie. non-GRPC) provides ordered asynchronous messages. | |||
This is useful for DeliverTx and CheckTx, since it allows Tendermint to forward | |||
transactions to the app before it's finished processing previous ones. | |||
Thus, DeliverTx and CheckTx messages are sent asycnhronously, while all other | |||
messages are sent synchronously. | |||
## CheckTx and Commit | |||
It is typical to hold three distinct states in an ABCI app: CheckTxState, DeliverTxState, | |||
QueryState. The QueryState contains the latest committed state for a block. | |||
The CheckTxState and DeliverTxState may be updated concurrently with one another. | |||
Before Commit is called, Tendermint locks and flushes the mempool so that no new changes will happen | |||
to CheckTxState. When Commit completes, it unlocks the mempool. | |||
Thus, during Commit, it is safe to reset the QueryState and the CheckTxState to the latest DeliverTxState | |||
(ie. the new state from executing all the txs in the block). | |||
Note, however, that it is not possible to send transactions to Tendermint during Commit - if your app | |||
tries to send a `/broadcast_tx` to Tendermint during Commit, it will deadlock. | |||
## EndBlock Validator Updates | |||
Updates to the Tendermint validator set can be made by returning `Validator` | |||
objects in the `ResponseBeginBlock`: | |||
``` | |||
message Validator { | |||
bytes address = 1; | |||
PubKey pub_key = 2; | |||
int64 power = 3; | |||
} | |||
message PubKey { | |||
string type = 1; | |||
bytes data = 2; | |||
} | |||
``` | |||
The `pub_key` currently supports two types: | |||
- `type = "ed25519" and `data = <raw 32-byte public key>` | |||
- `type = "secp256k1" and `data = <33-byte OpenSSL compressed public key>` | |||
If the address is provided, it must match the address of the pubkey, as | |||
specified [here](/docs/spec/blockchain/encoding.md#Addresses) | |||
(Note: In the v0.19 series, the `pub_key` is the [Amino encoded public | |||
key](/docs/spec/blockchain/encoding.md#public-key-cryptography). | |||
For Ed25519 pubkeys, the Amino prefix is always "1624DE6220". For example, the 32-byte Ed25519 pubkey | |||
`76852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85` would be | |||
Amino encoded as | |||
`1624DE622076852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85`) | |||
(Note: In old versions of Tendermint (pre-v0.19.0), the pubkey is just prefixed with a | |||
single type byte, so for ED25519 we'd have `pub_key = 0x1 | pub`) | |||
The `power` is the new voting power for the validator, with the | |||
following rules: | |||
- power must be non-negative | |||
- if power is 0, the validator must already exist, and will be removed from the | |||
validator set | |||
- if power is non-0: | |||
- if the validator does not already exist, it will be added to the validator | |||
set with the given power | |||
- if the validator does already exist, its power will be adjusted to the given power | |||
## InitChain Validator Updates | |||
ResponseInitChain has the option to return a list of validators. | |||
If the list is not empty, Tendermint will adopt it for the validator set. | |||
This way the application can determine the initial validator set for the | |||
blockchain. | |||
Note that if addressses are included in the returned validators, they must match | |||
the address of the public key. | |||
ResponseInitChain also includes ConsensusParams, but these are presently | |||
ignored. | |||
## Query | |||
Query is a generic message type with lots of flexibility to enable diverse sets | |||
of queries from applications. Tendermint has no requirements from the Query | |||
message for normal operation - that is, the ABCI app developer need not implement Query functionality if they do not wish too. | |||
That said, Tendermint makes a number of queries to support some optional | |||
features. These are: | |||
### Peer Filtering | |||
When Tendermint connects to a peer, it sends two queries to the ABCI application | |||
using the following paths, with no additional data: | |||
- `/p2p/filter/addr/<IP:PORT>`, where `<IP:PORT>` denote the IP address and | |||
the port of the connection | |||
- `p2p/filter/id/<ID>`, where `<ID>` is the peer node ID (ie. the | |||
pubkey.Address() for the peer's PubKey) | |||
If either of these queries return a non-zero ABCI code, Tendermint will refuse | |||
to connect to the peer. | |||
## Info and the Handshake/Replay | |||
On startup, Tendermint calls Info on the Query connection to get the latest | |||
committed state of the app. The app MUST return information consistent with the | |||
last block it succesfully completed Commit for. | |||
If the app succesfully committed block H but not H+1, then `last_block_height = | |||
H` and `last_block_app_hash = <hash returned by Commit for block H>`. If the app | |||
failed during the Commit of block H, then `last_block_height = H-1` and | |||
`last_block_app_hash = <hash returned by Commit for block H-1, which is the hash | |||
in the header of block H>`. | |||
We now distinguish three heights, and describe how Tendermint syncs itself with | |||
the app. | |||
``` | |||
storeBlockHeight = height of the last block Tendermint saw a commit for | |||
stateBlockHeight = height of the last block for which Tendermint completed all | |||
block processing and saved all ABCI results to disk | |||
appBlockHeight = height of the last block for which ABCI app succesfully | |||
completely Commit | |||
``` | |||
Note we always have `storeBlockHeight >= stateBlockHeight` and `storeBlockHeight >= appBlockHeight` | |||
Note also we never call Commit on an ABCI app twice for the same height. | |||
The procedure is as follows. | |||
First, some simeple start conditions: | |||
If `appBlockHeight == 0`, then call InitChain. | |||
If `storeBlockHeight == 0`, we're done. | |||
Now, some sanity checks: | |||
If `storeBlockHeight < appBlockHeight`, error | |||
If `storeBlockHeight < stateBlockHeight`, panic | |||
If `storeBlockHeight > stateBlockHeight+1`, panic | |||
Now, the meat: | |||
If `storeBlockHeight == stateBlockHeight && appBlockHeight < storeBlockHeight`, | |||
replay all blocks in full from `appBlockHeight` to `storeBlockHeight`. | |||
This happens if we completed processing the block, but the app forgot its height. | |||
If `storeBlockHeight == stateBlockHeight && appBlockHeight == storeBlockHeight`, we're done | |||
This happens if we crashed at an opportune spot. | |||
If `storeBlockHeight == stateBlockHeight+1` | |||
This happens if we started processing the block but didn't finish. | |||
If `appBlockHeight < stateBlockHeight` | |||
replay all blocks in full from `appBlockHeight` to `storeBlockHeight-1`, | |||
and replay the block at `storeBlockHeight` using the WAL. | |||
This happens if the app forgot the last block it committed. | |||
If `appBlockHeight == stateBlockHeight`, | |||
replay the last block (storeBlockHeight) in full. | |||
This happens if we crashed before the app finished Commit | |||
If appBlockHeight == storeBlockHeight { | |||
update the state using the saved ABCI responses but dont run the block against the real app. | |||
This happens if we crashed after the app finished Commit but before Tendermint saved the state. | |||
[Moved](/docs/spec/software/abci.md) |
@ -0,0 +1,9 @@ | |||
We are working to finalize an updated Tendermint specification with formal | |||
proofs of safety and liveness. | |||
In the meantime, see the [description in the | |||
docs](http://tendermint.readthedocs.io/en/master/specification/byzantine-consensus-algorithm.html). | |||
There are also relevant but somewhat outdated descriptions in Jae Kwon's [original | |||
whitepaper](https://tendermint.com/static/docs/tendermint.pdf) and Ethan Buchman's [master's | |||
thesis](https://atrium.lib.uoguelph.ca/xmlui/handle/10214/9769). |
@ -1,33 +1 @@ | |||
# WAL | |||
Consensus module writes every message to the WAL (write-ahead log). | |||
It also issues fsync syscall through | |||
[File#Sync](https://golang.org/pkg/os/#File.Sync) for messages signed by this | |||
node (to prevent double signing). | |||
Under the hood, it uses | |||
[autofile.Group](https://godoc.org/github.com/tendermint/tmlibs/autofile#Group), | |||
which rotates files when those get too big (> 10MB). | |||
The total maximum size is 1GB. We only need the latest block and the block before it, | |||
but if the former is dragging on across many rounds, we want all those rounds. | |||
## Replay | |||
Consensus module will replay all the messages of the last height written to WAL | |||
before a crash (if such occurs). | |||
The private validator may try to sign messages during replay because it runs | |||
somewhat autonomously and does not know about replay process. | |||
For example, if we got all the way to precommit in the WAL and then crash, | |||
after we replay the proposal message, the private validator will try to sign a | |||
prevote. But it will fail. That's ok because we’ll see the prevote later in the | |||
WAL. Then it will go to precommit, and that time it will work because the | |||
private validator contains the `LastSignBytes` and then we’ll replay the | |||
precommit from the WAL. | |||
Make sure to read about [WAL | |||
corruption](https://tendermint.readthedocs.io/projects/tools/en/master/specification/corruption.html#wal-corruption) | |||
and recovery strategies. | |||
[Moved](/docs/spec/software/wal.md) |
@ -0,0 +1,192 @@ | |||
# Application Blockchain Interface (ABCI) | |||
ABCI is the interface between Tendermint (a state-machine replication engine) | |||
and an application (the actual state machine). | |||
The ABCI message types are defined in a [protobuf | |||
file](https://github.com/tendermint/abci/blob/master/types/types.proto). | |||
For full details on the ABCI message types and protocol, see the [ABCI | |||
specificaiton](https://github.com/tendermint/abci/blob/master/specification.rst). | |||
Be sure to read the specification if you're trying to build an ABCI app! | |||
For additional details on server implementation, see the [ABCI | |||
readme](https://github.com/tendermint/abci#implementation). | |||
Here we provide some more details around the use of ABCI by Tendermint and | |||
clarify common "gotchas". | |||
## ABCI connections | |||
Tendermint opens 3 ABCI connections to the app: one for Consensus, one for | |||
Mempool, one for Queries. | |||
## Async vs Sync | |||
The main ABCI server (ie. non-GRPC) provides ordered asynchronous messages. | |||
This is useful for DeliverTx and CheckTx, since it allows Tendermint to forward | |||
transactions to the app before it's finished processing previous ones. | |||
Thus, DeliverTx and CheckTx messages are sent asycnhronously, while all other | |||
messages are sent synchronously. | |||
## CheckTx and Commit | |||
It is typical to hold three distinct states in an ABCI app: CheckTxState, DeliverTxState, | |||
QueryState. The QueryState contains the latest committed state for a block. | |||
The CheckTxState and DeliverTxState may be updated concurrently with one another. | |||
Before Commit is called, Tendermint locks and flushes the mempool so that no new changes will happen | |||
to CheckTxState. When Commit completes, it unlocks the mempool. | |||
Thus, during Commit, it is safe to reset the QueryState and the CheckTxState to the latest DeliverTxState | |||
(ie. the new state from executing all the txs in the block). | |||
Note, however, that it is not possible to send transactions to Tendermint during Commit - if your app | |||
tries to send a `/broadcast_tx` to Tendermint during Commit, it will deadlock. | |||
## EndBlock Validator Updates | |||
Updates to the Tendermint validator set can be made by returning `Validator` | |||
objects in the `ResponseBeginBlock`: | |||
``` | |||
message Validator { | |||
bytes address = 1; | |||
PubKey pub_key = 2; | |||
int64 power = 3; | |||
} | |||
message PubKey { | |||
string type = 1; | |||
bytes data = 2; | |||
} | |||
``` | |||
The `pub_key` currently supports two types: | |||
- `type = "ed25519" and `data = <raw 32-byte public key>` | |||
- `type = "secp256k1" and `data = <33-byte OpenSSL compressed public key>` | |||
If the address is provided, it must match the address of the pubkey, as | |||
specified [here](/docs/spec/blockchain/encoding.md#Addresses) | |||
(Note: In the v0.19 series, the `pub_key` is the [Amino encoded public | |||
key](/docs/spec/blockchain/encoding.md#public-key-cryptography). | |||
For Ed25519 pubkeys, the Amino prefix is always "1624DE6220". For example, the 32-byte Ed25519 pubkey | |||
`76852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85` would be | |||
Amino encoded as | |||
`1624DE622076852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85`) | |||
(Note: In old versions of Tendermint (pre-v0.19.0), the pubkey is just prefixed with a | |||
single type byte, so for ED25519 we'd have `pub_key = 0x1 | pub`) | |||
The `power` is the new voting power for the validator, with the | |||
following rules: | |||
- power must be non-negative | |||
- if power is 0, the validator must already exist, and will be removed from the | |||
validator set | |||
- if power is non-0: | |||
- if the validator does not already exist, it will be added to the validator | |||
set with the given power | |||
- if the validator does already exist, its power will be adjusted to the given power | |||
## InitChain Validator Updates | |||
ResponseInitChain has the option to return a list of validators. | |||
If the list is not empty, Tendermint will adopt it for the validator set. | |||
This way the application can determine the initial validator set for the | |||
blockchain. | |||
Note that if addressses are included in the returned validators, they must match | |||
the address of the public key. | |||
ResponseInitChain also includes ConsensusParams, but these are presently | |||
ignored. | |||
## Query | |||
Query is a generic message type with lots of flexibility to enable diverse sets | |||
of queries from applications. Tendermint has no requirements from the Query | |||
message for normal operation - that is, the ABCI app developer need not implement Query functionality if they do not wish too. | |||
That said, Tendermint makes a number of queries to support some optional | |||
features. These are: | |||
### Peer Filtering | |||
When Tendermint connects to a peer, it sends two queries to the ABCI application | |||
using the following paths, with no additional data: | |||
- `/p2p/filter/addr/<IP:PORT>`, where `<IP:PORT>` denote the IP address and | |||
the port of the connection | |||
- `p2p/filter/id/<ID>`, where `<ID>` is the peer node ID (ie. the | |||
pubkey.Address() for the peer's PubKey) | |||
If either of these queries return a non-zero ABCI code, Tendermint will refuse | |||
to connect to the peer. | |||
## Info and the Handshake/Replay | |||
On startup, Tendermint calls Info on the Query connection to get the latest | |||
committed state of the app. The app MUST return information consistent with the | |||
last block it succesfully completed Commit for. | |||
If the app succesfully committed block H but not H+1, then `last_block_height = | |||
H` and `last_block_app_hash = <hash returned by Commit for block H>`. If the app | |||
failed during the Commit of block H, then `last_block_height = H-1` and | |||
`last_block_app_hash = <hash returned by Commit for block H-1, which is the hash | |||
in the header of block H>`. | |||
We now distinguish three heights, and describe how Tendermint syncs itself with | |||
the app. | |||
``` | |||
storeBlockHeight = height of the last block Tendermint saw a commit for | |||
stateBlockHeight = height of the last block for which Tendermint completed all | |||
block processing and saved all ABCI results to disk | |||
appBlockHeight = height of the last block for which ABCI app succesfully | |||
completely Commit | |||
``` | |||
Note we always have `storeBlockHeight >= stateBlockHeight` and `storeBlockHeight >= appBlockHeight` | |||
Note also we never call Commit on an ABCI app twice for the same height. | |||
The procedure is as follows. | |||
First, some simeple start conditions: | |||
If `appBlockHeight == 0`, then call InitChain. | |||
If `storeBlockHeight == 0`, we're done. | |||
Now, some sanity checks: | |||
If `storeBlockHeight < appBlockHeight`, error | |||
If `storeBlockHeight < stateBlockHeight`, panic | |||
If `storeBlockHeight > stateBlockHeight+1`, panic | |||
Now, the meat: | |||
If `storeBlockHeight == stateBlockHeight && appBlockHeight < storeBlockHeight`, | |||
replay all blocks in full from `appBlockHeight` to `storeBlockHeight`. | |||
This happens if we completed processing the block, but the app forgot its height. | |||
If `storeBlockHeight == stateBlockHeight && appBlockHeight == storeBlockHeight`, we're done | |||
This happens if we crashed at an opportune spot. | |||
If `storeBlockHeight == stateBlockHeight+1` | |||
This happens if we started processing the block but didn't finish. | |||
If `appBlockHeight < stateBlockHeight` | |||
replay all blocks in full from `appBlockHeight` to `storeBlockHeight-1`, | |||
and replay the block at `storeBlockHeight` using the WAL. | |||
This happens if the app forgot the last block it committed. | |||
If `appBlockHeight == stateBlockHeight`, | |||
replay the last block (storeBlockHeight) in full. | |||
This happens if we crashed before the app finished Commit | |||
If appBlockHeight == storeBlockHeight { | |||
update the state using the saved ABCI responses but dont run the block against the real app. | |||
This happens if we crashed after the app finished Commit but before Tendermint saved the state. |
@ -0,0 +1,33 @@ | |||
# WAL | |||
Consensus module writes every message to the WAL (write-ahead log). | |||
It also issues fsync syscall through | |||
[File#Sync](https://golang.org/pkg/os/#File.Sync) for messages signed by this | |||
node (to prevent double signing). | |||
Under the hood, it uses | |||
[autofile.Group](https://godoc.org/github.com/tendermint/tmlibs/autofile#Group), | |||
which rotates files when those get too big (> 10MB). | |||
The total maximum size is 1GB. We only need the latest block and the block before it, | |||
but if the former is dragging on across many rounds, we want all those rounds. | |||
## Replay | |||
Consensus module will replay all the messages of the last height written to WAL | |||
before a crash (if such occurs). | |||
The private validator may try to sign messages during replay because it runs | |||
somewhat autonomously and does not know about replay process. | |||
For example, if we got all the way to precommit in the WAL and then crash, | |||
after we replay the proposal message, the private validator will try to sign a | |||
prevote. But it will fail. That's ok because we’ll see the prevote later in the | |||
WAL. Then it will go to precommit, and that time it will work because the | |||
private validator contains the `LastSignBytes` and then we’ll replay the | |||
precommit from the WAL. | |||
Make sure to read about [WAL | |||
corruption](https://tendermint.readthedocs.io/projects/tools/en/master/specification/corruption.html#wal-corruption) | |||
and recovery strategies. |