From b8f340afd05eca5d8524772a5bb027c4aefff877 Mon Sep 17 00:00:00 2001 From: Ethan Buchman Date: Fri, 15 Jun 2018 22:56:26 -0700 Subject: [PATCH] docs/spec: some organizational cleanup --- docs/spec/README.md | 18 +- docs/spec/blockchain/pre-amino.md | 246 ------------------ docs/spec/consensus/abci.md | 193 +------------- docs/spec/consensus/consensus.md | 9 + .../{blockchain => consensus}/light-client.md | 0 docs/spec/consensus/wal.md | 34 +-- docs/spec/software/abci.md | 192 ++++++++++++++ docs/spec/software/wal.md | 33 +++ 8 files changed, 247 insertions(+), 478 deletions(-) delete mode 100644 docs/spec/blockchain/pre-amino.md create mode 100644 docs/spec/consensus/consensus.md rename docs/spec/{blockchain => consensus}/light-client.md (100%) create mode 100644 docs/spec/software/abci.md create mode 100644 docs/spec/software/wal.md diff --git a/docs/spec/README.md b/docs/spec/README.md index e2f6d1fa1..ab689d9d6 100644 --- a/docs/spec/README.md +++ b/docs/spec/README.md @@ -20,7 +20,9 @@ please submit them to our [bug bounty](https://tendermint.com/security)! ### Consensus Protocol -- TODO +- [Consensus Algorithm](/docs/spec/consensus/consensus.md) +- [Time](/docs/spec/consensus/bft-time.md) +- [Light-Client](/docs/spec/consensus/light-client.md) ### P2P and Network Protocols @@ -31,9 +33,12 @@ please submit them to our [bug bounty](https://tendermint.com/security)! - [Mempool](https://github.com/tendermint/tendermint/tree/master/docs/spec/reactors/mempool): gossip transactions so they get included in blocks - Evidence: TODO -### More -- Light Client: TODO -- Persistence: TODO +### Software + +- [ABCI](/docs/spec/software/abci.md): Details about interactions between the + application and consensus engine over ABCI +- [Write-Ahead Log](/docs/spec/software/wal.md): Details about how the consensus + engine preserves data and recovers from crash failures ## Overview @@ -42,10 +47,9 @@ hash-linked batches of transactions. Such transaction batches are called "blocks Hence, Tendermint defines a "blockchain". Each block in Tendermint has a unique index - its Height. -A block at `Height == H` can only be committed *after* the -block at `Height == H-1`. +Height's in the blockchain are monotonic. Each block is committed by a known set of weighted Validators. -Membership and weighting within this set may change over time. +Membership and weighting within this validator set may change over time. Tendermint guarantees the safety and liveness of the blockchain so long as less than 1/3 of the total weight of the Validator set is malicious or faulty. diff --git a/docs/spec/blockchain/pre-amino.md b/docs/spec/blockchain/pre-amino.md deleted file mode 100644 index edddeff46..000000000 --- a/docs/spec/blockchain/pre-amino.md +++ /dev/null @@ -1,246 +0,0 @@ -# Tendermint Encoding (Pre-Amino) - -## PubKeys and Addresses - -PubKeys are prefixed with a type-byte, followed by the raw bytes of the public -key. - -Two keys are supported with the following type bytes: - -``` -TypeByteEd25519 = 0x1 -TypeByteSecp256k1 = 0x2 -``` - -``` -// TypeByte: 0x1 -type PubKeyEd25519 [32]byte - -func (pub PubKeyEd25519) Encode() []byte { - return 0x1 | pub -} - -func (pub PubKeyEd25519) Address() []byte { - // NOTE: the length (0x0120) is also included - return RIPEMD160(0x1 | 0x0120 | pub) -} - -// TypeByte: 0x2 -// NOTE: OpenSSL compressed pubkey (x-cord with 0x2 or 0x3) -type PubKeySecp256k1 [33]byte - -func (pub PubKeySecp256k1) Encode() []byte { - return 0x2 | pub -} - -func (pub PubKeySecp256k1) Address() []byte { - return RIPEMD160(SHA256(pub)) -} -``` - -See https://github.com/tendermint/go-crypto/blob/v0.5.0/pub_key.go for more. - -## Binary Serialization (go-wire) - -Tendermint aims to encode data structures in a manner similar to how the corresponding Go structs -are laid out in memory. -Variable length items are length-prefixed. -While the encoding was inspired by Go, it is easily implemented in other languages as well, given its intuitive design. - -XXX: This is changing to use real varints and 4-byte-prefixes. -See https://github.com/tendermint/go-wire/tree/sdk2. - -### Fixed Length Integers - -Fixed length integers are encoded in Big-Endian using the specified number of bytes. -So `uint8` and `int8` use one byte, `uint16` and `int16` use two bytes, -`uint32` and `int32` use 3 bytes, and `uint64` and `int64` use 4 bytes. - -Negative integers are encoded via twos-complement. - -Examples: - -```go -encode(uint8(6)) == [0x06] -encode(uint32(6)) == [0x00, 0x00, 0x00, 0x06] - -encode(int8(-6)) == [0xFA] -encode(int32(-6)) == [0xFF, 0xFF, 0xFF, 0xFA] -``` - -### Variable Length Integers - -Variable length integers are encoded as length-prefixed Big-Endian integers. -The length-prefix consists of a single byte and corresponds to the length of the encoded integer. - -Negative integers are encoded by flipping the leading bit of the length-prefix to a `1`. - -Zero is encoded as `0x00`. It is not length-prefixed. - -Examples: - -```go -encode(uint(6)) == [0x01, 0x06] -encode(uint(70000)) == [0x03, 0x01, 0x11, 0x70] - -encode(int(-6)) == [0xF1, 0x06] -encode(int(-70000)) == [0xF3, 0x01, 0x11, 0x70] - -encode(int(0)) == [0x00] -``` - -### Strings - -An encoded string is length-prefixed followed by the underlying bytes of the string. -The length-prefix is itself encoded as an `int`. - -The empty string is encoded as `0x00`. It is not length-prefixed. - -Examples: - -```go -encode("") == [0x00] -encode("a") == [0x01, 0x01, 0x61] -encode("hello") == [0x01, 0x05, 0x68, 0x65, 0x6C, 0x6C, 0x6F] -encode("¥") == [0x01, 0x02, 0xC2, 0xA5] -``` - -### Arrays (fixed length) - -An encoded fix-lengthed array is the concatenation of the encoding of its elements. -There is no length-prefix. - -Examples: - -```go -encode([4]int8{1, 2, 3, 4}) == [0x01, 0x02, 0x03, 0x04] -encode([4]int16{1, 2, 3, 4}) == [0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x00, 0x04] -encode([4]int{1, 2, 3, 4}) == [0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x04] -encode([2]string{"abc", "efg"}) == [0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67] -``` - -### Slices (variable length) - -An encoded variable-length array is length-prefixed followed by the concatenation of the encoding of -its elements. -The length-prefix is itself encoded as an `int`. - -An empty slice is encoded as `0x00`. It is not length-prefixed. - -Examples: - -```go -encode([]int8{}) == [0x00] -encode([]int8{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x02, 0x03, 0x04] -encode([]int16{1, 2, 3, 4}) == [0x01, 0x04, 0x00, 0x01, 0x00, 0x02, 0x00, 0x03, 0x00, 0x04] -encode([]int{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x4] -encode([]string{"abc", "efg"}) == [0x01, 0x02, 0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67] -``` - -### BitArray - -BitArray is encoded as an `int` of the number of bits, and with an array of `uint64` to encode -value of each array element. - -```go -type BitArray struct { - Bits int - Elems []uint64 -} -``` - -### Time - -Time is encoded as an `int64` of the number of nanoseconds since January 1, 1970, -rounded to the nearest millisecond. - -Times before then are invalid. - -Examples: - -```go -encode(time.Time("Jan 1 00:00:00 UTC 1970")) == [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00] -encode(time.Time("Jan 1 00:00:01 UTC 1970")) == [0x00, 0x00, 0x00, 0x00, 0x3B, 0x9A, 0xCA, 0x00] // 1,000,000,000 ns -encode(time.Time("Mon Jan 2 15:04:05 -0700 MST 2006")) == [0x0F, 0xC4, 0xBB, 0xC1, 0x53, 0x03, 0x12, 0x00] -``` - -### Structs - -An encoded struct is the concatenation of the encoding of its elements. -There is no length-prefix. - -Examples: - -```go -type MyStruct struct{ - A int - B string - C time.Time -} -encode(MyStruct{4, "hello", time.Time("Mon Jan 2 15:04:05 -0700 MST 2006")}) == - [0x01, 0x04, 0x01, 0x05, 0x68, 0x65, 0x6C, 0x6C, 0x6F, 0x0F, 0xC4, 0xBB, 0xC1, 0x53, 0x03, 0x12, 0x00] -``` - -## Merkle Trees - -Simple Merkle trees are used in numerous places in Tendermint to compute a cryptographic digest of a data structure. - -RIPEMD160 is always used as the hashing function. - -The function `SimpleMerkleRoot` is a simple recursive function defined as follows: - -```go -func SimpleMerkleRoot(hashes [][]byte) []byte{ - switch len(hashes) { - case 0: - return nil - case 1: - return hashes[0] - default: - left := SimpleMerkleRoot(hashes[:(len(hashes)+1)/2]) - right := SimpleMerkleRoot(hashes[(len(hashes)+1)/2:]) - return RIPEMD160(append(left, right)) - } -} -``` - -Note: we abuse notion and call `SimpleMerkleRoot` with arguments of type `struct` or type `[]struct`. -For `struct` arguments, we compute a `[][]byte` by sorting elements of the `struct` according to -field name and then hashing them. -For `[]struct` arguments, we compute a `[][]byte` by hashing the individual `struct` elements. - -## JSON (TMJSON) - -Signed messages (eg. votes, proposals) in the consensus are encoded in TMJSON, rather than TMBIN. -TMJSON is JSON where `[]byte` are encoded as uppercase hex, rather than base64. - -When signing, the elements of a message are sorted by key and the sorted message is embedded in an -outer JSON that includes a `chain_id` field. -We call this encoding the CanonicalSignBytes. For instance, CanonicalSignBytes for a vote would look -like: - -```json -{"chain_id":"my-chain-id","vote":{"block_id":{"hash":DEADBEEF,"parts":{"hash":BEEFDEAD,"total":3}},"height":3,"round":2,"timestamp":1234567890, "type":2} -``` - -Note how the fields within each level are sorted. - -## Other - -### MakeParts - -Encode an object using TMBIN and slice it into parts. - -```go -MakeParts(object, partSize) -``` - -### Part - -```go -type Part struct { - Index int - Bytes byte[] - Proof byte[] -} -``` diff --git a/docs/spec/consensus/abci.md b/docs/spec/consensus/abci.md index 9c9e6a58b..82b88161e 100644 --- a/docs/spec/consensus/abci.md +++ b/docs/spec/consensus/abci.md @@ -1,192 +1 @@ -# Application Blockchain Interface (ABCI) - -ABCI is the interface between Tendermint (a state-machine replication engine) -and an application (the actual state machine). - -The ABCI message types are defined in a [protobuf -file](https://github.com/tendermint/abci/blob/master/types/types.proto). - -For full details on the ABCI message types and protocol, see the [ABCI -specificaiton](https://github.com/tendermint/abci/blob/master/specification.rst). -Be sure to read the specification if you're trying to build an ABCI app! - -For additional details on server implementation, see the [ABCI -readme](https://github.com/tendermint/abci#implementation). - -Here we provide some more details around the use of ABCI by Tendermint and -clarify common "gotchas". - -## ABCI connections - -Tendermint opens 3 ABCI connections to the app: one for Consensus, one for -Mempool, one for Queries. - -## Async vs Sync - -The main ABCI server (ie. non-GRPC) provides ordered asynchronous messages. -This is useful for DeliverTx and CheckTx, since it allows Tendermint to forward -transactions to the app before it's finished processing previous ones. - -Thus, DeliverTx and CheckTx messages are sent asycnhronously, while all other -messages are sent synchronously. - -## CheckTx and Commit - -It is typical to hold three distinct states in an ABCI app: CheckTxState, DeliverTxState, -QueryState. The QueryState contains the latest committed state for a block. -The CheckTxState and DeliverTxState may be updated concurrently with one another. -Before Commit is called, Tendermint locks and flushes the mempool so that no new changes will happen -to CheckTxState. When Commit completes, it unlocks the mempool. - -Thus, during Commit, it is safe to reset the QueryState and the CheckTxState to the latest DeliverTxState -(ie. the new state from executing all the txs in the block). - -Note, however, that it is not possible to send transactions to Tendermint during Commit - if your app -tries to send a `/broadcast_tx` to Tendermint during Commit, it will deadlock. - - -## EndBlock Validator Updates - -Updates to the Tendermint validator set can be made by returning `Validator` -objects in the `ResponseBeginBlock`: - -``` -message Validator { - bytes address = 1; - PubKey pub_key = 2; - int64 power = 3; -} - -message PubKey { - string type = 1; - bytes data = 2; -} - -``` - -The `pub_key` currently supports two types: - - `type = "ed25519" and `data = ` - - `type = "secp256k1" and `data = <33-byte OpenSSL compressed public key>` - -If the address is provided, it must match the address of the pubkey, as -specified [here](/docs/spec/blockchain/encoding.md#Addresses) - -(Note: In the v0.19 series, the `pub_key` is the [Amino encoded public -key](/docs/spec/blockchain/encoding.md#public-key-cryptography). -For Ed25519 pubkeys, the Amino prefix is always "1624DE6220". For example, the 32-byte Ed25519 pubkey -`76852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85` would be -Amino encoded as -`1624DE622076852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85`) - -(Note: In old versions of Tendermint (pre-v0.19.0), the pubkey is just prefixed with a -single type byte, so for ED25519 we'd have `pub_key = 0x1 | pub`) - -The `power` is the new voting power for the validator, with the -following rules: - -- power must be non-negative -- if power is 0, the validator must already exist, and will be removed from the - validator set -- if power is non-0: - - if the validator does not already exist, it will be added to the validator - set with the given power - - if the validator does already exist, its power will be adjusted to the given power - -## InitChain Validator Updates - -ResponseInitChain has the option to return a list of validators. -If the list is not empty, Tendermint will adopt it for the validator set. -This way the application can determine the initial validator set for the -blockchain. - -Note that if addressses are included in the returned validators, they must match -the address of the public key. - -ResponseInitChain also includes ConsensusParams, but these are presently -ignored. - -## Query - -Query is a generic message type with lots of flexibility to enable diverse sets -of queries from applications. Tendermint has no requirements from the Query -message for normal operation - that is, the ABCI app developer need not implement Query functionality if they do not wish too. -That said, Tendermint makes a number of queries to support some optional -features. These are: - -### Peer Filtering - -When Tendermint connects to a peer, it sends two queries to the ABCI application -using the following paths, with no additional data: - - - `/p2p/filter/addr/`, where `` denote the IP address and - the port of the connection - - `p2p/filter/id/`, where `` is the peer node ID (ie. the - pubkey.Address() for the peer's PubKey) - -If either of these queries return a non-zero ABCI code, Tendermint will refuse -to connect to the peer. - -## Info and the Handshake/Replay - -On startup, Tendermint calls Info on the Query connection to get the latest -committed state of the app. The app MUST return information consistent with the -last block it succesfully completed Commit for. - -If the app succesfully committed block H but not H+1, then `last_block_height = -H` and `last_block_app_hash = `. If the app -failed during the Commit of block H, then `last_block_height = H-1` and -`last_block_app_hash = `. - -We now distinguish three heights, and describe how Tendermint syncs itself with -the app. - -``` -storeBlockHeight = height of the last block Tendermint saw a commit for -stateBlockHeight = height of the last block for which Tendermint completed all - block processing and saved all ABCI results to disk -appBlockHeight = height of the last block for which ABCI app succesfully - completely Commit -``` - -Note we always have `storeBlockHeight >= stateBlockHeight` and `storeBlockHeight >= appBlockHeight` -Note also we never call Commit on an ABCI app twice for the same height. - -The procedure is as follows. - -First, some simeple start conditions: - -If `appBlockHeight == 0`, then call InitChain. - -If `storeBlockHeight == 0`, we're done. - -Now, some sanity checks: - -If `storeBlockHeight < appBlockHeight`, error -If `storeBlockHeight < stateBlockHeight`, panic -If `storeBlockHeight > stateBlockHeight+1`, panic - -Now, the meat: - -If `storeBlockHeight == stateBlockHeight && appBlockHeight < storeBlockHeight`, - replay all blocks in full from `appBlockHeight` to `storeBlockHeight`. - This happens if we completed processing the block, but the app forgot its height. - -If `storeBlockHeight == stateBlockHeight && appBlockHeight == storeBlockHeight`, we're done - This happens if we crashed at an opportune spot. - -If `storeBlockHeight == stateBlockHeight+1` - This happens if we started processing the block but didn't finish. - - If `appBlockHeight < stateBlockHeight` - replay all blocks in full from `appBlockHeight` to `storeBlockHeight-1`, - and replay the block at `storeBlockHeight` using the WAL. - This happens if the app forgot the last block it committed. - - If `appBlockHeight == stateBlockHeight`, - replay the last block (storeBlockHeight) in full. - This happens if we crashed before the app finished Commit - - If appBlockHeight == storeBlockHeight { - update the state using the saved ABCI responses but dont run the block against the real app. - This happens if we crashed after the app finished Commit but before Tendermint saved the state. +[Moved](/docs/spec/software/abci.md) diff --git a/docs/spec/consensus/consensus.md b/docs/spec/consensus/consensus.md new file mode 100644 index 000000000..1bf075773 --- /dev/null +++ b/docs/spec/consensus/consensus.md @@ -0,0 +1,9 @@ +We are working to finalize an updated Tendermint specification with formal +proofs of safety and liveness. + +In the meantime, see the [description in the +docs](http://tendermint.readthedocs.io/en/master/specification/byzantine-consensus-algorithm.html). + +There are also relevant but somewhat outdated descriptions in Jae Kwon's [original +whitepaper](https://tendermint.com/static/docs/tendermint.pdf) and Ethan Buchman's [master's +thesis](https://atrium.lib.uoguelph.ca/xmlui/handle/10214/9769). diff --git a/docs/spec/blockchain/light-client.md b/docs/spec/consensus/light-client.md similarity index 100% rename from docs/spec/blockchain/light-client.md rename to docs/spec/consensus/light-client.md diff --git a/docs/spec/consensus/wal.md b/docs/spec/consensus/wal.md index a2e03137d..589680f99 100644 --- a/docs/spec/consensus/wal.md +++ b/docs/spec/consensus/wal.md @@ -1,33 +1 @@ -# WAL - -Consensus module writes every message to the WAL (write-ahead log). - -It also issues fsync syscall through -[File#Sync](https://golang.org/pkg/os/#File.Sync) for messages signed by this -node (to prevent double signing). - -Under the hood, it uses -[autofile.Group](https://godoc.org/github.com/tendermint/tmlibs/autofile#Group), -which rotates files when those get too big (> 10MB). - -The total maximum size is 1GB. We only need the latest block and the block before it, -but if the former is dragging on across many rounds, we want all those rounds. - -## Replay - -Consensus module will replay all the messages of the last height written to WAL -before a crash (if such occurs). - -The private validator may try to sign messages during replay because it runs -somewhat autonomously and does not know about replay process. - -For example, if we got all the way to precommit in the WAL and then crash, -after we replay the proposal message, the private validator will try to sign a -prevote. But it will fail. That's ok because we’ll see the prevote later in the -WAL. Then it will go to precommit, and that time it will work because the -private validator contains the `LastSignBytes` and then we’ll replay the -precommit from the WAL. - -Make sure to read about [WAL -corruption](https://tendermint.readthedocs.io/projects/tools/en/master/specification/corruption.html#wal-corruption) -and recovery strategies. +[Moved](/docs/spec/software/wal.md) diff --git a/docs/spec/software/abci.md b/docs/spec/software/abci.md new file mode 100644 index 000000000..9c9e6a58b --- /dev/null +++ b/docs/spec/software/abci.md @@ -0,0 +1,192 @@ +# Application Blockchain Interface (ABCI) + +ABCI is the interface between Tendermint (a state-machine replication engine) +and an application (the actual state machine). + +The ABCI message types are defined in a [protobuf +file](https://github.com/tendermint/abci/blob/master/types/types.proto). + +For full details on the ABCI message types and protocol, see the [ABCI +specificaiton](https://github.com/tendermint/abci/blob/master/specification.rst). +Be sure to read the specification if you're trying to build an ABCI app! + +For additional details on server implementation, see the [ABCI +readme](https://github.com/tendermint/abci#implementation). + +Here we provide some more details around the use of ABCI by Tendermint and +clarify common "gotchas". + +## ABCI connections + +Tendermint opens 3 ABCI connections to the app: one for Consensus, one for +Mempool, one for Queries. + +## Async vs Sync + +The main ABCI server (ie. non-GRPC) provides ordered asynchronous messages. +This is useful for DeliverTx and CheckTx, since it allows Tendermint to forward +transactions to the app before it's finished processing previous ones. + +Thus, DeliverTx and CheckTx messages are sent asycnhronously, while all other +messages are sent synchronously. + +## CheckTx and Commit + +It is typical to hold three distinct states in an ABCI app: CheckTxState, DeliverTxState, +QueryState. The QueryState contains the latest committed state for a block. +The CheckTxState and DeliverTxState may be updated concurrently with one another. +Before Commit is called, Tendermint locks and flushes the mempool so that no new changes will happen +to CheckTxState. When Commit completes, it unlocks the mempool. + +Thus, during Commit, it is safe to reset the QueryState and the CheckTxState to the latest DeliverTxState +(ie. the new state from executing all the txs in the block). + +Note, however, that it is not possible to send transactions to Tendermint during Commit - if your app +tries to send a `/broadcast_tx` to Tendermint during Commit, it will deadlock. + + +## EndBlock Validator Updates + +Updates to the Tendermint validator set can be made by returning `Validator` +objects in the `ResponseBeginBlock`: + +``` +message Validator { + bytes address = 1; + PubKey pub_key = 2; + int64 power = 3; +} + +message PubKey { + string type = 1; + bytes data = 2; +} + +``` + +The `pub_key` currently supports two types: + - `type = "ed25519" and `data = ` + - `type = "secp256k1" and `data = <33-byte OpenSSL compressed public key>` + +If the address is provided, it must match the address of the pubkey, as +specified [here](/docs/spec/blockchain/encoding.md#Addresses) + +(Note: In the v0.19 series, the `pub_key` is the [Amino encoded public +key](/docs/spec/blockchain/encoding.md#public-key-cryptography). +For Ed25519 pubkeys, the Amino prefix is always "1624DE6220". For example, the 32-byte Ed25519 pubkey +`76852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85` would be +Amino encoded as +`1624DE622076852933A4686A721442E931A8415F62F5F1AEDF4910F1F252FB393F74C40C85`) + +(Note: In old versions of Tendermint (pre-v0.19.0), the pubkey is just prefixed with a +single type byte, so for ED25519 we'd have `pub_key = 0x1 | pub`) + +The `power` is the new voting power for the validator, with the +following rules: + +- power must be non-negative +- if power is 0, the validator must already exist, and will be removed from the + validator set +- if power is non-0: + - if the validator does not already exist, it will be added to the validator + set with the given power + - if the validator does already exist, its power will be adjusted to the given power + +## InitChain Validator Updates + +ResponseInitChain has the option to return a list of validators. +If the list is not empty, Tendermint will adopt it for the validator set. +This way the application can determine the initial validator set for the +blockchain. + +Note that if addressses are included in the returned validators, they must match +the address of the public key. + +ResponseInitChain also includes ConsensusParams, but these are presently +ignored. + +## Query + +Query is a generic message type with lots of flexibility to enable diverse sets +of queries from applications. Tendermint has no requirements from the Query +message for normal operation - that is, the ABCI app developer need not implement Query functionality if they do not wish too. +That said, Tendermint makes a number of queries to support some optional +features. These are: + +### Peer Filtering + +When Tendermint connects to a peer, it sends two queries to the ABCI application +using the following paths, with no additional data: + + - `/p2p/filter/addr/`, where `` denote the IP address and + the port of the connection + - `p2p/filter/id/`, where `` is the peer node ID (ie. the + pubkey.Address() for the peer's PubKey) + +If either of these queries return a non-zero ABCI code, Tendermint will refuse +to connect to the peer. + +## Info and the Handshake/Replay + +On startup, Tendermint calls Info on the Query connection to get the latest +committed state of the app. The app MUST return information consistent with the +last block it succesfully completed Commit for. + +If the app succesfully committed block H but not H+1, then `last_block_height = +H` and `last_block_app_hash = `. If the app +failed during the Commit of block H, then `last_block_height = H-1` and +`last_block_app_hash = `. + +We now distinguish three heights, and describe how Tendermint syncs itself with +the app. + +``` +storeBlockHeight = height of the last block Tendermint saw a commit for +stateBlockHeight = height of the last block for which Tendermint completed all + block processing and saved all ABCI results to disk +appBlockHeight = height of the last block for which ABCI app succesfully + completely Commit +``` + +Note we always have `storeBlockHeight >= stateBlockHeight` and `storeBlockHeight >= appBlockHeight` +Note also we never call Commit on an ABCI app twice for the same height. + +The procedure is as follows. + +First, some simeple start conditions: + +If `appBlockHeight == 0`, then call InitChain. + +If `storeBlockHeight == 0`, we're done. + +Now, some sanity checks: + +If `storeBlockHeight < appBlockHeight`, error +If `storeBlockHeight < stateBlockHeight`, panic +If `storeBlockHeight > stateBlockHeight+1`, panic + +Now, the meat: + +If `storeBlockHeight == stateBlockHeight && appBlockHeight < storeBlockHeight`, + replay all blocks in full from `appBlockHeight` to `storeBlockHeight`. + This happens if we completed processing the block, but the app forgot its height. + +If `storeBlockHeight == stateBlockHeight && appBlockHeight == storeBlockHeight`, we're done + This happens if we crashed at an opportune spot. + +If `storeBlockHeight == stateBlockHeight+1` + This happens if we started processing the block but didn't finish. + + If `appBlockHeight < stateBlockHeight` + replay all blocks in full from `appBlockHeight` to `storeBlockHeight-1`, + and replay the block at `storeBlockHeight` using the WAL. + This happens if the app forgot the last block it committed. + + If `appBlockHeight == stateBlockHeight`, + replay the last block (storeBlockHeight) in full. + This happens if we crashed before the app finished Commit + + If appBlockHeight == storeBlockHeight { + update the state using the saved ABCI responses but dont run the block against the real app. + This happens if we crashed after the app finished Commit but before Tendermint saved the state. diff --git a/docs/spec/software/wal.md b/docs/spec/software/wal.md new file mode 100644 index 000000000..a2e03137d --- /dev/null +++ b/docs/spec/software/wal.md @@ -0,0 +1,33 @@ +# WAL + +Consensus module writes every message to the WAL (write-ahead log). + +It also issues fsync syscall through +[File#Sync](https://golang.org/pkg/os/#File.Sync) for messages signed by this +node (to prevent double signing). + +Under the hood, it uses +[autofile.Group](https://godoc.org/github.com/tendermint/tmlibs/autofile#Group), +which rotates files when those get too big (> 10MB). + +The total maximum size is 1GB. We only need the latest block and the block before it, +but if the former is dragging on across many rounds, we want all those rounds. + +## Replay + +Consensus module will replay all the messages of the last height written to WAL +before a crash (if such occurs). + +The private validator may try to sign messages during replay because it runs +somewhat autonomously and does not know about replay process. + +For example, if we got all the way to precommit in the WAL and then crash, +after we replay the proposal message, the private validator will try to sign a +prevote. But it will fail. That's ok because we’ll see the prevote later in the +WAL. Then it will go to precommit, and that time it will work because the +private validator contains the `LastSignBytes` and then we’ll replay the +precommit from the WAL. + +Make sure to read about [WAL +corruption](https://tendermint.readthedocs.io/projects/tools/en/master/specification/corruption.html#wal-corruption) +and recovery strategies.