This repository contains specifications for the Tendermint protocol. For the pdf, see the [latest release](https://github.com/tendermint/spec/releases).
This repository contains specifications for the Tendermint protocol. For the pdf, see the [latest release](https://github.com/tendermint/spec/releases).
There are currently two implementations of the Tendermint protocol,
There are currently two implementations of the Tendermint protocol,
maintained by two separate-but-collaborative entities:
maintained by two separate-but-collaborative entities:
One in [Go](https://github.com/tendermint/tendermint),
maintained by Interchain GmbH,
One in [Go](https://github.com/tendermint/tendermint),
maintained by Interchain GmbH,
and one in [Rust](https://github.com/informalsystems/tendermint-rs),
and one in [Rust](https://github.com/informalsystems/tendermint-rs),
maintained by Informal Systems.
maintained by Informal Systems.
There have been inadvertent divergences in the specs followed
by the Go implementation and the Rust implementation respectively.
However, we are worked to reconverge these specs into a single unified spec.
There have been inadvertent divergences in the specs followed
by the Go implementation and the Rust implementation respectively.
However, we are worked to reconverge these specs into a single unified spec.
Consequently, this repository is in a bit of a state of flux.
Consequently, this repository is in a bit of a state of flux.
At the moment, the spec followed by the Go implementation
(tendermint/tendermint) is in the [spec](spec) directory,
while the spec followed by the Rust implementation
At the moment, the spec followed by the Go implementation
(tendermint/tendermint) is in the [spec](spec) directory,
while the spec followed by the Rust implementation
(informalsystems/tendermint-rs) is in the rust-spec
(informalsystems/tendermint-rs) is in the rust-spec
directory. TLA+ specifications are also in the rust-spec directory.
directory. TLA+ specifications are also in the rust-spec directory.
Over time, these specs will converge in the spec directory.
Once they have fully converged, we will version the spec moving forward.
Over time, these specs will converge in the spec directory.
Once they have fully converged, we will version the spec moving forward.
Currently, all Tendermint nodes contain the complete sequence of blocks from genesis up to some height (typically the latest chain height). This will no longer be true when the following features are released:
Currently, all Tendermint nodes contain the complete sequence of blocks from genesis up to some height (typically the latest chain height). This will no longer be true when the following features are released:
* [Block pruning](https://github.com/tendermint/tendermint/issues/3652): removes historical blocks and associated data (e.g. validator sets) up to some height, keeping only the most recent blocks.
- [Block pruning](https://github.com/tendermint/tendermint/issues/3652): removes historical blocks and associated data (e.g. validator sets) up to some height, keeping only the most recent blocks.
* [State sync](https://github.com/tendermint/tendermint/issues/828): bootstraps a new node by syncing state machine snapshots at a given height, but not historical blocks and associated data.
- [State sync](https://github.com/tendermint/tendermint/issues/828): bootstraps a new node by syncing state machine snapshots at a given height, but not historical blocks and associated data.
To maintain the integrity of the chain, the use of these features must be coordinated such that necessary historical blocks will not become unavailable or lost forever. In particular:
To maintain the integrity of the chain, the use of these features must be coordinated such that necessary historical blocks will not become unavailable or lost forever. In particular:
* Some nodes should have complete block histories, for auditability, querying, and bootstrapping.
- Some nodes should have complete block histories, for auditability, querying, and bootstrapping.
* The majority of nodes should retain blocks longer than the Cosmos SDK unbonding period, for light client verification.
- The majority of nodes should retain blocks longer than the Cosmos SDK unbonding period, for light client verification.
* Some nodes must take and serve state sync snapshots with snapshot intervals less than the block retention periods, to allow new nodes to state sync and then replay blocks to catch up.
- Some nodes must take and serve state sync snapshots with snapshot intervals less than the block retention periods, to allow new nodes to state sync and then replay blocks to catch up.
* Applications may not persist their state on commit, and require block replay on restart.
- Applications may not persist their state on commit, and require block replay on restart.
* Only a minority of nodes can be state synced within the unbonding period, for light client verification and to serve block histories for catch-up.
- Only a minority of nodes can be state synced within the unbonding period, for light client verification and to serve block histories for catch-up.
However, it is unclear if and how we should enforce this. It may not be possible to technically enforce all of these without knowing the state of the entire network, but it may also be unrealistic to expect this to be enforced entirely through social coordination. This is especially unfortunate since the consequences of misconfiguration can be permanent chain-wide data loss.
However, it is unclear if and how we should enforce this. It may not be possible to technically enforce all of these without knowing the state of the entire network, but it may also be unrealistic to expect this to be enforced entirely through social coordination. This is especially unfortunate since the consequences of misconfiguration can be permanent chain-wide data loss.
@ -65,13 +65,13 @@ As an example, we'll consider how the Cosmos SDK might make use of this. The spe
The returned `retain_height` would be the lowest height that satisfies:
The returned `retain_height` would be the lowest height that satisfies:
* Unbonding time: the time interval in which validators can be economically punished for misbehavior. Blocks in this interval must be auditable e.g. by the light client.
- Unbonding time: the time interval in which validators can be economically punished for misbehavior. Blocks in this interval must be auditable e.g. by the light client.
* IAVL snapshot interval: the block interval at which the underlying IAVL database is persisted to disk, e.g. every 10000 heights. Blocks since the last IAVL snapshot must be available for replay on application restart.
- IAVL snapshot interval: the block interval at which the underlying IAVL database is persisted to disk, e.g. every 10000 heights. Blocks since the last IAVL snapshot must be available for replay on application restart.
* State sync snapshots: blocks since the _oldest_ available snapshot must be available for state sync nodes to catch up (oldest because a node may be restoring an old snapshot while a new snapshot was taken).
- State sync snapshots: blocks since the _oldest_ available snapshot must be available for state sync nodes to catch up (oldest because a node may be restoring an old snapshot while a new snapshot was taken).
* Local config: archive nodes may want to retain more or all blocks, e.g. via a local config option `min-retain-blocks`. There may also be a need to vary rentention for other nodes, e.g. sentry nodes which do not need historical blocks.
- Local config: archive nodes may want to retain more or all blocks, e.g. via a local config option `min-retain-blocks`. There may also be a need to vary rentention for other nodes, e.g. sentry nodes which do not need historical blocks.
* Application-specified block retention allows the application to take all relevant factors into account and prevent necessary blocks from being accidentally removed.
- Application-specified block retention allows the application to take all relevant factors into account and prevent necessary blocks from being accidentally removed.
* Node operators can independently decide whether they want to provide complete block histories (if local configuration for this is provided) and snapshots.
- Node operators can independently decide whether they want to provide complete block histories (if local configuration for this is provided) and snapshots.
### Negative
### Negative
* Social coordination is required to run archival nodes, failure to do so may lead to permanent loss of historical blocks.
- Social coordination is required to run archival nodes, failure to do so may lead to permanent loss of historical blocks.
* Social coordination is required to run snapshot nodes, failure to do so may lead to inability to run state sync, and inability to bootstrap new nodes at all if no archival nodes are online.
- Social coordination is required to run snapshot nodes, failure to do so may lead to inability to run state sync, and inability to bootstrap new nodes at all if no archival nodes are online.
### Neutral
### Neutral
* Reduced block retention requires application changes, and cannot be controlled directly in Tendermint.
- Reduced block retention requires application changes, and cannot be controlled directly in Tendermint.
* Application-specified block retention may set a lower bound on disk space requirements for all nodes.
- Application-specified block retention may set a lower bound on disk space requirements for all nodes.
## References
## References
- State sync ADR: https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md
- State sync ADR: <https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md>
- State sync issue: https://github.com/tendermint/tendermint/issues/828
- State sync issue: <https://github.com/tendermint/tendermint/issues/828>
- `InitialHeight (int64)`: Height of the initial block (typically `1`).
- **Response**:
- **Response**:
- `ConsensusParams (ConsensusParams)`: Initial
- `ConsensusParams (ConsensusParams)`: Initial
consensus-critical parameters (optional).
consensus-critical parameters (optional).
- `Validators ([]ValidatorUpdate)`: Initial validator set (optional).
- `AppHash ([]byte)`: Initial application hash.
- `Validators ([]ValidatorUpdate)`: Initial validator set (optional).
- `AppHash ([]byte)`: Initial application hash.
- **Usage**:
- **Usage**:
- Called once upon genesis.
- If ResponseInitChain.Validators is empty, the initial validator set will be the RequestInitChain.Validators
- If ResponseInitChain.Validators is not empty, it will be the initial
- Called once upon genesis.
- If ResponseInitChain.Validators is empty, the initial validator set will be the RequestInitChain.Validators
- If ResponseInitChain.Validators is not empty, it will be the initial
validator set (regardless of what is in RequestInitChain.Validators).
validator set (regardless of what is in RequestInitChain.Validators).
- This allows the app to decide if it wants to accept the initial validator
- This allows the app to decide if it wants to accept the initial validator
set proposed by tendermint (ie. in the genesis file), or if it wants to use
set proposed by tendermint (ie. in the genesis file), or if it wants to use
a different one (perhaps computed based on some application specific
a different one (perhaps computed based on some application specific
information in the genesis file).
information in the genesis file).
@ -258,154 +258,154 @@ via light client.
### Query
### Query
- **Request**:
- **Request**:
- `Data ([]byte)`: Raw query bytes. Can be used with or in lieu
- `Data ([]byte)`: Raw query bytes. Can be used with or in lieu
of Path.
of Path.
- `Path (string)`: Path of request, like an HTTP GET path. Can be
- `Path (string)`: Path of request, like an HTTP GET path. Can be
used with or in liue of Data.
used with or in liue of Data.
- Apps MUST interpret '/store' as a query by key on the
- Apps MUST interpret '/store' as a query by key on the
underlying store. The key SHOULD be specified in the Data field.
underlying store. The key SHOULD be specified in the Data field.
- Apps SHOULD allow queries over specific types like
- Apps SHOULD allow queries over specific types like
'/accounts/...' or '/votes/...'
'/accounts/...' or '/votes/...'
- `Height (int64)`: The block height for which you want the query
- `Height (int64)`: The block height for which you want the query
(default=0 returns data for the latest committed block). Note
(default=0 returns data for the latest committed block). Note
that this is the height of the block containing the
that this is the height of the block containing the
application's Merkle root hash, which represents the state as it
application's Merkle root hash, which represents the state as it
was after committing the block at Height-1
was after committing the block at Height-1
- `Prove (bool)`: Return Merkle proof with response if possible
- `Prove (bool)`: Return Merkle proof with response if possible
- **Response**:
- **Response**:
- `Code (uint32)`: Response code.
- `Log (string)`: The output of the application's logger. May
- `Code (uint32)`: Response code.
- `Log (string)`: The output of the application's logger. May
be non-deterministic.
be non-deterministic.
- `Info (string)`: Additional information. May
- `Info (string)`: Additional information. May
be non-deterministic.
be non-deterministic.
- `Index (int64)`: The index of the key in the tree.
- `Key ([]byte)`: The key of the matching data.
- `Value ([]byte)`: The value of the matching data.
- `Proof (Proof)`: Serialized proof for the value data, if requested, to be
- `Index (int64)`: The index of the key in the tree.
- `Key ([]byte)`: The key of the matching data.
- `Value ([]byte)`: The value of the matching data.
- `Proof (Proof)`: Serialized proof for the value data, if requested, to be
verified against the `AppHash` for the given Height.
verified against the `AppHash` for the given Height.
- `Height (int64)`: The block height from which data was derived.
- `Height (int64)`: The block height from which data was derived.
Note that this is the height of the block containing the
Note that this is the height of the block containing the
application's Merkle root hash, which represents the state as it
application's Merkle root hash, which represents the state as it
was after committing the block at Height-1
was after committing the block at Height-1
- `Codespace (string)`: Namespace for the `Code`.
- `Codespace (string)`: Namespace for the `Code`.
- **Usage**:
- **Usage**:
- Query for data from the application at current or past height.
- Optionally return Merkle proof.
- Merkle proof includes self-describing `type` field to support many types
- Query for data from the application at current or past height.
- Optionally return Merkle proof.
- Merkle proof includes self-describing `type` field to support many types
of Merkle trees and encoding formats.
of Merkle trees and encoding formats.
### BeginBlock
### BeginBlock
- **Request**:
- **Request**:
- `Hash ([]byte)`: The block's hash. This can be derived from the
- `Hash ([]byte)`: The block's hash. This can be derived from the
block header.
block header.
- `Header (struct{})`: The block header.
- `LastCommitInfo (LastCommitInfo)`: Info about the last commit, including the
- `Header (struct{})`: The block header.
- `LastCommitInfo (LastCommitInfo)`: Info about the last commit, including the
round, and the list of validators and which ones signed the last block.
round, and the list of validators and which ones signed the last block.
- `ByzantineValidators ([]Evidence)`: List of evidence of
- `ByzantineValidators ([]Evidence)`: List of evidence of
validators that acted maliciously.
validators that acted maliciously.
- **Response**:
- **Response**:
- `Tags ([]kv.Pair)`: Key-Value tags for filtering and indexing
- `Tags ([]kv.Pair)`: Key-Value tags for filtering and indexing
- **Usage**:
- **Usage**:
- Signals the beginning of a new block. Called prior to
- Signals the beginning of a new block. Called prior to
any DeliverTxs.
any DeliverTxs.
- The header contains the height, timestamp, and more - it exactly matches the
- The header contains the height, timestamp, and more - it exactly matches the
Tendermint block header. We may seek to generalize this in the future.
Tendermint block header. We may seek to generalize this in the future.
- The `LastCommitInfo` and `ByzantineValidators` can be used to determine
- The `LastCommitInfo` and `ByzantineValidators` can be used to determine
rewards and punishments for the validators. NOTE validators here do not
rewards and punishments for the validators. NOTE validators here do not
include pubkeys.
include pubkeys.
### CheckTx
### CheckTx
- **Request**:
- **Request**:
- `Tx ([]byte)`: The request transaction bytes
- `Type (CheckTxType)`: What type of `CheckTx` request is this? At present,
- `Tx ([]byte)`: The request transaction bytes
- `Type (CheckTxType)`: What type of `CheckTx` request is this? At present,
there are two possible values: `CheckTx_New` (the default, which says
there are two possible values: `CheckTx_New` (the default, which says
that a full check is required), and `CheckTx_Recheck` (when the mempool is
that a full check is required), and `CheckTx_Recheck` (when the mempool is
initiating a normal recheck of a transaction).
initiating a normal recheck of a transaction).
- **Response**:
- **Response**:
- `Code (uint32)`: Response code
- `Data ([]byte)`: Result bytes, if any.
- `Log (string)`: The output of the application's logger. May
- `Code (uint32)`: Response code
- `Data ([]byte)`: Result bytes, if any.
- `Log (string)`: The output of the application's logger. May
be non-deterministic.
be non-deterministic.
- `Info (string)`: Additional information. May
- `Info (string)`: Additional information. May
be non-deterministic.
be non-deterministic.
- `GasWanted (int64)`: Amount of gas requested for transaction.
- `GasUsed (int64)`: Amount of gas consumed by transaction.
- `Tags ([]kv.Pair)`: Key-Value tags for filtering and indexing
- `GasWanted (int64)`: Amount of gas requested for transaction.
- `GasUsed (int64)`: Amount of gas consumed by transaction.
- `Tags ([]kv.Pair)`: Key-Value tags for filtering and indexing
transactions (eg. by account).
transactions (eg. by account).
- `Codespace (string)`: Namespace for the `Code`.
- `Codespace (string)`: Namespace for the `Code`.
- **Usage**:
- **Usage**:
- Technically optional - not involved in processing blocks.
- Guardian of the mempool: every node runs CheckTx before letting a
- Technically optional - not involved in processing blocks.
- Guardian of the mempool: every node runs CheckTx before letting a
transaction into its local mempool.
transaction into its local mempool.
- The transaction may come from an external user or another node
- CheckTx need not execute the transaction in full, but rather a light-weight
- The transaction may come from an external user or another node
- CheckTx need not execute the transaction in full, but rather a light-weight
yet stateful validation, like checking signatures and account balances, but
yet stateful validation, like checking signatures and account balances, but
not running code in a virtual machine.
not running code in a virtual machine.
- Transactions where `ResponseCheckTx.Code != 0` will be rejected - they will not be broadcast to
- Transactions where `ResponseCheckTx.Code != 0` will be rejected - they will not be broadcast to
other nodes or included in a proposal block.
other nodes or included in a proposal block.
- Tendermint attributes no other value to the response code
- Tendermint attributes no other value to the response code
### DeliverTx
### DeliverTx
- **Request**:
- **Request**:
- `Tx ([]byte)`: The request transaction bytes.
- `Tx ([]byte)`: The request transaction bytes.
- **Response**:
- **Response**:
- `Code (uint32)`: Response code.
- `Data ([]byte)`: Result bytes, if any.
- `Log (string)`: The output of the application's logger. May
- `Code (uint32)`: Response code.
- `Data ([]byte)`: Result bytes, if any.
- `Log (string)`: The output of the application's logger. May
be non-deterministic.
be non-deterministic.
- `Info (string)`: Additional information. May
- `Info (string)`: Additional information. May
be non-deterministic.
be non-deterministic.
- `GasWanted (int64)`: Amount of gas requested for transaction.
- `GasUsed (int64)`: Amount of gas consumed by transaction.
- `Tags ([]kv.Pair)`: Key-Value tags for filtering and indexing
- `GasWanted (int64)`: Amount of gas requested for transaction.
- `GasUsed (int64)`: Amount of gas consumed by transaction.
- `Tags ([]kv.Pair)`: Key-Value tags for filtering and indexing
transactions (eg. by account).
transactions (eg. by account).
- `Codespace (string)`: Namespace for the `Code`.
- `Codespace (string)`: Namespace for the `Code`.
- **Usage**:
- **Usage**:
- The workhorse of the application - non-optional.
- Execute the transaction in full.
- `ResponseDeliverTx.Code == 0` only if the transaction is fully valid.
- The workhorse of the application - non-optional.
- Execute the transaction in full.
- `ResponseDeliverTx.Code == 0` only if the transaction is fully valid.
### EndBlock
### EndBlock
- **Request**:
- **Request**:
- `Height (int64)`: Height of the block just executed.
- `Height (int64)`: Height of the block just executed.
- **Response**:
- **Response**:
- `ValidatorUpdates ([]ValidatorUpdate)`: Changes to validator set (set
- `ValidatorUpdates ([]ValidatorUpdate)`: Changes to validator set (set
voting power to 0 to remove).
voting power to 0 to remove).
- `ConsensusParamUpdates (ConsensusParams)`: Changes to
- `ConsensusParamUpdates (ConsensusParams)`: Changes to
consensus-critical time, size, and other parameters.
consensus-critical time, size, and other parameters.
- `Tags ([]kv.Pair)`: Key-Value tags for filtering and indexing
- `Tags ([]kv.Pair)`: Key-Value tags for filtering and indexing
- **Usage**:
- **Usage**:
- Signals the end of a block.
- Called after all transactions, prior to each Commit.
- Validator updates returned by block `H` impact blocks `H+1`, `H+2`, and
- Signals the end of a block.
- Called after all transactions, prior to each Commit.
- Validator updates returned by block `H` impact blocks `H+1`, `H+2`, and
`H+3`, but only effects changes on the validator set of `H+2`:
`H+3`, but only effects changes on the validator set of `H+2`:
- `H+1`: NextValidatorsHash
- `H+2`: ValidatorsHash (and thus the validator set)
- `H+3`: LastCommitInfo (ie. the last validator set)
- Consensus params returned for block `H` apply for block `H+1`
- `H+1`: NextValidatorsHash
- `H+2`: ValidatorsHash (and thus the validator set)
- `H+3`: LastCommitInfo (ie. the last validator set)
- Consensus params returned for block `H` apply for block `H+1`
### Commit
### Commit
- **Response**:
- **Response**:
- `Data ([]byte)`: The Merkle root hash of the application state
- `RetainHeight (int64)`: Blocks below this height may be removed. Defaults
- `Data ([]byte)`: The Merkle root hash of the application state
- `RetainHeight (int64)`: Blocks below this height may be removed. Defaults
to `0` (retain all).
to `0` (retain all).
- **Usage**:
- **Usage**:
- Persist the application state.
- Return an (optional) Merkle root hash of the application state
- `ResponseCommit.Data` is included as the `Header.AppHash` in the next block
- it may be empty
- Later calls to `Query` can return proofs about the application state anchored
- Persist the application state.
- Return an (optional) Merkle root hash of the application state
- `ResponseCommit.Data` is included as the `Header.AppHash` in the next block
- it may be empty
- Later calls to `Query` can return proofs about the application state anchored
in this Merkle root hash
in this Merkle root hash
- Note developers can return whatever they want here (could be nothing, or a
- Note developers can return whatever they want here (could be nothing, or a
constant string, etc.), so long as it is deterministic - it must not be a
constant string, etc.), so long as it is deterministic - it must not be a
function of anything that did not come from the
function of anything that did not come from the
BeginBlock/DeliverTx/EndBlock methods.
BeginBlock/DeliverTx/EndBlock methods.
- Use `RetainHeight` with caution! If all nodes in the network remove historical
- Use `RetainHeight` with caution! If all nodes in the network remove historical
blocks then this data is permanently lost, and no new nodes will be able to
blocks then this data is permanently lost, and no new nodes will be able to
join the network and bootstrap. Historical blocks may also be required for
join the network and bootstrap. Historical blocks may also be required for
other purposes, e.g. auditing, replay of non-persisted heights, light client
other purposes, e.g. auditing, replay of non-persisted heights, light client
@ -414,256 +414,254 @@ via light client.
### ListSnapshots
### ListSnapshots
- **Response**:
- **Response**:
- `Snapshots ([]Snapshot)`: List of local state snapshots.
- `Snapshots ([]Snapshot)`: List of local state snapshots.
- **Usage**:
- **Usage**:
- Used during state sync to discover available snapshots on peers.
- See `Snapshot` data type for details.
- Used during state sync to discover available snapshots on peers.
- See `Snapshot` data type for details.
### LoadSnapshotChunk
### LoadSnapshotChunk
- **Request**:
- **Request**:
- `Height (uint64)`: The height of the snapshot the chunks belongs to.
- `Format (uint32)`: The application-specific format of the snapshot the chunk belongs to.
- `Chunk (uint32)`: The chunk index, starting from `0` for the initial chunk.
- `Height (uint64)`: The height of the snapshot the chunks belongs to.
- `Format (uint32)`: The application-specific format of the snapshot the chunk belongs to.
- `Chunk (uint32)`: The chunk index, starting from `0` for the initial chunk.
- **Response**:
- **Response**:
- `Chunk ([]byte)`: The binary chunk contents, in an arbitray format. Chunk messages cannot be
- `Chunk ([]byte)`: The binary chunk contents, in an arbitray format. Chunk messages cannot be
larger than 16 MB _including metadata_, so 10 MB is a good starting point.
larger than 16 MB _including metadata_, so 10 MB is a good starting point.
- **Usage**:
- **Usage**:
- Used during state sync to retrieve snapshot chunks from peers.
- Used during state sync to retrieve snapshot chunks from peers.
### OfferSnapshot
### OfferSnapshot
- **Request**:
- **Request**:
- `Snapshot (Snapshot)`: The snapshot offered for restoration.
- `AppHash ([]byte)`: The light client-verified app hash for this height, from the blockchain.
- `Snapshot (Snapshot)`: The snapshot offered for restoration.
- `AppHash ([]byte)`: The light client-verified app hash for this height, from the blockchain.
- **Response**:
- **Response**:
- `Result (Result)`: The result of the snapshot offer.
- `ACCEPT`: Snapshot is accepted, start applying chunks.
- `ABORT`: Abort snapshot restoration, and don't try any other snapshots.
- `REJECT`: Reject this specific snapshot, try others.
- `REJECT_FORMAT`: Reject all snapshots with this `format`, try others.
- `REJECT_SENDERS`: Reject all snapshots from all senders of this snapshot, try others.
- `Result (Result)`: The result of the snapshot offer.
- `ACCEPT`: Snapshot is accepted, start applying chunks.
- `ABORT`: Abort snapshot restoration, and don't try any other snapshots.
- `REJECT`: Reject this specific snapshot, try others.
- `REJECT_FORMAT`: Reject all snapshots with this `format`, try others.
- `REJECT_SENDERS`: Reject all snapshots from all senders of this snapshot, try others.
- **Usage**:
- **Usage**:
- `OfferSnapshot` is called when bootstrapping a node using state sync. The application may
- `OfferSnapshot` is called when bootstrapping a node using state sync. The application may
accept or reject snapshots as appropriate. Upon accepting, Tendermint will retrieve and
accept or reject snapshots as appropriate. Upon accepting, Tendermint will retrieve and
apply snapshot chunks via `ApplySnapshotChunk`. The application may also choose to reject a
apply snapshot chunks via `ApplySnapshotChunk`. The application may also choose to reject a
snapshot in the chunk response, in which case it should be prepared to accept further
snapshot in the chunk response, in which case it should be prepared to accept further
`OfferSnapshot` calls.
`OfferSnapshot` calls.
- Only `AppHash` can be trusted, as it has been verified by the light client. Any other data
- Only `AppHash` can be trusted, as it has been verified by the light client. Any other data
can be spoofed by adversaries, so applications should employ additional verification schemes
can be spoofed by adversaries, so applications should employ additional verification schemes
to avoid denial-of-service attacks. The verified `AppHash` is automatically checked against
to avoid denial-of-service attacks. The verified `AppHash` is automatically checked against
the restored application at the end of snapshot restoration.
the restored application at the end of snapshot restoration.
- For more information, see the `Snapshot` data type or the [state sync section](apps.md#state-sync).
- For more information, see the `Snapshot` data type or the [state sync section](apps.md#state-sync).
### ApplySnapshotChunk
### ApplySnapshotChunk
- **Request**:
- **Request**:
- `Index (uint32)`: The chunk index, starting from `0`. Tendermint applies chunks sequentially.
- `Chunk ([]byte)`: The binary chunk contents, as returned by `LoadSnapshotChunk`.
- `Sender (string)`: The P2P ID of the node who sent this chunk.
- `Index (uint32)`: The chunk index, starting from `0`. Tendermint applies chunks sequentially.
- `Chunk ([]byte)`: The binary chunk contents, as returned by `LoadSnapshotChunk`.
- `Sender (string)`: The P2P ID of the node who sent this chunk.
- **Response**:
- **Response**:
- `Result (Result)`: The result of applying this chunk.
- `ACCEPT`: The chunk was accepted.
- `ABORT`: Abort snapshot restoration, and don't try any other snapshots.
- `RETRY`: Reapply this chunk, combine with `RefetchChunks` and `RejectSenders` as appropriate.
- `RETRY_SNAPSHOT`: Restart this snapshot from `OfferSnapshot`, reusing chunks unless
- `Result (Result)`: The result of applying this chunk.
- `ACCEPT`: The chunk was accepted.
- `ABORT`: Abort snapshot restoration, and don't try any other snapshots.
- `RETRY`: Reapply this chunk, combine with `RefetchChunks` and `RejectSenders` as appropriate.
- `RETRY_SNAPSHOT`: Restart this snapshot from `OfferSnapshot`, reusing chunks unless
instructed otherwise.
instructed otherwise.
- `REJECT_SNAPSHOT`: Reject this snapshot, try a different one.
- `RefetchChunks ([]uint32)`: Refetch and reapply the given chunks, regardless of `Result`. Only
- `REJECT_SNAPSHOT`: Reject this snapshot, try a different one.
- `RefetchChunks ([]uint32)`: Refetch and reapply the given chunks, regardless of `Result`. Only
the listed chunks will be refetched, and reapplied in sequential order.
the listed chunks will be refetched, and reapplied in sequential order.
- `RejectSenders ([]string)`: Reject the given P2P senders, regardless of `Result`. Any chunks
- `RejectSenders ([]string)`: Reject the given P2P senders, regardless of `Result`. Any chunks
already applied will not be refetched unless explicitly requested, but queued chunks from these senders will be discarded, and new chunks or other snapshots rejected.
already applied will not be refetched unless explicitly requested, but queued chunks from these senders will be discarded, and new chunks or other snapshots rejected.
- **Usage**:
- **Usage**:
- The application can choose to refetch chunks and/or ban P2P peers as appropriate. Tendermint
- The application can choose to refetch chunks and/or ban P2P peers as appropriate. Tendermint
will not do this unless instructed by the application.
will not do this unless instructed by the application.
- The application may want to verify each chunk, e.g. by attaching chunk hashes in
- The application may want to verify each chunk, e.g. by attaching chunk hashes in
`Snapshot.Metadata` and/or incrementally verifying contents against `AppHash`.
`Snapshot.Metadata` and/or incrementally verifying contents against `AppHash`.
- When all chunks have been accepted, Tendermint will make an ABCI `Info` call to verify that
- When all chunks have been accepted, Tendermint will make an ABCI `Info` call to verify that
`LastBlockAppHash` and `LastBlockHeight` matches the expected values, and record the
`LastBlockAppHash` and `LastBlockHeight` matches the expected values, and record the
`AppVersion` in the node state. It then switches to fast sync or consensus and joins the
`AppVersion` in the node state. It then switches to fast sync or consensus and joins the
network.
network.
- If Tendermint is unable to retrieve the next chunk after some time (e.g. because no suitable
- If Tendermint is unable to retrieve the next chunk after some time (e.g. because no suitable
peers are available), it will reject the snapshot and try a different one via `OfferSnapshot`.
peers are available), it will reject the snapshot and try a different one via `OfferSnapshot`.
The application should be prepared to reset and accept it or abort as appropriate.
The application should be prepared to reset and accept it or abort as appropriate.
###
## Data Types
## Data Types
### Header
### Header
- **Fields**:
- **Fields**:
- `Version (Version)`: Version of the blockchain and the application
- `ChainID (string)`: ID of the blockchain
- `Height (int64)`: Height of the block in the chain
- `Time (google.protobuf.Timestamp)`: Time of the previous block.
- `Version (Version)`: Version of the blockchain and the application
- `ChainID (string)`: ID of the blockchain
- `Height (int64)`: Height of the block in the chain
- `Time (google.protobuf.Timestamp)`: Time of the previous block.
For most blocks it's the weighted median of the timestamps of the valid votes in the
For most blocks it's the weighted median of the timestamps of the valid votes in the
block.LastCommit, except for the initial height where it's the genesis time.
block.LastCommit, except for the initial height where it's the genesis time.
- `LastBlockID (BlockID)`: Hash of the previous (parent) block
- `LastCommitHash ([]byte)`: Hash of the previous block's commit
- `ValidatorsHash ([]byte)`: Hash of the validator set for this block
- `NextValidatorsHash ([]byte)`: Hash of the validator set for the next block
- `ConsensusHash ([]byte)`: Hash of the consensus parameters for this block
- `AppHash ([]byte)`: Data returned by the last call to `Commit` - typically the
- `LastBlockID (BlockID)`: Hash of the previous (parent) block
- `LastCommitHash ([]byte)`: Hash of the previous block's commit
- `ValidatorsHash ([]byte)`: Hash of the validator set for this block
- `NextValidatorsHash ([]byte)`: Hash of the validator set for the next block
- `ConsensusHash ([]byte)`: Hash of the consensus parameters for this block
- `AppHash ([]byte)`: Data returned by the last call to `Commit` - typically the
Merkle root of the application state after executing the previous block's
Merkle root of the application state after executing the previous block's
transactions
transactions
- `LastResultsHash ([]byte)`: Root hash of all results from the txs from the previous block.
- `EvidenceHash ([]byte)`: Hash of the evidence included in this block
- `ProposerAddress ([]byte)`: Original proposer for the block
- `LastResultsHash ([]byte)`: Root hash of all results from the txs from the previous block.
- `EvidenceHash ([]byte)`: Hash of the evidence included in this block
- `ProposerAddress ([]byte)`: Original proposer for the block
- **Usage**:
- **Usage**:
- Provided in RequestBeginBlock
- Provides important context about the current state of the blockchain -
- Provided in RequestBeginBlock
- Provides important context about the current state of the blockchain -
especially height and time.
especially height and time.
- Provides the proposer of the current block, for use in proposer-based
- Provides the proposer of the current block, for use in proposer-based
reward mechanisms.
reward mechanisms.
- `LastResultsHash` is the root hash of a Merkle tree built from `ResponseDeliverTx` responses (`Log`, `Info`, `Codespace` and `Events` fields are ignored).
- `LastResultsHash` is the root hash of a Merkle tree built from `ResponseDeliverTx` responses (`Log`, `Info`, `Codespace` and `Events` fields are ignored).
### Version
### Version
- **Fields**:
- **Fields**:
- `Block (uint64)`: Protocol version of the blockchain data structures.
- `App (uint64)`: Protocol version of the application.
- `Block (uint64)`: Protocol version of the blockchain data structures.
- `App (uint64)`: Protocol version of the application.
- **Usage**:
- **Usage**:
- Block version should be static in the life of a blockchain.
- App version may be updated over time by the application.
- Block version should be static in the life of a blockchain.
- App version may be updated over time by the application.
### Validator
### Validator
- **Fields**:
- **Fields**:
- `Address ([]byte)`: Address of the validator (the first 20 bytes of SHA256(public key))
- `Power (int64)`: Voting power of the validator
- `Address ([]byte)`: Address of the validator (the first 20 bytes of SHA256(public key))
- `Power (int64)`: Voting power of the validator
- **Usage**:
- **Usage**:
- Validator identified by address
- Used in RequestBeginBlock as part of VoteInfo
- Does not include PubKey to avoid sending potentially large quantum pubkeys
- Validator identified by address
- Used in RequestBeginBlock as part of VoteInfo
- Does not include PubKey to avoid sending potentially large quantum pubkeys
over the ABCI
over the ABCI
### ValidatorUpdate
### ValidatorUpdate
- **Fields**:
- **Fields**:
- `PubKey (PubKey)`: Public key of the validator
- `Power (int64)`: Voting power of the validator
- `PubKey (PubKey)`: Public key of the validator
- `Power (int64)`: Voting power of the validator
- **Usage**:
- **Usage**:
- Validator identified by PubKey
- Used to tell Tendermint to update the validator set
- Validator identified by PubKey
- Used to tell Tendermint to update the validator set
### VoteInfo
### VoteInfo
- **Fields**:
- **Fields**:
- `Validator (Validator)`: A validator
- `SignedLastBlock (bool)`: Indicates whether or not the validator signed
- `Validator (Validator)`: A validator
- `SignedLastBlock (bool)`: Indicates whether or not the validator signed
the last block
the last block
- **Usage**:
- **Usage**:
- Indicates whether a validator signed the last block, allowing for rewards
- Indicates whether a validator signed the last block, allowing for rewards
based on validator availability
based on validator availability
### PubKey
### PubKey
- **Fields**:
- **Fields**:
- `Type (string)`: Type of the public key. A simple string like `"ed25519"`.
- `Type (string)`: Type of the public key. A simple string like `"ed25519"`.
In the future, may indicate a serialization algorithm to parse the `Data`,
In the future, may indicate a serialization algorithm to parse the `Data`,
for instance `"amino"`.
for instance `"amino"`.
- `Data ([]byte)`: Public key data. For a simple public key, it's just the
- `Data ([]byte)`: Public key data. For a simple public key, it's just the
raw bytes. If the `Type` indicates an encoding algorithm, this is the
raw bytes. If the `Type` indicates an encoding algorithm, this is the
encoded public key.
encoded public key.
- **Usage**:
- **Usage**:
- A generic and extensible typed public key
- A generic and extensible typed public key
### Evidence
### Evidence
- **Fields**:
- **Fields**:
- `Type (string)`: Type of the evidence. A hierarchical path like
- `Type (string)`: Type of the evidence. A hierarchical path like
"duplicate/vote".
"duplicate/vote".
- `Validator (Validator`: The offending validator
- `Height (int64)`: Height when the offense occured
- `Time (google.protobuf.Timestamp)`: Time of the block that was committed at the height that the offense occured
- `TotalVotingPower (int64)`: Total voting power of the validator set at
- `Validator (Validator`: The offending validator
- `Height (int64)`: Height when the offense occured
- `Time (google.protobuf.Timestamp)`: Time of the block that was committed at the height that the offense occured
- `TotalVotingPower (int64)`: Total voting power of the validator set at
height `Height`
height `Height`
### LastCommitInfo
### LastCommitInfo
- **Fields**:
- **Fields**:
- `Round (int32)`: Commit round.
- `Votes ([]VoteInfo)`: List of validators addresses in the last validator set
- `Round (int32)`: Commit round.
- `Votes ([]VoteInfo)`: List of validators addresses in the last validator set
with their voting power and whether or not they signed a vote.
with their voting power and whether or not they signed a vote.
### ConsensusParams
### ConsensusParams
- **Fields**:
- **Fields**:
- `Block (BlockParams)`: Parameters limiting the size of a block and time between consecutive blocks.
- `Evidence (EvidenceParams)`: Parameters limiting the validity of
- `Block (BlockParams)`: Parameters limiting the size of a block and time between consecutive blocks.
- `Evidence (EvidenceParams)`: Parameters limiting the validity of
evidence of byzantine behaviour.
evidence of byzantine behaviour.
- `Validator (ValidatorParams)`: Parameters limiting the types of pubkeys validators can use.
- `Version (VersionParams)`: The ABCI application version.
- `Validator (ValidatorParams)`: Parameters limiting the types of pubkeys validators can use.
- `Version (VersionParams)`: The ABCI application version.
### BlockParams
### BlockParams
- **Fields**:
- **Fields**:
- `MaxBytes (int64)`: Max size of a block, in bytes.
- `MaxGas (int64)`: Max sum of `GasWanted` in a proposed block.
- NOTE: blocks that violate this may be committed if there are Byzantine proposers.
- `MaxBytes (int64)`: Max size of a block, in bytes.
- `MaxGas (int64)`: Max sum of `GasWanted` in a proposed block.
- NOTE: blocks that violate this may be committed if there are Byzantine proposers.
It's the application's responsibility to handle this when processing a
It's the application's responsibility to handle this when processing a
block!
block!
### EvidenceParams
### EvidenceParams
- **Fields**:
- **Fields**:
- `MaxAgeNumBlocks (int64)`: Max age of evidence, in blocks.
- `MaxAgeDuration (time.Duration)`: Max age of evidence, in time.
- `MaxAgeNumBlocks (int64)`: Max age of evidence, in blocks.
- `MaxAgeDuration (time.Duration)`: Max age of evidence, in time.
It should correspond with an app's "unbonding period" or other similar
It should correspond with an app's "unbonding period" or other similar
Tendermint consensus guarantees the following specifications for all heights:
Tendermint consensus guarantees the following specifications for all heights:
* agreement -- no two correct full nodes decide differently.
* agreement -- no two correct full nodes decide differently.
* validity -- the decided block satisfies the predefined predicate *valid()*.
* validity -- the decided block satisfies the predefined predicate *valid()*.
* termination -- all correct full nodes eventually decide,
* termination -- all correct full nodes eventually decide,
@ -13,7 +14,6 @@ does not hold, each of the specification may be violated.
The agreement property says that for a given height, any two correct validators that decide on a block for that height decide on the same block. That the block was indeed generated by the blockchain, can be verified starting from a trusted (genesis) block, and checking that all subsequent blocks are properly signed.
The agreement property says that for a given height, any two correct validators that decide on a block for that height decide on the same block. That the block was indeed generated by the blockchain, can be verified starting from a trusted (genesis) block, and checking that all subsequent blocks are properly signed.
However, faulty nodes may forge blocks and try to convince users (light clients) that the blocks had been correctly generated. In addition, Tendermint agreement might be violated in the case where more than 1/3 of the voting power belongs to faulty validators: Two correct validators decide on different blocks. The latter case motivates the term "fork": as Tendermint consensus also agrees on the next validator set, correct validators may have decided on disjoint next validator sets, and the chain branches into two or more partitions (possibly having faulty validators in common) and each branch continues to generate blocks independently of the other.
However, faulty nodes may forge blocks and try to convince users (light clients) that the blocks had been correctly generated. In addition, Tendermint agreement might be violated in the case where more than 1/3 of the voting power belongs to faulty validators: Two correct validators decide on different blocks. The latter case motivates the term "fork": as Tendermint consensus also agrees on the next validator set, correct validators may have decided on disjoint next validator sets, and the chain branches into two or more partitions (possibly having faulty validators in common) and each branch continues to generate blocks independently of the other.
We say that a fork is a case in which there are two commits for different blocks at the same height of the blockchain. The proplem is to ensure that in those cases we are able to detect faulty validators (and not mistakenly accuse correct validators), and incentivize therefore validators to behave according to the protocol specification.
We say that a fork is a case in which there are two commits for different blocks at the same height of the blockchain. The proplem is to ensure that in those cases we are able to detect faulty validators (and not mistakenly accuse correct validators), and incentivize therefore validators to behave according to the protocol specification.
@ -24,7 +24,6 @@ We say that a fork is a case in which there are two commits for different blocks
*Remark.* In the case more than 1/3 of the voting power belongs to faulty validators, also validity and termination can be broken. Termination can be broken if faulty processes just do not send the messages that are needed to make progress. Due to asynchrony, this is not punishable, because faulty validators can always claim they never received the messages that would have forced them to send messages.
*Remark.* In the case more than 1/3 of the voting power belongs to faulty validators, also validity and termination can be broken. Termination can be broken if faulty processes just do not send the messages that are needed to make progress. Due to asynchrony, this is not punishable, because faulty validators can always claim they never received the messages that would have forced them to send messages.
## The Misbehavior of Faulty Validators
## The Misbehavior of Faulty Validators
Forks are the result of faulty validators deviating from the protocol. In principle several such deviations can be detected without a fork actually occurring:
Forks are the result of faulty validators deviating from the protocol. In principle several such deviations can be detected without a fork actually occurring:
@ -37,13 +36,11 @@ Forks are the result of faulty validators deviating from the protocol. In princi
*Remark.* In isolation, Point 3 is an attack on validity (rather than agreement). However, the prevotes and precommits can then also be used to forge blocks.
*Remark.* In isolation, Point 3 is an attack on validity (rather than agreement). However, the prevotes and precommits can then also be used to forge blocks.
1. amnesia: Tendermint consensus has a locking mechanism. If a validator has some value v locked, then it can only prevote/precommit for v or nil. Sending prevote/precomit message for a different value v' (that is not nil) while holding lock on value v is misbehavior.
1. amnesia: Tendermint consensus has a locking mechanism. If a validator has some value v locked, then it can only prevote/precommit for v or nil. Sending prevote/precomit message for a different value v' (that is not nil) while holding lock on value v is misbehavior.
2. spurious messages: In Tendermint consensus most of the message send instructions are guarded by threshold guards, e.g., one needs to receive *2f + 1* prevote messages to send precommit. Faulty validators may send precommit without having received the prevote messages.
2. spurious messages: In Tendermint consensus most of the message send instructions are guarded by threshold guards, e.g., one needs to receive *2f + 1* prevote messages to send precommit. Faulty validators may send precommit without having received the prevote messages.
Independently of a fork happening, punishing this behavior might be important to prevent forks altogether. This should keep attackers from misbehaving: if at most 1/3 of the voting power is faulty, this misbehavior is detectable but will not lead to a safety violation. Thus, unless they have more than 1/3 (or in some cases more than 2/3) of the voting power attackers have the incentive to not misbehave. If attackers control too much voting power, we have to deal with forks, as discussed in this document.
Independently of a fork happening, punishing this behavior might be important to prevent forks altogether. This should keep attackers from misbehaving: if at most 1/3 of the voting power is faulty, this misbehavior is detectable but will not lead to a safety violation. Thus, unless they have more than 1/3 (or in some cases more than 2/3) of the voting power attackers have the incentive to not misbehave. If attackers control too much voting power, we have to deal with forks, as discussed in this document.
## Two types of forks
## Two types of forks
@ -53,7 +50,6 @@ As in this case we have two different blocks (both having the same right/no righ
* Fork-Light. All correct validators decide on the same block for height *h*, but faulty processes (validators or not), forge a different block for that height, in order to fool users (who use the light client).
* Fork-Light. All correct validators decide on the same block for height *h*, but faulty processes (validators or not), forge a different block for that height, in order to fool users (who use the light client).
# Attack scenarios
# Attack scenarios
## On-chain attacks
## On-chain attacks
@ -64,25 +60,17 @@ There are several scenarios in which forks might happen. The first is double sig
* F1. Equivocation: faulty validators sign multiple vote messages (prevote and/or precommit) for different values *during the same round r* at a given height h.
* F1. Equivocation: faulty validators sign multiple vote messages (prevote and/or precommit) for different values *during the same round r* at a given height h.
### Flip-flopping
### Flip-flopping
Tendermint consensus implements a locking mechanism: If a correct validator *p* receives proposal for value v and *2f + 1* prevotes for a value *id(v)* in round *r*, it locks *v* and remembers *r*. In this case, *p* also sends a precommit message for *id(v)*, which later may serve as proof that *p* locked *v*.
Tendermint consensus implements a locking mechanism: If a correct validator *p* receives proposal for value v and *2f + 1* prevotes for a value *id(v)* in round *r*, it locks *v* and remembers *r*. In this case, *p* also sends a precommit message for *id(v)*, which later may serve as proof that *p* locked *v*.
In subsequent rounds, *p* only sends prevote messages for a value it had previously locked. However, it is possible to change the locked value if in a future round *r' > r*, if the process receives proposal and *2f + 1* prevotes for a different value *v'*. In this case, *p* could send a prevote/precommit for *id(v')*. This algorithmic feature can be exploited in two ways:
In subsequent rounds, *p* only sends prevote messages for a value it had previously locked. However, it is possible to change the locked value if in a future round *r' > r*, if the process receives proposal and *2f + 1* prevotes for a different value *v'*. In this case, *p* could send a prevote/precommit for *id(v')*. This algorithmic feature can be exploited in two ways:
* F2. Faulty Flip-flopping (Amnesia): faulty validators precommit some value *id(v)* in round *r* (value *v* is locked in round *r*) and then prevote for different value *id(v')* in higher round *r' > r* without previously correctly unlocking value *v*. In this case faulty processes "forget" that they have locked value *v* and prevote some other value in the following rounds.
* F2. Faulty Flip-flopping (Amnesia): faulty validators precommit some value *id(v)* in round *r* (value *v* is locked in round *r*) and then prevote for different value *id(v')* in higher round *r' > r* without previously correctly unlocking value *v*. In this case faulty processes "forget" that they have locked value *v* and prevote some other value in the following rounds.
Some correct validators might have decided on *v* in *r*, and other correct validators decide on *v'* in *r'*. Here we can have branching on the main chain (Fork-Full).
Some correct validators might have decided on *v* in *r*, and other correct validators decide on *v'* in *r'*. Here we can have branching on the main chain (Fork-Full).
* F3. Correct Flip-flopping (Back to the past): There are some precommit messages signed by (correct) validators for value *id(v)* in round *r*. Still, *v* is not decided upon, and all processes move on to the next round. Then correct validators (correctly) lock and decide a different value *v'* in some round *r' > r*. And the correct validators continue; there is no branching on the main chain.
* F3. Correct Flip-flopping (Back to the past): There are some precommit messages signed by (correct) validators for value *id(v)* in round *r*. Still, *v* is not decided upon, and all processes move on to the next round. Then correct validators (correctly) lock and decide a different value *v'* in some round *r' > r*. And the correct validators continue; there is no branching on the main chain.
However, faulty validators may use the correct precommit messages from round *r* together with a posteriori generated faulty precommit messages for round *r* to forge a block for a value that was not decided on the main chain (Fork-Light).
However, faulty validators may use the correct precommit messages from round *r* together with a posteriori generated faulty precommit messages for round *r* to forge a block for a value that was not decided on the main chain (Fork-Light).
## Off-chain attacks
## Off-chain attacks
F1-F3 may contaminate the state of full nodes (and even validators). Contaminated (but otherwise correct) full nodes may thus communicate faulty blocks to light clients.
F1-F3 may contaminate the state of full nodes (and even validators). Contaminated (but otherwise correct) full nodes may thus communicate faulty blocks to light clients.
@ -96,10 +84,9 @@ Similarly, without actually interfering with the main chain, we can have the fol
We consider three types of potential attack victims:
We consider three types of potential attack victims:
- FN: full node
- LCS: light client with sequential header verification
- LCB: light client with bisection based header verification
* FN: full node
* LCS: light client with sequential header verification
* LCB: light client with bisection based header verification
F1 and F2 can be used by faulty validators to actually create multiple branches on the blockchain. That means that correctly operating full nodes decide on different blocks for the same height. Until a fork is detected locally by a full node (by receiving evidence from others or by some other local check that fails), the full node can spread corrupted blocks to light clients.
F1 and F2 can be used by faulty validators to actually create multiple branches on the blockchain. That means that correctly operating full nodes decide on different blocks for the same height. Until a fork is detected locally by a full node (by receiving evidence from others or by some other local check that fails), the full node can spread corrupted blocks to light clients.
@ -110,15 +97,9 @@ F3 is similar to F1, except that no two correct validators decide on different b
In addition, without creating a fork on the main chain, light clients can be contaminated by more than a third of validators that are faulty and sign a forged header
In addition, without creating a fork on the main chain, light clients can be contaminated by more than a third of validators that are faulty and sign a forged header
F4 cannot fool correct full nodes as they know the current validator set. Similarly, LCS know who the validators are. Hence, F4 is an attack against LCB that do not necessarily know the complete prefix of headers (Fork-Light), as they trust a header that is signed by at least one correct validator (trusting period method).
F4 cannot fool correct full nodes as they know the current validator set. Similarly, LCS know who the validators are. Hence, F4 is an attack against LCB that do not necessarily know the complete prefix of headers (Fork-Light), as they trust a header that is signed by at least one correct validator (trusting period method).
The following table gives an overview of how the different attacks may affect different nodes. F1-F3 are *on-chain* attacks so they can corrupt the state of full nodes. Then if a light client (LCS or LCB) contacts a full node to obtain headers (or blocks), the corrupted state may propagate to the light client.
The following table gives an overview of how the different attacks may affect different nodes. F1-F3 are *on-chain* attacks so they can corrupt the state of full nodes. Then if a light client (LCS or LCB) contacts a full node to obtain headers (or blocks), the corrupted state may propagate to the light client.
F4 and F5 are *off-chain*, that is, these attacks cannot be used to corrupt the state of full nodes (which have sufficient knowledge on the state of the chain to not be fooled).
F4 and F5 are *off-chain*, that is, these attacks cannot be used to corrupt the state of full nodes (which have sufficient knowledge on the state of the chain to not be fooled).
| Attack | FN | LCS | LCB |
| Attack | FN | LCS | LCB |
|:------:|:------:|:------:|:------:|
|:------:|:------:|:------:|:------:|
@ -128,16 +109,11 @@ F4 and F5 are *off-chain*, that is, these attacks cannot be used to corrupt the
| F4 | | | direct |
| F4 | | | direct |
| F5 | | | direct |
| F5 | | | direct |
**Q:** Light clients are more vulnerable than full nodes, because the former do only verify headers but do not execute transactions. What kind of certainty is gained by a full node that executes a transaction?
**Q:** Light clients are more vulnerable than full nodes, because the former do only verify headers but do not execute transactions. What kind of certainty is gained by a full node that executes a transaction?
As a full node verifies all transactions, it can only be
As a full node verifies all transactions, it can only be
contaminated by an attack if the blockchain itself violates its invariant (one block per height), that is, in case of a fork that leads to branching.
contaminated by an attack if the blockchain itself violates its invariant (one block per height), that is, in case of a fork that leads to branching.
## Detailed Attack Scenarios
## Detailed Attack Scenarios
### Equivocation based attacks
### Equivocation based attacks
@ -148,6 +124,7 @@ round of some height. This attack can be executed on both full nodes and light c
#### Scenario 1: Equivocation on the main chain
#### Scenario 1: Equivocation on the main chain
Validators:
Validators:
* CA - a set of correct validators with less than 1/3 of the voting power
* CA - a set of correct validators with less than 1/3 of the voting power
* CB - a set of correct validators with less than 1/3 of the voting power
* CB - a set of correct validators with less than 1/3 of the voting power
* CA and CB are disjoint
* CA and CB are disjoint
@ -162,14 +139,15 @@ Execution:
* Validators from the set CA and CB prevote for A and B, respectively.
* Validators from the set CA and CB prevote for A and B, respectively.
* Faulty validators from the set F prevote both for A and B.
* Faulty validators from the set F prevote both for A and B.
* The faulty prevote messages
* The faulty prevote messages
- for A arrive at CA long before the B messages
- for B arrive at CB long before the A messages
* for A arrive at CA long before the B messages
* for B arrive at CB long before the A messages
* Therefore correct validators from set CA and CB will observe
* Therefore correct validators from set CA and CB will observe
more than 2/3 of prevotes for A and B and precommit for A and B, respectively.
more than 2/3 of prevotes for A and B and precommit for A and B, respectively.
* Faulty validators from the set F precommit both values A and B.
* Faulty validators from the set F precommit both values A and B.
* Thus, we have more than 2/3 commits for both A and B.
* Thus, we have more than 2/3 commits for both A and B.
Consequences:
Consequences:
* Creating evidence of misbehavior is simple in this case as we have multiple messages signed by the same faulty processes for different values in the same round.
* Creating evidence of misbehavior is simple in this case as we have multiple messages signed by the same faulty processes for different values in the same round.
* We have to ensure that these different messages reach a correct process (full node, monitor?), which can submit evidence.
* We have to ensure that these different messages reach a correct process (full node, monitor?), which can submit evidence.
@ -180,11 +158,12 @@ Consequences:
#### Scenario 2: Equivocation to a light client (LCS)
#### Scenario 2: Equivocation to a light client (LCS)
Validators:
Validators:
* a set F of faulty validators with more than 2/3 of the voting power.
* a set F of faulty validators with more than 2/3 of the voting power.
Execution:
Execution:
* for the main chain F behaves nicely
* for the main chain F behaves nicely
* F coordinates to sign a block B that is different from the one on the main chain.
* F coordinates to sign a block B that is different from the one on the main chain.
* the light clients obtains B and trusts at as it is signed by more than 2/3 of the voting power.
* the light clients obtains B and trusts at as it is signed by more than 2/3 of the voting power.
@ -202,8 +181,6 @@ In order to detect such (equivocation-based attack), the light client would need
### Flip-flopping: Amnesia based attacks
### Flip-flopping: Amnesia based attacks
In case of amnesia, faulty validators lock some value *v* in some round *r*, and then vote for different value *v'* in higher rounds without correctly unlocking value *v*. This attack can be used both on full nodes and light clients.
In case of amnesia, faulty validators lock some value *v* in some round *r*, and then vote for different value *v'* in higher rounds without correctly unlocking value *v*. This attack can be used both on full nodes and light clients.
#### Scenario 3: At most 2/3 of faults
#### Scenario 3: At most 2/3 of faults
@ -215,7 +192,7 @@ Validators:
Execution:
Execution:
* Faulty validators commit (without exposing it on the main chain) a block A in round *r* by collecting more than 2/3 of the
* Faulty validators commit (without exposing it on the main chain) a block A in round *r* by collecting more than 2/3 of the
voting power (containing correct and faulty validators).
voting power (containing correct and faulty validators).
* All validators (correct and faulty) reach a round *r' > r*.
* All validators (correct and faulty) reach a round *r' > r*.
* Some correct validators in C do not lock any value before round *r'*.
* Some correct validators in C do not lock any value before round *r'*.
@ -224,7 +201,7 @@ Execution:
*Remark.* In this case, the more than 1/3 of faulty validators do not need to commit an equivocation (F1) as they only vote once per round in the execution.
*Remark.* In this case, the more than 1/3 of faulty validators do not need to commit an equivocation (F1) as they only vote once per round in the execution.
Detecting faulty validators in the case of such an attack can be done by the fork accountability mechanism described in: https://docs.google.com/document/d/11ZhMsCj3y7zIZz4udO9l25xqb0kl7gmWqNpGVRzOeyY/edit?usp=sharing.
Detecting faulty validators in the case of such an attack can be done by the fork accountability mechanism described in: <https://docs.google.com/document/d/11ZhMsCj3y7zIZz4udO9l25xqb0kl7gmWqNpGVRzOeyY/edit?usp=sharing>.
If a light client is attacked using this attack with more than 1/3 of voting power (and less than 2/3), the attacker cannot change the application state arbitrarily. Rather, the attacker is limited to a state a correct validator finds acceptable: In the execution above, correct validators still find the value acceptable, however, the block the light client trusts deviates from the one on the main chain.
If a light client is attacked using this attack with more than 1/3 of voting power (and less than 2/3), the attacker cannot change the application state arbitrarily. Rather, the attacker is limited to a state a correct validator finds acceptable: In the execution above, correct validators still find the value acceptable, however, the block the light client trusts deviates from the one on the main chain.
@ -249,7 +226,7 @@ Consequences:
* The validators in F1 will be detectable by the the fork accountability mechanisms.
* The validators in F1 will be detectable by the the fork accountability mechanisms.
* The validators in F2 cannot be detected using this mechanism.
* The validators in F2 cannot be detected using this mechanism.
Only in case they signed something which conflicts with the application this can be used against them. Otherwise they do not do anything incorrect.
Only in case they signed something which conflicts with the application this can be used against them. Otherwise they do not do anything incorrect.
* This case is not covered by the report https://docs.google.com/document/d/11ZhMsCj3y7zIZz4udO9l25xqb0kl7gmWqNpGVRzOeyY/edit?usp=sharing as it only assumes at most 2/3 of faulty validators.
* This case is not covered by the report <https://docs.google.com/document/d/11ZhMsCj3y7zIZz4udO9l25xqb0kl7gmWqNpGVRzOeyY/edit?usp=sharing> as it only assumes at most 2/3 of faulty validators.
**Q:** do we need to define a special kind of attack for the case where a validator sign arbitrarily state? It seems that detecting such attack requires a different mechanism that would require as an evidence a sequence of blocks that led to that state. This might be very tricky to implement.
**Q:** do we need to define a special kind of attack for the case where a validator sign arbitrarily state? It seems that detecting such attack requires a different mechanism that would require as an evidence a sequence of blocks that led to that state. This might be very tricky to implement.
@ -257,9 +234,10 @@ Only in case they signed something which conflicts with the application this can
In this kind of attack, faulty validators take advantage of the fact that they did not sign messages in some of the past rounds. Due to the asynchronous network in which Tendermint operates, we cannot easily differentiate between such an attack and delayed message. This kind of attack can be used at both full nodes and light clients.
In this kind of attack, faulty validators take advantage of the fact that they did not sign messages in some of the past rounds. Due to the asynchronous network in which Tendermint operates, we cannot easily differentiate between such an attack and delayed message. This kind of attack can be used at both full nodes and light clients.
#### Scenario 5:
#### Scenario 5
Validators:
Validators:
* C1 - a set of correct validators with 1/3 of the voting power
* C1 - a set of correct validators with 1/3 of the voting power
* C2 - a set of correct validators with 1/3 of the voting power
* C2 - a set of correct validators with 1/3 of the voting power
* C1 and C2 are disjoint
* C1 and C2 are disjoint
@ -267,7 +245,6 @@ Validators:
* one additional faulty process *q*
* one additional faulty process *q*
* F and *q* violate the Tendermint failure model.
* F and *q* violate the Tendermint failure model.
Execution:
Execution:
* in a round *r* of height *h* we have C1 precommitting a value A,
* in a round *r* of height *h* we have C1 precommitting a value A,
@ -278,7 +255,6 @@ Execution:
* F and *fp* "go back to the past" and sign precommit message for value A in round *r*.
* F and *fp* "go back to the past" and sign precommit message for value A in round *r*.
* Together with precomit messages of C1 this is sufficient for a commit for value A.
* Together with precomit messages of C1 this is sufficient for a commit for value A.
Consequences:
Consequences:
* Only a single faulty validator that previously precommited nil did equivocation, while the other 1/3 of faulty validators actually executed an attack that has exactly the same sequence of messages as part of amnesia attack. Detecting this kind of attack boil down to mechanisms for equivocation and amnesia.
* Only a single faulty validator that previously precommited nil did equivocation, while the other 1/3 of faulty validators actually executed an attack that has exactly the same sequence of messages as part of amnesia attack. Detecting this kind of attack boil down to mechanisms for equivocation and amnesia.
@ -289,16 +265,17 @@ Consequences:
In case of phantom validators, processes that are not part of the current validator set but are still bonded (as attack happen during their unbonding period) can be part of the attack by signing vote messages. This attack can be executed against both full nodes and light clients.
In case of phantom validators, processes that are not part of the current validator set but are still bonded (as attack happen during their unbonding period) can be part of the attack by signing vote messages. This attack can be executed against both full nodes and light clients.
#### Scenario 6:
#### Scenario 6
Validators:
Validators:
* F -- a set of faulty validators that are not part of the validator set on the main chain at height *h + k*
* F -- a set of faulty validators that are not part of the validator set on the main chain at height *h + k*
Execution:
Execution:
* There is a fork, and there exist two different headers for height *h + k*, with different validator sets:
* There is a fork, and there exist two different headers for height *h + k*, with different validator sets:
- VS2 on the main chain
- forged header VS2', signed by F (and others)
* VS2 on the main chain
* forged header VS2', signed by F (and others)
* a light client has a trust in a header for height *h* (and the corresponding validator set VS1).
* a light client has a trust in a header for height *h* (and the corresponding validator set VS1).
* As part of bisection header verification, it verifies the header at height *h + k* with new validator set VS2'.
* As part of bisection header verification, it verifies the header at height *h + k* with new validator set VS2'.
@ -314,7 +291,7 @@ Consequences:
the light client involving a phantom validator will have needed to be initiated by 1/3+ lunatic
the light client involving a phantom validator will have needed to be initiated by 1/3+ lunatic
validators that can forge a new validator set that includes the phantom validator. Only in
validators that can forge a new validator set that includes the phantom validator. Only in
that case will the light client accept the phantom validators vote. We need only worry about
that case will the light client accept the phantom validators vote. We need only worry about
punishing the 1/3+ lunatic cabal, that is the root cause of the attack.
punishing the 1/3+ lunatic cabal, that is the root cause of the attack.
*Assumption*: In the following, we assume that *untrusted_h.Header.height > trusted_h.Header.height*. We will quickly discuss the other case in the next section.
*Assumption*: In the following, we assume that *untrusted_h.Header.height > trusted_h.Header.height*. We will quickly discuss the other case in the next section.
We consider the following set-up:
We consider the following set-up:
- the light client communicates with one full node
- the light client communicates with one full node
- the light client locally stores all the headers that has passed basic verification and that are within light client trust period. In the pseudo code below we
- the light client locally stores all the headers that has passed basic verification and that are within light client trust period. In the pseudo code below we
write *Store.Add(header)* for this. If a header failed to verify, then
write *Store.Add(header)* for this. If a header failed to verify, then
the full node we are talking to is faulty and we should disconnect from it and reinitialise with new peer.
the full node we are talking to is faulty and we should disconnect from it and reinitialise with new peer.
- If `CanTrust` returns *error*, then the light client has seen a forged header or the trusted header has expired (it is outside its trusted period).
- If `CanTrust` returns *error*, then the light client has seen a forged header or the trusted header has expired (it is outside its trusted period).
* In case of forged header, the full node is faulty so light client should disconnect and reinitialise with new peer. If the trusted header has expired,
- In case of forged header, the full node is faulty so light client should disconnect and reinitialise with new peer. If the trusted header has expired,
we need to reinitialise light client with new trusted header (that is within its trusted period), but we don't necessarily need to disconnect from the full node
we need to reinitialise light client with new trusted header (that is within its trusted period), but we don't necessarily need to disconnect from the full node
we are talking to (as we haven't observed full node misbehavior in this case).
we are talking to (as we haven't observed full node misbehavior in this case).
## Correctness of the Light Client Protocols
## Correctness of the Light Client Protocols
### Definitions
### Definitions
*`TRUSTED_PERIOD`: trusted period
* for realtime `t`, the predicate `correct(v,t)` is true if the validator `v`
-`TRUSTED_PERIOD`: trusted period
- for realtime `t`, the predicate `correct(v,t)` is true if the validator `v`
follows the protocol until time `t` (we will see about recovery later).
follows the protocol until time `t` (we will see about recovery later).
* Validator fields. We will write a validator as a tuple `(v,p)` such that
+`v` is the identifier (i.e., validator address; we assume identifiers are unique in each validator set)
+`p` is its voting power
* For each header `h`, we write `trust(h) = true` if the light client trusts `h`.
- Validator fields. We will write a validator as a tuple `(v,p)` such that
-`v` is the identifier (i.e., validator address; we assume identifiers are unique in each validator set)
-`p` is its voting power
- For each header `h`, we write `trust(h) = true` if the light client trusts `h`.
### Failure Model
### Failure Model
@ -487,7 +483,6 @@ Formally,
2/3 \sum_{(v,p) \in validators(h.NextValidatorsHash)} p
2/3 \sum_{(v,p) \in validators(h.NextValidatorsHash)} p
\]
\]
The light client communicates with a full node and learns new headers. The goal is to locally decide whether to trust a header. Our implementation needs to ensure the following two properties:
The light client communicates with a full node and learns new headers. The goal is to locally decide whether to trust a header. Our implementation needs to ensure the following two properties:
- *Light Client Completeness*: If a header `h` was correctly generated by an instance of Tendermint consensus (and its age is less than the trusted period),
- *Light Client Completeness*: If a header `h` was correctly generated by an instance of Tendermint consensus (and its age is less than the trusted period),
@ -532,14 +527,15 @@ is correct, but we only trust the fact that less than `1/3` of them are faulty (
*`VerifySingle` correctness arguments*
*`VerifySingle` correctness arguments*
Light Client Accuracy:
Light Client Accuracy:
- Assume by contradiction that `untrustedHeader` was not generated correctly and the light client sets trust to true because `verifySingle` returns without error.
- Assume by contradiction that `untrustedHeader` was not generated correctly and the light client sets trust to true because `verifySingle` returns without error.
- `trustedState` is trusted and sufficiently new
- `trustedState` is trusted and sufficiently new
- by the Failure Model, less than `1/3` of the voting power held by faulty validators => at least one correct validator `v` has signed `untrustedHeader`.
- by the Failure Model, less than `1/3` of the voting power held by faulty validators => at least one correct validator `v` has signed `untrustedHeader`.
- as `v` is correct up to now, it followed the Tendermint consensus protocol at least up to signing `untrustedHeader` => `untrustedHeader` was correctly generated.
- as `v` is correct up to now, it followed the Tendermint consensus protocol at least up to signing `untrustedHeader` => `untrustedHeader` was correctly generated.
We arrive at the required contradiction.
We arrive at the required contradiction.
Light Client Completeness:
Light Client Completeness:
- The check is successful if sufficiently many validators of `trustedState` are still validators in the height `untrustedHeader.Height` and signed `untrustedHeader`.
- The check is successful if sufficiently many validators of `trustedState` are still validators in the height `untrustedHeader.Height` and signed `untrustedHeader`.
- If `untrustedHeader.Height = trustedHeader.Height + 1`, and both headers were generated correctly, the test passes.
- If `untrustedHeader.Height = trustedHeader.Height + 1`, and both headers were generated correctly, the test passes.
@ -550,10 +546,10 @@ Light Client Completeness:
However, in case of (frequent) changes in the validator set, the higher the `trustThreshold` is chosen, the more unlikely it becomes that
However, in case of (frequent) changes in the validator set, the higher the `trustThreshold` is chosen, the more unlikely it becomes that
`verifySingle` returns with an error for non-adjacent headers.
`verifySingle` returns with an error for non-adjacent headers.
- Assume by contradiction that the header at `untrustedHeight` obtained from the full node was not generated correctly and
- Assume by contradiction that the header at `untrustedHeight` obtained from the full node was not generated correctly and
the light client sets trust to true because `VerifyBisection` returns without an error.
the light client sets trust to true because `VerifyBisection` returns without an error.
- `VerifyBisection` returns without error only if all calls to `verifySingle` in the recursion return without error (return `nil`).
- `VerifyBisection` returns without error only if all calls to `verifySingle` in the recursion return without error (return `nil`).
@ -568,12 +564,8 @@ This is only ensured if upon `Commit(pivot)` the light client is always provided
With `VerifyBisection`, a faulty full node could stall a light client by creating a long sequence of headers that are queried one-by-one by the light client and look OK,
With `VerifyBisection`, a faulty full node could stall a light client by creating a long sequence of headers that are queried one-by-one by the light client and look OK,
before the light client eventually detects a problem. There are several ways to address this:
before the light client eventually detects a problem. There are several ways to address this:
* Each call to `Commit` could be issued to a different full node
* Instead of querying header by header, the light client tells a full node which header it trusts, and the height of the header it needs. The full node responds with
the header along with a proof consisting of intermediate headers that the light client can use to verify. Roughly, `VerifyBisection` would then be executed at the full node.
* We may set a timeout how long `VerifyBisection` may take.
- Each call to `Commit` could be issued to a different full node
- Instead of querying header by header, the light client tells a full node which header it trusts, and the height of the header it needs. The full node responds with
the header along with a proof consisting of intermediate headers that the light client can use to verify. Roughly, `VerifyBisection` would then be executed at the full node.
- We may set a timeout how long `VerifyBisection` may take.
@ -36,29 +36,29 @@ the data in the current block, the previous block, and the results returned by t
```go
```go
type Header struct {
type Header struct {
// basic block info
Version Version
ChainID string
Height int64
Time Time
// basic block info
Version Version
ChainID string
Height int64
Time Time
// prev block info
LastBlockID BlockID
// prev block info
LastBlockID BlockID
// hashes of block data
LastCommitHash []byte // commit from validators from the last block
DataHash []byte // MerkleRoot of transaction hashes
// hashes of block data
LastCommitHash []byte // commit from validators from the last block
DataHash []byte // MerkleRoot of transaction hashes
// hashes from the app output from the prev block
ValidatorsHash []byte // validators for the current block
NextValidatorsHash []byte // validators for the next block
ConsensusHash []byte // consensus params for current block
AppHash []byte // state after txs from the previous block
LastResultsHash []byte // root hash of all results from the txs from the previous block
// hashes from the app output from the prev block
ValidatorsHash []byte // validators for the current block
NextValidatorsHash []byte // validators for the next block
ConsensusHash []byte // consensus params for current block
AppHash []byte // state after txs from the previous block
LastResultsHash []byte // root hash of all results from the txs from the previous block
// consensus info
EvidenceHash []byte // evidence included in the block
ProposerAddress []byte // original proposer of the block
// consensus info
EvidenceHash []byte // evidence included in the block
ProposerAddress []byte // original proposer of the block
```
```
Further details on each of these fields is described below.
Further details on each of these fields is described below.
@ -67,8 +67,8 @@ Further details on each of these fields is described below.
```go
```go
type Version struct {
type Version struct {
Block uint64
App uint64
Block uint64
App uint64
}
}
```
```
@ -111,7 +111,7 @@ format, which uses two integers, one for Seconds and for Nanoseconds.
Data is just a wrapper for a list of transactions, where transactions are
Data is just a wrapper for a list of transactions, where transactions are
arbitrary byte arrays:
arbitrary byte arrays:
```
```go
type Data struct {
type Data struct {
Txs [][]byte
Txs [][]byte
}
}
@ -124,10 +124,10 @@ validator. It also contains the relevant BlockID, height and round:
```go
```go
type Commit struct {
type Commit struct {
Height int64
Round int
BlockID BlockID
Signatures []CommitSig
Height int64
Round int
BlockID BlockID
Signatures []CommitSig
}
}
```
```
@ -141,19 +141,19 @@ to reconstruct the vote set given the validator set.
type BlockIDFlag byte
type BlockIDFlag byte
const (
const (
// BlockIDFlagAbsent - no vote was received from a validator.
BlockIDFlagAbsent BlockIDFlag = 0x01
// BlockIDFlagCommit - voted for the Commit.BlockID.
BlockIDFlagCommit = 0x02
// BlockIDFlagNil - voted for nil.
BlockIDFlagNil = 0x03
// BlockIDFlagAbsent - no vote was received from a validator.
BlockIDFlagAbsent BlockIDFlag = 0x01
// BlockIDFlagCommit - voted for the Commit.BlockID.
BlockIDFlagCommit = 0x02
// BlockIDFlagNil - voted for nil.
BlockIDFlagNil = 0x03
)
)
type CommitSig struct {
type CommitSig struct {
BlockIDFlag BlockIDFlag
ValidatorAddress Address
Timestamp time.Time
Signature []byte
BlockIDFlag BlockIDFlag
ValidatorAddress Address
Timestamp time.Time
Signature []byte
}
}
```
```
@ -168,14 +168,14 @@ The vote includes information about the validator signing it.
```go
```go
type Vote struct {
type Vote struct {
Type byte
Height int64
Round int
BlockID BlockID
Timestamp Time
ValidatorAddress []byte
ValidatorIndex int
Signature []byte
Type byte
Height int64
Round int
BlockID BlockID
Timestamp Time
ValidatorAddress []byte
ValidatorIndex int
Signature []byte
}
}
```
```
@ -193,7 +193,7 @@ See the [signature spec](./encoding.md#key-types) for more.
EvidenceData is a simple wrapper for a list of evidence:
EvidenceData is a simple wrapper for a list of evidence:
```
```go
type EvidenceData struct {
type EvidenceData struct {
Evidence []Evidence
Evidence []Evidence
}
}
@ -201,40 +201,40 @@ type EvidenceData struct {
## Evidence
## Evidence
Evidence in Tendermint is used to indicate breaches in the consensus by a validator.
Evidence in Tendermint is used to indicate breaches in the consensus by a validator.
It is implemented as the following interface.
It is implemented as the following interface.
```go
```go
type Evidence interface {
type Evidence interface {
Height() int64 // height of the equivocation
Time() time.Time // time of the equivocation
Address() []byte // address of the equivocating validator
Bytes() []byte // bytes which comprise the evidence
Hash() []byte // hash of the evidence
Verify(chainID string, pubKey crypto.PubKey) error // verify the evidence
Equal(Evidence) bool // check equality of evidence
ValidateBasic() error
String() string
Height() int64 // height of the equivocation
Time() time.Time // time of the equivocation
Address() []byte // address of the equivocating validator
Bytes() []byte // bytes which comprise the evidence
Hash() []byte // hash of the evidence
Verify(chainID string, pubKey crypto.PubKey) error // verify the evidence
Equal(Evidence) bool // check equality of evidence
ValidateBasic() error
String() string
}
}
```
```
All evidence can be encoded and decoded to and from Protobuf with the `EvidenceToProto()`
and `EvidenceFromProto()` functions. The [Fork Accountability](../consensus/light-client/accountability.md)
All evidence can be encoded and decoded to and from Protobuf with the `EvidenceToProto()`
and `EvidenceFromProto()` functions. The [Fork Accountability](../consensus/light-client/accountability.md)
document provides a good overview for the types of evidence and how they occur. For evidence to be committed onchain, it must adhere to the validation rules of each evidence and must not be expired. The expiration age, measured in both block height and time is set in `EvidenceParams`. Each evidence uses
document provides a good overview for the types of evidence and how they occur. For evidence to be committed onchain, it must adhere to the validation rules of each evidence and must not be expired. The expiration age, measured in both block height and time is set in `EvidenceParams`. Each evidence uses
the timestamp of the block that the evidence occured at to indicate the age of the evidence.
the timestamp of the block that the evidence occured at to indicate the age of the evidence.
### DuplicateVoteEvidence
### DuplicateVoteEvidence
`DuplicateVoteEvidence` represents a validator that has voted for two different blocks
`DuplicateVoteEvidence` represents a validator that has voted for two different blocks
in the same round of the same height. Votes are lexicographically sorted on `BlockID`.
in the same round of the same height. Votes are lexicographically sorted on `BlockID`.
```go
```go
type DuplicateVoteEvidence struct {
type DuplicateVoteEvidence struct {
VoteA *Vote
VoteB *Vote
Timestamp time.Time
VoteA *Vote
VoteB *Vote
Timestamp time.Time
}
}
```
```
@ -252,15 +252,15 @@ Valid Duplicate Vote Evidence must adhere to the following rules:
### AmensiaEvidence
### AmensiaEvidence
`AmnesiaEvidence` represents a validator that has incorrectly voted for another block in a
`AmnesiaEvidence` represents a validator that has incorrectly voted for another block in a
different round to the the block that the validator was previously locked on. This form
different round to the the block that the validator was previously locked on. This form
of evidence is generated differently from the rest. See this
of evidence is generated differently from the rest. See this
[ADR](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-056-proving-amnesia-attacks.md) for more information.
[ADR](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-056-proving-amnesia-attacks.md) for more information.
```go
```go
type AmnesiaEvidence struct {
type AmnesiaEvidence struct {
*PotentialAmnesiaEvidence
Polc *ProofOfLockChange
*PotentialAmnesiaEvidence
Polc *ProofOfLockChange
}
}
```
```
@ -280,16 +280,16 @@ Valid Amnesia Evidence must adhere to the following rules:
### LunaticValidatorEvidence
### LunaticValidatorEvidence
`LunaticValidatorEvidence` represents a validator that has signed for an arbitrary application state.
`LunaticValidatorEvidence` represents a validator that has signed for an arbitrary application state.
This attack only applies to Light clients.
This attack only applies to Light clients.
```go
```go
type LunaticValidatorEvidence struct {
type LunaticValidatorEvidence struct {
Header *Header
Vote *Vote
InvalidHeaderField string
Timestamp time.Time
Header *Header
Vote *Vote
InvalidHeaderField string
Timestamp time.Time
}
}
```
```
@ -330,7 +330,7 @@ A Header is valid if its corresponding fields are valid.
@ -17,33 +19,34 @@ The data structures used are illustrated below.
```go
```go
type BlockchainReactor struct {
type BlockchainReactor struct {
p2p.BaseReactor
p2p.BaseReactor
initialState sm.State // immutable
state sm.State
initialState sm.State // immutable
state sm.State
blockExec *sm.BlockExecutor
store *store.BlockStore
blockExec *sm.BlockExecutor
store *store.BlockStore
fastSync bool
fastSync bool
fsm *BcReactorFSM
blocksSynced int
fsm *BcReactorFSM
blocksSynced int
// Receive goroutine forwards messages to this channel to be processed in the context of the poolRoutine.
messagesForFSMCh chan bcReactorMessage
// Receive goroutine forwards messages to this channel to be processed in the context of the poolRoutine.
messagesForFSMCh chan bcReactorMessage
// Switch goroutine may send RemovePeer to the blockchain reactor. This is an error message that is relayed
// to this channel to be processed in the context of the poolRoutine.
errorsForFSMCh chan bcReactorMessage
// Switch goroutine may send RemovePeer to the blockchain reactor. This is an error message that is relayed
// to this channel to be processed in the context of the poolRoutine.
errorsForFSMCh chan bcReactorMessage
// This channel is used by the FSM and indirectly the block pool to report errors to the blockchain reactor and
// the switch.
eventsFromFSMCh chan bcFsmMessage
// This channel is used by the FSM and indirectly the block pool to report errors to the blockchain reactor and
// the switch.
eventsFromFSMCh chan bcFsmMessage
}
}
```
```
#### BcReactorFSM
#### BcReactorFSM
- implements a simple finite state machine.
- implements a simple finite state machine.
- has a state and a state timer.
- has a state and a state timer.
- has a `BlockPool` to keep track of block requests sent to peers and blocks received from peers.
- has a `BlockPool` to keep track of block requests sent to peers and blocks received from peers.
@ -51,49 +54,53 @@ type BlockchainReactor struct {
```go
```go
type BcReactorFSM struct {
type BcReactorFSM struct {
logger log.Logger
mtx sync.Mutex
logger log.Logger
mtx sync.Mutex
startTime time.Time
startTime time.Time
state *bcReactorFSMState
stateTimer *time.Timer
pool *BlockPool
state *bcReactorFSMState
stateTimer *time.Timer
pool *BlockPool
// interface used to call the Blockchain reactor to send StatusRequest, BlockRequest, reporting errors, etc.
toBcR bcReactor
// interface used to call the Blockchain reactor to send StatusRequest, BlockRequest, reporting errors, etc.
toBcR bcReactor
}
}
```
```
#### BlockPool
#### BlockPool
- maintains a peer set, implemented as a map of peer ID to `BpPeer`.
- maintains a peer set, implemented as a map of peer ID to `BpPeer`.
- maintains a set of requests made to peers, implemented as a map of block request heights to peer IDs.
- maintains a set of requests made to peers, implemented as a map of block request heights to peer IDs.
- maintains a list of future block requests needed to advance the fast-sync. This is a list of block heights.
- maintains a list of future block requests needed to advance the fast-sync. This is a list of block heights.
- keeps track of the maximum height of the peers in the set.
- keeps track of the maximum height of the peers in the set.
- uses an interface to send requests and report errors to the reactor (via FSM).
- uses an interface to send requests and report errors to the reactor (via FSM).
```go
```go
type BlockPool struct {
type BlockPool struct {
logger log.Logger
// Set of peers that have sent status responses, with height bigger than pool.Height
peers map[p2p.ID]*BpPeer
// Set of block heights and the corresponding peers from where a block response is expected or has been received.
blocks map[int64]p2p.ID
plannedRequests map[int64]struct{} // list of blocks to be assigned peers for blockRequest
nextRequestHeight int64 // next height to be added to plannedRequests
Height int64 // height of next block to execute
MaxPeerHeight int64 // maximum height of all peers
toBcR bcReactor
logger log.Logger
// Set of peers that have sent status responses, with height bigger than pool.Height
peers map[p2p.ID]*BpPeer
// Set of block heights and the corresponding peers from where a block response is expected or has been received.
blocks map[int64]p2p.ID
plannedRequests map[int64]struct{} // list of blocks to be assigned peers for blockRequest
nextRequestHeight int64 // next height to be added to plannedRequests
Height int64 // height of next block to execute
MaxPeerHeight int64 // maximum height of all peers
toBcR bcReactor
}
}
```
```
Some reasons for the `BlockPool` data structure content:
Some reasons for the `BlockPool` data structure content:
1. If a peer is removed by the switch fast access is required to the peer and the block requests made to that peer in order to redo them.
1. If a peer is removed by the switch fast access is required to the peer and the block requests made to that peer in order to redo them.
2. When block verification fails fast access is required from the block height to the peer and the block requests made to that peer in order to redo them.
2. When block verification fails fast access is required from the block height to the peer and the block requests made to that peer in order to redo them.
3. The `BlockchainReactor` main routine decides when the block pool is running low and asks the `BlockPool` (via FSM) to make more requests. The `BlockPool` creates a list of requests and triggers the sending of the block requests (via the interface). The reason it maintains a list of requests is the redo operations that may occur during error handling. These are redone when the `BlockchainReactor` requires more blocks.
3. The `BlockchainReactor` main routine decides when the block pool is running low and asks the `BlockPool` (via FSM) to make more requests. The `BlockPool` creates a list of requests and triggers the sending of the block requests (via the interface). The reason it maintains a list of requests is the redo operations that may occur during error handling. These are redone when the `BlockchainReactor` requires more blocks.
#### BpPeer
#### BpPeer
- keeps track of a single peer, with height bigger than the initial height.
- keeps track of a single peer, with height bigger than the initial height.
- maintains the block requests made to the peer and the blocks received from the peer until they are executed.
- maintains the block requests made to the peer and the blocks received from the peer until they are executed.
- monitors the peer speed when there are pending requests.
- monitors the peer speed when there are pending requests.
@ -101,17 +108,17 @@ Some reasons for the `BlockPool` data structure content:
```go
```go
type BpPeer struct {
type BpPeer struct {
logger log.Logger
ID p2p.ID
logger log.Logger
ID p2p.ID
Height int64 // the peer reported height
NumPendingBlockRequests int // number of requests still waiting for block responses
blocks map[int64]*types.Block // blocks received or expected to be received from this peer
blockResponseTimer *time.Timer
recvMonitor *flow.Monitor
params *BpPeerParams // parameters for timer and monitor
Height int64 // the peer reported height
NumPendingBlockRequests int // number of requests still waiting for block responses
blocks map[int64]*types.Block // blocks received or expected to be received from this peer
blockResponseTimer *time.Timer
recvMonitor *flow.Monitor
params *BpPeerParams // parameters for timer and monitor
onErr func(err error, peerID p2p.ID) // function to call on error
onErr func(err error, peerID p2p.ID) // function to call on error
}
}
```
```
@ -120,61 +127,73 @@ type BpPeer struct {
The diagram below shows the goroutines (depicted by the gray blocks), timers (shown on the left with their values) and channels (colored rectangles). The FSM box shows some of the functionality and it is not a separate goroutine.
The diagram below shows the goroutines (depicted by the gray blocks), timers (shown on the left with their values) and channels (colored rectangles). The FSM box shows some of the functionality and it is not a separate goroutine.
The interface used by the FSM is shown in light red with the `IF` block. This is used to:
The interface used by the FSM is shown in light red with the `IF` block. This is used to:
- send block requests
- send block requests
- report peer errors to the switch - this results in the reactor calling `switch.StopPeerForError()` and, if triggered by the peer timeout routine, a `removePeerEv` is sent to the FSM and action is taken from the context of the `poolRoutine()`
- report peer errors to the switch - this results in the reactor calling `switch.StopPeerForError()` and, if triggered by the peer timeout routine, a `removePeerEv` is sent to the FSM and action is taken from the context of the `poolRoutine()`
- ask the reactor to reset the state timers. The timers are owned by the FSM while the timeout routine is defined by the reactor. This was done in order to avoid running timers in tests and will change in the next revision.
- ask the reactor to reset the state timers. The timers are owned by the FSM while the timeout routine is defined by the reactor. This was done in order to avoid running timers in tests and will change in the next revision.
There are two main goroutines implemented by the blockchain reactor. All I/O operations are performed from the `poolRoutine()` context while the CPU intensive operations related to the block execution are performed from the context of the `executeBlocksRoutine()`. All goroutines are detailed in the next sections.
There are two main goroutines implemented by the blockchain reactor. All I/O operations are performed from the `poolRoutine()` context while the CPU intensive operations related to the block execution are performed from the context of the `executeBlocksRoutine()`. All goroutines are detailed in the next sections.
Fast-sync messages from peers are received by this goroutine. It performs basic validation and:
Fast-sync messages from peers are received by this goroutine. It performs basic validation and:
- in helper mode (i.e. for request message) it replies immediately. This is different than the proposal in adr-040 that specifies having the FSM handling these.
- in helper mode (i.e. for request message) it replies immediately. This is different than the proposal in adr-040 that specifies having the FSM handling these.
- forwards response messages to the `poolRoutine()`.
- forwards response messages to the `poolRoutine()`.
#### poolRoutine()
#### poolRoutine()
(named kept as in the previous reactor).
(named kept as in the previous reactor).
It starts the `executeBlocksRoutine()` and the FSM. It then waits in a loop for events. These are received from the following channels:
It starts the `executeBlocksRoutine()` and the FSM. It then waits in a loop for events. These are received from the following channels:
- `sendBlockRequestTicker.C` - every 10msec the reactor asks FSM to make more block requests up to a maximum. Note: currently this value is constant but could be changed based on low/ high watermark thresholds for the number of blocks received and waiting to be processed, the number of blockResponse messages waiting in messagesForFSMCh, etc.
- `sendBlockRequestTicker.C` - every 10msec the reactor asks FSM to make more block requests up to a maximum. Note: currently this value is constant but could be changed based on low/ high watermark thresholds for the number of blocks received and waiting to be processed, the number of blockResponse messages waiting in messagesForFSMCh, etc.
- `statusUpdateTicker.C` - every 10 seconds the reactor broadcasts status requests to peers. While adr-040 specifies this to run within the FSM, at this point this functionality is kept in the reactor.
- `statusUpdateTicker.C` - every 10 seconds the reactor broadcasts status requests to peers. While adr-040 specifies this to run within the FSM, at this point this functionality is kept in the reactor.
- `messagesForFSMCh` - the `Receive()` goroutine sends status and block response messages to this channel and the reactor calls FSM to handle them.
- `messagesForFSMCh` - the `Receive()` goroutine sends status and block response messages to this channel and the reactor calls FSM to handle them.
- `errorsForFSMCh` - this channel receives the following events:
- `errorsForFSMCh` - this channel receives the following events:
- peer remove - when the switch removes a peer
- peer remove - when the switch removes a peer
- sate timeout event - when FSM state timers trigger
- sate timeout event - when FSM state timers trigger
The reactor forwards this messages to the FSM.
The reactor forwards this messages to the FSM.
- `eventsFromFSMCh` - there are two type of events sent over this channel:
- `eventsFromFSMCh` - there are two type of events sent over this channel:
- `syncFinishedEv` - triggered when FSM enters `finished` state and calls the switchToConsensus() interface function.
- `syncFinishedEv` - triggered when FSM enters `finished` state and calls the switchToConsensus() interface function.
- `peerErrorEv`- peer timer expiry goroutine sends this event over the channel for processing from poolRoutine() context.
- `peerErrorEv`- peer timer expiry goroutine sends this event over the channel for processing from poolRoutine() context.
#### executeBlocksRoutine()
#### executeBlocksRoutine()
Started by the `poolRoutine()`, it retrieves blocks from the pool and executes them:
Started by the `poolRoutine()`, it retrieves blocks from the pool and executes them:
- `processReceivedBlockTicker.C` - a ticker event is received over the channel every 10msec and its handling results in a signal being sent to the doProcessBlockCh channel.
- doProcessBlockCh - events are received on this channel as described as above and upon processing blocks are retrieved from the pool and executed.
- `processReceivedBlockTicker.C` - a ticker event is received over the channel every 10msec and its handling results in a signal being sent to the doProcessBlockCh channel.
- doProcessBlockCh - events are received on this channel as described as above and upon processing blocks are retrieved from the pool and executed.
### FSM
### FSM
![fsm](img/bc-reactor-new-fsm.png)
![fsm](img/bc-reactor-new-fsm.png)
#### States
#### States
##### init (aka unknown)
##### init (aka unknown)
The FSM is created in `unknown` state. When started, by the reactor (`startFSMEv`), it broadcasts Status requests and transitions to `waitForPeer` state.
The FSM is created in `unknown` state. When started, by the reactor (`startFSMEv`), it broadcasts Status requests and transitions to `waitForPeer` state.
##### waitForPeer
##### waitForPeer
In this state, the FSM waits for a Status responses from a "tall" peer. A timer is running in this state to allow the FSM to finish if there are no useful peers.
In this state, the FSM waits for a Status responses from a "tall" peer. A timer is running in this state to allow the FSM to finish if there are no useful peers.
If the timer expires, it moves to `finished` state and calls the reactor to switch to consensus.
If the timer expires, it moves to `finished` state and calls the reactor to switch to consensus.
If a Status response is received from a peer within the timeout, the FSM transitions to `waitForBlock` state.
If a Status response is received from a peer within the timeout, the FSM transitions to `waitForBlock` state.
##### waitForBlock
##### waitForBlock
In this state the FSM makes Block requests (triggered by a ticker in reactor) and waits for Block responses. There is a timer running in this state to detect if a peer is not sending the block at current processing height. If the timer expires, the FSM removes the peer where the request was sent and all requests made to that peer are redone.
In this state the FSM makes Block requests (triggered by a ticker in reactor) and waits for Block responses. There is a timer running in this state to detect if a peer is not sending the block at current processing height. If the timer expires, the FSM removes the peer where the request was sent and all requests made to that peer are redone.
As blocks are received they are stored by the pool. Block execution is independently performed by the reactor and the result reported to the FSM:
As blocks are received they are stored by the pool. Block execution is independently performed by the reactor and the result reported to the FSM:
- if there are no errors, the FSM increases the pool height and resets the state timer.
- if there are no errors, the FSM increases the pool height and resets the state timer.
- if there are errors, the peers that delivered the two blocks (at height and height+1) are removed and the requests redone.
- if there are errors, the peers that delivered the two blocks (at height and height+1) are removed and the requests redone.
In this state the FSM may receive peer remove events in any of the following scenarios:
In this state the FSM may receive peer remove events in any of the following scenarios:
- the switch is removing a peer
- the switch is removing a peer
- a peer is penalized because it has not responded to some block requests for a long time
- a peer is penalized because it has not responded to some block requests for a long time
- a peer is penalized for being slow
- a peer is penalized for being slow
@ -183,6 +202,7 @@ When processing of the last block (the one with height equal to the highest peer
If after a peer update or removal the pool height is same as maxPeerHeight, the FSM transitions to `finished` state.
If after a peer update or removal the pool height is same as maxPeerHeight, the FSM transitions to `finished` state.
##### finished
##### finished
When entering this state, the FSM calls the reactor to switch to consensus and performs cleanup.
When entering this state, the FSM calls the reactor to switch to consensus and performs cleanup.
#### Events
#### Events
@ -191,18 +211,19 @@ The following events are handled by the FSM:
```go
```go
const (
const (
startFSMEv = iota + 1
statusResponseEv
blockResponseEv
processedBlockEv
makeRequestsEv
stopFSMEv
peerRemoveEv = iota + 256
stateTimeoutEv
startFSMEv = iota + 1
statusResponseEv
blockResponseEv
processedBlockEv
makeRequestsEv
stopFSMEv
peerRemoveEv = iota + 256
stateTimeoutEv
)
)
```
```
### Examples of Scenarios and Termination Handling
### Examples of Scenarios and Termination Handling
A few scenarios are covered in this section together with the current/ proposed handling.
A few scenarios are covered in this section together with the current/ proposed handling.
In general, the scenarios involving faulty peers are made worse by the fact that they may quickly be re-added.
In general, the scenarios involving faulty peers are made worse by the fact that they may quickly be re-added.
@ -4,41 +4,46 @@ This document specifies the Proposer Selection Procedure that is used in Tenderm
As Tendermint is “leader-based protocol”, the proposer selection is critical for its correct functioning.
As Tendermint is “leader-based protocol”, the proposer selection is critical for its correct functioning.
At a given block height, the proposer selection algorithm runs with the same validator set at each round .
At a given block height, the proposer selection algorithm runs with the same validator set at each round .
Between heights, an updated validator set may be specified by the application as part of the ABCIResponses' EndBlock.
Between heights, an updated validator set may be specified by the application as part of the ABCIResponses' EndBlock.
## Requirements for Proposer Selection
## Requirements for Proposer Selection
This sections covers the requirements with Rx being mandatory and Ox optional requirements.
This sections covers the requirements with Rx being mandatory and Ox optional requirements.
The following requirements must be met by the Proposer Selection procedure:
The following requirements must be met by the Proposer Selection procedure:
#### R1: Determinism
### R1: Determinism
Given a validator set `V`, and two honest validators `p` and `q`, for each height `h` and each round `r` the following must hold:
Given a validator set `V`, and two honest validators `p` and `q`, for each height `h` and each round `r` the following must hold:
`proposer_p(h,r) = proposer_q(h,r)`
`proposer_p(h,r) = proposer_q(h,r)`
where `proposer_p(h,r)` is the proposer returned by the Proposer Selection Procedure at process `p`, at height `h` and round `r`.
where `proposer_p(h,r)` is the proposer returned by the Proposer Selection Procedure at process `p`, at height `h` and round `r`.
#### R2: Fairness
### R2: Fairness
Given a validator set with total voting power P and a sequence S of elections. In any sub-sequence of S with length C*P, a validator v must be elected as proposer P/VP(v) times, i.e. with frequency:
Given a validator set with total voting power P and a sequence S of elections. In any sub-sequence of S with length C*P, a validator v must be elected as proposer P/VP(v) times, i.e. with frequency:
f(v) ~ VP(v) / P
f(v) ~ VP(v) / P
where C is a tolerance factor for validator set changes with following values:
where C is a tolerance factor for validator set changes with following values:
- C == 1 if there are no validator set changes
- C == 1 if there are no validator set changes
- C ~ k when there are validator changes
- C ~ k when there are validator changes
*[this needs more work]*
*[this needs more work]*
### Basic Algorithm
## Basic Algorithm
At its core, the proposer selection procedure uses a weighted round-robin algorithm.
At its core, the proposer selection procedure uses a weighted round-robin algorithm.
A model that gives a good intuition on how/ why the selection algorithm works and it is fair is that of a priority queue. The validators move ahead in this queue according to their voting power (the higher the voting power the faster a validator moves towards the head of the queue). When the algorithm runs the following happens:
A model that gives a good intuition on how/ why the selection algorithm works and it is fair is that of a priority queue. The validators move ahead in this queue according to their voting power (the higher the voting power the faster a validator moves towards the head of the queue). When the algorithm runs the following happens:
- all validators move "ahead" according to their powers: for each validator, increase the priority by the voting power
- all validators move "ahead" according to their powers: for each validator, increase the priority by the voting power
- first in the queue becomes the proposer: select the validator with highest priority
- first in the queue becomes the proposer: select the validator with highest priority
- move the proposer back in the queue: decrease the proposer's priority by the total voting power
- move the proposer back in the queue: decrease the proposer's priority by the total voting power
Notation:
Notation:
- vset - the validator set
- vset - the validator set
- n - the number of validators
- n - the number of validators
- VP(i) - voting power of validator i
- VP(i) - voting power of validator i
@ -49,7 +54,7 @@ Notation:
Simple view at the Selection Algorithm:
Simple view at the Selection Algorithm:
```
```md
def ProposerSelection (vset):
def ProposerSelection (vset):
// compute priorities and elect proposer
// compute priorities and elect proposer
@ -59,16 +64,16 @@ Simple view at the Selection Algorithm:
A(prop) -= P
A(prop) -= P
```
```
### Stable Set
## Stable Set
Consider the validator set:
Consider the validator set:
Validator | p1| p2
Validator | p1| p2
----------|---|---
----------|---|---
VP | 1 | 3
VP | 1 | 3
Assuming no validator changes, the following table shows the proposer priority computation over a few runs. Four runs of the selection procedure are shown, starting with the 5th the same values are computed.
Assuming no validator changes, the following table shows the proposer priority computation over a few runs. Four runs of the selection procedure are shown, starting with the 5th the same values are computed.
Each row shows the priority queue and the process place in it. The proposer is the closest to the head, the rightmost validator. As priorities are updated, the validators move right in the queue. The proposer moves left as its priority is reduced after election.
Each row shows the priority queue and the process place in it. The proposer is the closest to the head, the rightmost validator. As priorities are updated, the validators move right in the queue. The proposer moves left as its priority is reduced after election.
@ -83,20 +88,23 @@ Each row shows the priority queue and the process place in it. The proposer is t
| | | |p1,p2| | | | | |A(p2)-= P
| | | |p1,p2| | | | | |A(p2)-= P
It can be shown that:
It can be shown that:
- At the end of each run k+1 the sum of the priorities is the same as at end of run k. If a new set's priorities are initialized to 0 then the sum of priorities will be 0 at each run while there are no changes.
- At the end of each run k+1 the sum of the priorities is the same as at end of run k. If a new set's priorities are initialized to 0 then the sum of priorities will be 0 at each run while there are no changes.
- The max distance between priorites is (n-1) * P. *[formal proof not finished]*
- The max distance between priorites is (n-1) *P.*[formal proof not finished]*
## Validator Set Changes
### Validator Set Changes
Between proposer selection runs the validator set may change. Some changes have implications on the proposer election.
Between proposer selection runs the validator set may change. Some changes have implications on the proposer election.
#### Voting Power Change
### Voting Power Change
Consider again the earlier example and assume that the voting power of p1 is changed to 4:
Consider again the earlier example and assume that the voting power of p1 is changed to 4:
Validator | p1| p2
Validator | p1| p2
----------|---| ---
----------|---| ---
VP | 4 | 3
VP | 4 | 3
Let's also assume that before this change the proposer priorites were as shown in first row (last run). As it can be seen, the selection could run again, without changes, as before.
Let's also assume that before this change the proposer priorites were as shown in first row (last run). As it can be seen, the selection could run again, without changes, as before.
@ -107,20 +115,22 @@ Let's also assume that before this change the proposer priorites were as shown i
However, when a validator changes power from a high to a low value, some other validator remain far back in the queue for a long time. This scenario is considered again in the Proposer Priority Range section.
However, when a validator changes power from a high to a low value, some other validator remain far back in the queue for a long time. This scenario is considered again in the Proposer Priority Range section.
As before:
As before:
- At the end of each run k+1 the sum of the priorities is the same as at run k.
- At the end of each run k+1 the sum of the priorities is the same as at run k.
- The max distance between priorites is (n-1) * P.
- The max distance between priorites is (n-1) * P.
#### Validator Removal
### Validator Removal
Consider a new example with set:
Consider a new example with set:
Validator | p1 | p2 | p3 |
Validator | p1 | p2 | p3 |
--------- |--- |--- |--- |
--------- |--- |--- |--- |
VP | 1 | 2 | 3 |
VP | 1 | 2 | 3 |
Let's assume that after the last run the proposer priorities were as shown in first row with their sum being 0. After p2 is removed, at the end of next proposer selection run (penultimate row) the sum of priorities is -2 (minus the priority of the removed process).
Let's assume that after the last run the proposer priorities were as shown in first row with their sum being 0. After p2 is removed, at the end of next proposer selection run (penultimate row) the sum of priorities is -2 (minus the priority of the removed process).
The procedure could continue without modifications. However, after a sufficiently large number of modifications in validator set, the priority values would migrate towards maximum or minimum allowed values causing truncations due to overflow detection.
The procedure could continue without modifications. However, after a sufficiently large number of modifications in validator set, the priority values would migrate towards maximum or minimum allowed values causing truncations due to overflow detection.
For this reason, the selection procedure adds another __new step__ that centers the current priority values such that the priority sum remains close to 0.
For this reason, the selection procedure adds another __new step__ that centers the current priority values such that the priority sum remains close to 0.
@ -132,6 +142,7 @@ For this reason, the selection procedure adds another __new step__ that centers
The modified selection algorithm is:
The modified selection algorithm is:
```md
def ProposerSelection (vset):
def ProposerSelection (vset):
// center priorities around zero
// center priorities around zero
@ -144,18 +155,23 @@ The modified selection algorithm is:
A(i) += VP(i)
A(i) += VP(i)
prop = max(A)
prop = max(A)
A(prop) -= P
A(prop) -= P
```
Observations:
Observations:
- The sum of priorities is now close to 0. Due to integer division the sum is an integer in (-n, n), where n is the number of validators.
- The sum of priorities is now close to 0. Due to integer division the sum is an integer in (-n, n), where n is the number of validators.
#### New Validator
### New Validator
When a new validator is added, same problem as the one described for removal appears, the sum of priorities in the new set is not zero. This is fixed with the centering step introduced above.
When a new validator is added, same problem as the one described for removal appears, the sum of priorities in the new set is not zero. This is fixed with the centering step introduced above.
One other issue that needs to be addressed is the following. A validator V that has just been elected is moved to the end of the queue. If the validator set is large and/ or other validators have significantly higher power, V will have to wait many runs to be elected. If V removes and re-adds itself to the set, it would make a significant (albeit unfair) "jump" ahead in the queue.
One other issue that needs to be addressed is the following. A validator V that has just been elected is moved to the end of the queue. If the validator set is large and/ or other validators have significantly higher power, V will have to wait many runs to be elected. If V removes and re-adds itself to the set, it would make a significant (albeit unfair) "jump" ahead in the queue.
In order to prevent this, when a new validator is added, its initial priority is set to:
In order to prevent this, when a new validator is added, its initial priority is set to:
```md
A(V) = -1.125 * P
A(V) = -1.125 * P
```
where P is the total voting power of the set including V.
where P is the total voting power of the set including V.
@ -169,7 +185,9 @@ VP | 1 | 3 | 8
then p3 will start with proposer priority:
then p3 will start with proposer priority:
```md
A(p3) = -1.125 * (1 + 3 + 8) ~ -13
A(p3) = -1.125 * (1 + 3 + 8) ~ -13
```
Note that since current computation uses integer division there is penalty loss when sum of the voting power is less than 8.
Note that since current computation uses integer division there is penalty loss when sum of the voting power is less than 8.
@ -183,7 +201,8 @@ In the next run, p3 will still be ahead in the queue, elected as proposer and mo
| | | | | | p3 | | | | p2| | p1|A(i)+=VP(i)
| | | | | | p3 | | | | p2| | p1|A(i)+=VP(i)
| | | | p1 | | p3 | | | | p2| | |A(p1)-=P
| | | | p1 | | p3 | | | | p2| | |A(p1)-=P
### Proposer Priority Range
## Proposer Priority Range
With the introduction of centering, some interesting cases occur. Low power validators that bind early in a set that includes high power validator(s) benefit from subsequent additions to the set. This is because these early validators run through more right shift operations during centering, operations that increase their priority.
With the introduction of centering, some interesting cases occur. Low power validators that bind early in a set that includes high power validator(s) benefit from subsequent additions to the set. This is because these early validators run through more right shift operations during centering, operations that increase their priority.
As an example, consider the set where p2 is added after p1, with priority -1.125 * 80k = -90k. After the selection procedure runs once:
As an example, consider the set where p2 is added after p1, with priority -1.125 * 80k = -90k. After the selection procedure runs once:
@ -198,83 +217,90 @@ Then execute the following steps:
1. Add a new validator p3:
1. Add a new validator p3:
Validator | p1 | p2 | p3
----------|-----|--- |----
VP | 80k | 10 | 10
Validator | p1 | p2 | p3
----------|-----|--- |----
VP | 80k | 10 | 10
2. Run selection once. The notation '..p'/'p..' means very small deviations compared to column priority.
2. Run selection once. The notation '..p'/'p..' means very small deviations compared to column priority.
At this point, while the total voting power is 20, the distance between priorities is 45k. It will take 4500 runs for p3 to catch up with p2.
At this point, while the total voting power is 20, the distance between priorities is 45k. It will take 4500 runs for p3 to catch up with p2.
In order to prevent these types of scenarios, the selection algorithm performs scaling of priorities such that the difference between min and max values is smaller than two times the total voting power.
In order to prevent these types of scenarios, the selection algorithm performs scaling of priorities such that the difference between min and max values is smaller than two times the total voting power.
The modified selection algorithm is:
The modified selection algorithm is:
```md
def ProposerSelection (vset):
def ProposerSelection (vset):
// scale the priority values
// scale the priority values
diff = max(A)-min(A)
diff = max(A)-min(A)
threshold = 2 * P
threshold = 2 * P
if diff > threshold:
if diff > threshold:
scale = diff/threshold
scale = diff/threshold
for each validator i in vset:
for each validator i in vset:
A(i) = A(i)/scale
A(i) = A(i)/scale
// center priorities around zero
// center priorities around zero
avg = sum(A(i) for i in vset)/len(vset)
avg = sum(A(i) for i in vset)/len(vset)
for each validator i in vset:
for each validator i in vset:
A(i) -= avg
A(i) -= avg
// compute priorities and elect proposer
// compute priorities and elect proposer
for each validator i in vset:
for each validator i in vset:
A(i) += VP(i)
A(i) += VP(i)
prop = max(A)
prop = max(A)
A(prop) -= P
A(prop) -= P
```
Observations:
Observations:
- With this modification, the maximum distance between priorites becomes 2 * P.
- With this modification, the maximum distance between priorites becomes 2 * P.
Note also that even during steady state the priority range may increase beyond 2 * P. The scaling introduced here helps to keep the range bounded.
Note also that even during steady state the priority range may increase beyond 2 * P. The scaling introduced here helps to keep the range bounded.
## Wrinkles
### Wrinkles
### Validator Power Overflow Conditions
#### Validator Power Overflow Conditions
The validator voting power is a positive number stored as an int64. When a validator is added the `1.125 * P` computation must not overflow. As a consequence the code handling validator updates (add and update) checks for overflow conditions making sure the total voting power is never larger than the largest int64 `MAX`, with the property that `1.125 * MAX` is still in the bounds of int64. Fatal error is return when overflow condition is detected.
The validator voting power is a positive number stored as an int64. When a validator is added the `1.125 * P` computation must not overflow. As a consequence the code handling validator updates (add and update) checks for overflow conditions making sure the total voting power is never larger than the largest int64 `MAX`, with the property that `1.125 * MAX` is still in the bounds of int64. Fatal error is return when overflow condition is detected.
The proposer priority is stored as an int64. The selection algorithm performs additions and subtractions to these values and in the case of overflows and underflows it limits the values to:
The proposer priority is stored as an int64. The selection algorithm performs additions and subtractions to these values and in the case of overflows and underflows it limits the values to:
```go
MaxInt64 = 1 <<63-1
MaxInt64 = 1 <<63-1
MinInt64 = -1 <<63
MinInt64 = -1 <<63
```
## Requirement Fulfillment Claims
### Requirement Fulfillment Claims
__[R1]__
__[R1]__
The proposer algorithm is deterministic giving consistent results across executions with same transactions and validator set modifications.
The proposer algorithm is deterministic giving consistent results across executions with same transactions and validator set modifications.
[WIP - needs more detail]
[WIP - needs more detail]
__[R2]__
__[R2]__
Given a set of processes with the total voting power P, during a sequence of elections of length P, the number of times any process is selected as proposer is equal to its voting power. The sequence of the P proposers then repeats. If we consider the validator set:
Given a set of processes with the total voting power P, during a sequence of elections of length P, the number of times any process is selected as proposer is equal to its voting power. The sequence of the P proposers then repeats. If we consider the validator set:
Validator | p1| p2
Validator | p1| p2
----------|---|---
----------|---|---
VP | 1 | 3
VP | 1 | 3
@ -286,6 +312,8 @@ Assigning priorities to each validator based on the voting power and updating th
Intuitively, a process v jumps ahead in the queue at most (max(A) - min(A))/VP(v) times until it reaches the head and is elected. The frequency is then:
Intuitively, a process v jumps ahead in the queue at most (max(A) - min(A))/VP(v) times until it reaches the head and is elected. The frequency is then:
```md
f(v) ~ VP(v)/(max(A)-min(A)) = 1/k * VP(v)/P
f(v) ~ VP(v)/(max(A)-min(A)) = 1/k * VP(v)/P
```
For current implementation, this means v should be proposer at least VP(v) times out of k * P runs, with scaling factor k=2.
For current implementation, this means v should be proposer at least VP(v) times out of k * P runs, with scaling factor k=2.