RFC: ReverseSync - fetching historical data (#224)

3 years ago · b39af911ae
--- a/rfc/005-reverse-sync.md
+++ b/rfc/005-reverse-sync.md
@ -0,0 +1,197 @@
 # RFC 004: ReverseSync - fetching historical data

 ## Changelog

 - 2021-04-19: Use P2P to gossip necessary data for reverse sync.
 - 2021-03-03: Simplify proposal to the state sync case.
 - 2021-02-17: Add notes on asynchronicity of processes.
 - 2020-12-10: Rename backfill blocks to reverse sync.
 - 2020-11-25: Initial draft.

 ## Author(s)

 - Callum Waters (@cmwaters)

 ## Context

 Two new features: [Block pruning](https://github.com/tendermint/tendermint/issues/3652)
 and [State sync](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-042-state-sync.md)
 meant nodes no longer needed a complete history of the blockchain. This
 introduced some challenges of its own which were covered and subsequently
 tackled with [RFC-001](https://github.com/tendermint/spec/blob/master/rfc/001-block-retention.md).
 The RFC allowed applications to set a block retention height; an upper bound on
 what blocks would be pruned. However nodes who state sync past this upper bound
 (which is necessary as snapshots must be saved within the trusting period for
 the assisting light client to verify) have no means of backfilling the blocks
 to meet the retention limit. This could be a problem as nodes who state sync and
 then eventually switch to consensus (or fast sync) may not have the block and
 validator history to verify evidence causing them to panic if they see 2/3
 commit on what the node believes to be an invalid block.

 Thus, this RFC sets out to instil a minimum block history invariant amongst
 honest nodes.

 ## Proposal

 A backfill mechanism can simply be defined as an algorithm for fetching,
 verifying and storing, headers and validator sets of a height prior to the
 current base of the node's blockchain. In matching the terminology used for
 other data retrieving protocols (i.e. fast sync and state sync), we
 call this method **ReverseSync**.

 We will define the mechanism in four sections:
 - Usage
 - Design
 - Verification
 - Termination

 ### Usage

 For now, we focus purely on the case of a state syncing node, whom after
 syncing to a height will need to verify historical data in order to be capable
 of processing new blocks. We can denote the earliest height that the node will
 need to verify and store in order to be able to verify any evidence that might
 arise as the `max_historical_height`/`time`. Both height and time are necessary
 as this maps to the BFT time used for evidence expiration. After acquiring
 `State`, we calculate these parameters as:

 ```go
 max_historical_height = max(state.InitialHeight, state.LastBlockHeight - state.ConsensusParams.EvidenceAgeHeight)
 max_historical_time = max(GenesisTime, state.LastBlockTime.Sub(state.ConsensusParams.EvidenceAgeTime))
 ```

 Before starting either fast sync or consensus, we then run the following
 synchronous process:

 ```go
 func ReverseSync(max_historical_height int64, max_historical_time time.Time) error
 ```

 Where we fetch and verify blocks until a block `A` where
 `A.Height <= max_historical_height` and `A.Time <= max_historical_time`.

 Upon successfully reverse syncing, a node can now safely continue. As this
 feature is only used as part of state sync, one can think of this as merely an
 extension to it.

 In the future we may want to extend this functionality to allow nodes to fetch
 historical blocks for reasons of accountability or data accessibility.

 ### Design

 This section will provide a high level overview of some of the more important
 characteristics of the design, saving the more tedious details as an ADR.

 #### P2P

 Implementation of this RFC will require the addition of a new channel and two
 new messages.

 ```proto
 message LightBlockRequest {
  uint64 height = 1;
 }
 ```

 ```proto
 message LightBlockResponse {
  Header header = 1;
  Commit commit = 2;
  ValidatorSet validator_set = 3;
 }
 ```

 The P2P path may also enable P2P networked light clients and a state sync that
 also doesn't need to rely on RPC.

 ### Verification

 ReverseSync is used to fetch the following data structures:
 - `Header`
 - `Commit`
 - `ValidatorSet`

 Nodes will also need to be able to verify these. This can be achieved by first
 retrieving the header at the base height from the block store. From this trusted
 header, the node hashes each of the three data structures and checks that they are correct.

 1. The trusted header's last block ID matches the hash of the new header 
 ```go
 header[height].LastBlockID == hash(header[height-1])
 ```

 2. The trusted header's last commit hash matches the hash of the new commit
 ```go
 header[height].LastCommitHash == hash(commit[height-1])
 ```

 3. Given that the node now trusts the new header, check that the header's validator set
 hash matches the hash of the validator set
 ```go
 header[height-1].ValidatorsHash == hash(validatorSet[height-1])
 ```

 ### Termination

 ReverseSync draws a lot of parallels with fast sync. An important consideration
 for fast sync that also extends to ReverseSync is termination. ReverseSync will
 finish it's task when one of the following conditions have been met:

 1. It reaches a block `A` where `A.Height <= max_historical_height` and
 `A.Time <= max_historical_time`.
 2. None of it's peers reports to have the block at the height below the
 processes current block.
 3. A global timeout.

 This implies that we can't guarantee adequate history and thus the term
 "invariant" can't be used in the strictest sense. In the case that the first
 condition isn't met, the node will log an error and optimistically attempt 
 to continue with either fast sync or consensus.

 ## Alternative Solutions

 The need for a minimum block history invariant stems purely from the need to
 validate evidence (although there may be some application relevant needs as
 well). Because of this, an alternative, could be to simply trust whatever the
 2/3+ majority has agreed upon and in the case where a node is at the head of the
 blockchain, you simply abstain from voting.

 As it stands, if 2/3+ vote on evidence you can't verify, in the same manner if
 2/3+ vote on a header that a node sees as invalid (perhaps due to a different
 app hash), the node will halt.

 Another alternative is the method with which the relevant data is retrieved.
 Instead of introducing new messages to the P2P layer, RPC could have been used 
 instead.

 The aforementioned data is already available via the following RPC endpoints:
 `/commit` for `Header`'s' and `/validators` for `ValidatorSet`'s'. It was
 decided predominantly due to the instability of the current RPC infrastructure
 that P2P be used instead. 

 ## Status

 Proposed

 ## Consequences

 ### Positive

 - Ensures a minimum block history invariant for honest nodes. This will allow
  nodes to verify evidence.

 ### Negative

 - Statesync will be slower as more processing is required.

 ### Neutral

 - By having validator sets served through p2p, this would make it easier to
 extend p2p support to light clients and state sync.
 - In the future, it may also be possible to extend this feature to allow for
 nodes to freely fetch and verify prior blocks

 ## References

 - [RFC-001: Block retention](https://github.com/tendermint/spec/blob/master/rfc/001-block-retention.md)
 - [Original issue](https://github.com/tendermint/tendermint/issues/4629)
--- a/rfc/README.md
+++ b/rfc/README.md
@ -25,3 +25,5 @@ Some RFC's will be presented at a Tendermint Dev Session. If you are an outside
 - [001-block-retention](./001-block-retention.md)
 - [002-nonzero-genesis](./002-nonzero-genesis.md)
 - [003-ed25519-verification](./003-ed25519-verification.md)
 - [004-abci++](./004-abci++.md)
 - [005-reverse-sync](./005-reverse-sync.md)