You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

203 lines
7.2 KiB

  1. # ADR 080: ReverseSync - fetching historical data
  2. ## Changelog
  3. - 2021-02-11: Migrate to tendermint repo (Originally [RFC 005](https://github.com/tendermint/spec/pull/224))
  4. - 2021-04-19: Use P2P to gossip necessary data for reverse sync.
  5. - 2021-03-03: Simplify proposal to the state sync case.
  6. - 2021-02-17: Add notes on asynchronicity of processes.
  7. - 2020-12-10: Rename backfill blocks to reverse sync.
  8. - 2020-11-25: Initial draft.
  9. ## Author(s)
  10. - Callum Waters (@cmwaters)
  11. ## Context
  12. Two new features: [Block pruning](https://github.com/tendermint/tendermint/issues/3652)
  13. and [State sync](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-042-state-sync.md)
  14. meant nodes no longer needed a complete history of the blockchain. This
  15. introduced some challenges of its own which were covered and subsequently
  16. tackled with [RFC-001](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-077-block-retention.md).
  17. The RFC allowed applications to set a block retention height; an upper bound on
  18. what blocks would be pruned. However nodes who state sync past this upper bound
  19. (which is necessary as snapshots must be saved within the trusting period for
  20. the assisting light client to verify) have no means of backfilling the blocks
  21. to meet the retention limit. This could be a problem as nodes who state sync and
  22. then eventually switch to consensus (or fast sync) may not have the block and
  23. validator history to verify evidence causing them to panic if they see 2/3
  24. commit on what the node believes to be an invalid block.
  25. Thus, this RFC sets out to instil a minimum block history invariant amongst
  26. honest nodes.
  27. ## Proposal
  28. A backfill mechanism can simply be defined as an algorithm for fetching,
  29. verifying and storing, headers and validator sets of a height prior to the
  30. current base of the node's blockchain. In matching the terminology used for
  31. other data retrieving protocols (i.e. fast sync and state sync), we
  32. call this method **ReverseSync**.
  33. We will define the mechanism in four sections:
  34. - Usage
  35. - Design
  36. - Verification
  37. - Termination
  38. ### Usage
  39. For now, we focus purely on the case of a state syncing node, whom after
  40. syncing to a height will need to verify historical data in order to be capable
  41. of processing new blocks. We can denote the earliest height that the node will
  42. need to verify and store in order to be able to verify any evidence that might
  43. arise as the `max_historical_height`/`time`. Both height and time are necessary
  44. as this maps to the BFT time used for evidence expiration. After acquiring
  45. `State`, we calculate these parameters as:
  46. ```go
  47. max_historical_height = max(state.InitialHeight, state.LastBlockHeight - state.ConsensusParams.EvidenceAgeHeight)
  48. max_historical_time = max(GenesisTime, state.LastBlockTime.Sub(state.ConsensusParams.EvidenceAgeTime))
  49. ```
  50. Before starting either fast sync or consensus, we then run the following
  51. synchronous process:
  52. ```go
  53. func ReverseSync(max_historical_height int64, max_historical_time time.Time) error
  54. ```
  55. Where we fetch and verify blocks until a block `A` where
  56. `A.Height <= max_historical_height` and `A.Time <= max_historical_time`.
  57. Upon successfully reverse syncing, a node can now safely continue. As this
  58. feature is only used as part of state sync, one can think of this as merely an
  59. extension to it.
  60. In the future we may want to extend this functionality to allow nodes to fetch
  61. historical blocks for reasons of accountability or data accessibility.
  62. ### Design
  63. This section will provide a high level overview of some of the more important
  64. characteristics of the design, saving the more tedious details as an ADR.
  65. #### P2P
  66. Implementation of this RFC will require the addition of a new channel and two
  67. new messages.
  68. ```proto
  69. message LightBlockRequest {
  70. uint64 height = 1;
  71. }
  72. ```
  73. ```proto
  74. message LightBlockResponse {
  75. Header header = 1;
  76. Commit commit = 2;
  77. ValidatorSet validator_set = 3;
  78. }
  79. ```
  80. The P2P path may also enable P2P networked light clients and a state sync that
  81. also doesn't need to rely on RPC.
  82. ### Verification
  83. ReverseSync is used to fetch the following data structures:
  84. - `Header`
  85. - `Commit`
  86. - `ValidatorSet`
  87. Nodes will also need to be able to verify these. This can be achieved by first
  88. retrieving the header at the base height from the block store. From this trusted
  89. header, the node hashes each of the three data structures and checks that they are correct.
  90. 1. The trusted header's last block ID matches the hash of the new header
  91. ```go
  92. header[height].LastBlockID == hash(header[height-1])
  93. ```
  94. 2. The trusted header's last commit hash matches the hash of the new commit
  95. ```go
  96. header[height].LastCommitHash == hash(commit[height-1])
  97. ```
  98. 3. Given that the node now trusts the new header, check that the header's validator set
  99. hash matches the hash of the validator set
  100. ```go
  101. header[height-1].ValidatorsHash == hash(validatorSet[height-1])
  102. ```
  103. ### Termination
  104. ReverseSync draws a lot of parallels with fast sync. An important consideration
  105. for fast sync that also extends to ReverseSync is termination. ReverseSync will
  106. finish it's task when one of the following conditions have been met:
  107. 1. It reaches a block `A` where `A.Height <= max_historical_height` and
  108. `A.Time <= max_historical_time`.
  109. 2. None of it's peers reports to have the block at the height below the
  110. processes current block.
  111. 3. A global timeout.
  112. This implies that we can't guarantee adequate history and thus the term
  113. "invariant" can't be used in the strictest sense. In the case that the first
  114. condition isn't met, the node will log an error and optimistically attempt
  115. to continue with either fast sync or consensus.
  116. ## Alternative Solutions
  117. The need for a minimum block history invariant stems purely from the need to
  118. validate evidence (although there may be some application relevant needs as
  119. well). Because of this, an alternative, could be to simply trust whatever the
  120. 2/3+ majority has agreed upon and in the case where a node is at the head of the
  121. blockchain, you simply abstain from voting.
  122. As it stands, if 2/3+ vote on evidence you can't verify, in the same manner if
  123. 2/3+ vote on a header that a node sees as invalid (perhaps due to a different
  124. app hash), the node will halt.
  125. Another alternative is the method with which the relevant data is retrieved.
  126. Instead of introducing new messages to the P2P layer, RPC could have been used
  127. instead.
  128. The aforementioned data is already available via the following RPC endpoints:
  129. `/commit` for `Header`'s' and `/validators` for `ValidatorSet`'s'. It was
  130. decided predominantly due to the instability of the current RPC infrastructure
  131. that P2P be used instead.
  132. ## Status
  133. Proposed
  134. ## Consequences
  135. ### Positive
  136. - Ensures a minimum block history invariant for honest nodes. This will allow
  137. nodes to verify evidence.
  138. ### Negative
  139. - Statesync will be slower as more processing is required.
  140. ### Neutral
  141. - By having validator sets served through p2p, this would make it easier to
  142. extend p2p support to light clients and state sync.
  143. - In the future, it may also be possible to extend this feature to allow for
  144. nodes to freely fetch and verify prior blocks
  145. ## References
  146. - [RFC-001: Block retention](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-077-block-retention.md)
  147. - [Original issue](https://github.com/tendermint/tendermint/issues/4629)