You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

203 lines
9.5 KiB

  1. # ADR 74: Migrate Timeout Parameters to Consensus Parameters
  2. ## Changelog
  3. - 03-Jan-2022: Initial draft (@williambanfield)
  4. - 13-Jan-2022: Updated to indicate work on upgrade path needed (@williambanfield)
  5. ## Status
  6. Proposed
  7. ## Context
  8. ### Background
  9. Tendermint's consensus timeout parameters are currently configured locally by each validator
  10. in the validator's [config.toml][config-toml].
  11. This means that the validators on a Tendermint network may have different timeouts
  12. from each other. There is no reason for validators on the same network to configure
  13. different timeout values. Proper functioning of the Tendermint consensus algorithm
  14. relies on these parameters being uniform across validators.
  15. The configurable values are as follows:
  16. * `TimeoutPropose`
  17. * How long the consensus algorithm waits for a proposal block before issuing a prevote.
  18. * If no prevote arrives by `TimeoutPropose`, then the consensus algorithm will issue a nil prevote.
  19. * `TimeoutProposeDelta`
  20. * How much the `TimeoutPropose` grows each round.
  21. * `TimeoutPrevote`
  22. * How long the consensus algorithm waits after receiving +2/3 prevotes with
  23. no quorum for a value before issuing a precommit for nil.
  24. (See the [arXiv paper][arxiv-paper], Algorithm 1, Line 34)
  25. * `TimeoutPrevoteDelta`
  26. * How much the `TimeoutPrevote` increases with each round.
  27. * `TimeoutPrecommit`
  28. * How long the consensus algorithm waits after receiving +2/3 precommits that
  29. do not have a quorum for a value before entering the next round.
  30. (See the [arXiv paper][arxiv-paper], Algorithm 1, Line 47)
  31. * `TimeoutPrecommitDelta`
  32. * How much the `TimeoutPrecommit` increases with each round.
  33. * `TimeoutCommit`
  34. * How long the consensus algorithm waits after committing a block but before starting the new height.
  35. * This gives a validator a chance to receive slow precommits.
  36. * `SkipTimeoutCommit`
  37. * Make progress as soon as the node has 100% of the precommits.
  38. ### Overview of Change
  39. We will consolidate the timeout parameters and migrate them from the node-local
  40. `config.toml` file into the network-global consensus parameters.
  41. The 8 timeout parameters will be consolidated down to 6. These will be as follows:
  42. * `TimeoutPropose`
  43. * Same as current `TimeoutPropose`.
  44. * `TimeoutProposeDelta`
  45. * Same as current `TimeoutProposeDelta`.
  46. * `TimeoutVote`
  47. * How long validators wait for votes in both the prevote
  48. and precommit phase of the consensus algorithm. This parameter subsumes
  49. the current `TimeoutPrevote` and `TimeoutPrecommit` parameters.
  50. * `TimeoutVoteDelta`
  51. * How much the `TimeoutVote` will grow each successive round.
  52. This parameter subsumes the current `TimeoutPrevoteDelta` and `TimeoutPrecommitDelta`
  53. parameters.
  54. * `TimeoutCommit`
  55. * Same as current `TimeoutCommit`.
  56. * `EnableTimeoutCommitBypass`
  57. * Same as current `SkipTimeoutCommit`, renamed for clarity.
  58. A safe default will be provided by Tendermint for each of these parameters and
  59. networks will be able to update the parameters as they see fit. Local updates
  60. to these parameters will no longer be possible; instead, the application will control
  61. updating the parameters. Applications using the Cosmos SDK will be automatically be
  62. able to change the values of these consensus parameters [via a governance proposal][cosmos-sdk-consensus-params].
  63. This change is low-risk. While parameters are locally configurable, many running chains
  64. do not change them from their default values. For example, initializing
  65. a node on Osmosis, Terra, and the Cosmos Hub using the their `init` command produces
  66. a `config.toml` with Tendermint's default values for these parameters.
  67. ### Why this parameter consolidation?
  68. Reducing the number of parameters is good for UX. Fewer superfluous parameters makes
  69. running and operating a Tendermint network less confusing.
  70. The Prevote and Precommit messages are both similar sizes, require similar amounts
  71. of processing so there is no strong need for them to be configured separately.
  72. The `TimeoutPropose` parameter governs how long Tendermint will wait for the proposed
  73. block to be gossiped. Blocks are much larger than votes and therefore tend to be
  74. gossiped much more slowly. It therefore makes sense to keep `TimeoutPropose` and
  75. the `TimeoutProposeDelta` as parameters separate from the vote timeouts.
  76. `TimeoutCommit` is used by chains to ensure that the network waits for the votes from
  77. slower validators before proceeding to the next height. Without this timeout, the votes
  78. from slower validators would consistently not be included in blocks and those validators
  79. would not be counted as 'up' from the chain's perspective. Being down damages a validator's
  80. reputation and causes potential stakers to think twice before delegating to that validator.
  81. `TimeoutCommit` also prevents the network from producing the next height as soon as validators
  82. on the fastest hardware with a summed voting power of +2/3 of the network's total have
  83. completed execution of the block. Allowing the network to proceed as soon as the fastest
  84. +2/3 completed execution would have a cumulative effect over heights, eventually
  85. leaving slower validators unable to participate in consensus at all. `TimeoutCommit`
  86. therefore allows networks to have greater variability in hardware. Additional
  87. discussion of this can be found in [tendermint issue 5911][tendermint-issue-5911-comment]
  88. and [spec issue 359][spec-issue-359].
  89. ## Alternative Approaches
  90. ### Hardcode the parameters
  91. Many Tendermint networks run on similar cloud-hosted infrastructure. Therefore,
  92. they have similar bandwidth and machine resources. The timings for propagating votes
  93. and blocks are likely to be reasonably similar across networks. As a result, the
  94. timeout parameters are good candidates for being hardcoded. Hardcoding the timeouts
  95. in Tendermint would mean entirely removing these parameters from any configuration
  96. that could be altered by either an application or a node operator. Instead,
  97. Tendermint would ship with a set of timeouts and all applications using Tendermint
  98. would use this exact same set of values.
  99. While Tendermint nodes often run with similar bandwidth and on similar cloud-hosted
  100. machines, there are enough points of variability to make configuring
  101. consensus timeouts meaningful. Namely, Tendermint network topologies are likely to be
  102. very different from chain to chain. Additionally, applications may vary greatly in
  103. how long the `Commit` phase may take. Applications that perform more work during `Commit`
  104. require a longer `TimeoutCommit` to allow the application to complete its work
  105. and be prepared for the next height.
  106. ## Decision
  107. The decision has been made to implement this work, with the caveat that the
  108. specific mechanism for introducing the new parameters to chains is still ongoing.
  109. ## Detailed Design
  110. ### New Consensus Parameters
  111. A new `TimeoutParams` `message` will be added to the [params.proto file][consensus-params-proto].
  112. This message will have the following form:
  113. ```proto
  114. message TimeoutParams {
  115. google.protobuf.Duration propose = 1;
  116. google.protobuf.Duration propose_delta = 2;
  117. google.protobuf.Duration vote = 3;
  118. google.protobuf.Duration vote_delta = 4;
  119. google.protobuf.Duration commit = 5;
  120. bool enable_commit_timeout_bypass = 6;
  121. }
  122. ```
  123. This new message will be added as a field into the [`ConsensusParams`
  124. message][consensus-params-proto]. The same default values that are [currently
  125. set for these parameters][current-timeout-defaults] in the local configuration
  126. file will be used as the defaults for these new consensus parameters in the
  127. [consensus parameter defaults][default-consensus-params].
  128. The new consensus parameters will be subject to the same
  129. [validity rules][time-param-validation] as the current configuration values,
  130. namely, each value must be non-negative.
  131. ### Migration
  132. The new `ConsensusParameters` will be added during an upcoming release. In this
  133. release, the old `config.toml` parameters will cease to control the timeouts and
  134. an error will be logged on nodes that continue to specify these values. The specific
  135. mechanism by which these parameters will added to a chain is being discussed in
  136. [RFC-009][rfc-009] and will be decided ahead of the next release.
  137. The specific mechanism for adding these parameters depends on work related to
  138. [soft upgrades][soft-upgrades], which is still ongoing.
  139. ## Consequences
  140. ### Positive
  141. * Timeout parameters will be equal across all of the validators in a Tendermint network.
  142. * Remove superfluous timeout parameters.
  143. ### Negative
  144. ### Neutral
  145. * Timeout parameters require consensus to change.
  146. ## References
  147. [conseusus-params-proto]: https://github.com/tendermint/spec/blob/a00de7199f5558cdd6245bbbcd1d8405ccfb8129/proto/tendermint/types/params.proto#L11
  148. [hashed-params]: https://github.com/tendermint/tendermint/blob/7cdf560173dee6773b80d1c574a06489d4c394fe/types/params.go#L49
  149. [default-consensus-params]: https://github.com/tendermint/tendermint/blob/7cdf560173dee6773b80d1c574a06489d4c394fe/types/params.go#L79
  150. [current-timeout-defaults]: https://github.com/tendermint/tendermint/blob/7cdf560173dee6773b80d1c574a06489d4c394fe/config/config.go#L955
  151. [config-toml]: https://github.com/tendermint/tendermint/blob/5cc980698a3402afce76b26693ab54b8f67f038b/config/toml.go#L425-L440
  152. [cosmos-sdk-consensus-params]: https://github.com/cosmos/cosmos-sdk/issues/6197
  153. [time-param-validation]: https://github.com/tendermint/tendermint/blob/7cdf560173dee6773b80d1c574a06489d4c394fe/config/config.go#L1038
  154. [tendermint-issue-5911-comment]: https://github.com/tendermint/tendermint/issues/5911#issuecomment-973560381
  155. [spec-issue-359]: https://github.com/tendermint/spec/issues/359
  156. [arxiv-paper]: https://arxiv.org/pdf/1807.04938.pdf
  157. [soft-upgrades]: https://github.com/tendermint/spec/pull/222
  158. [rfc-009]: https://github.com/tendermint/tendermint/pull/7524