docs: PBTS synchrony issues runbook (#8129)

closes: #7756 # What does this pull request change? This pull request adds a new runbook for operators enountering errors related to the new Proposer-Based Timestamps algorithm. The goal of this runbook is to give operators a set of clear steps that they can follow if they are having issues producing blocks because of clock synchronization problems. This pull request also renames the `*PrevoteDelay` metrics to drop the term `MessageDelay`. These metrics provide a combined view of `message_delay` + `synchrony` so the name may be confusing. # Questions to reviewers * Are there ways to make the set of steps clearer or are there any pieces that seem confusing?
3 years ago · 7c91b53999
--- a/docs/tools/debugging/proposer-based-timestamps-runbook.md
+++ b/docs/tools/debugging/proposer-based-timestamps-runbook.md
@ -0,0 +1,216 @@
 ---
 order: 3
 ---

 # Proposer-Based Timestamps Runbook

 Version v0.36 of Tendermint added new constraints for the timestamps included in
 each block created by Tendermint. The new constraints mean that validators may
 fail to produce valid blocks or may issue `nil` `prevotes` for proposed blocks
 depending on the configuration of the validator's local clock.

 ## What is this document for?

 This document provides a set of actionable steps for application developers and
 node operators to diagnose and fix issues related to clock synchronization and
 configuration of the Proposer-Based Timestamps [SynchronyParams](https://github.com/tendermint/tendermint/blob/master/spec/core/data_structures.md#synchronyparams).

 Use this runbook if you observe that validators are frequently voting `nil` for a block that the rest
 of the network votes for or if validators are frequently producing block proposals
 that are not voted for by the rest of the network.

 ## Requirements

 To use this runbook, you must be running a node that has the [Prometheus metrics endpoint enabled](https://github.com/tendermint/tendermint/blob/master/docs/nodes/metrics.md)
 and the Tendermint RPC endpoint enabled and accessible.

 It is strongly recommended to also run a Prometheus metrics collector to gather and
 analyze metrics from the Tendermint node.

 ## Debugging a Single Node

 If you observe that a single validator is frequently failing to produce blocks or
 voting nil for proposals that other validators vote for and suspect it may be
 related to clock synchronization, use the following steps to debug and correct the issue.

 ### Check Timely Metric

 Tendermint exposes a histogram metric for the difference between the timestamp in the proposal
 the and the time read from the node's local clock when the proposal is received.

 The histogram exposes multiple metrics on the Prometheus `/metrics` endpoint called
 * `tendermint_consensus_proposal_timestamp_difference_bucket`.
 * `tendermint_consensus_proposal_timestamp_difference_sum`.
 * `tendermint_consensus_proposal_timestamp_difference_count`.

 Each metric is also labeled with the key `is_timely`, which can have a value of
 `true` or `false`.

 #### From the Prometheus Collector UI

 If you are running a Prometheus collector, navigate to the query web interface and select the 'Graph' tab.

 Issue a query for the following:

 ```
 tendermint_consensus_proposal_timestamp_difference_count{is_timely="false"} /
 tendermint_consensus_proposal_timestamp_difference_count{is_timely="true"}
 ```

 This query will graph the ratio of proposals the node considered timely to those it
 considered untimely. If the ratio is increasing, it means that your node is consistently
 seeing more proposals that are far from its local clock. If this is the case, you should
 check to make sure your local clock is properly synchronized to NTP.

 #### From the `/metrics` url

 If you are not running a Prometheus collector, navigate to the `/metrics` endpoint
 exposed on the Prometheus metrics port with `curl` or a browser.

 Search for the `tendermint_consensus_proposal_timestamp_difference_count` metrics.
 This metric is labeled with `is_timely`. Investigate the value of
 `tendermint_consensus_proposal_timestamp_difference_count` where `is_timely="false"`
 and where `is_timely="true"`. Refresh the endpoint and observe if the value of `is_timely="false"`
 is growing.

 If you observe that `is_timely="false"` is growing, it means that your node is consistently
 seeing proposals that are far from its local clock. If this is the case, you should check
 to make sure your local clock is properly synchronized to NTP.

 ### Checking Clock Sync

 NTP configuration and tooling is very specific to the operating system and distribution
 that your validator node is running. This guide assumes you have `timedatectl` installed with
 [chrony](https://chrony.tuxfamily.org/), a popular tool for interacting with time
 synchronization on Linux distributions. If you are using an operating system or
 distribution with a different time synchronization mechanism, please consult the
 documentation for your operating system to check the status and re-synchronize the daemon.

 #### Check if NTP is Enabled

 ```shell
 $ timedatectl
 ```

 From the output, ensure that `NTP service` is `active`. If `NTP service` is `inactive`, run:

 ```shell
 $ timedatectl set-ntp true
 ```

 Re-run the `timedatectl` command and verify that the change has taken effect.

 #### Check if Your NTP Daemon is Synchronized

 Check the status of your local `chrony` NTP daemon using by running the following:

 ```shell
 $ chronyc tracking
 ```

 If the `chrony` daemon is running, you will see output that indicates its current status.
 If the `chrony` daemon is not running, restart it and re-run `chronyc tracking`.

 The `System time` field of the response should show a value that is much smaller than 100
 milliseconds.

 If the value is very large, restart the `chronyd` daemon.

 ## Debugging a Network

 If you observe that a network is frequently failing to produce blocks and suspect
 it may be related to clock synchronization, use the following steps to debug and correct the issue.

 ### Check Prevote Message Delay

 Tendermint exposes metrics that help determine how synchronized the clocks on a network are.

 These metrics are visible on the Prometheus `/metrics` endpoint and are called:
 * `tendermint_consensus_quorum_prevote_delay`
 * `tendermint_consensus_full_prevote_delay`

 These metrics calculate the difference between the timestamp in the proposal message and
 the timestamp of a prevote that was issued during consensus.

 The `tendermint_consensus_quorum_prevote_delay` metric is the interval in seconds
 between the proposal timestamp and the timestamp of the earliest prevote that
 achieved a quorum during the prevote step.

 The `tendermint_consensus_full_prevote_delay` metric is the interval in seconds
 between the proposal timestamp and the timestamp of the latest prevote in a round
 where 100% of the validators voted.

 #### From the Prometheus Collector UI

 If you are running a Prometheus collector, navigate to the query web interface and select the 'Graph' tab.

 Issue a query for the following:

 ```
 sum(tendermint_consensus_quorum_prevote_delay) by (proposer_address)
 ```

 This query will graph the difference in seconds for each proposer on the network.

 If the value is much larger for some proposers, then the issue is likely related to the clock
 synchronization of their nodes. Contact those proposers and ensure that their nodes
 are properly connected to NTP using the steps for [Debugging a Single Node](#debugging-a-single-node).

 If the value is relatively similar for all proposers you should next compare this
 value to the `SynchronyParams` values for the network. Continue to the [Checking
 Sychrony](#checking-synchrony) steps.

 #### From the `/metrics` url

 If you are not running a Prometheus collector, navigate to the `/metrics` endpoint
 exposed on the Prometheus metrics port.

 Search for the `tendermint_consensus_quorum_prevote_delay` metric. There will be one
 entry of this metric for each `proposer_address`. If the value of this metric is
 much larger for some proposers, then the issue is likely related to synchronization of their
 nodes with NTP. Contact those proposers and ensure that their nodes are properly connected
 to NTP using the steps for [Debugging a Single Node](#debugging-a-single-node).

 If the values are relatively similar for all proposers you should next compare,
 you'll need to compare this value to the `SynchronyParams` for the network. Continue
 to the [Checking Sychrony](#checking-synchrony) steps.

 ### Checking Synchrony

 To determine the currently configured `SynchronyParams` for your network, issue a
 request to your node's RPC endpoint. For a node running locally with the RPC server
 exposed on port `26657`, run the following command:

 ```shell
 $ curl localhost:26657/consensus_params
 ```

 The json output will contain a field named `synchrony`, with the following structure:

 ```json
 {
  "precision": "500000000",
  "message_delay": "3000000000"
 }
 ```

 The `precision` and `message_delay` values returned are listed in nanoseconds:
 In the examples above, the precision is 500ms and the message delay is 3s.
 Remember, `tendermint_consensus_quorum_prevote_delay` is listed in seconds.
 If the `tendermint_consensus_quorum_prevote_delay` value approaches the sum of `precision` and `message_delay`,
 then the value selected for these parameters is too small. Your application will
 need to be modified to update the `SynchronyParams` to have larger values.

 ### Updating SynchronyParams

 The `SynchronyParams` are `ConsensusParameters` which means they are set and updated
 by the application running alongside Tendermint. Updates to these parameters must
 be passed to the application during the `FinalizeBlock` ABCI method call.

 If the application was built using the CosmosSDK, then these parameters can be updated
 programatically using a governance proposal. For more information, see the [CosmosSDK
 documentation](https://hub.cosmos.network/main/governance/submitting.html#sending-the-transaction-that-submits-your-governance-proposal).

 If the application does not implement a way to update the consensus parameters
 programatically, then the application itself must be updated to do so. More information on updating
 the consensus parameters via ABCI can be found in the [FinalizeBlock documentation](https://github.com/tendermint/tendermint/blob/master/spec/abci++/abci++_methods_002_draft.md#finalizeblock).
--- a/internal/consensus/metrics.go
+++ b/internal/consensus/metrics.go
@ -92,12 +92,12 @@ type Metrics struct {
 	// be above 2/3 of the total voting power of the network defines the endpoint
 	// the endpoint of the interval. Subtract the proposal timestamp from this endpoint
 	// to obtain the quorum delay.
 	QuorumPrevoteMessageDelay metrics.Gauge
 	QuorumPrevoteDelay metrics.Gauge

 	// FullPrevoteMessageDelay is the interval in seconds between the proposal
 	// FullPrevoteDelay is the interval in seconds between the proposal
 	// timestamp and the timestamp of the latest prevote in a round where 100%
 	// of the voting power on the network issued prevotes.
 	FullPrevoteMessageDelay metrics.Gauge
 	FullPrevoteDelay metrics.Gauge

 	// ProposalTimestampDifference is the difference between the timestamp in
 	// the proposal message and the local time of the validator at the time
@ -257,17 +257,17 @@ func PrometheusMetrics(namespace string, labelsAndValues ...string) *Metrics {
 			Help:      "Time spent per step.",
 			Buckets:   stdprometheus.ExponentialBucketsRange(0.1, 100, 8),
 		}, append(labels, "step")).With(labelsAndValues...),
 		QuorumPrevoteMessageDelay: prometheus.NewGaugeFrom(stdprometheus.GaugeOpts{
 		QuorumPrevoteDelay: prometheus.NewGaugeFrom(stdprometheus.GaugeOpts{
 			Namespace: namespace,
 			Subsystem: MetricsSubsystem,
 			Name:      "quorum_prevote_message_delay",
 			Name:      "quorum_prevote_delay",
 			Help: "Difference in seconds between the proposal timestamp and the timestamp " +
 				"of the latest prevote that achieved a quorum in the prevote step.",
 		}, append(labels, "proposer_address")).With(labelsAndValues...),
 		FullPrevoteMessageDelay: prometheus.NewGaugeFrom(stdprometheus.GaugeOpts{
 		FullPrevoteDelay: prometheus.NewGaugeFrom(stdprometheus.GaugeOpts{
 			Namespace: namespace,
 			Subsystem: MetricsSubsystem,
 			Name:      "full_prevote_message_delay",
 			Name:      "full_prevote_delay",
 			Help: "Difference in seconds between the proposal timestamp and the timestamp " +
 				"of the latest prevote that achieved 100% of the voting power in the prevote step.",
 		}, append(labels, "proposer_address")).With(labelsAndValues...),
@ -314,8 +314,8 @@ func NopMetrics() *Metrics {
 		BlockParts:                  discard.NewCounter(),
 		BlockGossipReceiveLatency:   discard.NewHistogram(),
 		BlockGossipPartsReceived:    discard.NewCounter(),
 		QuorumPrevoteMessageDelay:   discard.NewGauge(),
 		FullPrevoteMessageDelay:     discard.NewGauge(),
 		QuorumPrevoteDelay:          discard.NewGauge(),
 		FullPrevoteDelay:            discard.NewGauge(),
 		ProposalTimestampDifference: discard.NewHistogram(),
 	}
 }
--- a/internal/consensus/state.go
+++ b/internal/consensus/state.go
@ -2592,12 +2592,12 @@ func (cs *State) calculatePrevoteMessageDelayMetrics() {
 		_, val := cs.Validators.GetByAddress(v.ValidatorAddress)
 		votingPowerSeen += val.VotingPower
 		if votingPowerSeen >= cs.Validators.TotalVotingPower()*2/3+1 {
 			cs.metrics.QuorumPrevoteMessageDelay.With("proposer_address", cs.Validators.GetProposer().Address.String()).Set(v.Timestamp.Sub(cs.Proposal.Timestamp).Seconds())
 			cs.metrics.QuorumPrevoteDelay.With("proposer_address", cs.Validators.GetProposer().Address.String()).Set(v.Timestamp.Sub(cs.Proposal.Timestamp).Seconds())
 			break
 		}
 	}
 	if ps.HasAll() {
 		cs.metrics.FullPrevoteMessageDelay.With("proposer_address", cs.Validators.GetProposer().Address.String()).Set(pl[len(pl)-1].Timestamp.Sub(cs.Proposal.Timestamp).Seconds())
 		cs.metrics.FullPrevoteDelay.With("proposer_address", cs.Validators.GetProposer().Address.String()).Set(pl[len(pl)-1].Timestamp.Sub(cs.Proposal.Timestamp).Seconds())
 	}
 }