Two new features: Block pruning and State sync meant nodes no longer needed a complete history of the blockchain. This introduced some challenges of its own which were covered and subsequently tackled with RFC-001. The RFC allowed applications to set a block retention height; an upper bound on what blocks would be pruned. However nodes who state sync past this upper bound (which is necessary as snapshots must be saved within the trusting period for the assisting light client to verify) have no means of backfilling the blocks to meet the retention limit. This could be a problem as nodes who state sync and then eventually switch to consensus (or fast sync) may not have the block and validator history to verify evidence causing them to panic if they see 2/3 commit on what the node believes to be an invalid block.
Thus, this RFC sets out to instil a minimum block history invariant amongst honest nodes.
A backfill mechanism can simply be defined as an algorithm for fetching, verifying and storing, headers and validator sets of a height prior to the current base of the node's blockchain. In matching the terminology used for other data retrieving protocols (i.e. fast sync and state sync), we call this method ReverseSync.
We will define the mechanism in four sections:
For now, we focus purely on the case of a state syncing node, whom after
syncing to a height will need to verify historical data in order to be capable
of processing new blocks. We can denote the earliest height that the node will
need to verify and store in order to be able to verify any evidence that might
arise as the max_historical_height
/time
. Both height and time are necessary
as this maps to the BFT time used for evidence expiration. After acquiring
State
, we calculate these parameters as:
max_historical_height = max(state.InitialHeight, state.LastBlockHeight - state.ConsensusParams.EvidenceAgeHeight)
max_historical_time = max(GenesisTime, state.LastBlockTime.Sub(state.ConsensusParams.EvidenceAgeTime))
Before starting either fast sync or consensus, we then run the following synchronous process:
func ReverseSync(max_historical_height int64, max_historical_time time.Time) error
Where we fetch and verify blocks until a block A
where
A.Height <= max_historical_height
and A.Time <= max_historical_time
.
Upon successfully reverse syncing, a node can now safely continue. As this feature is only used as part of state sync, one can think of this as merely an extension to it.
In the future we may want to extend this functionality to allow nodes to fetch historical blocks for reasons of accountability or data accessibility.
This section will provide a high level overview of some of the more important characteristics of the design, saving the more tedious details as an ADR.
Implementation of this RFC will require the addition of a new channel and two new messages.
message LightBlockRequest {
uint64 height = 1;
}
message LightBlockResponse {
Header header = 1;
Commit commit = 2;
ValidatorSet validator_set = 3;
}
The P2P path may also enable P2P networked light clients and a state sync that also doesn't need to rely on RPC.
ReverseSync is used to fetch the following data structures:
Header
Commit
ValidatorSet
Nodes will also need to be able to verify these. This can be achieved by first retrieving the header at the base height from the block store. From this trusted header, the node hashes each of the three data structures and checks that they are correct.
The trusted header's last block ID matches the hash of the new header
header[height].LastBlockID == hash(header[height-1])
The trusted header's last commit hash matches the hash of the new commit
header[height].LastCommitHash == hash(commit[height-1])
``
Given that the node now trusts the new header, check that the header's validator set hash matches the hash of the validator set
header[height-1].ValidatorsHash == hash(validatorSet[height-1])
``
ReverseSync draws a lot of parallels with fast sync. An important consideration for fast sync that also extends to ReverseSync is termination. ReverseSync will finish it's task when one of the following conditions have been met:
A
where A.Height <= max_historical_height
and
A.Time <= max_historical_time
.This implies that we can't guarantee adequate history and thus the term "invariant" can't be used in the strictest sense. In the case that the first condition isn't met, the node will log an error and optimistically attempt to continue with either fast sync or consensus.
The need for a minimum block history invariant stems purely from the need to validate evidence (although there may be some application relevant needs as well). Because of this, an alternative, could be to simply trust whatever the 2/3+ majority has agreed upon and in the case where a node is at the head of the blockchain, you simply abstain from voting.
As it stands, if 2/3+ vote on evidence you can't verify, in the same manner if 2/3+ vote on a header that a node sees as invalid (perhaps due to a different app hash), the node will halt.
Another alternative is the method with which the relevant data is retrieved. Instead of introducing new messages to the P2P layer, RPC could have been used instead.
The aforementioned data is already available via the following RPC endpoints:
/commit
for Header
's' and /validators
for ValidatorSet
's'. It was
decided predominantly due to the instability of the current RPC infrastructure
that P2P be used instead.
Proposed