Fastsync is a protocol that is used by a node to catch-up to the current state of a Tendermint blockchain. Its typical use case is a node that was disconnected from the system for some time. The recovering node locally has a copy of a prefix of the blockchain, and the corresponding application state that is slightly outdated. It then queries its peers for the blocks that were decided on by the Tendermint blockchain during the period the full node was disconnected. After receiving these blocks, it executes the transactions in the blocks in order to catch-up to the current height of the blockchain and the corresponding application state.
In practice it is sufficient to catch-up only close to the current height: The Tendermint consensus reactor implements its own catch-up functionality and can synchronize a node that is close to the current height, perhaps within 10 blocks away from the current height of the blockchain. Fastsync should bring a node within this range.
Part I: Introduction of Tendermint blockchain terms that are relevant for FastSync protocol.
Part II: Introduction of the problem addressed by the Fastsync protocol.
Fastsync Informal Problem statement: For the general audience, that is, engineers who want to get an overview over what the component is doing from a bird's eye view.
Sequential Problem statement: Provides a mathematical definition of the problem statement in its sequential form, that is, ignoring the distributed aspect of the implementation of the blockchain.
Part III: Distributed aspects of the fast sync problem, system assumptions and temporal logic specifications.
Computational Model: timing and correctness assumptions.
Distributed Problem Statement: temporal properties that formalize safety and liveness properties of fast sync in distributed setting.
Part IV: Specification of Fastsync V2 (the protocol underlying the current Golang implementation).
Definitions: Describes inputs, outputs, variables used by the protocol, auxiliary functions
FastSync V2: gives an outline of the solution, and details of the functions used (with preconditions, postconditions, error conditions).
Algorithm Invariants: invariants over the protocol variables that the implementation should maintain.
Part V: Analysis of Fastsync V2 that highlights several issues that prevent achieving some of the desired fault-tolerance properties. We also give some suggestions on how to address the issues in the future.
Analysis of Fastsync V2: describes undesirable scenarios of Fastsync V2, and why they violate desirable temporal logic specification in an unreliable distributed system.
Suggestions to address the issues discussed in the analysis.
In this document we quite extensively use tags in order to be able to reference assumptions, invariants, etc. in future communication. In these tags we frequently use the following short forms:
We will briefly list some of the notions of Tendermint blockchains that are required for this specification. More details can be found here.
A set of blockchain transactions is stored in a data structure called block, which contains a field called header. (The data structure block is defined here). As the header contains hashes to the relevant fields of the block, for the purpose of this specification, we will assume that the blockchain is a list of headers, rather than a list of blocks.
The Tendermint blockchain is a list chain of headers.
During operation, new headers may be appended to the list one by one.
In the following, ETIME is a lower bound on the time interval between the times at which two successor blocks are added.
If a header is appended at time t then no additional header will be appended before time t + ETIME.
We assume the authenticated Byzantine fault model in which no node (faulty or correct) may break digital signatures, but otherwise, no additional assumption is made about the internal behavior of faulty nodes. That is, faulty nodes are only limited in that they cannot forge messages.
We observed that in the existing documentation the term validator refers to both a data structure and a full node that participates in the distributed computation. Therefore, we introduce the notions validator pair and validator node, respectively, to distinguish these notions in the cases where they are not clear from the context.
Given a full node, a validator pair is a pair (address, voting_power), where
In the Golang implementation the data type for validator pair is called
Validator
.
A validator set is a set of validator pairs. For a validator set vs, we write TotalVotingPower(vs) for the sum of the voting powers of its validator pairs.
We define a predicate correctUntil(n, t), where n is a node and t is a time point. The predicate correctUntil(n, t) is true if and only if the node n follows all the protocols (at least) until time t.
A blockchain has the following configuration parameters:
If a block h is in the chain, then there exists a subset CorrV of h.NextValidators, such that:
Every correct full node locally stores a prefix of the current list of headers from **[TMBC-SEQ]**.
A full node has as input a block of the blockchain at height h and the corresponding application state (or the prefix of the current blockchain until height h). It has access to a set peerIDs of full nodes called peers that it knows of. The full node uses the peers to read blocks of the Tendermint blockchain (in a safe way, that is, it checks the soundness conditions), until it has read the most recent block and then terminates.
Fastsync gets as input a block of height h and the corresponding application state s that corresponds to the block and state of that height of the blockchain, and produces as output (i) a list L of blocks starting at height h to some height terminationHeight, and (ii) the application state when applying the transactions of the list L to s.
In Tendermint, the commit for block of height h is contained in block h + 1, and thus the block of height h + 1 is needed to verify the block of height h. Let us therefore clarify the following on the termination height: The returned value terminationHeight is the height of the block with the largest height that could be verified. In order to do so, Fastsync needs the block at height terminationHeight + 1 of the blockchain.
Fastsync has to satisfy the following properties:
Let bh be the height of the blockchain at the time Fastsync starts. By assumption we have bh >= h. When Fastsync terminates, it outputs a list of all blocks from height h to some height terminationHeight >= bh - 1.
The above property is independent of how many blocks are added to the blockchain while Fastsync is running. It links the target height to the initial state. If Fastsync has to catch-up many blocks, it would be better to link the target height to a time close to the termination. This is captured by the following specification:
Let eh be the height of the blockchain at the time Fastsync terminates. There is a constant D >= 1 such that when Fastsync terminates, it outputs a list of all blocks from height h to some height terminationHeight >= eh - D.
Upon termination, the application state is the one that corresponds to the blockchain at height terminationHeight.
Fastsync eventually terminates.
We consider a node FS that performs Fastsync.
FS has access to a set peerIDs of IDs (public keys) of peers . During the execution of Fastsync, another protocol (outside of this specification) may add new IDs to peerIDs.
Peers can be faulty, and we do not make any assumptions about the number or ratio of correct/faulty nodes. Faulty processes may be Byzantine according to **[TMBC-AUTH-BYZ]**.
The system satisfies [TMBC-AUTH-BYZ]** and [TMBC-FM-2THIRDS]**. Thus, there is a blockchain that satisfies the soundness requirements (that is, the validation rules in [block]).
Communication between the node FS and all correct peers is reliable and bounded in time: there is a message end-to-end delay Delta such that if a message is sent at time t by a correct process to a correct process, then it will be received and processed by time t + Delta. This implies that we need a timeout of at least 2 Delta for remote procedure calls to ensure that the response of a correct peer arrives before the timeout expires.
We do not assume that there is a correct full node in peerIDs. Under this assumption no protocol can guarantee the combination of the properties [FS-SEQ-LIVE] and [FS-SEQ-SAFE-START] and [FS-SEQ-SAFE-SYNC] described in the sequential specification above. Thus, in the (unreliable) distributed setting, we consider two kinds of termination (successful and failure) and we will specify below under what (favorable) conditions Fastsync ensures to terminate successfully, and satisfy the requirements of the sequential problem statement:
Fastsync eventually terminates: it either terminates successfully or it terminates with failure.
As mentioned above, without assumptions on the correctness of some peers, no protocol can achieve the required specifications. Therefore, we consider the following (fairness) constraint in the safety and liveness properties below:
Initially, the set peerIDs contains at least one correct full node.
While in principle the above condition [FS-SOME-CORR-PEER] can be part of a sufficient condition to solve [FS-SEQ-LIVE] and [FS-SEQ-SAFE-START] and [FS-SEQ-SAFE-SYNC] in the distributed setting (their corresponding properties are given below), we will discuss in Part V that the current implementation of Fastsync (V2) requires the much stronger requirement **[FS-ALL-CORR-PEER]** given in Part V.
As this specification does not assume that a correct peer is at the most recent height of the blockchain (it might lag behind), the property [FS-SEQ-SAFE-START] cannot be ensured in an unreliable distributed setting. We consider the following relaxation. (Which is typically sufficient for Tendermint, as the consensus reactor then synchronizes from that height.)
Let maxh be the maximum height of a correct peer **[TMBC-CORR-FULL]** in peerIDs at the time Fastsync starts. If FastSync terminates successfully, it is at some height terminationHeight >= maxh - 1.
To address [FS-SEQ-SAFE-SYNC] we consider the following property in the distributed setting. See the comments below on the relation to the sequential version.
Under [FS-SOME-CORR-PEER], there exists a constant time interval TD, such that if term is the time Fastsync terminates and maxh is the maximum height of a correct peer **[TMBC-CORR-FULL]** in peerIDs at the time term - TD, then if FastSync terminates successfully, it is at some height terminationHeight >= maxh - 1.
TD might depend on timeouts etc. We suggest that an acceptable value for TD is in the range of approx. 10 sec., that is the interval between two calls
QueryStatus()
; see below. We use term - TD as reference time, as we have to account for communication delay between the peer and FS. After the peer sent the last message to FS, the peer and FS run concurrently and independently. There is no assumption on the rate at which a peer can add blocks (e.g., it might be in the process of catching up itself). Hence, without additional assumption we cannot link [FS-DIST-SAFE-SYNC] to [FS-SEQ-SAFE-SYNC]**, in particular to the parameter D. We discuss a way to achieve this below: Relation to [FS-SEQ-SAFE-SYNC]:
Under [FS-SOME-CORR-PEER], if peerIDs contains a full node that is "synchronized with the blockchain", and blockchainheight is the height of the blockchain at time term, then terminationHeight may even achieve blockchainheight - TD / ETIME; cf. [TMBC-SEQ-APPEND-E]**, that is, the parameter D from [FS-SEQ-SAFE-SYNC] is in the range of TD / ETIME.
It is the same as the sequential version **[FS-SEQ-SAFE-STATE]**.
If there is one correct process in peerIDs [FS-SOME-CORR-PEER], Fastsync never terminates with failure. (Together with [FS-DIST-LIVE] that means it will terminate successfully.)
Here we provide a specification of the FastSync V2 protocol as it is currently implemented. The V2 design is the result of significant refactoring to improve the testability and determinism in the implementation. The architecture is detailed in ADR-43.
In the original design, a go-routine (thread of execution) was spawned for each block requested, and was responsible for both protocol logic and IO. In the V2 design, protocol logic is decoupled from IO by using three total threads of execution: a scheduler, a processer, and a demuxer.
The scheduler contains the business logic for managing peers and requesting blocks from them, while the processor handles the computationally expensive block execution. Both the scheduler and processor are structured as finite state machines that receive input events and emit output events. The demuxer is responsible for all IO, including translating between internal events and IO messages, and routing events between components.
Protocols in Tendermint can be considered to consist of two components: a "core" state machine and a "peer" state machine. The core state machine refers to the internal state managed by the node, while the peer state machine determines what messages to send to peers. In the FastSync design, the core and peer state machines correspond to the processor and scheduler, respectively.
In the case of FastSync, the core state machine (the processor) is effectively
just the Tendermint block execution function, while virtually all protocol logic
is contained in the peer state machine (the scheduler). The processor is
only implemented as a separate component due to the computationally expensive nature
of block execution. We therefore focus our specification here on the peer state machine
(the scheduler component), capturing the core state machine (the processor component)
in the single Execute
function, defined below.
While the internal details of the Execute
function are not relevant for the
FastSync protocol and are thus not part of this specification, they will be
defined in detail at a later date in a separate Block Execution specification.
We now introduce variables and auxiliary functions used by the protocol.
height: kinitially startBlock.Height + 1
height should be thought of the "height of the next block we need to download"
state: initially startState
peerIDs: peer addresses FS-A-PEER-IDS
peerHeights: stores for each peer the height it reported. initially 0
pendingBlocks: stores for each height which peer was queried. initially nil for each height
receivedBlocks: stores for each height which peer returned it. initially nil
blockstore: stores for each height greater than startBlock.Height, the block of that height. initially nil for all heights
peerTimeStamp: stores for each peer the last time a block was received
pendingTime: stores for a given height the time a block was requested
peerRate: stores for each peer the rate of received data in Bytes/second
func VerifyCommit(b Block, c Commit) Boolean
Comment
verifyCommit(chainID string, blockID types.BlockID, height int64, commit *types.Commit) error
in the
current Golang implementation, which expects blockID and height
(from the first block) and the
corresponding commit from the following block. We use the
simplified form for ease in presentation.Implementation remark
Expected precondition
Expected postcondition
Error condition
Peers participating in FastSync exchange the following set of messages. Messages are
encoded using the Amino serialization protocol. We define each message here
using Go syntax, annoted with the Amino type name. The prefix bc
refers to
blockchain
, which is the name of the FastSync reactor in the Go
implementation.
// type: "tendermint/blockchain/BlockRequest"
type bcBlockRequestMessage struct {
Height int64
}
Remark:
msg.Height
> 0// type: "tendermint/blockchain/NoBlockResponse"
type bcNoBlockResponseMessage struct {
Height int64
}
Remark:
msg.Height
> 0// type: "tendermint/blockchain/BlockResponse"
type bcBlockResponseMessage struct {
Block *types.Block
}
Remark:
msg.Block
is a Tendermint block as defined in [block].msg.Block
!= nil// type: "tendermint/blockchain/StatusRequest"
type bcStatusRequestMessage struct {
Height int64
}
Remark:
msg.Height
> 0// type: "tendermint/blockchain/StatusResponse"
type bcStatusResponseMessage struct {
Height int64
}
Remark:
msg.Height
> 0Peers expose the following functions over remote procedure calls. The "Expected precondition" are only expected for correct peers (as no assumption is made on internals of faulty processes [FS-A-PEER]). These functions are implemented using the above defined message types.
In this document we describe the communication with peers via asynchronous RPCs.
func Status(addr Address) (int64, error)
bcStatusRequestMessage
.bcStatusResponseMessage
.height
of the
peer. [FS-A-COMM]func Block(addr Address, height int64) (Block, error)
bcBlockRequestMessage
.bcBlockResponseMessage
or bcNoBlockResponseMessage
.height
from the blockchain. [FS-A-COMM]bcNoBlockResponseMessage
). [FS-A-COMM]The protocol is described in terms of functions that are triggered by (external) events. The implementation uses a scheduler and a de-multiplexer to deal with communicating with peers and to trigger the execution of these functions:
QueryStatus()
: regularly (currently every 10sec; necessarily
interval greater than 2 Delta) queries all peers from peerIDs
for their current height [TMBC-CORR-FULL]. It does so
by calling Status(n)
remotely on all peers n.
CreateRequest
: regularly checks whether certain blocks have no
open request. If a block does not have an open request, it requests
one from a peer. It does so by calling Block(n,h)
remotely on one
peer n for a missing height h.
We have left the strategy how peers are selected unspecified, and the currently existing different implementations of Fastsync differ in this aspect. In V2, a peer p is selected with the minimum number of pending requests that can serve the required height h, that is with peerHeight(p) >= h.
The functions Status
and Block
are called by asynchronous
RPC. When they return, the following functions are called:
OnStatusResponse(addr Address, height int64)
: The full node with
address addr returns its current height. The function updates the height
information about addr, and may also increase TargetHeight.
OnBlockResponse(addr Address, b Block)
. The full node with
address addr returns a block. It is added to blockstore. Then
the auxiliary function Execute
is called.
Execute()
: Iterates over the blockstore. Checks soundness of
the blocks, and
executes the transactions of a sound block and updates state.
In addition to the functions above, the following two features are implemented in Fastsync V2
Periodically, peerTimeStamp and peerRate and pendingTime are analyzed. If a peer p has not provided a block recently (check of peerTimeStamp[p]) or it has not provided sufficiently many data (check of peerRate[p]), then p is removed from peerIDs. In addition, pendingTime is used to estimate whether the peer that is responsible for the current height has provided the corresponding block on time.
Fastsync V2 starts a timeout whenever a block is executed (that is, when the height is incremented). If the timeout expires before the next block is executed, Fastsync terminates. If this happens, then Fastsync terminates with failure.
func QueryStatus()
Status(n)
at each peer n in peerIDs.func OnStatusResponse(addr Address, ht int64)
OnStatusResponse(addr, ht)
was not invoked within 2 Delta after
Status(addr)
was called: addr not in peerIDsfunc CreateRequest
Block
is called remotely at a peer addr in peerIDs
for a missing height hfunc OnBlockResponse(addr Address, b Block)
Execute
is calledExecute
has been executed without error or was not
executed:
OnBlockResponse(addr, b)
was not invoked within 2 Delta after
Block(addr,h)
was called for b.Height = h: addr not in peerIDsfunc Execute()
In contrast to the temporal properties above that define the problem statement, the following are invariants on the solution to the problem, that is on the algorithm. These invariants are useful for the verification, but can also guide the implementation.
It is always the case that state corresponds to the application state of the blockchain of that height, that is, state = chain[height - 1].AppState; chain is defined in **[TMBC-SEQ]**.
It is always the case that the set peerIDs only contains nodes that have not yet misbehaved (by sending wrong data or timing out).
For startBlock.Height <= i < height - 1, let b(i) be the block with height i in blockstore, it always holds that VerifyCommit(b(i), b(i+1).Commit) = true. This means that height can only be incremented if all blocks with lower height have been verified.
If two blocks are not matching [FS-V2-Verif], Execute
dismisses both
blocks and removes the peers that provided these blocks from
peerIDs. If block a was correct and provided by a correct peer p,
and block b was faulty and provided by a faulty peer, the protocol
By [FS-A-PEER] we do not put a restriction on the number of faulty peers, so that faulty peers can make FS to remove all correct peers from peerIDs. As a result, this version of Fastsync violates [FS-DIST-SAFE-SYNC].
Due to [FS-ISSUE-KILL]**, from some point on, only faulty peers may be in peerIDs. They can thus control at which rate Fastsync gets blocks. If the timeout duration from [FS-V2-TIMEOUT] is greater than the time it takes to add a block to the blockchain (LTIME in [TMBC-SEQ-APPEND-E]**), the protocol may never terminate and thus violate [FS-DIST-LIVE]. This scenario is even possible if a correct peer is always in peerIDs, but faulty peers are regularly asked for blocks.
The issues [FS-ISSUE-KILL] and [FS-ISSUE-NON-TERM] explain why does not satisfy the property [FS-DIST-LIVE] relevant for termination. As a result, V2 only solves the specifications in a restricted form, namely, when all peers are correct:
At all times, the set peerIDs contains only correct full nodes.
With this restriction we can give the achieved properties:
Under [FS-ALL-CORR-PEER], Fastsync never terminates with failure.
Under [FS-ALL-CORR-PEER], Fastsync eventually terminates successfully.
In a fault tolerance context this is problematic, as it means that faulty peers can prevent FastSync from termination. We observe that this also touches other properties, namely, [FS-DIST-SAFE-START] and [FS-DIST-SAFE-SYNC]: Termination at an acceptable height are all conditional under "successful termination". The properties above severely restrict under which circumstances FastSync (V2) terminates successfully. As a result, large parts of the current implementation of are not fault-tolerant. We will discuss this, and suggestions how to solve this after the description of the current protocol.
To avoid [FS-ISSUE-KILL], we observe that **[TMBC-FM-2THIRDS]** ensures that from the point a block was created, we assume that more than two thirds of the validator nodes are correct until the trustingPeriod expires. Under this assumption, assume the trusting period of startBlock is not expired by the time FastSync checks a block b1 with height startBlock.Height + 1. To do so, we first need to check whether the Commit in the block b2 with startBlock.Height + 2 contains more than 2/3 of the voting power in startBlock.NextValidators. If this is the case we can check VerifyCommit (b1,b2.Commit). If we perform checks in this order we observe:
Based on this reasoning, we can ensure to only remove faulty peers from peerIDs. That is, if we sequentially verify blocks starting with startBlock, we will never remove a correct peer from peerIDs and we will be able to ensure the following invariant:
If a peer never misbehaves, it is never removed from peerIDs. It follows that under [FS-SOME-CORR-PEER], peerIDs is always non-empty.
To ensure this, we suggest to change the protocol as follows:
[NewFS-A-INIT] is the suggested replacement of [FS-A-V2-INIT]. This will allow us to use the established trust to understand precisely which peer reported an invalid block in order to ensure the invariant [NewFS-VAR-TRUST-INV] below:
Let b(i) be the block in trustedBlockstore with b(i).Height = i. It holds that for startHeight < i < height - 1, VerifyCommit (b(i),b(i+1).Commit) = true.
We propose to update the function
Execute
. To do so, we first define the following helper functions:
func ValidCommit(VS ValidatorSet, C Commit) Boolean
func SequentialVerify {
while (true) {
b1 = blockstore[height];
b2 = blockstore[height+1];
if b1 == nil or b2 == nil {
exit;
}
if ValidCommit(trustedBlockstore[height - 1].NextValidators, b2.commit) {
// we trust b2
if VerifyCommit(b1, b2.commit) {
trustedBlockstore.Add(b1);
height = height + 1;
}
else {
// as we trust b2, b1 must be faulty
blockstore.RemoveFromPeer(receivedBlocks[height]);
// we remove all blocks received from the faulty peer
peerIDs.Remove(receivedBlocks(bnew.Height));
exit;
}
} else {
// b2 is faulty
blockstore.RemoveFromPeer(receivedBlocks[height + 1]);
// we remove all blocks received from the faulty peer
peerIDs.Remove(receivedBlocks(bnew.Height));
exit; }
}
}
Then
Execute
just consists in callingSequentialVerify
and then updating the application state to the (new) height.
func Execute()
SequentialVerify
is executedAs discussed above, the advantageous termination requirement is the combination of [FS-DIST-LIVE] and [FS-DIST-NONABORT], that is, Fastsync should terminate successfully in case there is at least one correct peer in peerIDs. For this we have to ensure that faulty processes cannot slow us down and provide blocks at a lower rate than the blockchain may grow. To ensure that we will have to add an assumption on message delays.
2 Delta < ETIME; cf. **[TMBC-SEQ-APPEND-E]**.
This assumption implies that the timeouts for
OnBlockResponse
andOnStatusResponse
are such that a faulty peer that tries to respond slower than 2 Delta will be removed. In the following we will provide a rough estimate on termination time in a fault-prone scenario. In the following we assume that during a "long enough" finite good period no new faulty peers are added to peerIDs. Below we will sketch how "long enough" can be estimated based on the timing assumption in this specification.
Let Sigma be the (upper bound on the)
time between two calls of QueryStatus()
.
A time interval [begin,end] is good period if:
In the analysis below we assume that the termination condition of Fastsync is height = TargetHeight in the postcondition of
Execute
. Therefore, [NewFS-A-STATUS-INTERVAL] does not interfere with this analysis. If a correct peer reports a new height "shortly before termination" this leads to an additional round trip to request and add the new block. Then [NewFS-A-DELTA] ensures that Fastsync catches up.
Arguments:
If a faulty peer p reports a faulty block, SequentialVerify
will
eventually remove p from peerIDs
By SequentialVerify
, if a faulty peer p reports multiple faulty
blocks, p will be removed upon trying to check the block with the
smallest height received from p.
Assume whenever a block does not have an open request, CreateRequest
is
called immediately, which calls Block(n)
on a peer. Say this
happens at time t. There are two cases:
Block(n)
timed out and n is removed from
peer.Let f(t) be the number of faulty peers in peerIDs at time t;
f(begin) = fmax.
Let t_i be the sequence of times OnBlockResponse(addr,b)
is
invoked or times out with b.Height = height + 1.
By 3.,
By an inductive argument we prove for i > 0 that
Argument: if the peer is faulty and does not return a block, the
peer is removed, if it is faulty and returns a faulty block
SequentialVerify
removes the peer (b). If the returned block is OK,
height is increased (a).
By 2. and 7., faulty peers can delay incrementing the height at most fmax times, where each time "costs" 2 Delta seconds. We have additional 2 Delta initial offset (3a) plus 2 Delta to get all missing blocks after the last fault showed itself. (This assumes that an arbitrary number of blocks can be obtained and checked within one round-trip 2 Delta; which either needs conservative estimation of Delta, or a more refined analysis). Thus we reach the targetHeight and terminate by time end.
[block] Specification of the block data structure.