You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 

349 lines
16 KiB

Byzantine Consensus Algorithm
=============================
Terms
-----
- The network is composed of optionally connected *nodes*. Nodes
directly connected to a particular node are called *peers*.
- The consensus process in deciding the next block (at some *height*
``H``) is composed of one or many *rounds*.
- ``NewHeight``, ``Propose``, ``Prevote``, ``Precommit``, and
``Commit`` represent state machine states of a round. (aka
``RoundStep`` or just "step").
- A node is said to be *at* a given height, round, and step, or at
``(H,R,S)``, or at ``(H,R)`` in short to omit the step.
- To *prevote* or *precommit* something means to broadcast a `prevote
vote <https://godoc.org/github.com/tendermint/tendermint/types#Vote>`__
or `first precommit
vote <https://godoc.org/github.com/tendermint/tendermint/types#FirstPrecommit>`__
for something.
- A vote *at* ``(H,R)`` is a vote signed with the bytes for ``H`` and
``R`` included in its
`sign-bytes <block-structure.html#vote-sign-bytes>`__.
- *+2/3* is short for "more than 2/3"
- *1/3+* is short for "1/3 or more"
- A set of +2/3 of prevotes for a particular block or ``<nil>`` at
``(H,R)`` is called a *proof-of-lock-change* or *PoLC* for short.
State Machine Overview
----------------------
At each height of the blockchain a round-based protocol is run to
determine the next block. Each round is composed of three *steps*
(``Propose``, ``Prevote``, and ``Precommit``), along with two special
steps ``Commit`` and ``NewHeight``.
In the optimal scenario, the order of steps is:
::
NewHeight -> (Propose -> Prevote -> Precommit)+ -> Commit -> NewHeight ->...
The sequence ``(Propose -> Prevote -> Precommit)`` is called a *round*.
There may be more than one round required to commit a block at a given
height. Examples for why more rounds may be required include:
- The designated proposer was not online.
- The block proposed by the designated proposer was not valid.
- The block proposed by the designated proposer did not propagate in
time.
- The block proposed was valid, but +2/3 of prevotes for the proposed
block were not received in time for enough validator nodes by the
time they reached the ``Precommit`` step. Even though +2/3 of
prevotes are necessary to progress to the next step, at least one
validator may have voted ``<nil>`` or maliciously voted for something
else.
- The block proposed was valid, and +2/3 of prevotes were received for
enough nodes, but +2/3 of precommits for the proposed block were not
received for enough validator nodes.
Some of these problems are resolved by moving onto the next round &
proposer. Others are resolved by increasing certain round timeout
parameters over each successive round.
State Machine Diagram
---------------------
::
+-------------------------------------+
v |(Wait til `CommmitTime+timeoutCommit`)
+-----------+ +-----+-----+
+----------> | Propose +--------------+ | NewHeight |
| +-----------+ | +-----------+
| | ^
|(Else, after timeoutPrecommit) v |
+-----+-----+ +-----------+ |
| Precommit | <------------------------+ Prevote | |
+-----+-----+ +-----------+ |
|(When +2/3 Precommits for block found) |
v |
+--------------------------------------------------------------------+
| Commit |
| |
| * Set CommitTime = now; |
| * Wait for block, then stage/save/commit block; |
+--------------------------------------------------------------------+
Background Gossip
-----------------
A node may not have a corresponding validator private key, but it
nevertheless plays an active role in the consensus process by relaying
relevant meta-data, proposals, blocks, and votes to its peers. A node
that has the private keys of an active validator and is engaged in
signing votes is called a *validator-node*. All nodes (not just
validator-nodes) have an associated state (the current height, round,
and step) and work to make progress.
Between two nodes there exists a ``Connection``, and multiplexed on top
of this connection are fairly throttled ``Channel``\ s of information.
An epidemic gossip protocol is implemented among some of these channels
to bring peers up to speed on the most recent state of consensus. For
example,
- Nodes gossip ``PartSet`` parts of the current round's proposer's
proposed block. A LibSwift inspired algorithm is used to quickly
broadcast blocks across the gossip network.
- Nodes gossip prevote/precommit votes. A node NODE\_A that is ahead of
NODE\_B can send NODE\_B prevotes or precommits for NODE\_B's current
(or future) round to enable it to progress forward.
- Nodes gossip prevotes for the proposed PoLC (proof-of-lock-change)
round if one is proposed.
- Nodes gossip to nodes lagging in blockchain height with block
`commits <https://godoc.org/github.com/tendermint/tendermint/types#Commit>`__
for older blocks.
- Nodes opportunistically gossip ``HasVote`` messages to hint peers
what votes it already has.
- Nodes broadcast their current state to all neighboring peers. (but is
not gossiped further)
There's more, but let's not get ahead of ourselves here.
Proposals
---------
A proposal is signed and published by the designated proposer at each
round. The proposer is chosen by a deterministic and non-choking round
robin selection algorithm that selects proposers in proportion to their
voting power. (see
`implementation <https://github.com/tendermint/tendermint/blob/develop/types/validator_set.go>`__)
A proposal at ``(H,R)`` is composed of a block and an optional latest
``PoLC-Round < R`` which is included iff the proposer knows of one. This
hints the network to allow nodes to unlock (when safe) to ensure the
liveness property.
State Machine Spec
------------------
Propose Step (height:H,round:R)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Upon entering ``Propose``: - The designated proposer proposes a block at
``(H,R)``.
The ``Propose`` step ends: - After ``timeoutProposeR`` after entering
``Propose``. --> goto ``Prevote(H,R)`` - After receiving proposal block
and all prevotes at ``PoLC-Round``. --> goto ``Prevote(H,R)`` - After
`common exit conditions <#common-exit-conditions>`__
Prevote Step (height:H,round:R)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Upon entering ``Prevote``, each validator broadcasts its prevote vote.
- First, if the validator is locked on a block since ``LastLockRound``
but now has a PoLC for something else at round ``PoLC-Round`` where
``LastLockRound < PoLC-Round < R``, then it unlocks.
- If the validator is still locked on a block, it prevotes that.
- Else, if the proposed block from ``Propose(H,R)`` is good, it
prevotes that.
- Else, if the proposal is invalid or wasn't received on time, it
prevotes ``<nil>``.
The ``Prevote`` step ends: - After +2/3 prevotes for a particular block
or ``<nil>``. --> goto ``Precommit(H,R)`` - After ``timeoutPrevote``
after receiving any +2/3 prevotes. --> goto ``Precommit(H,R)`` - After
`common exit conditions <#common-exit-conditions>`__
Precommit Step (height:H,round:R)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Upon entering ``Precommit``, each validator broadcasts its precommit
vote. - If the validator has a PoLC at ``(H,R)`` for a particular block
``B``, it (re)locks (or changes lock to) and precommits ``B`` and sets
``LastLockRound = R``. - Else, if the validator has a PoLC at ``(H,R)``
for ``<nil>``, it unlocks and precommits ``<nil>``. - Else, it keeps the
lock unchanged and precommits ``<nil>``.
A precommit for ``<nil>`` means "I didn’t see a PoLC for this round, but
I did get +2/3 prevotes and waited a bit".
The Precommit step ends: - After +2/3 precommits for ``<nil>``. --> goto
``Propose(H,R+1)`` - After ``timeoutPrecommit`` after receiving any +2/3
precommits. --> goto ``Propose(H,R+1)`` - After `common exit
conditions <#common-exit-conditions>`__
common exit conditions
^^^^^^^^^^^^^^^^^^^^^^
- After +2/3 precommits for a particular block. --> goto ``Commit(H)``
- After any +2/3 prevotes received at ``(H,R+x)``. --> goto
``Prevote(H,R+x)``
- After any +2/3 precommits received at ``(H,R+x)``. --> goto
``Precommit(H,R+x)``
Commit Step (height:H)
~~~~~~~~~~~~~~~~~~~~~~
- Set ``CommitTime = now()``
- Wait until block is received. --> goto ``NewHeight(H+1)``
NewHeight Step (height:H)
~~~~~~~~~~~~~~~~~~~~~~~~~
- Move ``Precommits`` to ``LastCommit`` and increment height.
- Set ``StartTime = CommitTime+timeoutCommit``
- Wait until ``StartTime`` to receive straggler commits. --> goto
``Propose(H,0)``
Proofs
------
Proof of Safety
~~~~~~~~~~~~~~~
Assume that at most -1/3 of the voting power of validators is byzantine.
If a validator commits block ``B`` at round ``R``, it's because it saw
+2/3 of precommits at round ``R``. This implies that 1/3+ of honest
nodes are still locked at round ``R' > R``. These locked validators will
remain locked until they see a PoLC at ``R' > R``, but this won't happen
because 1/3+ are locked and honest, so at most -2/3 are available to
vote for anything other than ``B``.
Proof of Liveness
~~~~~~~~~~~~~~~~~
If 1/3+ honest validators are locked on two different blocks from
different rounds, a proposers' ``PoLC-Round`` will eventually cause
nodes locked from the earlier round to unlock. Eventually, the
designated proposer will be one that is aware of a PoLC at the later
round. Also, ``timeoutProposalR`` increments with round ``R``, while the
size of a proposal are capped, so eventually the network is able to
"fully gossip" the whole proposal (e.g. the block & PoLC).
Proof of Fork Accountability
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Define the JSet (justification-vote-set) at height ``H`` of a validator
``V1`` to be all the votes signed by the validator at ``H`` along with
justification PoLC prevotes for each lock change. For example, if ``V1``
signed the following precommits: ``Precommit(B1 @ round 0)``,
``Precommit(<nil> @ round 1)``, ``Precommit(B2 @ round 4)`` (note that
no precommits were signed for rounds 2 and 3, and that's ok),
``Precommit(B1 @ round 0)`` must be justified by a PoLC at round 0, and
``Precommit(B2 @ round 4)`` must be justified by a PoLC at round 4; but
the precommit for ``<nil>`` at round 1 is not a lock-change by
definition so the JSet for ``V1`` need not include any prevotes at round
1, 2, or 3 (unless ``V1`` happened to have prevoted for those rounds).
Further, define the JSet at height ``H`` of a set of validators ``VSet``
to be the union of the JSets for each validator in ``VSet``. For a given
commit by honest validators at round ``R`` for block ``B`` we can
construct a JSet to justify the commit for ``B`` at ``R``. We say that a
JSet *justifies* a commit at ``(H,R)`` if all the committers (validators
in the commit-set) are each justified in the JSet with no duplicitous
vote signatures (by the committers).
- **Lemma**: When a fork is detected by the existence of two
conflicting `commits <./validators.html#commiting-a-block>`__,
the union of the JSets for both commits (if they can be compiled)
must include double-signing by at least 1/3+ of the validator set.
**Proof**: The commit cannot be at the same round, because that would
immediately imply double-signing by 1/3+. Take the union of the JSets
of both commits. If there is no double-signing by at least 1/3+ of
the validator set in the union, then no honest validator could have
precommitted any different block after the first commit. Yet, +2/3
did. Reductio ad absurdum.
As a corollary, when there is a fork, an external process can determine
the blame by requiring each validator to justify all of its round votes.
Either we will find 1/3+ who cannot justify at least one of their votes,
and/or, we will find 1/3+ who had double-signed.
Alternative algorithm
~~~~~~~~~~~~~~~~~~~~~
Alternatively, we can take the JSet of a commit to be the "full commit".
That is, if light clients and validators do not consider a block to be
committed unless the JSet of the commit is also known, then we get the
desirable property that if there ever is a fork (e.g. there are two
conflicting "full commits"), then 1/3+ of the validators are immediately
punishable for double-signing.
There are many ways to ensure that the gossip network efficiently share
the JSet of a commit. One solution is to add a new message type that
tells peers that this node has (or does not have) a +2/3 majority for B
(or ) at (H,R), and a bitarray of which votes contributed towards that
majority. Peers can react by responding with appropriate votes.
We will implement such an algorithm for the next iteration of the
Tendermint consensus protocol.
Other potential improvements include adding more data in votes such as
the last known PoLC round that caused a lock change, and the last voted
round/step (or, we may require that validators not skip any votes). This
may make JSet verification/gossip logic easier to implement.
Censorship Attacks
~~~~~~~~~~~~~~~~~~
Due to the definition of a block
`commit <validators.html#commiting-a-block>`__, any 1/3+
coalition of validators can halt the blockchain by not broadcasting
their votes. Such a coalition can also censor particular transactions by
rejecting blocks that include these transactions, though this would
result in a significant proportion of block proposals to be rejected,
which would slow down the rate of block commits of the blockchain,
reducing its utility and value. The malicious coalition might also
broadcast votes in a trickle so as to grind blockchain block commits to
a near halt, or engage in any combination of these attacks.
If a global active adversary were also involved, it can partition the
network in such a way that it may appear that the wrong subset of
validators were responsible for the slowdown. This is not just a
limitation of Tendermint, but rather a limitation of all consensus
protocols whose network is potentially controlled by an active
adversary.
Overcoming Forks and Censorship Attacks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For these types of attacks, a subset of the validators through external
means should coordinate to sign a reorg-proposal that chooses a fork
(and any evidence thereof) and the initial subset of validators with
their signatures. Validators who sign such a reorg-proposal forego its
collateral on all other forks. Clients should verify the signatures on
the reorg-proposal, verify any evidence, and make a judgement or prompt
the end-user for a decision. For example, a phone wallet app may prompt
the user with a security warning, while a refrigerator may accept any
reorg-proposal signed by +½ of the original validators.
No non-synchronous Byzantine fault-tolerant algorithm can come to
consensus when ⅓+ of validators are dishonest, yet a fork assumes that
⅓+ of validators have already been dishonest by double-signing or
lock-changing without justification. So, signing the reorg-proposal is a
coordination problem that cannot be solved by any non-synchronous
protocol (i.e. automatically, and without making assumptions about the
reliability of the underlying network). It must be provided by means
external to the weakly-synchronous Tendermint consensus algorithm. For
now, we leave the problem of reorg-proposal coordination to human
coordination via internet media. Validators must take care to ensure
that there are no significant network partitions, to avoid situations
where two conflicting reorg-proposals are signed.
Assuming that the external coordination medium and protocol is robust,
it follows that forks are less of a concern than `censorship
attacks <#censorship-attacks>`__.