You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

308 lines
9.4 KiB

  1. # Blockchain Reactor
  2. The Blockchain Reactor's high level responsibility is to enable peers who are
  3. far behind the current state of the consensus to quickly catch up by downloading
  4. many blocks in parallel, verifying their commits, and executing them against the
  5. ABCI application.
  6. Tendermint full nodes run the Blockchain Reactor as a service to provide blocks
  7. to new nodes. New nodes run the Blockchain Reactor in "fast_sync" mode,
  8. where they actively make requests for more blocks until they sync up.
  9. Once caught up, "fast_sync" mode is disabled and the node switches to
  10. using (and turns on) the Consensus Reactor.
  11. ## Message Types
  12. ```go
  13. const (
  14. msgTypeBlockRequest = byte(0x10)
  15. msgTypeBlockResponse = byte(0x11)
  16. msgTypeNoBlockResponse = byte(0x12)
  17. msgTypeStatusResponse = byte(0x20)
  18. msgTypeStatusRequest = byte(0x21)
  19. )
  20. ```
  21. ```go
  22. type bcBlockRequestMessage struct {
  23. Height int64
  24. }
  25. type bcNoBlockResponseMessage struct {
  26. Height int64
  27. }
  28. type bcBlockResponseMessage struct {
  29. Block Block
  30. }
  31. type bcStatusRequestMessage struct {
  32. Height int64
  33. type bcStatusResponseMessage struct {
  34. Height int64
  35. }
  36. ```
  37. ## Architecture and algorithm
  38. The Blockchain reactor is organised as a set of concurrent tasks:
  39. - Receive routine of Blockchain Reactor
  40. - Task for creating Requesters
  41. - Set of Requesters tasks and - Controller task.
  42. ![Blockchain Reactor Architecture Diagram](img/bc-reactor.png)
  43. ### Data structures
  44. These are the core data structures necessarily to provide the Blockchain Reactor logic.
  45. Requester data structure is used to track assignment of request for `block` at position `height` to a peer with id equals to `peerID`.
  46. ```go
  47. type Requester {
  48. mtx Mutex
  49. block Block
  50. height int64
  51. 
 peerID p2p.ID
  52. redoChannel chan p2p.ID //redo may send multi-time; peerId is used to identify repeat
  53. }
  54. ```
  55. Pool is a core data structure that stores last executed block (`height`), assignment of requests to peers (`requesters`), current height for each peer and number of pending requests for each peer (`peers`), maximum peer height, etc.
  56. ```go
  57. type Pool {
  58. mtx Mutex
  59. requesters map[int64]*Requester
  60. height int64
  61. peers map[p2p.ID]*Peer
  62. maxPeerHeight int64
  63. numPending int32
  64. store BlockStore
  65. requestsChannel chan<- BlockRequest
  66. errorsChannel chan<- peerError
  67. }
  68. ```
  69. Peer data structure stores for each peer current `height` and number of pending requests sent to the peer (`numPending`), etc.
  70. ```go
  71. type Peer struct {
  72. id p2p.ID
  73. height int64
  74. numPending int32
  75. timeout *time.Timer
  76. didTimeout bool
  77. }
  78. ```
  79. BlockRequest is internal data structure used to denote current mapping of request for a block at some `height` to a peer (`PeerID`).
  80. ```go
  81. type BlockRequest {
  82. Height int64
  83. PeerID p2p.ID
  84. }
  85. ```
  86. ### Receive routine of Blockchain Reactor
  87. It is executed upon message reception on the BlockchainChannel inside p2p receive routine. There is a separate p2p receive routine (and therefore receive routine of the Blockchain Reactor) executed for each peer. Note that try to send will not block (returns immediately) if outgoing buffer is full.
  88. ```go
  89. handleMsg(pool, m):
  90. upon receiving bcBlockRequestMessage m from peer p:
  91. block = load block for height m.Height from pool.store
  92. if block != nil then
  93. try to send BlockResponseMessage(block) to p
  94. else
  95. try to send bcNoBlockResponseMessage(m.Height) to p
  96. upon receiving bcBlockResponseMessage m from peer p:
  97. pool.mtx.Lock()
  98. requester = pool.requesters[m.Height]
  99. if requester == nil then
  100. error("peer sent us a block we didn't expect")
  101. continue
  102. if requester.block == nil and requester.peerID == p then
  103. requester.block = m
  104. pool.numPending -= 1 // atomic decrement
  105. peer = pool.peers[p]
  106. if peer != nil then
  107. peer.numPending--
  108. if peer.numPending == 0 then
  109. peer.timeout.Stop()
  110. // NOTE: we don't send Quit signal to the corresponding requester task!
  111. else
  112. trigger peer timeout to expire after peerTimeout
  113. pool.mtx.Unlock()
  114. upon receiving bcStatusRequestMessage m from peer p:
  115. try to send bcStatusResponseMessage(pool.store.Height)
  116. upon receiving bcStatusResponseMessage m from peer p:
  117. pool.mtx.Lock()
  118. peer = pool.peers[p]
  119. if peer != nil then
  120. peer.height = m.height
  121. else
  122. peer = create new Peer data structure with id = p and height = m.Height
  123. pool.peers[p] = peer
  124. if m.Height > pool.maxPeerHeight then
  125. pool.maxPeerHeight = m.Height
  126. pool.mtx.Unlock()
  127. onTimeout(p):
  128. send error message to pool error channel
  129. peer = pool.peers[p]
  130. peer.didTimeout = true
  131. ```
  132. ### Requester tasks
  133. Requester task is responsible for fetching a single block at position `height`.
  134. ```go
  135. fetchBlock(height, pool):
  136. while true do {
  137. peerID = nil
  138. block = nil
  139. peer = pickAvailablePeer(height)
  140. peerID = peer.id
  141. enqueue BlockRequest(height, peerID) to pool.requestsChannel
  142. redo = false
  143. while !redo do
  144. select {
  145. upon receiving Quit message do
  146. return
  147. upon receiving redo message with id on redoChannel do
  148. if peerID == id {
  149. mtx.Lock()
  150. pool.numPending++
  151. redo = true
  152. mtx.UnLock()
  153. }
  154. }
  155. }
  156. pickAvailablePeer(height):
  157. selectedPeer = nil
  158. while selectedPeer = nil do
  159. pool.mtx.Lock()
  160. for each peer in pool.peers do
  161. if !peer.didTimeout and peer.numPending < maxPendingRequestsPerPeer and peer.height >= height then
  162. peer.numPending++
  163. selectedPeer = peer
  164. break
  165. pool.mtx.Unlock()
  166. if selectedPeer = nil then
  167. sleep requestIntervalMS
  168. return selectedPeer
  169. ```
  170. sleep for requestIntervalMS
  171. ### Task for creating Requesters
  172. This task is responsible for continuously creating and starting Requester tasks.
  173. ```go
  174. createRequesters(pool):
  175. while true do
  176. if !pool.isRunning then break
  177. if pool.numPending < maxPendingRequests or size(pool.requesters) < maxTotalRequesters then
  178. pool.mtx.Lock()
  179. nextHeight = pool.height + size(pool.requesters)
  180. requester = create new requester for height nextHeight
  181. pool.requesters[nextHeight] = requester
  182. pool.numPending += 1 // atomic increment
  183. start requester task
  184. pool.mtx.Unlock()
  185. else
  186. sleep requestIntervalMS
  187. pool.mtx.Lock()
  188. for each peer in pool.peers do
  189. if !peer.didTimeout && peer.numPending > 0 && peer.curRate < minRecvRate then
  190. send error on pool error channel
  191. peer.didTimeout = true
  192. if peer.didTimeout then
  193. for each requester in pool.requesters do
  194. if requester.getPeerID() == peer then
  195. enqueue msg on requestor's redoChannel
  196. delete(pool.peers, peerID)
  197. pool.mtx.Unlock()
  198. ```
  199. ### Main blockchain reactor controller task
  200. ```go
  201. main(pool):
  202. create trySyncTicker with interval trySyncIntervalMS
  203. create statusUpdateTicker with interval statusUpdateIntervalSeconds
  204. create switchToConsensusTicker with interval switchToConsensusIntervalSeconds
  205. while true do
  206. select {
  207. upon receiving BlockRequest(Height, Peer) on pool.requestsChannel:
  208. try to send bcBlockRequestMessage(Height) to Peer
  209. upon receiving error(peer) on errorsChannel:
  210. stop peer for error
  211. upon receiving message on statusUpdateTickerChannel:
  212. broadcast bcStatusRequestMessage(bcR.store.Height) // message sent in a separate routine
  213. upon receiving message on switchToConsensusTickerChannel:
  214. pool.mtx.Lock()
  215. receivedBlockOrTimedOut = pool.height > 0 || (time.Now() - pool.startTime) > 5 Seconds
  216. ourChainIsLongestAmongPeers = pool.maxPeerHeight == 0 || pool.height >= pool.maxPeerHeight
  217. haveSomePeers = size of pool.peers > 0
  218. pool.mtx.Unlock()
  219. if haveSomePeers && receivedBlockOrTimedOut && ourChainIsLongestAmongPeers then
  220. switch to consensus mode
  221. upon receiving message on trySyncTickerChannel:
  222. for i = 0; i < 10; i++ do
  223. pool.mtx.Lock()
  224. firstBlock = pool.requesters[pool.height].block
  225. secondBlock = pool.requesters[pool.height].block
  226. if firstBlock == nil or secondBlock == nil then continue
  227. pool.mtx.Unlock()
  228. verify firstBlock using LastCommit from secondBlock
  229. if verification failed
  230. pool.mtx.Lock()
  231. peerID = pool.requesters[pool.height].peerID
  232. redoRequestsForPeer(peerId)
  233. delete(pool.peers, peerID)
  234. stop peer peerID for error
  235. pool.mtx.Unlock()
  236. else
  237. delete(pool.requesters, pool.height)
  238. save firstBlock to store
  239. pool.height++
  240. execute firstBlock
  241. }
  242. redoRequestsForPeer(pool, peerId):
  243. for each requester in pool.requesters do
  244. if requester.getPeerID() == peerID
  245. enqueue msg on redoChannel for requester
  246. ```
  247. ## Channels
  248. Defines `maxMsgSize` for the maximum size of incoming messages,
  249. `SendQueueCapacity` and `RecvBufferCapacity` for maximum sending and
  250. receiving buffers respectively. These are supposed to prevent amplification
  251. attacks by setting up the upper limit on how much data we can receive & send to
  252. a peer.
  253. Sending incorrectly encoded data will result in stopping the peer.