You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

305 lines
9.2 KiB

  1. # Blockchain Reactor
  2. The Blockchain Reactor's high level responsibility is to enable peers who are
  3. far behind the current state of the consensus to quickly catch up by downloading
  4. many blocks in parallel, verifying their commits, and executing them against the
  5. ABCI application.
  6. Tendermint full nodes run the Blockchain Reactor as a service to provide blocks
  7. to new nodes. New nodes run the Blockchain Reactor in "fast_sync" mode,
  8. where they actively make requests for more blocks until they sync up.
  9. Once caught up, "fast_sync" mode is disabled and the node switches to
  10. using (and turns on) the Consensus Reactor.
  11. ## Message Types
  12. ```go
  13. const (
  14. msgTypeBlockRequest = byte(0x10)
  15. msgTypeBlockResponse = byte(0x11)
  16. msgTypeNoBlockResponse = byte(0x12)
  17. msgTypeStatusResponse = byte(0x20)
  18. msgTypeStatusRequest = byte(0x21)
  19. )
  20. ```
  21. ```go
  22. type bcBlockRequestMessage struct {
  23. Height int64
  24. }
  25. type bcNoBlockResponseMessage struct {
  26. Height int64
  27. }
  28. type bcBlockResponseMessage struct {
  29. Block Block
  30. }
  31. type bcStatusRequestMessage struct {
  32. Height int64
  33. type bcStatusResponseMessage struct {
  34. Height int64
  35. }
  36. ```
  37. ## Architecture and algorithm
  38. The Blockchain reactor is organised as a set of concurrent tasks:
  39. - Receive routine of Blockchain Reactor
  40. - Task for creating Requesters
  41. - Set of Requesters tasks and - Controller task.
  42. ![Blockchain Reactor Architecture Diagram](img/bc-reactor.png)
  43. ### Data structures
  44. These are the core data structures necessarily to provide the Blockchain Reactor logic.
  45. Requester data structure is used to track assignment of request for `block` at position `height` to a peer with id equals to `peerID`.
  46. ```go
  47. type Requester {
  48. mtx Mutex
  49. block Block
  50. height int64
  51. 
peerID p2p.ID
  52. redoChannel chan struct{}
  53. }
  54. ```
  55. Pool is core data structure that stores last executed block (`height`), assignment of requests to peers (`requesters`), current height for each peer and number of pending requests for each peer (`peers`), maximum peer height, etc.
  56. ```go
  57. type Pool {
  58. mtx Mutex
  59. requesters map[int64]*Requester
  60. height int64
  61. peers map[p2p.ID]*Peer
  62. maxPeerHeight int64
  63. numPending int32
  64. store BlockStore
  65. requestsChannel chan<- BlockRequest
  66. errorsChannel chan<- peerError
  67. }
  68. ```
  69. Peer data structure stores for each peer current `height` and number of pending requests sent to the peer (`numPending`), etc.
  70. ```go
  71. type Peer struct {
  72. id p2p.ID
  73. height int64
  74. numPending int32
  75. timeout *time.Timer
  76. didTimeout bool
  77. }
  78. ```
  79. BlockRequest is internal data structure used to denote current mapping of request for a block at some `height` to a peer (`PeerID`).
  80. ```go
  81. type BlockRequest {
  82. Height int64
  83. PeerID p2p.ID
  84. }
  85. ```
  86. ### Receive routine of Blockchain Reactor
  87. It is executed upon message reception on the BlockchainChannel inside p2p receive routine. There is a separate p2p receive routine (and therefore receive routine of the Blockchain Reactor) executed for each peer. Note that try to send will not block (returns immediately) if outgoing buffer is full.
  88. ```go
  89. handleMsg(pool, m):
  90. upon receiving bcBlockRequestMessage m from peer p:
  91. block = load block for height m.Height from pool.store
  92. if block != nil then
  93. try to send BlockResponseMessage(block) to p
  94. else
  95. try to send bcNoBlockResponseMessage(m.Height) to p
  96. upon receiving bcBlockResponseMessage m from peer p:
  97. pool.mtx.Lock()
  98. requester = pool.requesters[m.Height]
  99. if requester == nil then
  100. error("peer sent us a block we didn't expect")
  101. continue
  102. if requester.block == nil and requester.peerID == p then
  103. requester.block = m
  104. pool.numPending -= 1 // atomic decrement
  105. peer = pool.peers[p]
  106. if peer != nil then
  107. peer.numPending--
  108. if peer.numPending == 0 then
  109. peer.timeout.Stop()
  110. // NOTE: we don't send Quit signal to the corresponding requester task!
  111. else
  112. trigger peer timeout to expire after peerTimeout
  113. pool.mtx.Unlock()
  114. upon receiving bcStatusRequestMessage m from peer p:
  115. try to send bcStatusResponseMessage(pool.store.Height)
  116. upon receiving bcStatusResponseMessage m from peer p:
  117. pool.mtx.Lock()
  118. peer = pool.peers[p]
  119. if peer != nil then
  120. peer.height = m.height
  121. else
  122. peer = create new Peer data structure with id = p and height = m.Height
  123. pool.peers[p] = peer
  124. if m.Height > pool.maxPeerHeight then
  125. pool.maxPeerHeight = m.Height
  126. pool.mtx.Unlock()
  127. onTimeout(p):
  128. send error message to pool error channel
  129. peer = pool.peers[p]
  130. peer.didTimeout = true
  131. ```
  132. ### Requester tasks
  133. Requester task is responsible for fetching a single block at position `height`.
  134. ```go
  135. fetchBlock(height, pool):
  136. while true do
  137. peerID = nil
  138. block = nil
  139. peer = pickAvailablePeer(height)
  140. peerId = peer.id
  141. enqueue BlockRequest(height, peerID) to pool.requestsChannel
  142. redo = false
  143. while !redo do
  144. select {
  145. upon receiving Quit message do
  146. return
  147. upon receiving message on redoChannel do
  148. mtx.Lock()
  149. pool.numPending++
  150. redo = true
  151. mtx.UnLock()
  152. }
  153. pickAvailablePeer(height):
  154. selectedPeer = nil
  155. while selectedPeer = nil do
  156. pool.mtx.Lock()
  157. for each peer in pool.peers do
  158. if !peer.didTimeout and peer.numPending < maxPendingRequestsPerPeer and peer.height >= height then
  159. peer.numPending++
  160. selectedPeer = peer
  161. break
  162. pool.mtx.Unlock()
  163. if selectedPeer = nil then
  164. sleep requestIntervalMS
  165. return selectedPeer
  166. ```
  167. sleep for requestIntervalMS
  168. ### Task for creating Requesters
  169. This task is responsible for continuously creating and starting Requester tasks.
  170. ```go
  171. createRequesters(pool):
  172. while true do
  173. if !pool.isRunning then break
  174. if pool.numPending < maxPendingRequests or size(pool.requesters) < maxTotalRequesters then
  175. pool.mtx.Lock()
  176. nextHeight = pool.height + size(pool.requesters)
  177. requester = create new requester for height nextHeight
  178. pool.requesters[nextHeight] = requester
  179. pool.numPending += 1 // atomic increment
  180. start requester task
  181. pool.mtx.Unlock()
  182. else
  183. sleep requestIntervalMS
  184. pool.mtx.Lock()
  185. for each peer in pool.peers do
  186. if !peer.didTimeout && peer.numPending > 0 && peer.curRate < minRecvRate then
  187. send error on pool error channel
  188. peer.didTimeout = true
  189. if peer.didTimeout then
  190. for each requester in pool.requesters do
  191. if requester.getPeerID() == peer then
  192. enqueue msg on requestor's redoChannel
  193. delete(pool.peers, peerID)
  194. pool.mtx.Unlock()
  195. ```
  196. ### Main blockchain reactor controller task
  197. ```go
  198. main(pool):
  199. create trySyncTicker with interval trySyncIntervalMS
  200. create statusUpdateTicker with interval statusUpdateIntervalSeconds
  201. create switchToConsensusTicker with interbal switchToConsensusIntervalSeconds
  202. while true do
  203. select {
  204. upon receiving BlockRequest(Height, Peer) on pool.requestsChannel:
  205. try to send bcBlockRequestMessage(Height) to Peer
  206. upon receiving error(peer) on errorsChannel:
  207. stop peer for error
  208. upon receiving message on statusUpdateTickerChannel:
  209. broadcast bcStatusRequestMessage(bcR.store.Height) // message sent in a separate routine
  210. upon receiving message on switchToConsensusTickerChannel:
  211. pool.mtx.Lock()
  212. receivedBlockOrTimedOut = pool.height > 0 || (time.Now() - pool.startTime) > 5 Seconds
  213. ourChainIsLongestAmongPeers = pool.maxPeerHeight == 0 || pool.height >= pool.maxPeerHeight
  214. haveSomePeers = size of pool.peers > 0
  215. pool.mtx.Unlock()
  216. if haveSomePeers && receivedBlockOrTimedOut && ourChainIsLongestAmongPeers then
  217. switch to consensus mode
  218. upon receiving message on trySyncTickerChannel:
  219. for i = 0; i < 10; i++ do
  220. pool.mtx.Lock()
  221. firstBlock = pool.requesters[pool.height].block
  222. secondBlock = pool.requesters[pool.height].block
  223. if firstBlock == nil or secondBlock == nil then continue
  224. pool.mtx.Unlock()
  225. verify firstBlock using LastCommit from secondBlock
  226. if verification failed
  227. pool.mtx.Lock()
  228. peerID = pool.requesters[pool.height].peerID
  229. redoRequestsForPeer(peerId)
  230. delete(pool.peers, peerID)
  231. stop peer peerID for error
  232. pool.mtx.Unlock()
  233. else
  234. delete(pool.requesters, pool.height)
  235. save firstBlock to store
  236. pool.height++
  237. execute firstBlock
  238. }
  239. redoRequestsForPeer(pool, peerId):
  240. for each requester in pool.requesters do
  241. if requester.getPeerID() == peerID
  242. enqueue msg on redoChannel for requester
  243. ```
  244. ## Channels
  245. Defines `maxMsgSize` for the maximum size of incoming messages,
  246. `SendQueueCapacity` and `RecvBufferCapacity` for maximum sending and
  247. receiving buffers respectively. These are supposed to prevent amplification
  248. attacks by setting up the upper limit on how much data we can receive & send to
  249. a peer.
  250. Sending incorrectly encoded data will result in stopping the peer.