You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

249 lines
7.5 KiB

blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
6 years ago
10 years ago
8 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
6 years ago
7 years ago
7 years ago
7 years ago
  1. package store
  2. import (
  3. "fmt"
  4. "sync"
  5. "github.com/pkg/errors"
  6. dbm "github.com/tendermint/tm-db"
  7. "github.com/tendermint/tendermint/types"
  8. )
  9. /*
  10. BlockStore is a simple low level store for blocks.
  11. There are three types of information stored:
  12. - BlockMeta: Meta information about each block
  13. - Block part: Parts of each block, aggregated w/ PartSet
  14. - Commit: The commit part of each block, for gossiping precommit votes
  15. Currently the precommit signatures are duplicated in the Block parts as
  16. well as the Commit. In the future this may change, perhaps by moving
  17. the Commit data outside the Block. (TODO)
  18. // NOTE: BlockStore methods will panic if they encounter errors
  19. // deserializing loaded data, indicating probable corruption on disk.
  20. */
  21. type BlockStore struct {
  22. db dbm.DB
  23. mtx sync.RWMutex
  24. height int64
  25. }
  26. // NewBlockStore returns a new BlockStore with the given DB,
  27. // initialized to the last height that was committed to the DB.
  28. func NewBlockStore(db dbm.DB) *BlockStore {
  29. bsjson := LoadBlockStoreStateJSON(db)
  30. return &BlockStore{
  31. height: bsjson.Height,
  32. db: db,
  33. }
  34. }
  35. // Height returns the last known contiguous block height.
  36. func (bs *BlockStore) Height() int64 {
  37. bs.mtx.RLock()
  38. defer bs.mtx.RUnlock()
  39. return bs.height
  40. }
  41. // LoadBlock returns the block with the given height.
  42. // If no block is found for that height, it returns nil.
  43. func (bs *BlockStore) LoadBlock(height int64) *types.Block {
  44. var blockMeta = bs.LoadBlockMeta(height)
  45. if blockMeta == nil {
  46. return nil
  47. }
  48. var block = new(types.Block)
  49. buf := []byte{}
  50. for i := 0; i < blockMeta.BlockID.PartsHeader.Total; i++ {
  51. part := bs.LoadBlockPart(height, i)
  52. buf = append(buf, part.Bytes...)
  53. }
  54. err := cdc.UnmarshalBinaryLengthPrefixed(buf, block)
  55. if err != nil {
  56. // NOTE: The existence of meta should imply the existence of the
  57. // block. So, make sure meta is only saved after blocks are saved.
  58. panic(errors.Wrap(err, "Error reading block"))
  59. }
  60. return block
  61. }
  62. // LoadBlockPart returns the Part at the given index
  63. // from the block at the given height.
  64. // If no part is found for the given height and index, it returns nil.
  65. func (bs *BlockStore) LoadBlockPart(height int64, index int) *types.Part {
  66. var part = new(types.Part)
  67. bz := bs.db.Get(calcBlockPartKey(height, index))
  68. if len(bz) == 0 {
  69. return nil
  70. }
  71. err := cdc.UnmarshalBinaryBare(bz, part)
  72. if err != nil {
  73. panic(errors.Wrap(err, "Error reading block part"))
  74. }
  75. return part
  76. }
  77. // LoadBlockMeta returns the BlockMeta for the given height.
  78. // If no block is found for the given height, it returns nil.
  79. func (bs *BlockStore) LoadBlockMeta(height int64) *types.BlockMeta {
  80. var blockMeta = new(types.BlockMeta)
  81. bz := bs.db.Get(calcBlockMetaKey(height))
  82. if len(bz) == 0 {
  83. return nil
  84. }
  85. err := cdc.UnmarshalBinaryBare(bz, blockMeta)
  86. if err != nil {
  87. panic(errors.Wrap(err, "Error reading block meta"))
  88. }
  89. return blockMeta
  90. }
  91. // LoadBlockCommit returns the Commit for the given height.
  92. // This commit consists of the +2/3 and other Precommit-votes for block at `height`,
  93. // and it comes from the block.LastCommit for `height+1`.
  94. // If no commit is found for the given height, it returns nil.
  95. func (bs *BlockStore) LoadBlockCommit(height int64) *types.Commit {
  96. var commit = new(types.Commit)
  97. bz := bs.db.Get(calcBlockCommitKey(height))
  98. if len(bz) == 0 {
  99. return nil
  100. }
  101. err := cdc.UnmarshalBinaryBare(bz, commit)
  102. if err != nil {
  103. panic(errors.Wrap(err, "Error reading block commit"))
  104. }
  105. return commit
  106. }
  107. // LoadSeenCommit returns the locally seen Commit for the given height.
  108. // This is useful when we've seen a commit, but there has not yet been
  109. // a new block at `height + 1` that includes this commit in its block.LastCommit.
  110. func (bs *BlockStore) LoadSeenCommit(height int64) *types.Commit {
  111. var commit = new(types.Commit)
  112. bz := bs.db.Get(calcSeenCommitKey(height))
  113. if len(bz) == 0 {
  114. return nil
  115. }
  116. err := cdc.UnmarshalBinaryBare(bz, commit)
  117. if err != nil {
  118. panic(errors.Wrap(err, "Error reading block seen commit"))
  119. }
  120. return commit
  121. }
  122. // SaveBlock persists the given block, blockParts, and seenCommit to the underlying db.
  123. // blockParts: Must be parts of the block
  124. // seenCommit: The +2/3 precommits that were seen which committed at height.
  125. // If all the nodes restart after committing a block,
  126. // we need this to reload the precommits to catch-up nodes to the
  127. // most recent height. Otherwise they'd stall at H-1.
  128. func (bs *BlockStore) SaveBlock(block *types.Block, blockParts *types.PartSet, seenCommit *types.Commit) {
  129. if block == nil {
  130. panic("BlockStore can only save a non-nil block")
  131. }
  132. height := block.Height
  133. if g, w := height, bs.Height()+1; g != w {
  134. panic(fmt.Sprintf("BlockStore can only save contiguous blocks. Wanted %v, got %v", w, g))
  135. }
  136. if !blockParts.IsComplete() {
  137. panic(fmt.Sprintf("BlockStore can only save complete block part sets"))
  138. }
  139. // Save block meta
  140. blockMeta := types.NewBlockMeta(block, blockParts)
  141. metaBytes := cdc.MustMarshalBinaryBare(blockMeta)
  142. bs.db.Set(calcBlockMetaKey(height), metaBytes)
  143. // Save block parts
  144. for i := 0; i < blockParts.Total(); i++ {
  145. part := blockParts.GetPart(i)
  146. bs.saveBlockPart(height, i, part)
  147. }
  148. // Save block commit (duplicate and separate from the Block)
  149. blockCommitBytes := cdc.MustMarshalBinaryBare(block.LastCommit)
  150. bs.db.Set(calcBlockCommitKey(height-1), blockCommitBytes)
  151. // Save seen commit (seen +2/3 precommits for block)
  152. // NOTE: we can delete this at a later height
  153. seenCommitBytes := cdc.MustMarshalBinaryBare(seenCommit)
  154. bs.db.Set(calcSeenCommitKey(height), seenCommitBytes)
  155. // Save new BlockStoreStateJSON descriptor
  156. BlockStoreStateJSON{Height: height}.Save(bs.db)
  157. // Done!
  158. bs.mtx.Lock()
  159. bs.height = height
  160. bs.mtx.Unlock()
  161. // Flush
  162. bs.db.SetSync(nil, nil)
  163. }
  164. func (bs *BlockStore) saveBlockPart(height int64, index int, part *types.Part) {
  165. if height != bs.Height()+1 {
  166. panic(fmt.Sprintf("BlockStore can only save contiguous blocks. Wanted %v, got %v", bs.Height()+1, height))
  167. }
  168. partBytes := cdc.MustMarshalBinaryBare(part)
  169. bs.db.Set(calcBlockPartKey(height, index), partBytes)
  170. }
  171. //-----------------------------------------------------------------------------
  172. func calcBlockMetaKey(height int64) []byte {
  173. return []byte(fmt.Sprintf("H:%v", height))
  174. }
  175. func calcBlockPartKey(height int64, partIndex int) []byte {
  176. return []byte(fmt.Sprintf("P:%v:%v", height, partIndex))
  177. }
  178. func calcBlockCommitKey(height int64) []byte {
  179. return []byte(fmt.Sprintf("C:%v", height))
  180. }
  181. func calcSeenCommitKey(height int64) []byte {
  182. return []byte(fmt.Sprintf("SC:%v", height))
  183. }
  184. //-----------------------------------------------------------------------------
  185. var blockStoreKey = []byte("blockStore")
  186. // BlockStoreStateJSON is the block store state JSON structure.
  187. type BlockStoreStateJSON struct {
  188. Height int64 `json:"height"`
  189. }
  190. // Save persists the blockStore state to the database as JSON.
  191. func (bsj BlockStoreStateJSON) Save(db dbm.DB) {
  192. bytes, err := cdc.MarshalJSON(bsj)
  193. if err != nil {
  194. panic(fmt.Sprintf("Could not marshal state bytes: %v", err))
  195. }
  196. db.SetSync(blockStoreKey, bytes)
  197. }
  198. // LoadBlockStoreStateJSON returns the BlockStoreStateJSON as loaded from disk.
  199. // If no BlockStoreStateJSON was previously persisted, it returns the zero value.
  200. func LoadBlockStoreStateJSON(db dbm.DB) BlockStoreStateJSON {
  201. bytes := db.Get(blockStoreKey)
  202. if len(bytes) == 0 {
  203. return BlockStoreStateJSON{
  204. Height: 0,
  205. }
  206. }
  207. bsj := BlockStoreStateJSON{}
  208. err := cdc.UnmarshalJSON(bytes, &bsj)
  209. if err != nil {
  210. panic(fmt.Sprintf("Could not unmarshal bytes: %X", bytes))
  211. }
  212. return bsj
  213. }