You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

721 lines
20 KiB

new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
8 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
6 years ago
pubsub 2.0 (#3227) * green pubsub tests :OK: * get rid of clientToQueryMap * Subscribe and SubscribeUnbuffered * start adapting other pkgs to new pubsub * nope * rename MsgAndTags to Message * remove TagMap it does not bring any additional benefits * bring back EventSubscriber * fix test * fix data race in TestStartNextHeightCorrectly ``` Write at 0x00c0001c7418 by goroutine 796: github.com/tendermint/tendermint/consensus.TestStartNextHeightCorrectly() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1296 +0xad testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Previous read at 0x00c0001c7418 by goroutine 858: github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1631 +0x1366 github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1476 +0x8f github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg() /go/src/github.com/tendermint/tendermint/consensus/state.go:667 +0xa1e github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine() /go/src/github.com/tendermint/tendermint/consensus/state.go:628 +0x794 Goroutine 796 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 testing.runTests.func1() /usr/local/go/src/testing/testing.go:1119 +0xa8 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 testing.runTests() /usr/local/go/src/testing/testing.go:1117 +0x4ee testing.(*M).Run() /usr/local/go/src/testing/testing.go:1034 +0x2ee main.main() _testmain.go:214 +0x332 Goroutine 858 (running) created at: github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines() /go/src/github.com/tendermint/tendermint/consensus/state.go:334 +0x221 github.com/tendermint/tendermint/consensus.startTestRound() /go/src/github.com/tendermint/tendermint/consensus/common_test.go:122 +0x63 github.com/tendermint/tendermint/consensus.TestStateFullRound1() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:255 +0x397 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ``` * fixes after my own review * fix formatting * wait 100ms before kicking a subscriber out + a test for indexer_service * fixes after my second review * no timeout * add changelog entries * fix merge conflicts * fix typos after Thane's review Co-Authored-By: melekes <anton.kalyaev@gmail.com> * reformat code * rewrite indexer service in the attempt to fix failing test https://github.com/tendermint/tendermint/pull/3227/#issuecomment-462316527 * Revert "rewrite indexer service in the attempt to fix failing test" This reverts commit 0d9107a098230de7138abb1c201877c246e89ed1. * another attempt to fix indexer * fixes after Ethan's review * use unbuffered channel when indexing transactions Refs https://github.com/tendermint/tendermint/pull/3227#discussion_r258786716 * add a comment for EventBus#SubscribeUnbuffered * format code
6 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
8 years ago
cs/replay: execCommitBlock should not read from state.lastValidators (#3067) * execCommitBlock should not read from state.lastValidators * fix height 1 * fix blockchain/reactor_test * fix consensus/mempool_test * fix consensus/reactor_test * fix consensus/replay_test * add CHANGELOG * fix consensus/reactor_test * fix consensus/replay_test * add a test for replay validators change * fix mem_pool test * fix byzantine test * remove a redundant code * reduce validator change blocks to 6 * fix * return peer0 config * seperate testName * seperate testName 1 * seperate testName 2 * seperate app db path * seperate app db path 1 * add a lock before startNet * move the lock to reactor_test * simulate just once * try to find problem * handshake only saveState when app version changed * update gometalinter to 3.0.0 (#3233) in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165 also code is simplified by running gofmt -s . remove unused vars enable linters we're currently passing remove deprecated linters (cherry picked from commit d47094550315c094512a242445e0dde24b5a03f5) * gofmt code * goimport code * change the bool name to testValidatorsChange * adjust receive kvstore.ProtocolVersion * adjust receive kvstore.ProtocolVersion 1 * adjust receive kvstore.ProtocolVersion 3 * fix merge execution.go * fix merge develop * fix merge develop 1 * fix run cleanupFunc * adjust code according to reviewers' opinion * modify the func name match the convention * simplify simulate a chain containing some validator change txs 1 * test CI error * Merge remote-tracking branch 'upstream/develop' into fixReplay 1 * fix pubsub_test * subscribeUnbuffered vote channel
6 years ago
cs/replay: execCommitBlock should not read from state.lastValidators (#3067) * execCommitBlock should not read from state.lastValidators * fix height 1 * fix blockchain/reactor_test * fix consensus/mempool_test * fix consensus/reactor_test * fix consensus/replay_test * add CHANGELOG * fix consensus/reactor_test * fix consensus/replay_test * add a test for replay validators change * fix mem_pool test * fix byzantine test * remove a redundant code * reduce validator change blocks to 6 * fix * return peer0 config * seperate testName * seperate testName 1 * seperate testName 2 * seperate app db path * seperate app db path 1 * add a lock before startNet * move the lock to reactor_test * simulate just once * try to find problem * handshake only saveState when app version changed * update gometalinter to 3.0.0 (#3233) in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165 also code is simplified by running gofmt -s . remove unused vars enable linters we're currently passing remove deprecated linters (cherry picked from commit d47094550315c094512a242445e0dde24b5a03f5) * gofmt code * goimport code * change the bool name to testValidatorsChange * adjust receive kvstore.ProtocolVersion * adjust receive kvstore.ProtocolVersion 1 * adjust receive kvstore.ProtocolVersion 3 * fix merge execution.go * fix merge develop * fix merge develop 1 * fix run cleanupFunc * adjust code according to reviewers' opinion * modify the func name match the convention * simplify simulate a chain containing some validator change txs 1 * test CI error * Merge remote-tracking branch 'upstream/develop' into fixReplay 1 * fix pubsub_test * subscribeUnbuffered vote channel
6 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
8 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
6 years ago
cs/replay: execCommitBlock should not read from state.lastValidators (#3067) * execCommitBlock should not read from state.lastValidators * fix height 1 * fix blockchain/reactor_test * fix consensus/mempool_test * fix consensus/reactor_test * fix consensus/replay_test * add CHANGELOG * fix consensus/reactor_test * fix consensus/replay_test * add a test for replay validators change * fix mem_pool test * fix byzantine test * remove a redundant code * reduce validator change blocks to 6 * fix * return peer0 config * seperate testName * seperate testName 1 * seperate testName 2 * seperate app db path * seperate app db path 1 * add a lock before startNet * move the lock to reactor_test * simulate just once * try to find problem * handshake only saveState when app version changed * update gometalinter to 3.0.0 (#3233) in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165 also code is simplified by running gofmt -s . remove unused vars enable linters we're currently passing remove deprecated linters (cherry picked from commit d47094550315c094512a242445e0dde24b5a03f5) * gofmt code * goimport code * change the bool name to testValidatorsChange * adjust receive kvstore.ProtocolVersion * adjust receive kvstore.ProtocolVersion 1 * adjust receive kvstore.ProtocolVersion 3 * fix merge execution.go * fix merge develop * fix merge develop 1 * fix run cleanupFunc * adjust code according to reviewers' opinion * modify the func name match the convention * simplify simulate a chain containing some validator change txs 1 * test CI error * Merge remote-tracking branch 'upstream/develop' into fixReplay 1 * fix pubsub_test * subscribeUnbuffered vote channel
6 years ago
  1. package consensus
  2. import (
  3. "context"
  4. "fmt"
  5. "os"
  6. "path"
  7. "sync"
  8. "testing"
  9. "time"
  10. "github.com/fortytw2/leaktest"
  11. "github.com/stretchr/testify/mock"
  12. "github.com/stretchr/testify/require"
  13. abcicli "github.com/tendermint/tendermint/abci/client"
  14. "github.com/tendermint/tendermint/abci/example/kvstore"
  15. abci "github.com/tendermint/tendermint/abci/types"
  16. cfg "github.com/tendermint/tendermint/config"
  17. cryptoenc "github.com/tendermint/tendermint/crypto/encoding"
  18. tmsync "github.com/tendermint/tendermint/internal/libs/sync"
  19. "github.com/tendermint/tendermint/internal/mempool"
  20. mempoolv0 "github.com/tendermint/tendermint/internal/mempool/v0"
  21. "github.com/tendermint/tendermint/internal/p2p"
  22. "github.com/tendermint/tendermint/internal/p2p/p2ptest"
  23. "github.com/tendermint/tendermint/internal/test/factory"
  24. "github.com/tendermint/tendermint/libs/log"
  25. tmcons "github.com/tendermint/tendermint/proto/tendermint/consensus"
  26. sm "github.com/tendermint/tendermint/state"
  27. statemocks "github.com/tendermint/tendermint/state/mocks"
  28. "github.com/tendermint/tendermint/store"
  29. "github.com/tendermint/tendermint/types"
  30. dbm "github.com/tendermint/tm-db"
  31. )
  32. var (
  33. defaultTestTime = time.Date(2019, 1, 1, 0, 0, 0, 0, time.UTC)
  34. )
  35. type reactorTestSuite struct {
  36. network *p2ptest.Network
  37. states map[p2p.NodeID]*State
  38. reactors map[p2p.NodeID]*Reactor
  39. subs map[p2p.NodeID]types.Subscription
  40. stateChannels map[p2p.NodeID]*p2p.Channel
  41. dataChannels map[p2p.NodeID]*p2p.Channel
  42. voteChannels map[p2p.NodeID]*p2p.Channel
  43. voteSetBitsChannels map[p2p.NodeID]*p2p.Channel
  44. }
  45. func chDesc(chID p2p.ChannelID) p2p.ChannelDescriptor {
  46. return p2p.ChannelDescriptor{
  47. ID: byte(chID),
  48. }
  49. }
  50. func setup(t *testing.T, numNodes int, states []*State, size int) *reactorTestSuite {
  51. t.Helper()
  52. rts := &reactorTestSuite{
  53. network: p2ptest.MakeNetwork(t, p2ptest.NetworkOptions{NumNodes: numNodes}),
  54. states: make(map[p2p.NodeID]*State),
  55. reactors: make(map[p2p.NodeID]*Reactor, numNodes),
  56. subs: make(map[p2p.NodeID]types.Subscription, numNodes),
  57. }
  58. rts.stateChannels = rts.network.MakeChannelsNoCleanup(t, chDesc(StateChannel), new(tmcons.Message), size)
  59. rts.dataChannels = rts.network.MakeChannelsNoCleanup(t, chDesc(DataChannel), new(tmcons.Message), size)
  60. rts.voteChannels = rts.network.MakeChannelsNoCleanup(t, chDesc(VoteChannel), new(tmcons.Message), size)
  61. rts.voteSetBitsChannels = rts.network.MakeChannelsNoCleanup(t, chDesc(VoteSetBitsChannel), new(tmcons.Message), size)
  62. i := 0
  63. for nodeID, node := range rts.network.Nodes {
  64. state := states[i]
  65. reactor := NewReactor(
  66. state.Logger.With("node", nodeID),
  67. state,
  68. rts.stateChannels[nodeID],
  69. rts.dataChannels[nodeID],
  70. rts.voteChannels[nodeID],
  71. rts.voteSetBitsChannels[nodeID],
  72. node.MakePeerUpdates(t),
  73. true,
  74. )
  75. reactor.SetEventBus(state.eventBus)
  76. blocksSub, err := state.eventBus.Subscribe(context.Background(), testSubscriber, types.EventQueryNewBlock, size)
  77. require.NoError(t, err)
  78. rts.states[nodeID] = state
  79. rts.subs[nodeID] = blocksSub
  80. rts.reactors[nodeID] = reactor
  81. // simulate handle initChain in handshake
  82. if state.state.LastBlockHeight == 0 {
  83. require.NoError(t, state.blockExec.Store().Save(state.state))
  84. }
  85. require.NoError(t, reactor.Start())
  86. require.True(t, reactor.IsRunning())
  87. i++
  88. }
  89. require.Len(t, rts.reactors, numNodes)
  90. // start the in-memory network and connect all peers with each other
  91. rts.network.Start(t)
  92. t.Cleanup(func() {
  93. for nodeID, r := range rts.reactors {
  94. require.NoError(t, rts.states[nodeID].eventBus.Stop())
  95. require.NoError(t, r.Stop())
  96. require.False(t, r.IsRunning())
  97. }
  98. leaktest.Check(t)
  99. })
  100. return rts
  101. }
  102. func validateBlock(block *types.Block, activeVals map[string]struct{}) error {
  103. if block.LastCommit.Size() != len(activeVals) {
  104. return fmt.Errorf(
  105. "commit size doesn't match number of active validators. Got %d, expected %d",
  106. block.LastCommit.Size(), len(activeVals),
  107. )
  108. }
  109. for _, commitSig := range block.LastCommit.Signatures {
  110. if _, ok := activeVals[string(commitSig.ValidatorAddress)]; !ok {
  111. return fmt.Errorf("found vote for inactive validator %X", commitSig.ValidatorAddress)
  112. }
  113. }
  114. return nil
  115. }
  116. func waitForAndValidateBlock(
  117. t *testing.T,
  118. n int,
  119. activeVals map[string]struct{},
  120. blocksSubs []types.Subscription,
  121. states []*State,
  122. txs ...[]byte,
  123. ) {
  124. fn := func(j int) {
  125. msg := <-blocksSubs[j].Out()
  126. newBlock := msg.Data().(types.EventDataNewBlock).Block
  127. require.NoError(t, validateBlock(newBlock, activeVals))
  128. for _, tx := range txs {
  129. require.NoError(t, assertMempool(states[j].txNotifier).CheckTx(context.Background(), tx, nil, mempool.TxInfo{}))
  130. }
  131. }
  132. var wg sync.WaitGroup
  133. wg.Add(n)
  134. for i := 0; i < n; i++ {
  135. go func(j int) {
  136. fn(j)
  137. wg.Done()
  138. }(i)
  139. }
  140. wg.Wait()
  141. }
  142. func waitForAndValidateBlockWithTx(
  143. t *testing.T,
  144. n int,
  145. activeVals map[string]struct{},
  146. blocksSubs []types.Subscription,
  147. states []*State,
  148. txs ...[]byte,
  149. ) {
  150. fn := func(j int) {
  151. ntxs := 0
  152. BLOCK_TX_LOOP:
  153. for {
  154. msg := <-blocksSubs[j].Out()
  155. newBlock := msg.Data().(types.EventDataNewBlock).Block
  156. require.NoError(t, validateBlock(newBlock, activeVals))
  157. // check that txs match the txs we're waiting for.
  158. // note they could be spread over multiple blocks,
  159. // but they should be in order.
  160. for _, tx := range newBlock.Data.Txs {
  161. require.EqualValues(t, txs[ntxs], tx)
  162. ntxs++
  163. }
  164. if ntxs == len(txs) {
  165. break BLOCK_TX_LOOP
  166. }
  167. }
  168. }
  169. var wg sync.WaitGroup
  170. wg.Add(n)
  171. for i := 0; i < n; i++ {
  172. go func(j int) {
  173. fn(j)
  174. wg.Done()
  175. }(i)
  176. }
  177. wg.Wait()
  178. }
  179. func waitForBlockWithUpdatedValsAndValidateIt(
  180. t *testing.T,
  181. n int,
  182. updatedVals map[string]struct{},
  183. blocksSubs []types.Subscription,
  184. css []*State,
  185. ) {
  186. fn := func(j int) {
  187. var newBlock *types.Block
  188. LOOP:
  189. for {
  190. msg := <-blocksSubs[j].Out()
  191. newBlock = msg.Data().(types.EventDataNewBlock).Block
  192. if newBlock.LastCommit.Size() == len(updatedVals) {
  193. break LOOP
  194. }
  195. }
  196. require.NoError(t, validateBlock(newBlock, updatedVals))
  197. }
  198. var wg sync.WaitGroup
  199. wg.Add(n)
  200. for i := 0; i < n; i++ {
  201. go func(j int) {
  202. fn(j)
  203. wg.Done()
  204. }(i)
  205. }
  206. wg.Wait()
  207. }
  208. func TestReactorBasic(t *testing.T) {
  209. config := configSetup(t)
  210. n := 4
  211. states, cleanup := randConsensusState(t, config, n, "consensus_reactor_test", newMockTickerFunc(true), newCounter)
  212. t.Cleanup(cleanup)
  213. rts := setup(t, n, states, 100) // buffer must be large enough to not deadlock
  214. for _, reactor := range rts.reactors {
  215. state := reactor.state.GetState()
  216. reactor.SwitchToConsensus(state, false)
  217. }
  218. var wg sync.WaitGroup
  219. for _, sub := range rts.subs {
  220. wg.Add(1)
  221. // wait till everyone makes the first new block
  222. go func(s types.Subscription) {
  223. <-s.Out()
  224. wg.Done()
  225. }(sub)
  226. }
  227. wg.Wait()
  228. }
  229. func TestReactorWithEvidence(t *testing.T) {
  230. config := configSetup(t)
  231. n := 4
  232. testName := "consensus_reactor_test"
  233. tickerFunc := newMockTickerFunc(true)
  234. appFunc := newCounter
  235. genDoc, privVals := factory.RandGenesisDoc(config, n, false, 30)
  236. states := make([]*State, n)
  237. logger := consensusLogger()
  238. for i := 0; i < n; i++ {
  239. stateDB := dbm.NewMemDB() // each state needs its own db
  240. stateStore := sm.NewStore(stateDB)
  241. state, err := sm.MakeGenesisState(genDoc)
  242. require.NoError(t, err)
  243. thisConfig := ResetConfig(fmt.Sprintf("%s_%d", testName, i))
  244. defer os.RemoveAll(thisConfig.RootDir)
  245. ensureDir(path.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
  246. app := appFunc()
  247. vals := types.TM2PB.ValidatorUpdates(state.Validators)
  248. app.InitChain(abci.RequestInitChain{Validators: vals})
  249. pv := privVals[i]
  250. blockDB := dbm.NewMemDB()
  251. blockStore := store.NewBlockStore(blockDB)
  252. // one for mempool, one for consensus
  253. mtx := new(tmsync.RWMutex)
  254. proxyAppConnMem := abcicli.NewLocalClient(mtx, app)
  255. proxyAppConnCon := abcicli.NewLocalClient(mtx, app)
  256. mempool := mempoolv0.NewCListMempool(thisConfig.Mempool, proxyAppConnMem, 0)
  257. mempool.SetLogger(log.TestingLogger().With("module", "mempool"))
  258. if thisConfig.Consensus.WaitForTxs() {
  259. mempool.EnableTxsAvailable()
  260. }
  261. // mock the evidence pool
  262. // everyone includes evidence of another double signing
  263. vIdx := (i + 1) % n
  264. ev := types.NewMockDuplicateVoteEvidenceWithValidator(1, defaultTestTime, privVals[vIdx], config.ChainID())
  265. evpool := &statemocks.EvidencePool{}
  266. evpool.On("CheckEvidence", mock.AnythingOfType("types.EvidenceList")).Return(nil)
  267. evpool.On("PendingEvidence", mock.AnythingOfType("int64")).Return([]types.Evidence{
  268. ev}, int64(len(ev.Bytes())))
  269. evpool.On("Update", mock.AnythingOfType("state.State"), mock.AnythingOfType("types.EvidenceList")).Return()
  270. evpool2 := sm.EmptyEvidencePool{}
  271. blockExec := sm.NewBlockExecutor(stateStore, log.TestingLogger(), proxyAppConnCon, mempool, evpool, blockStore)
  272. cs := NewState(thisConfig.Consensus, state, blockExec, blockStore, mempool, evpool2)
  273. cs.SetLogger(log.TestingLogger().With("module", "consensus"))
  274. cs.SetPrivValidator(pv)
  275. eventBus := types.NewEventBus()
  276. eventBus.SetLogger(log.TestingLogger().With("module", "events"))
  277. err = eventBus.Start()
  278. require.NoError(t, err)
  279. cs.SetEventBus(eventBus)
  280. cs.SetTimeoutTicker(tickerFunc())
  281. cs.SetLogger(logger.With("validator", i, "module", "consensus"))
  282. states[i] = cs
  283. }
  284. rts := setup(t, n, states, 100) // buffer must be large enough to not deadlock
  285. for _, reactor := range rts.reactors {
  286. state := reactor.state.GetState()
  287. reactor.SwitchToConsensus(state, false)
  288. }
  289. var wg sync.WaitGroup
  290. for _, sub := range rts.subs {
  291. wg.Add(1)
  292. // We expect for each validator that is the proposer to propose one piece of
  293. // evidence.
  294. go func(s types.Subscription) {
  295. msg := <-s.Out()
  296. block := msg.Data().(types.EventDataNewBlock).Block
  297. require.Len(t, block.Evidence.Evidence, 1)
  298. wg.Done()
  299. }(sub)
  300. }
  301. wg.Wait()
  302. }
  303. func TestReactorCreatesBlockWhenEmptyBlocksFalse(t *testing.T) {
  304. config := configSetup(t)
  305. n := 4
  306. states, cleanup := randConsensusState(
  307. t,
  308. config,
  309. n,
  310. "consensus_reactor_test",
  311. newMockTickerFunc(true),
  312. newCounter,
  313. func(c *cfg.Config) {
  314. c.Consensus.CreateEmptyBlocks = false
  315. },
  316. )
  317. t.Cleanup(cleanup)
  318. rts := setup(t, n, states, 100) // buffer must be large enough to not deadlock
  319. for _, reactor := range rts.reactors {
  320. state := reactor.state.GetState()
  321. reactor.SwitchToConsensus(state, false)
  322. }
  323. // send a tx
  324. require.NoError(
  325. t,
  326. assertMempool(states[3].txNotifier).CheckTx(
  327. context.Background(),
  328. []byte{1, 2, 3},
  329. nil,
  330. mempool.TxInfo{},
  331. ),
  332. )
  333. var wg sync.WaitGroup
  334. for _, sub := range rts.subs {
  335. wg.Add(1)
  336. // wait till everyone makes the first new block
  337. go func(s types.Subscription) {
  338. <-s.Out()
  339. wg.Done()
  340. }(sub)
  341. }
  342. wg.Wait()
  343. }
  344. func TestReactorRecordsVotesAndBlockParts(t *testing.T) {
  345. config := configSetup(t)
  346. n := 4
  347. states, cleanup := randConsensusState(t, config, n, "consensus_reactor_test", newMockTickerFunc(true), newCounter)
  348. t.Cleanup(cleanup)
  349. rts := setup(t, n, states, 100) // buffer must be large enough to not deadlock
  350. for _, reactor := range rts.reactors {
  351. state := reactor.state.GetState()
  352. reactor.SwitchToConsensus(state, false)
  353. }
  354. var wg sync.WaitGroup
  355. for _, sub := range rts.subs {
  356. wg.Add(1)
  357. // wait till everyone makes the first new block
  358. go func(s types.Subscription) {
  359. <-s.Out()
  360. wg.Done()
  361. }(sub)
  362. }
  363. wg.Wait()
  364. // Require at least one node to have sent block parts, but we can't know which
  365. // peer sent it.
  366. require.Eventually(
  367. t,
  368. func() bool {
  369. for _, reactor := range rts.reactors {
  370. for _, ps := range reactor.peers {
  371. if ps.BlockPartsSent() > 0 {
  372. return true
  373. }
  374. }
  375. }
  376. return false
  377. },
  378. time.Second,
  379. 10*time.Millisecond,
  380. "number of block parts sent should've increased",
  381. )
  382. nodeID := rts.network.RandomNode().NodeID
  383. reactor := rts.reactors[nodeID]
  384. peers := rts.network.Peers(nodeID)
  385. ps, ok := reactor.GetPeerState(peers[0].NodeID)
  386. require.True(t, ok)
  387. require.NotNil(t, ps)
  388. require.Greater(t, ps.VotesSent(), 0, "number of votes sent should've increased")
  389. }
  390. func TestReactorVotingPowerChange(t *testing.T) {
  391. config := configSetup(t)
  392. n := 4
  393. states, cleanup := randConsensusState(
  394. t,
  395. config,
  396. n,
  397. "consensus_voting_power_changes_test",
  398. newMockTickerFunc(true),
  399. newPersistentKVStore,
  400. )
  401. t.Cleanup(cleanup)
  402. rts := setup(t, n, states, 100) // buffer must be large enough to not deadlock
  403. for _, reactor := range rts.reactors {
  404. state := reactor.state.GetState()
  405. reactor.SwitchToConsensus(state, false)
  406. }
  407. // map of active validators
  408. activeVals := make(map[string]struct{})
  409. for i := 0; i < n; i++ {
  410. pubKey, err := states[i].privValidator.GetPubKey(context.Background())
  411. require.NoError(t, err)
  412. addr := pubKey.Address()
  413. activeVals[string(addr)] = struct{}{}
  414. }
  415. var wg sync.WaitGroup
  416. for _, sub := range rts.subs {
  417. wg.Add(1)
  418. // wait till everyone makes the first new block
  419. go func(s types.Subscription) {
  420. <-s.Out()
  421. wg.Done()
  422. }(sub)
  423. }
  424. wg.Wait()
  425. blocksSubs := []types.Subscription{}
  426. for _, sub := range rts.subs {
  427. blocksSubs = append(blocksSubs, sub)
  428. }
  429. val1PubKey, err := states[0].privValidator.GetPubKey(context.Background())
  430. require.NoError(t, err)
  431. val1PubKeyABCI, err := cryptoenc.PubKeyToProto(val1PubKey)
  432. require.NoError(t, err)
  433. updateValidatorTx := kvstore.MakeValSetChangeTx(val1PubKeyABCI, 25)
  434. previousTotalVotingPower := states[0].GetRoundState().LastValidators.TotalVotingPower()
  435. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  436. waitForAndValidateBlockWithTx(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  437. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  438. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  439. require.NotEqualf(
  440. t, previousTotalVotingPower, states[0].GetRoundState().LastValidators.TotalVotingPower(),
  441. "expected voting power to change (before: %d, after: %d)",
  442. previousTotalVotingPower,
  443. states[0].GetRoundState().LastValidators.TotalVotingPower(),
  444. )
  445. updateValidatorTx = kvstore.MakeValSetChangeTx(val1PubKeyABCI, 2)
  446. previousTotalVotingPower = states[0].GetRoundState().LastValidators.TotalVotingPower()
  447. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  448. waitForAndValidateBlockWithTx(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  449. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  450. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  451. require.NotEqualf(
  452. t, states[0].GetRoundState().LastValidators.TotalVotingPower(), previousTotalVotingPower,
  453. "expected voting power to change (before: %d, after: %d)",
  454. previousTotalVotingPower, states[0].GetRoundState().LastValidators.TotalVotingPower(),
  455. )
  456. updateValidatorTx = kvstore.MakeValSetChangeTx(val1PubKeyABCI, 26)
  457. previousTotalVotingPower = states[0].GetRoundState().LastValidators.TotalVotingPower()
  458. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  459. waitForAndValidateBlockWithTx(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  460. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  461. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  462. require.NotEqualf(
  463. t, previousTotalVotingPower, states[0].GetRoundState().LastValidators.TotalVotingPower(),
  464. "expected voting power to change (before: %d, after: %d)",
  465. previousTotalVotingPower,
  466. states[0].GetRoundState().LastValidators.TotalVotingPower(),
  467. )
  468. }
  469. func TestReactorValidatorSetChanges(t *testing.T) {
  470. config := configSetup(t)
  471. nPeers := 7
  472. nVals := 4
  473. states, _, _, cleanup := randConsensusNetWithPeers(
  474. config,
  475. nVals,
  476. nPeers,
  477. "consensus_val_set_changes_test",
  478. newMockTickerFunc(true),
  479. newPersistentKVStoreWithPath,
  480. )
  481. t.Cleanup(cleanup)
  482. rts := setup(t, nPeers, states, 100) // buffer must be large enough to not deadlock
  483. for _, reactor := range rts.reactors {
  484. state := reactor.state.GetState()
  485. reactor.SwitchToConsensus(state, false)
  486. }
  487. // map of active validators
  488. activeVals := make(map[string]struct{})
  489. for i := 0; i < nVals; i++ {
  490. pubKey, err := states[i].privValidator.GetPubKey(context.Background())
  491. require.NoError(t, err)
  492. activeVals[string(pubKey.Address())] = struct{}{}
  493. }
  494. var wg sync.WaitGroup
  495. for _, sub := range rts.subs {
  496. wg.Add(1)
  497. // wait till everyone makes the first new block
  498. go func(s types.Subscription) {
  499. <-s.Out()
  500. wg.Done()
  501. }(sub)
  502. }
  503. wg.Wait()
  504. newValidatorPubKey1, err := states[nVals].privValidator.GetPubKey(context.Background())
  505. require.NoError(t, err)
  506. valPubKey1ABCI, err := cryptoenc.PubKeyToProto(newValidatorPubKey1)
  507. require.NoError(t, err)
  508. newValidatorTx1 := kvstore.MakeValSetChangeTx(valPubKey1ABCI, testMinPower)
  509. blocksSubs := []types.Subscription{}
  510. for _, sub := range rts.subs {
  511. blocksSubs = append(blocksSubs, sub)
  512. }
  513. // wait till everyone makes block 2
  514. // ensure the commit includes all validators
  515. // send newValTx to change vals in block 3
  516. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states, newValidatorTx1)
  517. // wait till everyone makes block 3.
  518. // it includes the commit for block 2, which is by the original validator set
  519. waitForAndValidateBlockWithTx(t, nPeers, activeVals, blocksSubs, states, newValidatorTx1)
  520. // wait till everyone makes block 4.
  521. // it includes the commit for block 3, which is by the original validator set
  522. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states)
  523. // the commits for block 4 should be with the updated validator set
  524. activeVals[string(newValidatorPubKey1.Address())] = struct{}{}
  525. // wait till everyone makes block 5
  526. // it includes the commit for block 4, which should have the updated validator set
  527. waitForBlockWithUpdatedValsAndValidateIt(t, nPeers, activeVals, blocksSubs, states)
  528. updateValidatorPubKey1, err := states[nVals].privValidator.GetPubKey(context.Background())
  529. require.NoError(t, err)
  530. updatePubKey1ABCI, err := cryptoenc.PubKeyToProto(updateValidatorPubKey1)
  531. require.NoError(t, err)
  532. updateValidatorTx1 := kvstore.MakeValSetChangeTx(updatePubKey1ABCI, 25)
  533. previousTotalVotingPower := states[nVals].GetRoundState().LastValidators.TotalVotingPower()
  534. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states, updateValidatorTx1)
  535. waitForAndValidateBlockWithTx(t, nPeers, activeVals, blocksSubs, states, updateValidatorTx1)
  536. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states)
  537. waitForBlockWithUpdatedValsAndValidateIt(t, nPeers, activeVals, blocksSubs, states)
  538. require.NotEqualf(
  539. t, states[nVals].GetRoundState().LastValidators.TotalVotingPower(), previousTotalVotingPower,
  540. "expected voting power to change (before: %d, after: %d)",
  541. previousTotalVotingPower, states[nVals].GetRoundState().LastValidators.TotalVotingPower(),
  542. )
  543. newValidatorPubKey2, err := states[nVals+1].privValidator.GetPubKey(context.Background())
  544. require.NoError(t, err)
  545. newVal2ABCI, err := cryptoenc.PubKeyToProto(newValidatorPubKey2)
  546. require.NoError(t, err)
  547. newValidatorTx2 := kvstore.MakeValSetChangeTx(newVal2ABCI, testMinPower)
  548. newValidatorPubKey3, err := states[nVals+2].privValidator.GetPubKey(context.Background())
  549. require.NoError(t, err)
  550. newVal3ABCI, err := cryptoenc.PubKeyToProto(newValidatorPubKey3)
  551. require.NoError(t, err)
  552. newValidatorTx3 := kvstore.MakeValSetChangeTx(newVal3ABCI, testMinPower)
  553. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states, newValidatorTx2, newValidatorTx3)
  554. waitForAndValidateBlockWithTx(t, nPeers, activeVals, blocksSubs, states, newValidatorTx2, newValidatorTx3)
  555. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states)
  556. activeVals[string(newValidatorPubKey2.Address())] = struct{}{}
  557. activeVals[string(newValidatorPubKey3.Address())] = struct{}{}
  558. waitForBlockWithUpdatedValsAndValidateIt(t, nPeers, activeVals, blocksSubs, states)
  559. removeValidatorTx2 := kvstore.MakeValSetChangeTx(newVal2ABCI, 0)
  560. removeValidatorTx3 := kvstore.MakeValSetChangeTx(newVal3ABCI, 0)
  561. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states, removeValidatorTx2, removeValidatorTx3)
  562. waitForAndValidateBlockWithTx(t, nPeers, activeVals, blocksSubs, states, removeValidatorTx2, removeValidatorTx3)
  563. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states)
  564. delete(activeVals, string(newValidatorPubKey2.Address()))
  565. delete(activeVals, string(newValidatorPubKey3.Address()))
  566. waitForBlockWithUpdatedValsAndValidateIt(t, nPeers, activeVals, blocksSubs, states)
  567. }