You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

709 lines
20 KiB

new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
7 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
5 years ago
pubsub 2.0 (#3227) * green pubsub tests :OK: * get rid of clientToQueryMap * Subscribe and SubscribeUnbuffered * start adapting other pkgs to new pubsub * nope * rename MsgAndTags to Message * remove TagMap it does not bring any additional benefits * bring back EventSubscriber * fix test * fix data race in TestStartNextHeightCorrectly ``` Write at 0x00c0001c7418 by goroutine 796: github.com/tendermint/tendermint/consensus.TestStartNextHeightCorrectly() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1296 +0xad testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 Previous read at 0x00c0001c7418 by goroutine 858: github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1631 +0x1366 github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote() /go/src/github.com/tendermint/tendermint/consensus/state.go:1476 +0x8f github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg() /go/src/github.com/tendermint/tendermint/consensus/state.go:667 +0xa1e github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine() /go/src/github.com/tendermint/tendermint/consensus/state.go:628 +0x794 Goroutine 796 (running) created at: testing.(*T).Run() /usr/local/go/src/testing/testing.go:878 +0x659 testing.runTests.func1() /usr/local/go/src/testing/testing.go:1119 +0xa8 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 testing.runTests() /usr/local/go/src/testing/testing.go:1117 +0x4ee testing.(*M).Run() /usr/local/go/src/testing/testing.go:1034 +0x2ee main.main() _testmain.go:214 +0x332 Goroutine 858 (running) created at: github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines() /go/src/github.com/tendermint/tendermint/consensus/state.go:334 +0x221 github.com/tendermint/tendermint/consensus.startTestRound() /go/src/github.com/tendermint/tendermint/consensus/common_test.go:122 +0x63 github.com/tendermint/tendermint/consensus.TestStateFullRound1() /go/src/github.com/tendermint/tendermint/consensus/state_test.go:255 +0x397 testing.tRunner() /usr/local/go/src/testing/testing.go:827 +0x162 ``` * fixes after my own review * fix formatting * wait 100ms before kicking a subscriber out + a test for indexer_service * fixes after my second review * no timeout * add changelog entries * fix merge conflicts * fix typos after Thane's review Co-Authored-By: melekes <anton.kalyaev@gmail.com> * reformat code * rewrite indexer service in the attempt to fix failing test https://github.com/tendermint/tendermint/pull/3227/#issuecomment-462316527 * Revert "rewrite indexer service in the attempt to fix failing test" This reverts commit 0d9107a098230de7138abb1c201877c246e89ed1. * another attempt to fix indexer * fixes after Ethan's review * use unbuffered channel when indexing transactions Refs https://github.com/tendermint/tendermint/pull/3227#discussion_r258786716 * add a comment for EventBus#SubscribeUnbuffered * format code
6 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
7 years ago
cs/replay: execCommitBlock should not read from state.lastValidators (#3067) * execCommitBlock should not read from state.lastValidators * fix height 1 * fix blockchain/reactor_test * fix consensus/mempool_test * fix consensus/reactor_test * fix consensus/replay_test * add CHANGELOG * fix consensus/reactor_test * fix consensus/replay_test * add a test for replay validators change * fix mem_pool test * fix byzantine test * remove a redundant code * reduce validator change blocks to 6 * fix * return peer0 config * seperate testName * seperate testName 1 * seperate testName 2 * seperate app db path * seperate app db path 1 * add a lock before startNet * move the lock to reactor_test * simulate just once * try to find problem * handshake only saveState when app version changed * update gometalinter to 3.0.0 (#3233) in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165 also code is simplified by running gofmt -s . remove unused vars enable linters we're currently passing remove deprecated linters (cherry picked from commit d47094550315c094512a242445e0dde24b5a03f5) * gofmt code * goimport code * change the bool name to testValidatorsChange * adjust receive kvstore.ProtocolVersion * adjust receive kvstore.ProtocolVersion 1 * adjust receive kvstore.ProtocolVersion 3 * fix merge execution.go * fix merge develop * fix merge develop 1 * fix run cleanupFunc * adjust code according to reviewers' opinion * modify the func name match the convention * simplify simulate a chain containing some validator change txs 1 * test CI error * Merge remote-tracking branch 'upstream/develop' into fixReplay 1 * fix pubsub_test * subscribeUnbuffered vote channel
6 years ago
cs/replay: execCommitBlock should not read from state.lastValidators (#3067) * execCommitBlock should not read from state.lastValidators * fix height 1 * fix blockchain/reactor_test * fix consensus/mempool_test * fix consensus/reactor_test * fix consensus/replay_test * add CHANGELOG * fix consensus/reactor_test * fix consensus/replay_test * add a test for replay validators change * fix mem_pool test * fix byzantine test * remove a redundant code * reduce validator change blocks to 6 * fix * return peer0 config * seperate testName * seperate testName 1 * seperate testName 2 * seperate app db path * seperate app db path 1 * add a lock before startNet * move the lock to reactor_test * simulate just once * try to find problem * handshake only saveState when app version changed * update gometalinter to 3.0.0 (#3233) in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165 also code is simplified by running gofmt -s . remove unused vars enable linters we're currently passing remove deprecated linters (cherry picked from commit d47094550315c094512a242445e0dde24b5a03f5) * gofmt code * goimport code * change the bool name to testValidatorsChange * adjust receive kvstore.ProtocolVersion * adjust receive kvstore.ProtocolVersion 1 * adjust receive kvstore.ProtocolVersion 3 * fix merge execution.go * fix merge develop * fix merge develop 1 * fix run cleanupFunc * adjust code according to reviewers' opinion * modify the func name match the convention * simplify simulate a chain containing some validator change txs 1 * test CI error * Merge remote-tracking branch 'upstream/develop' into fixReplay 1 * fix pubsub_test * subscribeUnbuffered vote channel
6 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
7 years ago
blockchain: Reorg reactor (#3561) * go routines in blockchain reactor * Added reference to the go routine diagram * Initial commit * cleanup * Undo testing_logger change, committed by mistake * Fix the test loggers * pulled some fsm code into pool.go * added pool tests * changes to the design added block requests under peer moved the request trigger in the reactor poolRoutine, triggered now by a ticker in general moved everything required for making block requests smarter in the poolRoutine added a simple map of heights to keep track of what will need to be requested next added a few more tests * send errors to FSM in a different channel than blocks send errors (RemovePeer) from switch on a different channel than the one receiving blocks renamed channels added more pool tests * more pool tests * lint errors * more tests * more tests * switch fast sync to new implementation * fixed data race in tests * cleanup * finished fsm tests * address golangci comments :) * address golangci comments :) * Added timeout on next block needed to advance * updating docs and cleanup * fix issue in test from previous cleanup * cleanup * Added termination scenarios, tests and more cleanup * small fixes to adr, comments and cleanup * Fix bug in sendRequest() If we tried to send a request to a peer not present in the switch, a missing continue statement caused the request to be blackholed in a peer that was removed and never retried. While this bug was manifesting, the reactor kept asking for other blocks that would be stored and never consumed. Added the number of unconsumed blocks in the math for requesting blocks ahead of current processing height so eventually there will be no more blocks requested until the already received ones are consumed. * remove bpPeer's didTimeout field * Use distinct err codes for peer timeout and FSM timeouts * Don't allow peers to update with lower height * review comments from Ethan and Zarko * some cleanup, renaming, comments * Move block execution in separate goroutine * Remove pool's numPending * review comments * fix lint, remove old blockchain reactor and duplicates in fsm tests * small reorg around peer after review comments * add the reactor spec * verify block only once * review comments * change to int for max number of pending requests * cleanup and godoc * Add configuration flag fast sync version * golangci fixes * fix config template * move both reactor versions under blockchain * cleanup, golint, renaming stuff * updated documentation, fixed more golint warnings * integrate with behavior package * sync with master * gofmt * add changelog_pending entry * move to improvments * suggestion to changelog entry
5 years ago
mempool: move interface into mempool package (#3524) ## Description Refs #2659 Breaking changes in the mempool package: [mempool] #2659 Mempool now an interface old Mempool renamed to CListMempool NewMempool renamed to NewCListMempool Option renamed to CListOption MempoolReactor renamed to Reactor NewMempoolReactor renamed to NewReactor unexpose TxID method TxInfo.PeerID renamed to SenderID unexpose MempoolReactor.Mempool Breaking changes in the state package: [state] #2659 Mempool interface moved to mempool package MockMempool moved to top-level mock package and renamed to Mempool Non Breaking changes in the node package: [node] #2659 Add Mempool method, which allows you to access mempool ## Commits * move Mempool interface into mempool package Refs #2659 Breaking changes in the mempool package: - Mempool now an interface - old Mempool renamed to CListMempool Breaking changes to state package: - MockMempool moved to mempool/mock package and renamed to Mempool - Mempool interface moved to mempool package * assert CListMempool impl Mempool * gofmt code * rename MempoolReactor to Reactor - combine everything into one interface - rename TxInfo.PeerID to TxInfo.SenderID - unexpose MempoolReactor.Mempool * move mempool mock into top-level mock package * add a fixme TxsFront should not be a part of the Mempool interface because it leaks implementation details. Instead, we need to come up with general interface for querying the mempool so the MempoolReactor can fetch and broadcast txs to peers. * change node#Mempool to return interface * save commit = new reactor arch * Revert "save commit = new reactor arch" This reverts commit 1bfceacd9d65a720574683a7f22771e69af9af4d. * require CListMempool in mempool.Reactor * add two changelog entries * fixes after my own review * quote interfaces, structs and functions * fixes after Ismail's review * make node's mempool an interface * make InitWAL/CloseWAL methods a part of Mempool interface * fix merge conflicts * make node's mempool an interface
6 years ago
cs/replay: execCommitBlock should not read from state.lastValidators (#3067) * execCommitBlock should not read from state.lastValidators * fix height 1 * fix blockchain/reactor_test * fix consensus/mempool_test * fix consensus/reactor_test * fix consensus/replay_test * add CHANGELOG * fix consensus/reactor_test * fix consensus/replay_test * add a test for replay validators change * fix mem_pool test * fix byzantine test * remove a redundant code * reduce validator change blocks to 6 * fix * return peer0 config * seperate testName * seperate testName 1 * seperate testName 2 * seperate app db path * seperate app db path 1 * add a lock before startNet * move the lock to reactor_test * simulate just once * try to find problem * handshake only saveState when app version changed * update gometalinter to 3.0.0 (#3233) in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165 also code is simplified by running gofmt -s . remove unused vars enable linters we're currently passing remove deprecated linters (cherry picked from commit d47094550315c094512a242445e0dde24b5a03f5) * gofmt code * goimport code * change the bool name to testValidatorsChange * adjust receive kvstore.ProtocolVersion * adjust receive kvstore.ProtocolVersion 1 * adjust receive kvstore.ProtocolVersion 3 * fix merge execution.go * fix merge develop * fix merge develop 1 * fix run cleanupFunc * adjust code according to reviewers' opinion * modify the func name match the convention * simplify simulate a chain containing some validator change txs 1 * test CI error * Merge remote-tracking branch 'upstream/develop' into fixReplay 1 * fix pubsub_test * subscribeUnbuffered vote channel
6 years ago
  1. package consensus
  2. import (
  3. "context"
  4. "fmt"
  5. "os"
  6. "path"
  7. "sync"
  8. "testing"
  9. "time"
  10. "github.com/fortytw2/leaktest"
  11. "github.com/stretchr/testify/mock"
  12. "github.com/stretchr/testify/require"
  13. abcicli "github.com/tendermint/tendermint/abci/client"
  14. "github.com/tendermint/tendermint/abci/example/kvstore"
  15. abci "github.com/tendermint/tendermint/abci/types"
  16. cfg "github.com/tendermint/tendermint/config"
  17. cryptoenc "github.com/tendermint/tendermint/crypto/encoding"
  18. tmsync "github.com/tendermint/tendermint/internal/libs/sync"
  19. "github.com/tendermint/tendermint/internal/test/factory"
  20. "github.com/tendermint/tendermint/libs/log"
  21. mempl "github.com/tendermint/tendermint/mempool"
  22. "github.com/tendermint/tendermint/p2p"
  23. "github.com/tendermint/tendermint/p2p/p2ptest"
  24. tmcons "github.com/tendermint/tendermint/proto/tendermint/consensus"
  25. sm "github.com/tendermint/tendermint/state"
  26. statemocks "github.com/tendermint/tendermint/state/mocks"
  27. "github.com/tendermint/tendermint/store"
  28. "github.com/tendermint/tendermint/types"
  29. dbm "github.com/tendermint/tm-db"
  30. )
  31. var (
  32. defaultTestTime = time.Date(2019, 1, 1, 0, 0, 0, 0, time.UTC)
  33. )
  34. type reactorTestSuite struct {
  35. network *p2ptest.Network
  36. states map[p2p.NodeID]*State
  37. reactors map[p2p.NodeID]*Reactor
  38. subs map[p2p.NodeID]types.Subscription
  39. stateChannels map[p2p.NodeID]*p2p.Channel
  40. dataChannels map[p2p.NodeID]*p2p.Channel
  41. voteChannels map[p2p.NodeID]*p2p.Channel
  42. voteSetBitsChannels map[p2p.NodeID]*p2p.Channel
  43. }
  44. func chDesc(chID p2p.ChannelID) p2p.ChannelDescriptor {
  45. return p2p.ChannelDescriptor{
  46. ID: byte(chID),
  47. }
  48. }
  49. func setup(t *testing.T, numNodes int, states []*State, size int) *reactorTestSuite {
  50. t.Helper()
  51. rts := &reactorTestSuite{
  52. network: p2ptest.MakeNetwork(t, p2ptest.NetworkOptions{NumNodes: numNodes}),
  53. states: make(map[p2p.NodeID]*State),
  54. reactors: make(map[p2p.NodeID]*Reactor, numNodes),
  55. subs: make(map[p2p.NodeID]types.Subscription, numNodes),
  56. }
  57. rts.stateChannels = rts.network.MakeChannelsNoCleanup(t, chDesc(StateChannel), new(tmcons.Message), size)
  58. rts.dataChannels = rts.network.MakeChannelsNoCleanup(t, chDesc(DataChannel), new(tmcons.Message), size)
  59. rts.voteChannels = rts.network.MakeChannelsNoCleanup(t, chDesc(VoteChannel), new(tmcons.Message), size)
  60. rts.voteSetBitsChannels = rts.network.MakeChannelsNoCleanup(t, chDesc(VoteSetBitsChannel), new(tmcons.Message), size)
  61. i := 0
  62. for nodeID, node := range rts.network.Nodes {
  63. state := states[i]
  64. reactor := NewReactor(
  65. state.Logger.With("node", nodeID),
  66. state,
  67. rts.stateChannels[nodeID],
  68. rts.dataChannels[nodeID],
  69. rts.voteChannels[nodeID],
  70. rts.voteSetBitsChannels[nodeID],
  71. node.MakePeerUpdates(t),
  72. true,
  73. )
  74. reactor.SetEventBus(state.eventBus)
  75. blocksSub, err := state.eventBus.Subscribe(context.Background(), testSubscriber, types.EventQueryNewBlock, size)
  76. require.NoError(t, err)
  77. rts.states[nodeID] = state
  78. rts.subs[nodeID] = blocksSub
  79. rts.reactors[nodeID] = reactor
  80. // simulate handle initChain in handshake
  81. if state.state.LastBlockHeight == 0 {
  82. require.NoError(t, state.blockExec.Store().Save(state.state))
  83. }
  84. require.NoError(t, reactor.Start())
  85. require.True(t, reactor.IsRunning())
  86. i++
  87. }
  88. require.Len(t, rts.reactors, numNodes)
  89. // start the in-memory network and connect all peers with each other
  90. rts.network.Start(t)
  91. t.Cleanup(func() {
  92. for nodeID, r := range rts.reactors {
  93. require.NoError(t, rts.states[nodeID].eventBus.Stop())
  94. require.NoError(t, r.Stop())
  95. require.False(t, r.IsRunning())
  96. }
  97. leaktest.Check(t)
  98. })
  99. return rts
  100. }
  101. func validateBlock(block *types.Block, activeVals map[string]struct{}) error {
  102. if block.LastCommit.Size() != len(activeVals) {
  103. return fmt.Errorf(
  104. "commit size doesn't match number of active validators. Got %d, expected %d",
  105. block.LastCommit.Size(), len(activeVals),
  106. )
  107. }
  108. for _, commitSig := range block.LastCommit.Signatures {
  109. if _, ok := activeVals[string(commitSig.ValidatorAddress)]; !ok {
  110. return fmt.Errorf("found vote for inactive validator %X", commitSig.ValidatorAddress)
  111. }
  112. }
  113. return nil
  114. }
  115. func waitForAndValidateBlock(
  116. t *testing.T,
  117. n int,
  118. activeVals map[string]struct{},
  119. blocksSubs []types.Subscription,
  120. states []*State,
  121. txs ...[]byte,
  122. ) {
  123. fn := func(j int) {
  124. msg := <-blocksSubs[j].Out()
  125. newBlock := msg.Data().(types.EventDataNewBlock).Block
  126. require.NoError(t, validateBlock(newBlock, activeVals))
  127. for _, tx := range txs {
  128. require.NoError(t, assertMempool(states[j].txNotifier).CheckTx(tx, nil, mempl.TxInfo{}))
  129. }
  130. }
  131. var wg sync.WaitGroup
  132. wg.Add(n)
  133. for i := 0; i < n; i++ {
  134. go func(j int) {
  135. fn(j)
  136. wg.Done()
  137. }(i)
  138. }
  139. wg.Wait()
  140. }
  141. func waitForAndValidateBlockWithTx(
  142. t *testing.T,
  143. n int,
  144. activeVals map[string]struct{},
  145. blocksSubs []types.Subscription,
  146. states []*State,
  147. txs ...[]byte,
  148. ) {
  149. fn := func(j int) {
  150. ntxs := 0
  151. BLOCK_TX_LOOP:
  152. for {
  153. msg := <-blocksSubs[j].Out()
  154. newBlock := msg.Data().(types.EventDataNewBlock).Block
  155. require.NoError(t, validateBlock(newBlock, activeVals))
  156. // check that txs match the txs we're waiting for.
  157. // note they could be spread over multiple blocks,
  158. // but they should be in order.
  159. for _, tx := range newBlock.Data.Txs {
  160. require.EqualValues(t, txs[ntxs], tx)
  161. ntxs++
  162. }
  163. if ntxs == len(txs) {
  164. break BLOCK_TX_LOOP
  165. }
  166. }
  167. }
  168. var wg sync.WaitGroup
  169. wg.Add(n)
  170. for i := 0; i < n; i++ {
  171. go func(j int) {
  172. fn(j)
  173. wg.Done()
  174. }(i)
  175. }
  176. wg.Wait()
  177. }
  178. func waitForBlockWithUpdatedValsAndValidateIt(
  179. t *testing.T,
  180. n int,
  181. updatedVals map[string]struct{},
  182. blocksSubs []types.Subscription,
  183. css []*State,
  184. ) {
  185. fn := func(j int) {
  186. var newBlock *types.Block
  187. LOOP:
  188. for {
  189. msg := <-blocksSubs[j].Out()
  190. newBlock = msg.Data().(types.EventDataNewBlock).Block
  191. if newBlock.LastCommit.Size() == len(updatedVals) {
  192. break LOOP
  193. }
  194. }
  195. require.NoError(t, validateBlock(newBlock, updatedVals))
  196. }
  197. var wg sync.WaitGroup
  198. wg.Add(n)
  199. for i := 0; i < n; i++ {
  200. go func(j int) {
  201. fn(j)
  202. wg.Done()
  203. }(i)
  204. }
  205. wg.Wait()
  206. }
  207. func TestReactorBasic(t *testing.T) {
  208. config := configSetup(t)
  209. n := 4
  210. states, cleanup := randConsensusState(config, n, "consensus_reactor_test", newMockTickerFunc(true), newCounter)
  211. t.Cleanup(cleanup)
  212. rts := setup(t, n, states, 100) // buffer must be large enough to not deadlock
  213. for _, reactor := range rts.reactors {
  214. state := reactor.state.GetState()
  215. reactor.SwitchToConsensus(state, false)
  216. }
  217. var wg sync.WaitGroup
  218. for _, sub := range rts.subs {
  219. wg.Add(1)
  220. // wait till everyone makes the first new block
  221. go func(s types.Subscription) {
  222. <-s.Out()
  223. wg.Done()
  224. }(sub)
  225. }
  226. wg.Wait()
  227. }
  228. func TestReactorWithEvidence(t *testing.T) {
  229. config := configSetup(t)
  230. n := 4
  231. testName := "consensus_reactor_test"
  232. tickerFunc := newMockTickerFunc(true)
  233. appFunc := newCounter
  234. genDoc, privVals := factory.RandGenesisDoc(config, n, false, 30)
  235. states := make([]*State, n)
  236. logger := consensusLogger()
  237. for i := 0; i < n; i++ {
  238. stateDB := dbm.NewMemDB() // each state needs its own db
  239. stateStore := sm.NewStore(stateDB)
  240. state, _ := stateStore.LoadFromDBOrGenesisDoc(genDoc)
  241. thisConfig := ResetConfig(fmt.Sprintf("%s_%d", testName, i))
  242. defer os.RemoveAll(thisConfig.RootDir)
  243. ensureDir(path.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
  244. app := appFunc()
  245. vals := types.TM2PB.ValidatorUpdates(state.Validators)
  246. app.InitChain(abci.RequestInitChain{Validators: vals})
  247. pv := privVals[i]
  248. blockDB := dbm.NewMemDB()
  249. blockStore := store.NewBlockStore(blockDB)
  250. // one for mempool, one for consensus
  251. mtx := new(tmsync.RWMutex)
  252. proxyAppConnMem := abcicli.NewLocalClient(mtx, app)
  253. proxyAppConnCon := abcicli.NewLocalClient(mtx, app)
  254. mempool := mempl.NewCListMempool(thisConfig.Mempool, proxyAppConnMem, 0)
  255. mempool.SetLogger(log.TestingLogger().With("module", "mempool"))
  256. if thisConfig.Consensus.WaitForTxs() {
  257. mempool.EnableTxsAvailable()
  258. }
  259. // mock the evidence pool
  260. // everyone includes evidence of another double signing
  261. vIdx := (i + 1) % n
  262. ev := types.NewMockDuplicateVoteEvidenceWithValidator(1, defaultTestTime, privVals[vIdx], config.ChainID())
  263. evpool := &statemocks.EvidencePool{}
  264. evpool.On("CheckEvidence", mock.AnythingOfType("types.EvidenceList")).Return(nil)
  265. evpool.On("PendingEvidence", mock.AnythingOfType("int64")).Return([]types.Evidence{
  266. ev}, int64(len(ev.Bytes())))
  267. evpool.On("Update", mock.AnythingOfType("state.State"), mock.AnythingOfType("types.EvidenceList")).Return()
  268. evpool2 := sm.EmptyEvidencePool{}
  269. blockExec := sm.NewBlockExecutor(stateStore, log.TestingLogger(), proxyAppConnCon, mempool, evpool)
  270. cs := NewState(thisConfig.Consensus, state, blockExec, blockStore, mempool, evpool2)
  271. cs.SetLogger(log.TestingLogger().With("module", "consensus"))
  272. cs.SetPrivValidator(pv)
  273. eventBus := types.NewEventBus()
  274. eventBus.SetLogger(log.TestingLogger().With("module", "events"))
  275. err := eventBus.Start()
  276. require.NoError(t, err)
  277. cs.SetEventBus(eventBus)
  278. cs.SetTimeoutTicker(tickerFunc())
  279. cs.SetLogger(logger.With("validator", i, "module", "consensus"))
  280. states[i] = cs
  281. }
  282. rts := setup(t, n, states, 100) // buffer must be large enough to not deadlock
  283. for _, reactor := range rts.reactors {
  284. state := reactor.state.GetState()
  285. reactor.SwitchToConsensus(state, false)
  286. }
  287. var wg sync.WaitGroup
  288. for _, sub := range rts.subs {
  289. wg.Add(1)
  290. // We expect for each validator that is the proposer to propose one piece of
  291. // evidence.
  292. go func(s types.Subscription) {
  293. msg := <-s.Out()
  294. block := msg.Data().(types.EventDataNewBlock).Block
  295. require.Len(t, block.Evidence.Evidence, 1)
  296. wg.Done()
  297. }(sub)
  298. }
  299. wg.Wait()
  300. }
  301. func TestReactorCreatesBlockWhenEmptyBlocksFalse(t *testing.T) {
  302. config := configSetup(t)
  303. n := 4
  304. states, cleanup := randConsensusState(
  305. config,
  306. n,
  307. "consensus_reactor_test",
  308. newMockTickerFunc(true),
  309. newCounter,
  310. func(c *cfg.Config) {
  311. c.Consensus.CreateEmptyBlocks = false
  312. },
  313. )
  314. t.Cleanup(cleanup)
  315. rts := setup(t, n, states, 100) // buffer must be large enough to not deadlock
  316. for _, reactor := range rts.reactors {
  317. state := reactor.state.GetState()
  318. reactor.SwitchToConsensus(state, false)
  319. }
  320. // send a tx
  321. require.NoError(t, assertMempool(states[3].txNotifier).CheckTx([]byte{1, 2, 3}, nil, mempl.TxInfo{}))
  322. var wg sync.WaitGroup
  323. for _, sub := range rts.subs {
  324. wg.Add(1)
  325. // wait till everyone makes the first new block
  326. go func(s types.Subscription) {
  327. <-s.Out()
  328. wg.Done()
  329. }(sub)
  330. }
  331. wg.Wait()
  332. }
  333. func TestReactorRecordsVotesAndBlockParts(t *testing.T) {
  334. config := configSetup(t)
  335. n := 4
  336. states, cleanup := randConsensusState(config, n, "consensus_reactor_test", newMockTickerFunc(true), newCounter)
  337. t.Cleanup(cleanup)
  338. rts := setup(t, n, states, 100) // buffer must be large enough to not deadlock
  339. for _, reactor := range rts.reactors {
  340. state := reactor.state.GetState()
  341. reactor.SwitchToConsensus(state, false)
  342. }
  343. var wg sync.WaitGroup
  344. for _, sub := range rts.subs {
  345. wg.Add(1)
  346. // wait till everyone makes the first new block
  347. go func(s types.Subscription) {
  348. <-s.Out()
  349. wg.Done()
  350. }(sub)
  351. }
  352. wg.Wait()
  353. // Require at least one node to have sent block parts, but we can't know which
  354. // peer sent it.
  355. require.Eventually(
  356. t,
  357. func() bool {
  358. for _, reactor := range rts.reactors {
  359. for _, ps := range reactor.peers {
  360. if ps.BlockPartsSent() > 0 {
  361. return true
  362. }
  363. }
  364. }
  365. return false
  366. },
  367. time.Second,
  368. 10*time.Millisecond,
  369. "number of block parts sent should've increased",
  370. )
  371. nodeID := rts.network.RandomNode().NodeID
  372. reactor := rts.reactors[nodeID]
  373. peers := rts.network.Peers(nodeID)
  374. ps, ok := reactor.GetPeerState(peers[0].NodeID)
  375. require.True(t, ok)
  376. require.NotNil(t, ps)
  377. require.Greater(t, ps.VotesSent(), 0, "number of votes sent should've increased")
  378. }
  379. func TestReactorVotingPowerChange(t *testing.T) {
  380. config := configSetup(t)
  381. n := 4
  382. states, cleanup := randConsensusState(
  383. config,
  384. n,
  385. "consensus_voting_power_changes_test",
  386. newMockTickerFunc(true),
  387. newPersistentKVStore,
  388. )
  389. t.Cleanup(cleanup)
  390. rts := setup(t, n, states, 100) // buffer must be large enough to not deadlock
  391. for _, reactor := range rts.reactors {
  392. state := reactor.state.GetState()
  393. reactor.SwitchToConsensus(state, false)
  394. }
  395. // map of active validators
  396. activeVals := make(map[string]struct{})
  397. for i := 0; i < n; i++ {
  398. pubKey, err := states[i].privValidator.GetPubKey(context.Background())
  399. require.NoError(t, err)
  400. addr := pubKey.Address()
  401. activeVals[string(addr)] = struct{}{}
  402. }
  403. var wg sync.WaitGroup
  404. for _, sub := range rts.subs {
  405. wg.Add(1)
  406. // wait till everyone makes the first new block
  407. go func(s types.Subscription) {
  408. <-s.Out()
  409. wg.Done()
  410. }(sub)
  411. }
  412. wg.Wait()
  413. blocksSubs := []types.Subscription{}
  414. for _, sub := range rts.subs {
  415. blocksSubs = append(blocksSubs, sub)
  416. }
  417. val1PubKey, err := states[0].privValidator.GetPubKey(context.Background())
  418. require.NoError(t, err)
  419. val1PubKeyABCI, err := cryptoenc.PubKeyToProto(val1PubKey)
  420. require.NoError(t, err)
  421. updateValidatorTx := kvstore.MakeValSetChangeTx(val1PubKeyABCI, 25)
  422. previousTotalVotingPower := states[0].GetRoundState().LastValidators.TotalVotingPower()
  423. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  424. waitForAndValidateBlockWithTx(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  425. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  426. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  427. require.NotEqualf(
  428. t, previousTotalVotingPower, states[0].GetRoundState().LastValidators.TotalVotingPower(),
  429. "expected voting power to change (before: %d, after: %d)",
  430. previousTotalVotingPower,
  431. states[0].GetRoundState().LastValidators.TotalVotingPower(),
  432. )
  433. updateValidatorTx = kvstore.MakeValSetChangeTx(val1PubKeyABCI, 2)
  434. previousTotalVotingPower = states[0].GetRoundState().LastValidators.TotalVotingPower()
  435. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  436. waitForAndValidateBlockWithTx(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  437. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  438. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  439. require.NotEqualf(
  440. t, states[0].GetRoundState().LastValidators.TotalVotingPower(), previousTotalVotingPower,
  441. "expected voting power to change (before: %d, after: %d)",
  442. previousTotalVotingPower, states[0].GetRoundState().LastValidators.TotalVotingPower(),
  443. )
  444. updateValidatorTx = kvstore.MakeValSetChangeTx(val1PubKeyABCI, 26)
  445. previousTotalVotingPower = states[0].GetRoundState().LastValidators.TotalVotingPower()
  446. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  447. waitForAndValidateBlockWithTx(t, n, activeVals, blocksSubs, states, updateValidatorTx)
  448. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  449. waitForAndValidateBlock(t, n, activeVals, blocksSubs, states)
  450. require.NotEqualf(
  451. t, previousTotalVotingPower, states[0].GetRoundState().LastValidators.TotalVotingPower(),
  452. "expected voting power to change (before: %d, after: %d)",
  453. previousTotalVotingPower,
  454. states[0].GetRoundState().LastValidators.TotalVotingPower(),
  455. )
  456. }
  457. func TestReactorValidatorSetChanges(t *testing.T) {
  458. config := configSetup(t)
  459. nPeers := 7
  460. nVals := 4
  461. states, _, _, cleanup := randConsensusNetWithPeers(
  462. config,
  463. nVals,
  464. nPeers,
  465. "consensus_val_set_changes_test",
  466. newMockTickerFunc(true),
  467. newPersistentKVStoreWithPath,
  468. )
  469. t.Cleanup(cleanup)
  470. rts := setup(t, nPeers, states, 100) // buffer must be large enough to not deadlock
  471. for _, reactor := range rts.reactors {
  472. state := reactor.state.GetState()
  473. reactor.SwitchToConsensus(state, false)
  474. }
  475. // map of active validators
  476. activeVals := make(map[string]struct{})
  477. for i := 0; i < nVals; i++ {
  478. pubKey, err := states[i].privValidator.GetPubKey(context.Background())
  479. require.NoError(t, err)
  480. activeVals[string(pubKey.Address())] = struct{}{}
  481. }
  482. var wg sync.WaitGroup
  483. for _, sub := range rts.subs {
  484. wg.Add(1)
  485. // wait till everyone makes the first new block
  486. go func(s types.Subscription) {
  487. <-s.Out()
  488. wg.Done()
  489. }(sub)
  490. }
  491. wg.Wait()
  492. newValidatorPubKey1, err := states[nVals].privValidator.GetPubKey(context.Background())
  493. require.NoError(t, err)
  494. valPubKey1ABCI, err := cryptoenc.PubKeyToProto(newValidatorPubKey1)
  495. require.NoError(t, err)
  496. newValidatorTx1 := kvstore.MakeValSetChangeTx(valPubKey1ABCI, testMinPower)
  497. blocksSubs := []types.Subscription{}
  498. for _, sub := range rts.subs {
  499. blocksSubs = append(blocksSubs, sub)
  500. }
  501. // wait till everyone makes block 2
  502. // ensure the commit includes all validators
  503. // send newValTx to change vals in block 3
  504. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states, newValidatorTx1)
  505. // wait till everyone makes block 3.
  506. // it includes the commit for block 2, which is by the original validator set
  507. waitForAndValidateBlockWithTx(t, nPeers, activeVals, blocksSubs, states, newValidatorTx1)
  508. // wait till everyone makes block 4.
  509. // it includes the commit for block 3, which is by the original validator set
  510. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states)
  511. // the commits for block 4 should be with the updated validator set
  512. activeVals[string(newValidatorPubKey1.Address())] = struct{}{}
  513. // wait till everyone makes block 5
  514. // it includes the commit for block 4, which should have the updated validator set
  515. waitForBlockWithUpdatedValsAndValidateIt(t, nPeers, activeVals, blocksSubs, states)
  516. updateValidatorPubKey1, err := states[nVals].privValidator.GetPubKey(context.Background())
  517. require.NoError(t, err)
  518. updatePubKey1ABCI, err := cryptoenc.PubKeyToProto(updateValidatorPubKey1)
  519. require.NoError(t, err)
  520. updateValidatorTx1 := kvstore.MakeValSetChangeTx(updatePubKey1ABCI, 25)
  521. previousTotalVotingPower := states[nVals].GetRoundState().LastValidators.TotalVotingPower()
  522. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states, updateValidatorTx1)
  523. waitForAndValidateBlockWithTx(t, nPeers, activeVals, blocksSubs, states, updateValidatorTx1)
  524. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states)
  525. waitForBlockWithUpdatedValsAndValidateIt(t, nPeers, activeVals, blocksSubs, states)
  526. require.NotEqualf(
  527. t, states[nVals].GetRoundState().LastValidators.TotalVotingPower(), previousTotalVotingPower,
  528. "expected voting power to change (before: %d, after: %d)",
  529. previousTotalVotingPower, states[nVals].GetRoundState().LastValidators.TotalVotingPower(),
  530. )
  531. newValidatorPubKey2, err := states[nVals+1].privValidator.GetPubKey(context.Background())
  532. require.NoError(t, err)
  533. newVal2ABCI, err := cryptoenc.PubKeyToProto(newValidatorPubKey2)
  534. require.NoError(t, err)
  535. newValidatorTx2 := kvstore.MakeValSetChangeTx(newVal2ABCI, testMinPower)
  536. newValidatorPubKey3, err := states[nVals+2].privValidator.GetPubKey(context.Background())
  537. require.NoError(t, err)
  538. newVal3ABCI, err := cryptoenc.PubKeyToProto(newValidatorPubKey3)
  539. require.NoError(t, err)
  540. newValidatorTx3 := kvstore.MakeValSetChangeTx(newVal3ABCI, testMinPower)
  541. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states, newValidatorTx2, newValidatorTx3)
  542. waitForAndValidateBlockWithTx(t, nPeers, activeVals, blocksSubs, states, newValidatorTx2, newValidatorTx3)
  543. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states)
  544. activeVals[string(newValidatorPubKey2.Address())] = struct{}{}
  545. activeVals[string(newValidatorPubKey3.Address())] = struct{}{}
  546. waitForBlockWithUpdatedValsAndValidateIt(t, nPeers, activeVals, blocksSubs, states)
  547. removeValidatorTx2 := kvstore.MakeValSetChangeTx(newVal2ABCI, 0)
  548. removeValidatorTx3 := kvstore.MakeValSetChangeTx(newVal3ABCI, 0)
  549. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states, removeValidatorTx2, removeValidatorTx3)
  550. waitForAndValidateBlockWithTx(t, nPeers, activeVals, blocksSubs, states, removeValidatorTx2, removeValidatorTx3)
  551. waitForAndValidateBlock(t, nPeers, activeVals, blocksSubs, states)
  552. delete(activeVals, string(newValidatorPubKey2.Address()))
  553. delete(activeVals, string(newValidatorPubKey3.Address()))
  554. waitForBlockWithUpdatedValsAndValidateIt(t, nPeers, activeVals, blocksSubs, states)
  555. }